[Hardware] "success"
John L. Bass
jbass at dmsd.com
Sat Oct 21 03:44:08 EDT 2006
Kris Amy wrote:
>What is actually required in terms of hardware for this?
>
>I'm just browsing around on that website and unsure of what you need.
>
>
Hi Kris,
There is a solution space for nearly every type of FPGA, ranging from
fully unrolled solutions that run one key per clock, to tightly rolled
bit/digit serial designs that will fit hundreds/thousands of engines
that will require around 2,500 clocks per key, and a solution every few
clocks. Bit/Digit serial designs can be run nearly at max clock rate for
the FPGA, while parallel designs have deeper combinatorials and more
routing delays and will need to have the clock rate tuned to the designs
internal latencies.
Thermally, and power wise, the bit/digit serial designs are superior as
they suffer much less from excess logic transitions in multilevel
combinatorial parallel solutions. You can also fully pack nearly any
device, where large parallel solutions will have large unused areas,
unless you also pack them with rolled or bit/digit serial engines too.
In every case, these design will have a VERY high utilization, with
toggle rates over short periods of time that are near 100% for serial
designs (worst case) and between 25-50% for parallel designs. This
consumes a lot of power (more than most student boards can provide), and
heat that generally requires extreme measures to cool, like using high
end overclockers solutions (water cooling) for the largest devices. Some
devices, like older large Virtex, Virtex-E, and Virtex-2 products are
unstable even with extreme engineering solutions for the power and heat.
Current Virtex-4 and Virtex-5 devices in BGA packages are better, and
might actually be stable with extreme engineering for power and cooling.
In short, this is very feasible, fast, and may require some REAL worst
case engineering for larger FPGAs, and some smaller ones. Current FPGA
vendors do not spec the worst case numbers, so some careful
application/design specific in system measuring and derating will have
to occur for FPGA key cracking systems to be stable and error free. On
DIE termals will certainly have to be monitored, with external over temp
measures implemented to keep from reducing an expensive FPGA to burn out
trash (I've created a few termally failed Virtex FPGAs already).
I haven't done the math for a couple years, but with the largest
Virtex-5 devices, and a design optimized for 5-LUTs and 6-LUTs with
agressive fiting to use support logic (muxes, expander and carry logic)
in a digit serial configuration, should yeild an aggregate performance
such that a hundred or two of these large FPGAs are as fast as all the
member processors in DNet. Maybe less FPGAs. The easiest way to do
bit/digit serial designs is as a schematic macro (core), that is then
hand/script tiled with LOCs and wired to common support logic with a
script generated HDL high level design. This is necessary to avoid
routing delays, and routing congestion, and should produce a max clock
rate design (inside your power/cooling limits) that is nearly 100%
utilized. Anticipate that you will have some stability problems with
power/thermal doing this ... possibly fatal ones, and you may have to
derate the device to protect it from failure, or get stable operation.
More information about the Hardware
mailing list