[Hardware] "success"

John L. Bass jbass at dmsd.com
Sat Oct 21 03:44:08 EDT 2006


Kris Amy wrote:

>What is actually required in terms of hardware for this?
>
>I'm just browsing around on that website and unsure of what you need.
>  
>
Hi Kris,

There is a solution space for nearly every type of FPGA, ranging from 
fully unrolled solutions that run one key per clock, to tightly rolled 
bit/digit serial designs that will fit hundreds/thousands of engines 
that will require around 2,500 clocks per key, and a solution every few 
clocks. Bit/Digit serial designs can be run nearly at max clock rate for 
the FPGA, while parallel designs have deeper combinatorials and more 
routing delays and will need to have the clock rate tuned to the designs 
internal latencies.

Thermally, and power wise, the bit/digit serial designs are superior as 
they suffer much less from excess logic transitions in multilevel 
combinatorial parallel solutions. You can also fully pack nearly any 
device, where large parallel solutions will have large unused areas, 
unless you also pack them with rolled or bit/digit serial engines too.

In every case, these design will have a VERY high utilization, with 
toggle rates over short periods of time that are near 100% for serial 
designs (worst case) and between 25-50% for parallel designs. This 
consumes a lot of power (more than most student boards can provide), and 
heat that generally requires extreme measures to cool, like using high 
end overclockers solutions (water cooling) for the largest devices. Some 
devices, like older  large Virtex, Virtex-E, and Virtex-2 products are 
unstable even with extreme engineering solutions for the power and heat. 
Current Virtex-4 and Virtex-5 devices in BGA packages are better, and 
might actually be stable with extreme engineering for power and cooling.

In short, this is very feasible, fast, and may require some REAL worst 
case engineering for larger FPGAs, and some smaller ones. Current FPGA 
vendors do not spec the worst case numbers, so some careful 
application/design specific in system measuring and derating will have 
to occur for FPGA key cracking systems to be stable and error free. On 
DIE termals will certainly have to be monitored, with external over temp 
measures implemented to keep from reducing an expensive FPGA to burn out 
trash (I've created a few  termally failed Virtex FPGAs already).

I haven't done the math for a couple years, but with the largest 
Virtex-5 devices, and a design optimized for 5-LUTs and 6-LUTs with 
agressive fiting to use support logic (muxes, expander and carry logic) 
in a digit serial configuration, should yeild an aggregate performance 
such that a hundred or two of these large FPGAs are as fast as all the 
member processors in DNet. Maybe less FPGAs. The easiest way to do 
bit/digit serial designs is as a schematic macro (core), that is then 
hand/script tiled with LOCs and wired to common support logic with a 
script generated HDL high level design. This is necessary to avoid 
routing delays, and routing congestion, and should produce a max clock 
rate design (inside your power/cooling limits) that is nearly 100% 
utilized. Anticipate that you will have some stability problems with 
power/thermal doing this ... possibly fatal ones, and you may have to 
derate the device to protect it from failure, or get stable operation.


More information about the Hardware mailing list