[Hardware] Notes on packing serial cores into Xilinx devices

Elektron elektron_rc5 at yahoo.ca
Sun Aug 15 11:33:57 EDT 2004

On 13 Aug, 2004, at 23:20, Dan Oetting wrote:

> Using some reasonable value for cracking rate and operating cost for 
> some piece of hardware, what is the break-even acquisition cost for 
> that hardware. [PS: this is trick question, what is the trick?]

'reasonable value'. And the acquisition cost will end up being negative.

In times like this, I like to work backwards from what you can reliably 
estimate (how heavy the nuke in Armageddon was, how much energy you can 
get in a nuclear reaction, the speed you need to give the asteroid to 
make it pass the Earth if it blew up on the moon, the density of the 
moon) to calculate a ridiculous requirement (the asteroid is only 2 km 
a side). An asteroid that big would actually cause a mass-extinction, 
but they have to claim it was the size of Texas. There's also a huge 
plot hole ("Why the hell do we have a bomb that can split Texas in 

Now, you win $10000. Assume electricity costs an insanely low 1 cent 
per kWh. Assume equipment is free. So you have 1 million kWh (== 3.6e12 
J) to spend cracking 2^71 keys (since, on average, you search half the 
keyspace to find the result). That's 1.5e-9 Joules per key.

Assume we have a low threshold voltage of 0.3V (either germanium or 
silicon, I forget). That means you have 5.1e-9 Coulombs per key. That's 
3.2e10 electrons.

Let's estimate the minimum number of transistor*clocks you need to do 
RC5. Assume you need no bits of storage (which you don't if you're not 
pipelining). Skip this if you don't feel like counting transistors.
Two adds per S round. 405 transistors per add [1], so 810 for an 
S-round, so 63180 total.
Two 32-bit adds per L-round, so 810 transistors. A 5-bit add, so 54 
transistors [2]. The rotate costs 32 5-bit hard-coded comparisons, 
which should average 1.5*5=7.5 transistors per comparison [1], negated 
(another transistor), and fed to 32 transistors to select the correct 
rotate amount. 40.5 transistors, 32 comparisons, so 1296 transistors. 
2160 per L-round, 168480 total.
A 32-bit XOR per E-round, 64 transistors. A rotate, 1296 transistors. 
An add, 405 transistors. 1765 per E-round, 45890 total.
A 64-bit hard-coded comparison with the ciphertext, which has 43 
on-bits, so 64+43=107 transistors.

107+45890+168480+63180=277657 transistors. About 11400 electrons per 
transistor. This kind of efficiency isn't likely to happen any time 

N.B. You can probably do ANDs in one transistor and ORs in two, if you 
use a different type. It's close enough for an order-of-magnitude 
estimate though.

Interestingly, the chance of having 43 or more on-bits in the first 64 
bits of ciphertext is roughly 4/1000.

- Purr

[1] A "half-adder" has a XOR, and an AND. The XOR takes two 
transistors, the AND takes three. Five per half-adder, 32+31=63 
half-adders, 315 bits. Add the 30 OR gates, 30*3=90 transistors. 405 

[2] Similarly, 5+4=9 half-adders, 3 OR gates, so 9*5 + 3*3 = 54 

[3] Comparing numbers is easy. You have a VCC fed through n FETs. If 
any of them are 'on', the VCC is blocked, thus we're comparing with n 
zeroes. To compare with one-bits first negate the corresponding input 
with a FET. So 1 FET per compare with 0, 2 FETs per compare with 1.

More information about the Hardware mailing list