[Hardware] Has anyone made progress on that CUDA client?

John L. Bass jbass at dmsd.com
Fri Nov 21 12:15:44 EST 2008

Kigen wrote:
> As my subject states, has anyone made progress on that CUDA client?
> Cause NVIDIA was so kind as to release the "GPU" NVIDIA Tesla C1060
> Computing Processor.  It comes with 240 cores working @ 1.296Ghz each.
>  They also advocate using multiple ones, like sticking 4 of them
> together.
> You can search TigerDirect to find it.  Not the cheapest thing on the block.
> _______________________________________________
> Hardware mailing list
> Hardware at lists.distributed.net
> http://lists.distributed.net/mailman/listinfo/hardware
CUDA GPU engines are very interesting for a lot of reasons. Rather than 
waste it on stupid brute force trying to recover the key, I would 
suggest focusing instead on trying to recover the data with another attack.

Two years ago, while I was working with Guerric Meurice on the FPGA RC5 
engine I noticed that there were a high number of partial matches on the 
output while cycling the key counter. It turned out that was just a 
statistical artifact of running a counter, and "multiplying" it with the 
SBox key scheduling algorithm. The ability to "leak" information of a 
particular wrong SBox was very dependent on the input cipher data.

That discovery, lead to what I call the "Wheel of Fortune" attack on RC5 
and other SBox based ciphers. The game "Wheel of Fortune" is about 
completing a challenge based on minimal partial knowledge. Using similar 
strategies, we can deduce significant amounts of information from a 
large number of limited partial disclosures. The nature of CBC mode 
makes this possible, because each block is only on two cipher blocks and 
the SBox/key. We do not have to use the same flawed SBox for every 
block, but can pick and choose from a set of flawed SBoxes, and then use 
a higher scoring strategy to make sure the text is consistant (IE ...we 
can form a rational message from the parts), using a "Wheel of Fortune" 
approach to identify those missing/flawed parts and correct.

The data protocol for 802.11b uses a statistical approach to filtering 
noise, that is based on normalizing random the input energy from 
multiple frequency shifted samples to extract synchronized data sources 
because they add above the normalized noise floor.

This gives us an automated way to filter correct data leaked from the 
flawed SBoxes, by keeping a running statistical sum on both the SBox 
bits, and the "clear text" output bits, which we selectively enhance by 
choosing to sum only events which have a higher that average partial 
match during random perturbation of the SBox. I've tried several ways to 
randomly change the SBoxes, but finally settled on random key values and 
using the key scheduling algorithm as the primary means. This was easy 
because the rest of the hardware was already there from the RC5 brute 
force framework. All I needed to do was change the counter on the front 
end, and add scoring hardware on the back end ... then pass high scoring 
"keys" back up to the software for statistical analysis.

We can use two different knowledge sources to score bits. First we can 
score SBox/Output bits that also have a high match score on the first 
three blocks that exceeds more than a standard deviation or two 
(weighted by the std deviation). We can do the same for output bytes 
filtering for those that have two zeros in the high bits for these same 
better than average SBoxes. This is relatively easy to do ... mask off 
the unknown bits in blocks higher than the known group, XOR with the 
known values, invert, and count ones bits. In the FPGA that was 
relatively free, but shouldn't be that costly in a GPU.

I got mixed results with the limited testing I got done before RSA 
canceled the challenges. There are several ways to improve the process 
from here, that anyone that is really a crypto geek might enjoy. I 
personally believe RSA canceled the challenges in part because I tipped 
Burt and Ron that this might be possible, when I made inquires about 
getting credit for cracking RC5 based on recovering good data or SBoxes 
(see attached).

I've been side tracked doing factoring and EC stuff for a couple years, 
that recently created an overlap with the Wheel of Fortune strategy, and 
brought me back to it. If the 2007 code still exists, I will post it ... 
and give someone a head start, or maybe not, as that might force them 
into the same problems I was having in 2007 trying to improve the 
automated output correctness above 40-80%. Sometimes fresh ideas are 
better. I think I understand how to get it to converge better today ... 
and also have two new strategies for recovering the key, one based on 
this at the key level to try.

What really sucks, is that the RSA guys make a killing selling their 
company ($1.3B if I remember right), and then they expect everyone to do 
crypto research to improve their product for free. Like the $10,000 
prizes for RC5 keys would break the bank. Or even the better factoring 

If we can crack RC5 with any degree of reliability using attacks other 
than brute force, well, that evens the playing field a bit. The scary 
part, is if this can be made to reliably work, it takes out most of the 
known crypto algorithms. That should be worth some bucks to somebody ... 
maybe we can get DARPA or NSA to create a government challenge for this 
... with a $1M price tag ... that will get people working on cracking 
data, not stupid brute force key attacks.

The value of this attack if matured, is at least the value of the RSA 
sale ... so the asking price should be a significant part of the $1.3B 
if they want to "buy" the crack ... otherwise leak results, and delay 
the day that the script kiddies go after everyones data.

Have fun,

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: RSA-070207
Url: http://lists.distributed.net/pipermail/hardware/attachments/20081121/29da6f4a/attachment.pl 

More information about the Hardware mailing list