[Hardware] Has anyone made progress on that CUDA client?
John L. Bass
jbass at dmsd.com
Fri Nov 21 12:15:44 EST 2008
> As my subject states, has anyone made progress on that CUDA client?
> Cause NVIDIA was so kind as to release the "GPU" NVIDIA Tesla C1060
> Computing Processor. It comes with 240 cores working @ 1.296Ghz each.
> They also advocate using multiple ones, like sticking 4 of them
> You can search TigerDirect to find it. Not the cheapest thing on the block.
> Hardware mailing list
> Hardware at lists.distributed.net
CUDA GPU engines are very interesting for a lot of reasons. Rather than
waste it on stupid brute force trying to recover the key, I would
suggest focusing instead on trying to recover the data with another attack.
Two years ago, while I was working with Guerric Meurice on the FPGA RC5
engine I noticed that there were a high number of partial matches on the
output while cycling the key counter. It turned out that was just a
statistical artifact of running a counter, and "multiplying" it with the
SBox key scheduling algorithm. The ability to "leak" information of a
particular wrong SBox was very dependent on the input cipher data.
That discovery, lead to what I call the "Wheel of Fortune" attack on RC5
and other SBox based ciphers. The game "Wheel of Fortune" is about
completing a challenge based on minimal partial knowledge. Using similar
strategies, we can deduce significant amounts of information from a
large number of limited partial disclosures. The nature of CBC mode
makes this possible, because each block is only on two cipher blocks and
the SBox/key. We do not have to use the same flawed SBox for every
block, but can pick and choose from a set of flawed SBoxes, and then use
a higher scoring strategy to make sure the text is consistant (IE ...we
can form a rational message from the parts), using a "Wheel of Fortune"
approach to identify those missing/flawed parts and correct.
The data protocol for 802.11b uses a statistical approach to filtering
noise, that is based on normalizing random the input energy from
multiple frequency shifted samples to extract synchronized data sources
because they add above the normalized noise floor.
This gives us an automated way to filter correct data leaked from the
flawed SBoxes, by keeping a running statistical sum on both the SBox
bits, and the "clear text" output bits, which we selectively enhance by
choosing to sum only events which have a higher that average partial
match during random perturbation of the SBox. I've tried several ways to
randomly change the SBoxes, but finally settled on random key values and
using the key scheduling algorithm as the primary means. This was easy
because the rest of the hardware was already there from the RC5 brute
force framework. All I needed to do was change the counter on the front
end, and add scoring hardware on the back end ... then pass high scoring
"keys" back up to the software for statistical analysis.
We can use two different knowledge sources to score bits. First we can
score SBox/Output bits that also have a high match score on the first
three blocks that exceeds more than a standard deviation or two
(weighted by the std deviation). We can do the same for output bytes
filtering for those that have two zeros in the high bits for these same
better than average SBoxes. This is relatively easy to do ... mask off
the unknown bits in blocks higher than the known group, XOR with the
known values, invert, and count ones bits. In the FPGA that was
relatively free, but shouldn't be that costly in a GPU.
I got mixed results with the limited testing I got done before RSA
canceled the challenges. There are several ways to improve the process
from here, that anyone that is really a crypto geek might enjoy. I
personally believe RSA canceled the challenges in part because I tipped
Burt and Ron that this might be possible, when I made inquires about
getting credit for cracking RC5 based on recovering good data or SBoxes
I've been side tracked doing factoring and EC stuff for a couple years,
that recently created an overlap with the Wheel of Fortune strategy, and
brought me back to it. If the 2007 code still exists, I will post it ...
and give someone a head start, or maybe not, as that might force them
into the same problems I was having in 2007 trying to improve the
automated output correctness above 40-80%. Sometimes fresh ideas are
better. I think I understand how to get it to converge better today ...
and also have two new strategies for recovering the key, one based on
this at the key level to try.
What really sucks, is that the RSA guys make a killing selling their
company ($1.3B if I remember right), and then they expect everyone to do
crypto research to improve their product for free. Like the $10,000
prizes for RC5 keys would break the bank. Or even the better factoring
If we can crack RC5 with any degree of reliability using attacks other
than brute force, well, that evens the playing field a bit. The scary
part, is if this can be made to reliably work, it takes out most of the
known crypto algorithms. That should be worth some bucks to somebody ...
maybe we can get DARPA or NSA to create a government challenge for this
... with a $1M price tag ... that will get people working on cracking
data, not stupid brute force key attacks.
The value of this attack if matured, is at least the value of the RSA
sale ... so the asking price should be a significant part of the $1.3B
if they want to "buy" the crack ... otherwise leak results, and delay
the day that the script kiddies go after everyones data.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
More information about the Hardware