[Hardware] Notes... The case for an open client

Dan Oetting dan_oetting at uswest.net
Wed Aug 18 23:38:46 EDT 2004


On Aug 18, 2004, at 11:18 AM, david fleischer wrote:

> What is the point of having an error check if
> re-checking the key-block will take the same amount of
> work as the key processing itself?
> is this going to be a sampled check only?
> to have fail safe checking it would be necessary to
> have a means of checking that every block was
> processed correctly.

Yes, the residual would primarily be used for checking random samples 
of results. But it would also be used for initial startup 
certification. Running test keys as done by the current d.net client 
only tests the calculations for 1 key per block. A test with a residual 
result verifies the calculations for every key in the block. Software 
cores may be able to compensate by running tests on a large number of 
short blocks. Hardware cores may not be able to run short blocks.

The partial match key result has a 0.00000005% chance of revealing a 
single bit error while processing a block. Half of that is the false 
positive result which is at least easy to check. The other half are the 
false negatives which require reprocessing the block to detect. We can 
get a 20000 times increase in detection of errors by retesting 1 in a 
thousand blocks with a residual check.

It's not necessary for the processing to be fail safe because the 
contest itself is fail safe. If we somehow get unlucky and miss the key 
because of a hardware glitch we will know it and can begin a second 
search. The second search would begin by reprocessing key space that 
was initially handled by clients with the highest detected error rates. 
Then the key space of the clients with the highest undetected error 
rate would be reprocessed (remember though that each block reprocessed 
will change the undetected error rate of the client).

> --- Elektron <elektron_rc5 at yahoo.ca> wrote:
>>>
>>> A CRC could be used instead of a sum for the
>>> residual. The CRC
>>> polynomial X^32+1 would be easy enough to
>>> implement in both hardware
>>> and software and offers the same distributive
>>> properties as addition.
>>
>> XORing is probably better than adding (except
>> there's no carrying, so
>> there's a greater chance that an error in one bit
>> could be negated by a
>> second error in the same bit).

With only 32 bits in the residual we are going to miss about 1 in 4 
billion errors anyway. Things like two independent errors conspiring to 
cancel each other out will be so rare when compared to the individual 
error rates we wouldn't be able to measure it anyway. What we might 
need to be concerned with is a single error event that always manifests 
itself in a way that cancels out the change in the residual. One such 
event is a particle decay or perhaps a cascade event in a flaw in the 
silicon that saturates a gate for 2 or more clock cycles so that the 
result shows up in two consecutive keys.

A single bit error cannot propagate far before it gets split by a carry 
or  rotated to a different position. Only the results of almost the 
very last operation are likely to carry through with the same change to 
the residual from one key to the next. The last operation in RC5 is:
	E[0] = rotl(E[0]^c,c)+S[24]
The sum at the bit level contains exclusive-or logics. Saturating an 
input on an exclusive-or can turn a 0 output into a 1 or a 1 into a 0 
depending on the state of the other input. So the results is 2 
consecutive errors caused by a single event that have a real 
probability of canceling out even if using an adder for the residual 
function.

An adder is probably better than an xor for the residual function but 
not significantly so.



More information about the Hardware mailing list