[Hardware] Notes... The case for an open client

Dan Oetting dan_oetting at uswest.net
Wed Aug 18 11:44:36 EDT 2004

On Aug 16, 2004, at 1:01 PM, Elektron wrote:

>> Prior to the start of RC5-72, there was some internal discussion about
>> ignoring the mangling requirements and simply incrementing natively, 
>> but it
>> was decided to instead stay with standard RC5 mangling that had been 
>> done in
>> RC5-56 and RC5-64 rather than introduce any possible problems.  There 
>> had
>> been timings done of the performance difference gained by removing 
>> the extra
>> mangle/unmangle operations, and it turned out to be relatively 
>> trivial (less
>> than a few percent in keyrate, I think).  Although speed improvements 
>> here
>> and there are always good, picking the safety of past implementation
>> prevailed.
> This way, we can keep the first few S-boxes dependant on L[0] and L[1] 
> (I think). The 1/256 of the time that it overflows will always be a 
> problem, then be too much (though you could increment key.mid by 0x100 
> and check for an overflow, which would save you the 1/65536 and 1/16M 
> chance that other bytes overflow).

For reference, the 12 bytes in the RC5 key (in RSA order) are:

(msb)							(lsb)
   h3, h2, h1, h0 : m3, m2, m1, m0 : l3, l2, l1, l0
	key.hi	:	key.mid	:	key.lo

For RC5-72 h3, h2 and h1 are all zero leaving 9 bytes or 72 bits.

The key is used in Round 1 of the key expansion. key.lo is first used 
in stage 0, key.mid is first used in stage 1 and key.hi is first used 
in stage 2.

The least significant byte (l0) is the last byte d.net will change so 
if we want to avoid overlapping work with d.net that must also be the 
last byte that the hardware project changes.

Unless an overriding reason comes up for being different we should 
stick with the d.net key order. A core will process a block by 
iterating through all permutations of h0, m3, m2 and m1. A core that 
automatically processes sequential blocks should then increment the 
bytes in the following order: m0, l3, l2, l1.

For each block processed the core must return the residual check and 
the number of partial match keys found within the block. The core 
should also return as many of the partial match keys as practical.

The residual check is the arithmetic sum of the low word of encrypted 
text produced by each key in the block. A partial match key is one that 
produces a low word of encrypted text that matches the target cypher 

With the key order settled, we are ready to run. The first person with 
a hardware core built and tested gets to pick l0 for the first hardware 
super block. Until a key server is available just log your blocks 
locally, try to complete contiguous ranges of blocks and report what 
key range you have and plan to process so other hardware operators 
won't overlap your work.

If implemented in non reprogrammable hardware, give a passing thought 
to passing in the bytes h1, h2 and h3 as constants instead of 
hardcoding 0's. This way your great grand children may be able to dust 
off the old silicon to use in the next contest.

Incrementing l0 (and maybe l1 and l2) is optional. If not incremented 
the core will recycle within the same super block until restarted with 
a new set of parameters.

A CRC could be used instead of a sum for the residual. The CRC 
polynomial X^32+1 would be easy enough to implement in both hardware 
and software and offers the same distributive properties as addition. 
It would be nice if everyone used the same residual function but it's 
not absolutely necessary. The residual is used for local testing and 
random verification.

More information about the Hardware mailing list