[Hardware] Notes... The case for an open client

Jeff Lawson bovine at distributed.net
Mon Aug 16 14:26:17 EDT 2004

> >
> > Do we accept the extra overhead for compatibility or find 
> > the easiest
> > possible implementation for the cores? Adaption of the 
> > software cores 
> > to join the hardware project will be relatively easy and 
> > may even make 
> > them faster.
> What's preventing us from switching from byte-reversed to 
> non-byte-reversed? (Why the hell is it byte-reversed anyway? Is it 
> advantageous to do strange things to the key?)

The RC5 algorithm requires keys in reversed order:

            key.hi key.mid  key.lo
 unmangled      AB:CDEFGHIJ:KLMNOPQR
 mangled        QR:OPMNKLIJ:GHEFCDAB

"Even if it looks like a little/big endian problem, it isn't. Whatever
endianess the underlying system has, we must swap every byte in the key
before sending it to rc5_pub_data.unit_func()."

Prior to the start of RC5-72, there was some internal discussion about
ignoring the mangling requirements and simply incrementing natively, but it
was decided to instead stay with standard RC5 mangling that had been done in
RC5-56 and RC5-64 rather than introduce any possible problems.  There had
been timings done of the performance difference gained by removing the extra
mangle/unmangle operations, and it turned out to be relatively trivial (less
than a few percent in keyrate, I think).  Although speed improvements here
and there are always good, picking the safety of past implementation

> Also, isn't it possible to recode the client to internally not bother 
> (much) with byte-reverse adds (until it overflows by 2^32)? 
> Aren't all 
> blocks 2^32, at the moment, anyway? (I don't see an option for it)

I'm think there is already some expectation that cores can do their own
incrementing order inside calls to the unit_func(), but once the unit_func
returns it expects all keys in the mangle-incremented range to have been

Originally, preserving the same strict incrementing ordering between the
cores was very important since the client checkpoints and your work could be
resumed by potentially a different core (which might have a different idea
for incrementing).  However in recent times, the client will restart a block
if it sees that the checkpoint data was done by a different client version
or cpu/core identifier.

In any case, changing the incrementing of the "high" bits much above 32-bits
would not really be possible without (effectively) discarding all of the
work that had already been done.


More information about the Hardware mailing list