[Hardware] partially unrolled?
Dan Oetting
dan_oetting at qwest.net
Tue Nov 28 11:49:36 EST 2006
On Nov 27, 2006, at 5:23 PM, John L. Bass wrote:
> Can someone post what the Dnet core counting sequence, work
> assignments,
> and work checking requirements are for a valid DNet core to
> participate?
The best source to answer these questions is the reference core in
the d.net source code <http://distributed.net/source/>.
The key is incremented starting with the most significant byte. This
allows some optimization in the first few stages for the software
cores. The hardware cores will save power and may save a few gates if
you can load the pre-computed constants.
Work assignments were originally handed out in blocks of 2^28 keys.
This may have been increased to 2^32 keys. Hardware cores will want
much larger blocks or you will spend all of your time restarting the
core.
The work checking is still rather poor but I guess they didn't want
to waist 1 extra clock cycle generating the check code. The current
check code according to r75-ref.cpp is to count the number of partial
matches in the low 32 bits of the cypher text and return the key for
the last such match.
The client uses a set of test parameters to verify that the core is
working properly. The test parameters represent a set of short work
units with known results. These tests are defined in selfest.cpp and
for rc5-72 start on a 2^16 key boundary and run for 2^17 keys.
A better check would be to return every key that has a partial match.
There is no need then to even generate the high 32 bits of the cypher
text since the partial matches can be rechecked in software by the
client or preferably by the server after the results have been
logged. One of my software cores used this optimization because I ran
out of registers to save the last output of round 2. I just had the
fast core find the keys with a partial match and used a slow core to
reprocess those keys to look for a full match. This saves about 1 or 2%.
A more robust check to insure that no bits are getting lost anytime
would be to accumulate an xor sum of the low 32 bits of every cypher
text result. This sum would need to be read out after each block is
processed.
-- Dan O.
More information about the Hardware
mailing list