[Hardware] The market of ASICs (One GigaKey / Second?)
Dan Oetting
dan_oetting at uswest.net
Mon Aug 9 12:05:21 EDT 2004
On Aug 8, 2004, at 9:33 PM, Elektron wrote:
> That said, memory access is not a major bottleneck (on a PowerPC),
> since you can load into a temporary register several instructions
> ahead with minimal penalty, so it may be faster to keep the S-boxes in
> RAM, preload them, and use the extra registers for another pipe (the
> bottleneck here is that the PPC can only load once per clock cycle).
I tried that. With a parallel core I was never able to schedule all the
loads and stores to keep from blocking and slowing down the pipe. What
I did hit on is that the PPC has enough registers so you don't have to
write anything to RAM. Since Round 3 and the encryption free up 1 S
register each stage and Round 1 uses 1 more S register each stage they
can be wrapped together to start processing the next key as the
previous key is being finished. It worked out that Rounds 1 and 3 could
be perfectly meshed to keep the execution units busy and every dispatch
cycle full. Round 2 had 1 dispatch hole per stage and all but 2 of
these were filled with housekeeping instructions such as incrementing
the key for the next iteration.
More information about the Hardware
mailing list