[Hardware] The market of ASICs (One GigaKey / Second?)

Dan Oetting dan_oetting at uswest.net
Mon Aug 9 12:05:21 EDT 2004

On Aug 8, 2004, at 9:33 PM, Elektron wrote:

> That said, memory access is not a major bottleneck (on a PowerPC), 
> since you can load into a temporary register several instructions 
> ahead with minimal penalty, so it may be faster to keep the S-boxes in 
> RAM, preload them, and use the extra registers for another pipe (the 
> bottleneck here is that the PPC can only load once per clock cycle).

I tried that. With a parallel core I was never able to schedule all the 
loads and stores to keep from blocking and slowing down the pipe. What 
I did hit on is that the PPC has enough registers so you don't have to 
write anything to RAM. Since Round 3 and the encryption free up 1 S 
register each stage and Round 1 uses 1 more S register each stage they 
can be wrapped together to start processing the next key as the 
previous key is being finished. It worked out that Rounds 1 and 3 could 
be perfectly meshed to keep the execution units busy and every dispatch 
cycle full. Round 2 had 1 dispatch hole per stage and all but 2 of 
these were filled with housekeeping instructions such as incrementing 
the key for the next iteration.

