[Hardware] The market of ASICs (One GigaKey / Second?)

jbass at dmsd.com jbass at dmsd.com
Mon Aug 9 13:05:03 EDT 2004

"Dan Oetting" <dan_oetting at uswest.net> writes:
> I tried that. With a parallel core I was never able to schedule all the 
> loads and stores to keep from blocking and slowing down the pipe. What 
> I did hit on is that the PPC has enough registers so you don't have to 
> write anything to RAM. Since Round 3 and the encryption free up 1 S 
> register each stage and Round 1 uses 1 more S register each stage they 
> can be wrapped together to start processing the next key as the 
> previous key is being finished. It worked out that Rounds 1 and 3 could 
> be perfectly meshed to keep the execution units busy and every dispatch 
> cycle full. Round 2 had 1 dispatch hole per stage and all but 2 of 
> these were filled with housekeeping instructions such as incrementing 
> the key for the next iteration.

Awesome!! don't meet many cycle counters these days - I had almost thought
that was becoming a lost art! I dropped out of that biz some six years back
because nobody was willing to pay for performance anymore.

On the PIII's and later how close did you get? What's the speedup over gcc's
best effort of:

        leal    (%eax,%esi), %ebx
        andl    $31, %ecx
        addl    -488(%ebp), %ecx
        leal    (%edi,%ebx), %esi
        movl    %eax, -328(%ebp)
        movl    -488(%ebp), %eax
        roll    %cl, %esi
        addl    -428(%ebp), %eax
        addl    %esi, %eax
        roll    $3, %eax
        movl    %esi, %ecx

        movl    %eax, -492(%ebp)
        leal    (%eax,%edi), %ebx
        andl    $31, %ecx
        addl    -492(%ebp), %ecx
        leal    (%esi,%ebx), %edi
        movl    %eax, -324(%ebp)
        movl    -492(%ebp), %eax
        roll    %cl, %edi
        addl    -424(%ebp), %eax
        addl    %edi, %eax
        roll    $3, %eax
        movl    %edi, %ecx
        movl    %eax, -496(%ebp)

for each stage?

Did a long google search for serial RC5 cores yesterday, doesn't seem that
anybody wants to publish any respectable RC5 cores at all, even though RC5
appears to now be a standard fare fpga class assignment and a number of
people are trying to do thesis work around it.


More information about the Hardware mailing list