[RC5] AMD core

Zypher rc5 at planetfortress.com
Wed Feb 16 09:51:46 EST 2000

I can give a little "proof" that RC5 can somehow be made faster on K7s...

First off, I regularly run either RC5 or Gamma Flux on my Athlon system...

I know my usual rates when nothing else is running...they're pretty static,
I leave it on overnight.

THEN, launch both programs...let them go through a few blocks

Compare the rates...to put it simple, they add up to more than 100%. (GF
takes more, but RC5 does not slow down as much as GF 'takes')
They come to about 120%...and both are 'near 100% cpu idle usage' type
programs. Following my very limited knowledge of cpus, I'd have to say each
is utitlizing different parts of the core, and I know the K7 core was made
to do as many instructions per cycle as was feasible...

So if each is getting work out of 'another part' of the CPU, then they can
both be made to use that other part to some extent. I'll leave GF to
dcypher, but RC5 is ontopic here ;)

I think the real task is getting a decent amount of work done, writing the
right code. Certainly beyond me, though I'm glad to see theres others on it.
I live for speed in all its forms.

rc5 at planetfortress.com
> >Has anyone looked further into the AMD core optimisation as discussed in:
> >http://lists.distributed.net/hypermail/rc5.Oct1999/0000.html
> >It looked very promissing to me, but I'm afraid I have niether the
> >programming skill, nor the time to look into it. If anyone shoud however
> >know about some good literature about this subject, i would be glad to
> >about it.
> I have started some work on theoretical code.  The main changes to the
> original idea is that I have decoupled the round 3 key expansion and
> encryption phases and will attempt to do most of the encryption using MMX
> instructions while the integer code does the key expansion for the next
> keys.  Originally I had MMX instructions doing part of round 1 of the key
> expansion of the next two keys while the integer code finished it off.

