[RC5] Performance on P-IV

Peter Cordes peter at llama.nslug.ns.ca
Fri May 11 16:00:04 EDT 2001

On Thu, May 10, 2001 at 05:18:55PM -0400, James Sharp wrote:
> How exactly does the G4/AltiVec code work?  RC5 isn't truly vectorizable,
> because of inter-loop dependencies.  Does the dnet stuff break it down
> into parallel tasks using the vector registers each as a separate
> pseudo-processor?

 The source for the Altivec core is in the source tarball you can download.
IIRC, there are a few comments in the code.

 This might be wrong, but it might take advantage of having lots of
registers by doing multiple keys at the same time, instead of doing multiple
stages of the same key (which is not very easy, because the designers of the
algorithm were trying to stop you from figuring out how to optimize away
steps in the loop (more or less).)

#define X(x,y) x##y
Peter Cordes ;  e-mail: X(peter at llama.nslug. , ns.ca)

"The gods confound the man who first found out how to distinguish the hours!
 Confound him, too, who in this place set up a sundial, to cut and hack
 my day so wretchedly into small pieces!" -- Plautus, 200 BCE
To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
rc5-digest subscribers replace rc5 with rc5-digest

More information about the rc5 mailing list