[rc5] V2 Speed Probs on x86?

Ronald Van Iwaarden rrt0136 at ibm.net
Mon Jun 30 10:37:36 EDT 1997

On Sun, 29 Jun 1997 23:57:50 -0500, root wrote:

>Ronald Van Iwaarden wrote:
>> It is interesting to note that the AMD K5 got the biggest increase followed by
>> the Cyrix 6x86.  The K6 did not get as great an increase which is a bit of a
>> mystery to me.  My first instinct was that the branch prediction routines of the
>> K5 and 6x86 were their biggest benefactor.  These routines probably make them act

>There were not very many branches in the v1 code(minimum of two conditionals
>per key as I recall, max of 8-10) and I doubt that the optimization of even 20
>branches and their pipeline flushes(all of the ifs are bunched together) in
>relation to the ~1000 or so instructions that are required to "check a key"
>would produce a 50% performance boost.

I never saw the source code but this seems to be a good point.

>My best guess is that they figured out how to make better use of the two
>parallel integer pipelines and that helped every(x86)body. The big jump for
>AMD and Cyrix must be from the larger caches on the non-intel machines which
>allow all of the inner loop code to reside in the L1 cache. Compare how

If this is true, then the MMX P5's should be getting Cyrix/AMD level speeds.  Any 
MMX people out there that can give a benchmark for a P166MMX or P200MMX?

      o           Work to live; \   rvan at tiger.cudenver.edu
     /\            Live to bike; \   http://www-math.cudenver.edu/~rvan
   _`\ `_<===       Bike to work! \  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
__(_)/_(_)___.-._                  \   Note the new addresses!

To unsubscribe, send email to majordomo at llamas.net with 'unsubscribe rc5' in the body.

More information about the rc5 mailing list