[rc5] V2 Speed Probs on x86?
Ronald Van Iwaarden
rrt0136 at ibm.net
Mon Jun 30 10:37:36 EDT 1997
On Sun, 29 Jun 1997 23:57:50 -0500, root wrote:
>Ronald Van Iwaarden wrote:
>> It is interesting to note that the AMD K5 got the biggest increase followed by
>> the Cyrix 6x86. The K6 did not get as great an increase which is a bit of a
>> mystery to me. My first instinct was that the branch prediction routines of the
>> K5 and 6x86 were their biggest benefactor. These routines probably make them act
>There were not very many branches in the v1 code(minimum of two conditionals
>per key as I recall, max of 8-10) and I doubt that the optimization of even 20
>branches and their pipeline flushes(all of the ifs are bunched together) in
>relation to the ~1000 or so instructions that are required to "check a key"
>would produce a 50% performance boost.
I never saw the source code but this seems to be a good point.
>My best guess is that they figured out how to make better use of the two
>parallel integer pipelines and that helped every(x86)body. The big jump for
>AMD and Cyrix must be from the larger caches on the non-intel machines which
>allow all of the inner loop code to reside in the L1 cache. Compare how
If this is true, then the MMX P5's should be getting Cyrix/AMD level speeds. Any
MMX people out there that can give a benchmark for a P166MMX or P200MMX?
o Work to live; \ rvan at tiger.cudenver.edu
/\ Live to bike; \ http://www-math.cudenver.edu/~rvan
_`\ `_<=== Bike to work! \ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
__(_)/_(_)___.-._ \ Note the new addresses!
To unsubscribe, send email to majordomo at llamas.net with 'unsubscribe rc5' in the body.
More information about the rc5