[RC5] hyper threading on Intel P4

Jeroen v.d.burg at planet.nl
Fri Dec 20 08:45:09 EST 2002


The first Pentium 1 had 2 pipeline's, all later P2 P3 and P4 also have 2.
This means they can execute 2 instruction at the same clock.
What i think what hyper threading does is just split the 2 pipeline's
in to 2 'different' cpu's.
So 2x 3.060.000.000 cycle's/sec for one program turns into
3.060.000.000 for 1 program and another 3.060.000.000 for another.
Some instructions can't pair in the pipeline's so that one pipeline
is stalled. This can cause that the 1st pipeline is doing 3.060.000.000
instructions and the second one 1.500.000.000
When hyperthreading is enabled all instructions can 'pair' because
no instruction had to wait for the instruction of the other process.
But when a program is optimized good enough, this stalls won't
happen en thus hyperthreading is not effective.
I'm not saying the rc5 core for p4 can't be optimized, the current
core just uses both pipeline's very good but maybe it does
some things it doesn't have to do and thus can be optimize.
Or maybe can use other instructions to do the same but quicker.

Merry cristmas and happy new year :-)
 Jeroen

P.S. I'm not sure if i named all things correct, like pipeline, i think it's
something else but i hope you know what i mean ;-P


*********** REPLY SEPARATOR  ***********

On 19-12-2002 at 22:02 Adam Crews wrote:

>Hello,
> 
>It appears that the new P4 3.06 ghz chip with intel's hyper threading
>performs much much slower when 2 rc5 threads are started.  This chip
>presents it's self as 2 cpu's to the OS, and as a result dnet
>automatically starts 2 threads.  The client version is dnetc
>v2.9001-477-CTR-02111118 for Win32 (WindowsNT 5.1).
> 
>When this happens, the machine cracks a horid 2 million keys a second.
>[Dec 19 05:36:48 UTC] 2 crunchers ('a' and 'b') have been started.
>--snip--
>[Dec 19 15:25:07 UTC] RC5-72: Summary: 18 packets (18.00 stats units)
>                      0.09:51:44.17 - [2,117,507 keys/s]
>
> 
>When the client is configured to only start 1 thread, it runs much better
>at 4 million keys a second:
>[Dec 20 05:05:05 UTC] 1 cruncher has been started.
>[Dec 20 05:17:04 UTC] RC5-72: Completed CA:4263E76E:00000000 (1.00 stats
>units)
>                      0.00:11:59.10 - [4,214,779 keys/s]
> 
>It appears that the threads are steping on each other and generally
>slowing things down.
> 
>If the guys working on the x86 cores would like to test code on this chip,
>I am willing to give it a run and provide any feedback.
> 
>Or if anyone has any ideas on why this happens this way, I would be
>interested to know.
> 
>-Adam
>
>--
>To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
>rc5-digest subscribers replace rc5 with rc5-digest




--
To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
rc5-digest subscribers replace rc5 with rc5-digest



More information about the rc5 mailing list