[RC5] interesting Xeon core anomaly

Slawek sgp at telsatgp.com.pl
Wed Aug 3 16:24:46 EDT 2005

In message to "D.net Discussion" <rc5 at lists.distributed.net> sent Mon, 01
Aug 2005 10:26:28 +0200 you wrote:

NP> by coincidence (load testing a repaired server) I came across this
NP> funny behavior (2.8 GHz Xeon/HT Prestonia, 496 client core #7 SGP
NP> 3-pipe) I thought I should report:
NP> dnetc -numcpu 1 yields 4,994 kkeys/s (lower than one would expect from
NP> Prescott bench)
NP> dnetc -numcpu 2 yields 2,412 kkeys/s (as expected due to missing SMT
NP> optimization)
NP> but
NP> running dnetc -numcpu 1 and /another/ dnetc -numcpu -multiok
NP> /simultaneously/ yield 2x 2,674,xxx keys/s - not much faster, but
NP> obviously there seems to remain some room for optimization on Xeon.

When you launch two crunchers from one client, those crunchers are two
threads of one process. If you execute two clients, crunchers work on
separate processes.

I dont know if your operating system is more likely to switch threads that 
are executing on given processors than to switch processes. If it is more 
likely to do that than it could be the reason for the anomaly you see. You 
can recompile test client from source and change it to set affinity mask in 
a way that prevents threads to jump beetween processors and you'll find out 
if its the case.

BTW I think it is possible to write a core that would work faster using two 
threads on one hyperthreaded procesor than current cores which use only one 
thread per real processor. Unfortunatelly I don't have time to check this 
out. It would need two different cores to work on two different virtual 
processors. It doesn't fit too well in current client's code so oh well - if 
somebody is willing to give it a try, please leave me a note.

Slawomir Piotrowski / Telsat GP
Rejestracja Czasu Pracy i Kontrola Dostepu

More information about the rc5 mailing list