[RC5] Why does Pentium 3 process faster than Pentium 4?
ruheejih at calvados.zrz.tu-berlin.de
Wed Dec 11 09:28:49 EST 2002
>Well, contrary to what Intel would have you believe, clockspeed isn't the
>most important factor in determining a chips speed. A lesser clocked
>Athlon, or G4 can perform admirably against faster clocked P4's. A lot of
>this has to do with the depth of the chips pipe-line, and the number of
>instructions executed per clock (IPC).
Honestly, I can't here this anymore...
(1) In most cases (in calculation intense software, including RC5-72)
the length of the pipeline does
not play any role, because most computing is done in long unrooled loop
without branches. Even when braches
come into play, the P4 owns a very good brach prediction and there is
the possibility for the programmer to help the P4 with branch hints
("branch likely to be taken", etc.)
(2) The P4 ist clearly optimized for SSE and SSE2 and there, it is a
real monster. In single precision FP it beats everything on the market.
The reason is the trace cache and that most SSE/SSE2 intructions consume
only one uop each.
(3) The G4 can NOT perform admirably faster, it is not more effective
than Athlon or P4 in most cases. It is a myth, based on the old
prejudice, that "RISC" is better than "CISC" (I don't differ between
RISC and CISC anymore). In fact, there is only one benchmark/type of
application where the G4 is faster than P4 or Athlon: RC5-64 and we all
(4) Altivec is a very cool vector unit, but in most cases not more
effective than SSE/3dnow! In rare cases it can perform 8*MHz single
precision operations, but it is not enough to beat the much higher
clocked P4 in SSE.
(5) P4 is so slow in RC5, because of its slow shift/rotate. Here you can
find some good information about it: http://www.emulators.com/pentium4.htm
(6) The P4 is a machine behaves no good in traditional compiled code. It
requires special optimizations by the compiler to show more performance.
If you take the Intel optimization guidelines into account, you will me
able to write fast code for the P4
I am not P4 expert, I am doing more Athlon stuff. But I don't like P4
bashing, because the P4 is not a bad CPU
> The P4 is also a slightly different architecture than the PIII. Some
>things have changed that effect dnet on it.
The P4 not completely different
To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
rc5-digest subscribers replace rc5 with rc5-digest
More information about the rc5