[HARDWARE] Overclocking CPUs (long rant)

taniwha taniwha at taniwha.com
Wed Sep 22 21:12:50 EDT 1999


Hector Herrera wrote:
> IMHO,
>
> if you are running a long term project such as cracking RC5, then
> the decreased lifetime of an over clocked CPU will not pay off the
> 5-10% increase in performance.
>
> Any thoughts on this?

I don't think there's any data either way - and lots of FUD 
from Intel (largely butt-covering IMHO rather than conspiricy
sorts of stuff) because they're worried about  liability
(and their reputation) and to a lesser extent sales - I think that
the freezing of the clock multipliers was more a reaction to the 
fake chips that were coming onto the market (people selling
PII-350s as PII-450s) and their potential liability/loss
of reputation if/when these failed and people held them 
responsible.

I bet there are some chips for which there is a decreased
lifetime, and some for which there's not. It's likely to be 
more likely true for things run at higher voltages or 
running hotter. Knowing which is probably requires a 
statistical treatment that just isn't being done by
the overclockers in the methodical manner the CPU design teams
are probably doing their post-silicon burn-in testing.

Actual failure rates have to do with things like electromigration
[which depend on the metal widths on wires (changes from wafer to
wafer depending on the process) and peak currents flowing
through them - these often take a long time to fail (basicly
the electrons start pushing aluminum atoms around untill
the wire goes open) and one needs to use a statistical model
to figure out if any particular wire will suffer] and oxide
thicknesses - again it's very much process/wafer specific

As a chip designer I know that giving a chip a 'speed' is a 
pretty hit and miss sort of thing - you tend to be pretty 
conservative in order to hit all the points on the process 
curves - this usually leaves some performance on the 
table - even speed binning is often done at a higher frequency
on the tester with the knowledge that that will make sure that 
things  will work over the operating temp/voltage ranges.

Next a warning about over-clocking - each chip has some 
worst case speed paths - you don't know what they are (chances 
are the designers think they do and have nightmares about 
them and have spent months sweating bullets pulling 
picoseconds out of them:-)  worse than this - the speed 
paths change from day to day on the fab depending on where 
in the process a particular wafer happens to turn out 
(remember those designers were sweating  bullets over the 
worst [or maybe typical if they are binning] case of some 
statistical model of how the path would turn out - NOBODY
really knows - the company who made the chip THINK they
do to some statistical certanty.

What I'm trying to get at here is that what will fail first when 
you overclock is unknown (from your point of view) it may be a path
that you don't use much - maybe you can run quake, rc5 and compile
the limux kernel over and over untill you are blue in the face
without any problems .... but a speed path in the multiplier will
mean that it silently screws up your taxes ....

Finally you should realize that the chip debug people spent a lot 
of time testing the chip - this is where the manufacturer's real 
numbers come from and why they think that they can meet 
them - smooing the voltage/temp/freq to try and get some idea 
of where the chip functions - but they're doing this on a complex 
beast, even their tests can not test every path at speed 
(there will be data-driven timing dependancies in large combinatorial
paths where the posibilities are just too big - for example 
you can't afford to test adding/mutiplying/dividing all 
possible combinations of 32-bit integers, or 80-bit floating point
values - it just takes too long (let's see at least 
2**64 clocks to multiply all possible combinations - my
back-of-the-envelope calculation sais about 1000 years
at 400MHz) But I bet they're a lot better focused than 
playing quake or compiling the kernel. Even so even THEY 
really don't know absolutely for sure that a particular
chip will ven run at the speed they quote (just that it 
will probably run the test programs they smoo'd it with).

anyway - this is my standard overclocker's warning rant -
not to discourage it - just to give people a better idea of what
they are dealing with - basicly it's a risk each and every time
- you get to take it and shouldn't bitch about it when it 
doesn't pan out

I've probably told you all way more than you wanted to hear
so I'm sorry for going on a bit too long

	Paul Campbell aka Taniwha
--
To unsubscribe, send 'unsubscribe hardware' to majordomo at lists.distributed.net



More information about the Hardware mailing list