[RC5] Stats suggestion

Jim C. Nasby jim at nasby.net
Wed Jan 10 01:28:07 EST 2001

Here's my commentary on the exponential weighted mean (the proper term
for what's been discussed, iirc).

First, the new statsbox is *massive* compared to tally. Quad Xeon 500s
and hardware raid. Loads of RAM. Furthermore, the deal that was signed
with United Devices should allow DCTI to make periodic upgrades to this
box, something that never happend with tally. I'm never in favor of poor
design that mandates massive hardware, but in this case there is little
reason to worry about adding a few extra fields.

EWM is a very useful tool, and I think it might be a neat stat to have,
but the downside is that a ton of our users will have no clue what the
hell it means, which will result in even more mail to
help at distributed.net, which is already heavily overloaded. Everyone on
this list might understand it, but for every person on this list, there
are 10 active participants that aren't. Because of this, we have to be
very careful with what features we add.

A sliding window average (and sum, for that matter) is very easy to
explain on the other hand. "You've done 12382 work units in the past 90
days, for an average of 183.302 work units per day" is completely
self-explanitory (no, I didn't do the math, 183.302 came out of thin
air). The code to do this calculation is basically as simple as the EWM
code, and the record width's will be the same as with EWMs. The only
added storage requirement is to keep a copy of the past 90 days (or
however long the longest sliding window is) worth of master data, which
should only amount to a few hundred meg (remember that sbIII has 108G of

I think that the added stat of how many work units you've done in a
given period is also very nice, and there's no way to do that with an
EWM (please don't suggest EWM * days... it's painfully inaccurate).

As far as 'rolling your own' EWM goes, phistory_raw provides you with
all the info that you need.
