[RC5] Stats suggestion
peter at llama.nslug.ns.ca
Wed Jan 10 21:37:36 EST 2001
On Wed, Jan 10, 2001 at 01:28:07AM -0600, Jim C. Nasby wrote:
> Here's my commentary on the exponential weighted mean (the proper term
> for what's been discussed, iirc).
> First, the new statsbox is *massive* compared to tally. Quad Xeon 500s
> and hardware raid. Loads of RAM. Furthermore, the deal that was signed
> with United Devices should allow DCTI to make periodic upgrades to this
> box, something that never happend with tally. I'm never in favor of poor
> design that mandates massive hardware, but in this case there is little
> reason to worry about adding a few extra fields.
> EWM is a very useful tool, and I think it might be a neat stat to have,
> but the downside is that a ton of our users will have no clue what the
> hell it means, which will result in even more mail to
> help at distributed.net, which is already heavily overloaded. Everyone on
> this list might understand it, but for every person on this list, there
> are 10 active participants that aren't. Because of this, we have to be
> very careful with what features we add.
> A sliding window average (and sum, for that matter) is very easy to
> explain on the other hand. "You've done 12382 work units in the past 90
> days, for an average of 183.302 work units per day" is completely
> self-explanitory (no, I didn't do the math, 183.302 came out of thin
> air). The code to do this calculation is basically as simple as the EWM
> code, and the record width's will be the same as with EWMs. The only
> added storage requirement is to keep a copy of the past 90 days (or
> however long the longest sliding window is) worth of master data, which
> should only amount to a few hundred meg (remember that sbIII has 108G of
> I think that the added stat of how many work units you've done in a
> given period is also very nice, and there's no way to do that with an
> EWM (please don't suggest EWM * days... it's painfully inaccurate).
> As far as 'rolling your own' EWM goes, phistory_raw provides you with
> all the info that you need.
I quite like the roll your own weighting factor idea. If we maintain it
_as well_ as a fixed-weight average, and never use the adjustable one for
ranking, it can be used by participants to keep track of how they are doing.
Sure they could do it themselves, since the data is available, but it is so
easy for the stats server to do this that it would be really nice if it
would. You might want to make the half-life small for the average to catch
up to a new rate, then make it a bit longer again to be more stable. You
might want to just make it a quite short half life if you have a persistent
net connection, to catch shorter term trends.
If it never gets used for ranking, we could even let people set them to
whatever they want, to make managing them easier.
#define X(x,y) x##y
Peter Cordes ; e-mail: X(peter at llama.nslug. , ns.ca)
"The gods confound the man who first found out how to distinguish the hours!
Confound him, too, who in this place set up a sundial, to cut and hack
my day so wretchedly into small pieces!" -- Plautus, 200 BCE
To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
rc5-digest subscribers replace rc5 with rc5-digest
More information about the rc5