[RC5] Stats suggestion

bwilson at fers.com bwilson at fers.com
Fri Jan 5 14:00:00 EST 2001


I don't have a problem rebuilding these rates from the dawn of time, if
necessary.  It doesn't sound like each pass is terribly intensive (simple
addition and multiplication).  The advantage of a method like this is that
it sounds like it will level out the megaflushers with everyone else.  It
helps strike a balance between the instantaneous (today) and the total.

Even the people who megaflush on their first day won't get gratification
for long.  So what if your first day puts you at #335?  You dropped 2000
positions on day 2, and another 4000 on day 3!

If this works out as well as it sounds it might, we might even start using
this number for a new ranking (or even a replacement ranking).  It
emphasizes long term consistent results, but especially rewards people who
are actively contributing today, and it doesn't penalize people who only
flush once a week or month.

As far as time periods, each average requires another numeric field in the
rank table, and another field means more data, larger tables, fractionally
slower stats, so I want to keep the number of choices small - I think 3 is
the max.  I'll probably start with one on a 30-day half-life and see
whether it makes more sense to go longer or shorter.  Week/Month/Year
averages are an intriguing option, especially since we have people who
have contributed for multiple years, and we want to keep emphasizing the
long-term view of the projects.

__
Bruce Wilson, Manager, FERS Business Services
bwilson at fers.com, 312.245.1750, http://www.fers.com/
PGP KeyID: 5430B995, http://www.lasthome.net/~bwilson/
"A good programmer is someone who looks
both ways before crossing a one-way street."




Ben Clifford <benc at hawaga.org.uk>
Sent by: owner-rc5 at lists.distributed.net
2001-01-05 11:31
Please respond to rc5


        To:     rc5 at lists.distributed.net
        cc:
        Subject:        Re: [RC5] Stats suggestion



On Fri, 5 Jan 2001, Michael Pelletier wrote:

> The only time you'd have to pull
> values from every day under consideration is the first time you generate
> these values for the database, unless you wanted to just start from
> square one and ignore historical data, and let it build up over the days
> and weeks.  Just seed yesterday's exponential average with yesterday's
> value modified by some factor, or zero, and let it roll forward.

Indeed, no matter what starting value you choose, the result will
eventually reach the right value.

I would suggest, as a low-processing way of doing it, seeding everyone's
result with their past n days of data, where n is something around the
half-life. This would produce a value that is "about
right".

For new contestants, there is a decision to be made whether to seed them
at 0 or at their first days rate.

Starting at 0 will take them a while to get up to their correct average,
but will take into account the fact that they have been doing nothing for
the time before they started.

Starting at their first days rate would allow someone to build up a lot of
keys offline and submit them all on their first day, allowing them to
"spam" the stats (as can happen now). However, it puts honest new
participants in a fairer position as they will start around the correct
rate rather than have to spend several multiples of the half life getting
up to a decent value.

So I would probably vote for the second option.

--
http://www.hawaga.org.uk/travel/ for my rotating world map applet
http://www.hawaga.org.uk/benc_key.txt PGP / GPG key 0x30F06950 - please use
it!


--
To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
rc5-digest subscribers replace rc5 with rc5-digest





--
To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
rc5-digest subscribers replace rc5 with rc5-digest



More information about the rc5 mailing list