[RC5] Stats Database Trivia

David McNett nugget at slacker.com
Wed Mar 18 22:04:40 EST 1998

On 18-Mar-1998, Marc Sissom wrote:
> In addition to this there is transient data that includes the email
> address of the ckecker, the IP, CPU type, OS type, etc. All that
> stuff that you see in the stats. I have no idea how long they keep
> this data, but I'm sure that they must distill it down periodically.

Excellent beginning Marc, allow me to pick up where you've left
off.  The databases used by the stats server are not production
databases in the sense that we would ever need to rely on them to
verify work performed.  They exist only to keep us stats-mongers
happy.  The keymaster spits out a logfile entry that shows the
details of each block completed.  Those of you who run personal
proxies will be familiar with the logfile format, but for those
who are not, it contains the timestamp, email address, block
location, block size, cpu, os, and client version.  These logs are
stored perpetually, should we ever need to reference an accurate
and detailed record of what has been done.  Yes, there is much
redundant data, but fortunatly for us, redundant data tends to 
compress quite well.  (about 10:1)

Each night, the stats server takes the new logfiles for that day
(typically around 240mb of log representing 2,400,000 individual
log entries).  These files are encapsulated into one giant table
that tracks:

   date, email, # of blocks, team affiliation

This means that we have one record per participant for each day
that they've submitted blocks.

Smaller, ancilliary tables are also created to track platform
stats as well.  Also, the participants and teams rankings are
retabulated and buffered.

It is from these tables that all the stats you see on the stats
site are built.

In total, the entire stats system is occupying about 319 megabytes
of a 750 megabyte database device.  The primary table (tot_team)
weighs in at 1,547,246 rows in ~75mb of space.  The indexes add
another 50mb or so.

|David McNett      |To ensure privacy and data integrity this message has|
|nugget at slacker.com|been encrypted using dual rounds of ROT-13 encryption|
|Birmingham, AL USA|Please encrypt all important correspondence with PGP!|
To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
rc5-digest subscribers replace rc5 with rc5-digest

More information about the rc5 mailing list