[rc5] stats problem

Bill Plein bill at diablo.net
Tue Nov 4 17:51:04 EST 1997


>The database needed to provide the web stats would be a lot smaller than
>the one required for the whole project. The problem is how to push even
>this smaller amount of data out to the stats web servers, which you still
>need to recruit.

Sure, but the processing power to calculate summary tables, based on
MILLIONS of records, is immense.

Here is a layout and pseudo-code look at it:

MainStats:
	1 record per block (millions of records)
	Select count(*) from stats_table
	where team_id=37

ProxyStats:
	1 record per team (thousands of records)
	Select block_count from stats_table
	where team_id=37

Now in order to populate the ProxyStats machines, you are going to have to
do the following:

	Select team, count(*) from MainStats.dbo.stats_table
	group by team
	INTO ProxyStats.dbo.stats_table

If MainStats were properly indexed, this would take a while, but would be
manageable. Properly indexed, of course, increases the size of the
MainStats.dbo.stats_table.

And we're talking gigabytes of storage here, folks.

But anyhow, you populate ProxyStats maybe once an hour, and distribute that
table to the proxy servers. That would be an 8000 record table at this
point, and EVERY record changes, so all 8000 records would be transmitted.

Yup, it is do-able. Now you just have to find the proxy SQL servers, and at
something like $3000 a pop for licensing (they need multiple client
licenses if being used by a web server), a LEGAL and LICENSED proxy network
is very expensive.

Then again, the proxy stats servers could recieve flat ASCII text records
and use a Perl CGI script......


--
Bill Plein
bill at diablo.net 
PGP Key:
http://keys.pgp.com:11371/pks/lookup?op=get&exact=on&search=0x3860E5B9

----
To unsubscribe, send email to majordomo at llamas.net with 'unsubscribe rc5' in the body.



More information about the rc5 mailing list