[RC5] Why turn off stats during update?

Kevin van Haaren KvanHaaren at HNTB.com
Thu Jan 1 22:52:30 EST 1998

There are two reasons for turning off the stats during an update:
1) When you hit the stats page it does a query against the database to
get the info you're looking for.  The processor power for a query
(search) is usually larger than that for an add data (this is a rough
rule of thumb it really depends on the indexing you setup).  So by
turning off the query's during the update cycle you can really throw a
lot of processor at the update and get it done a LOT faster.

2) Microsoft's SQL Server (which I believe the stats use) does page
level record locking, not row level record locking.  This means that
when you request stat's for a particular e-mail all the records in the
"page" that particular information is on are locked.  If the update trys
to write to that record during your query it will be prevented by the
lock.  It will have to wait until the page is unlocked before
continuing.  [actually thinking about this, it may not be a problem
since query's are read-only.  They may not lock the records during a

There are a couple of things that could be done to fix this.  The first
requires throwing hardware at the problem (something in short supply at
Bovine).  One is to setup a second "update" stats box.  This would have
a master db and all updates would be done to this, once an update was
complete it could replicate the db to a "lookup" stats box.  Note that
the db is huge (a record for each block, maybe even each key), even only
replicating "changed" data would be a huge move of data, the internet is
probably too slow for the "lookup" stats box to be located anywhere but
in the same facility as the "update" stats box.

Another option is to utilize the fact that once the db is updated it is
static for 24 hours.  You could start tracking which query's occur most
frequently, and generate static HTML pages for those query's at update
time.  This would free up some lookup time by having cached pages ready
for the most common requests.  This assumes that day-to-day requests are
pretty much the same, people just looking for the same info day to day.

BTW I have nothing to do with Distributed.Net (except for donating CPU
cycles!).  I also have no complaints about the stats (well ok just one -
my listing is way too low, 2082 place just doesn't roll off the tongue),
I only check them once every couple of days.


> -----Original Message-----
> From:	James Mastros [SMTP:root at jennifer-unix.dyn.ml.org]
> Sent:	Wednesday, December 31, 1997 9:34 PM
> To:	rc5 at llamas.net
> Subject:	Re: [RC5] Stats are back up
> On Wed, 31 Dec 1997, David McNett wrote:
> > On Wed, Dec 31, 1997 at 09:51:28PM -0500, James Mastros wrote:
> > > And back down:
> > 
> > I'm not sure it's fair to call "updating" "down", but we're all a
> bit
> > gun shy considering the stats server's unreliability lately.
> > 
> > Just so you know, the reason this load is taking longer than normal
> is
> > due to the fact that we've managed to transfer all the logfiles
> across
> > for the missing "19th" data and are loading two days this run.
> > 
> > Oh, Happy New Year to everyone across the pond!
> > 
> > -/\/ugget
> >  (working for your stats on new year's eve)
> Sorry... It wasn't time for updates (by my arithmitic, anyway -- which
> is
> probably wrong), and the page says that the stats are off for
> "maintenance",
> not updating... perhaps the message should be made more clear...
> Somthing
> along the lines of:
> Stats will be avaible again soon.
> 	Stats are not available while the server is analising today's
> 	acativity.  Updates normally occur between about 00:00 GMT and
> 01:30
> 	GMT.  If you get this message far outside of these hours: don't
>         worry; you would get a different message if this were an
> unscheduled
> 	outage.  If you really care that much about the stats, consider
> 	malling (or joining) the mailinglist.
> A couple of notes: 1) Make "mailinglist" a link to the mailinglists
> page.
>                    2) Don't put that part about a different message if
> there
> 		      isn't a different message.
> and btw, why are all the stats down during update?  (Not
> that I know what I'm talking about in databases -- I end up
> reimplementing
> database functionality inside spreadsheets.)
> BTW - when I say BTW, I mean just that - don't feel in any hurry to
> answer
> that... If your working on stats now, then you have far more important
> things to do.
> 	-=- James Mastros
