[stats-dev] new ogr-ver-incr method
Jim C. Nasby
decibel at distributed.net
Tue Dec 9 12:57:35 EST 2003
On Tue, Dec 09, 2003 at 04:23:39PM +0000, Chris Hodson wrote:
> I have started to run into some problems getting the incremental version
> of ogr-verification (ogr-ver-incr or just -incr) to work properly.
> Debugging this has been dragging and progress is slow, to say the least.
> While I'm not giving up entirely on this, I think I can get a simpler
> (code-wise) version running a bit quicker. This probably won't be as
> fast as what I'm trying to get done right now (see
> http://cvs.distributed.net/viewcvs.cgi/stats-proc/ogr-verification/daily_update.sql?rev=1.17),
> but it should be much much faster than what is currently in place and
> much easier to debug. I wrote up an outline of the new method and some
> pseudo-code. It assumes a working knowledge of the tables that are
> there now. If you're interested and don't know about the tables I'd be
> happy to write something up. Any comments would be appreciated.
>
>
> 1. Load day_results with today's work (normalized and de-duped)
I don't think you should be de-duping. I'd like -verify to know what
users are slamming us with duped work.
> 2. Create affected_ids
> select L2.id from ogr_idlookup L1, ogr_idlookup L2
> where L1.retire_date = :RUNDATE
> AND L1.stats_id = L2.stats_id
>
> 3. Generate list of affected stubs
> select stub_id, nodecount from day_results
> UNION
> select stub_id, nodecount from ogr_results r, affected_ids aid
> where r.id = a.id
>
> 4. Delete stubs affected
> delete from ogr_summary
> where ogr_summary.stub_id = affected_stubs.stub_id
> and ogr_summary.nodecount = affected_stubs.nodecount
>
> 5. Generate day_summary
> SELECT stub_id, nodecount, count(DISTINCT l.stats_id) AS ids,
> max(p.version) AS max_version
> FROM day_results dr, OGR_idlookup l, platform p
> WHERE l.id = dr.id
> AND p.platform_id = dr.platform_id
> AND NOT dr.in_results
> GROUP BY stub_id, nodecount
>
> 6. Insert day_summary into ogr_summary (could really combine steps 5 and 6)
Yeah, you should combine those, otherwise you're needlessly creating a
temporary table.
--
Jim C. Nasby, Database Consultant jim at nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828
Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"
More information about the stats-dev
mailing list