[stats-dev] new ogr-ver-incr method

Jim C. Nasby decibel at distributed.net
Tue Dec 9 12:57:35 EST 2003


On Tue, Dec 09, 2003 at 04:23:39PM +0000, Chris Hodson wrote:
> I have started to run into some problems getting the incremental version
> of ogr-verification (ogr-ver-incr or just -incr) to work properly.
> Debugging this has been dragging and progress is slow, to say the least.
> While I'm not giving up entirely on this, I think I can get a simpler
> (code-wise) version running a bit quicker.  This probably won't be as
> fast as what I'm trying to get done right now (see
> http://cvs.distributed.net/viewcvs.cgi/stats-proc/ogr-verification/daily_update.sql?rev=1.17),
> but it should be much much faster than what is currently in place and
> much easier to debug.  I wrote up an outline of the new method and some
> pseudo-code.  It assumes a working knowledge of the tables that are
> there now.  If you're interested and don't know about the tables I'd be
> happy to write something up.  Any comments would be appreciated.
> 
> 
> 1. Load day_results with today's work (normalized and de-duped)
I don't think you should be de-duping. I'd like -verify to know what
users are slamming us with duped work.

> 2. Create affected_ids
> 	select L2.id from ogr_idlookup L1, ogr_idlookup L2
> 	where L1.retire_date = :RUNDATE
> 	AND L1.stats_id = L2.stats_id
> 
> 3. Generate list of affected stubs
> 	select stub_id, nodecount from day_results
> 	UNION
> 	select stub_id, nodecount from ogr_results r, affected_ids aid
> 		where r.id = a.id
> 
> 4. Delete stubs affected
> 	delete from ogr_summary
> 		where ogr_summary.stub_id = affected_stubs.stub_id
> 		and ogr_summary.nodecount = affected_stubs.nodecount
> 
> 5. Generate day_summary
> 	SELECT stub_id, nodecount, count(DISTINCT l.stats_id) AS ids,
> 		max(p.version) AS max_version
> 		FROM day_results dr, OGR_idlookup l, platform p
> 		WHERE l.id = dr.id
> 			AND p.platform_id = dr.platform_id
> 			AND NOT dr.in_results
> 		GROUP BY stub_id, nodecount
> 
> 6. Insert day_summary into ogr_summary (could really combine steps 5 and 6)
Yeah, you should combine those, otherwise you're needlessly creating a
temporary table.
-- 
Jim C. Nasby, Database Consultant                  jim at nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


More information about the stats-dev mailing list