[stats-dev] Incremental ogr-verify

stats-dev at lists.distributed.net stats-dev at lists.distributed.net
Tue Jun 17 11:40:01 EDT 2003


I saw that dB and TheJet discussed this, and I talked to Decibel a bit, but
I figured I would bring it back here so everyone could follow along and
throw in their opinions.  Most of my coments are about one section, so I've
copied the query in question.

First off, as I mentioned to decibel last night, max_version will never
change.  We aren't throwing out any work, just reassigning it, so we can
skip dealing with it entirely.  Not a huge deal, but it makes things a bit
simpler.

I guess I'm not sure that this query is doing the right thing, as it doesn't
look at where an id is being retired to.  If we have a stub that has been
done by ids 1,2,3 and 1 retires to 4, there is no need to do anything.  In
fact, in the above case, if 1 retires to 2, the stub should still be counted
as having been done twice (by 2 and 3).

I would propose we do one of two things, when we see that a stub has been
"tainted" by having one of the people who did it retire (any of the ids that
did it retire to the retire_to id) we either go through the chain one by one
and figure out if it's still valid, or we simply mark the stub,nodecount
combo as needing to be recalculated.

I had orginally thought about doing something similar to what TheJet
proposed.  Namely, having a cursor of all the affected stubs and going
through them one by one.  This would also buy us the benefit of being able
to keep ogr_complete up to date.  Adding some triggers might help and/or
obviate the need for the cursor, but I'mnot sure how anyone feels about
that.

Trying to spark discussion as much as anything else

-Nerf

-----Original Message-----

Build a list of new information.

SELECT stub_id, nodecount
        , (SELECT count(*)
                FROM retired_stub_id rsi
                WHERE rsi.stub_id = r.stub_id
                    AND rsi.nodecount = r.nodecount
            ) AS ids
        , max(p.version) AS max_version
    INTO TEMP retired_new_info
    FROM ogr_results r, platform p
        , (SELECT DISTINCT stub_id, nodecount FROM retired_stub_id) rs
    WHERE r.stub_id = rs.stub_id
        AND r.nodecount = rs.nodecount
        AND r.id NOT IN (SELECT id
                                FROM retired_stup_id rsi
                                WHERE rsi.stub_id = r.stub_id
                                    AND rsi.nodecount = r.nodecount
                            )
        AND p.platform_id = r.platform_id
;

UPDATE OGR_results
    SET participants = participants - n.ids
        , max_client = n.max_version
    FROM retired_new_info n
    WHERE r.stub_id = n.stub_id
;

-- 
Jim C. Nasby (aka Decibel!)                    jim at nasby.net



More information about the stats-dev mailing list