[stats-dev] Why do we care about date in the load functions?

Jim C. Nasby decibel at distributed.net
Thu Jul 29 09:11:51 EDT 2004


On Thu, Jul 29, 2004 at 10:59:20AM +0100, Paul Richards wrote:
> > Looking at the query for tmsummary I'm detecting some wonkyness...
> > 
> > SELECT *, last_date::DATE - first_date::DATE + 1 AS 
> > days_working FROM team_rank WHERE team_id = 1828 AND 
> > project_id = 8 AND last_date = (SELECT MAX(last_date) FROM 
> > team_rank t2 WHERE t2.team_id = team_rank.team_id AND 
> > t2.project_id = team_rank.project_id)
> > 
> > Uhm... huh?!? What's with the (expensive) max() operation?
> 
> I'm guessing that thejet originally thought it would be possible to load
> previous date stats in that manner. If team_rank will only ever contain the
> current days, we can simplify that query.
> 
> There's no where in the php that tries to supply a date, so it's safe to
> remove.
 
Neither rank table contains any history; it's one row per ID. I'll open
a bug to change the behavior.

> > Looking further, both team.php and participant.php's load 
> > functions accept a date. Does anyone know why? There's only 
> > one day's worth of data in the _rank tables. The only thing 
> > this would do is potentially exclude teams/participants that 
> > weren't active yesterday, but that doesn't make much sense 
> > since these functions should only be used for single teams or 
> > participants, not for something like yesterday listings.
> > 
> > By the way, the reason I started looking at this is because 
> > tmsummary is executing at least 15(!!) queries. See
> > http://decibel.statsdev.distributed.net/team/tmsummary.php?pro
> ject_id=8&team=1828&debug=1
> > and http://bugs.distributed.net/show_bug.cgi?id=3729. 
> 
> http://cvs.distributed.net/viewcvs.cgi/stats-html2/etc/team.php.diff?r1=1.25
> &r2=1.26
> ^ 
> That should probably say as a * - we're generating a 'team object' from it.
> If in the future we added caching of objects, we'd then have issues with
> incomplete objects.
> 
> Would you rather that
> http://decibel.statsdev.distributed.net/team/tmsummary.php?project_id=8&team
> =1828&debug=1 did:
> 
> SELECT *, last_date::DATE - first_date::DATE + 1 AS days_working FROM
> team_rank WHERE team_id IN (4264,1828,28304,28166,4719) AND project_id = 8
> AND last_date = (SELECT MAX(last_date) FROM team_rank t2 WHERE t2.team_id =
> team_rank.team_id AND t2.project_id = team_rank.project_id)
> 
> And a single query?
> 
> If so, I'll fix that up for you

I guess if we're populating a full object then maybe the * should be in
there. There's no reason for the IN clause though, or the last_date
clause. Just change the field list if you think we should have the * in
there.

> >Also, something's wrong with the query reporting. For example,
> > http://decibel.statsdev.distributed.net/participant/psummary.p
> > hp?project_id=8&id=39622&debug=1
> > is showing only 2 queries, which isn't right by a long-shot. 
> > (The neighbor and friend queries are missing, at a minimum).
> 
> $g_query_array or whatever it's called should be a global, not part of the
> class. I'm guessing this is what's causing the issue.
> 
> It's a 'bug' that when I first added it was part of the class.
> 
> Paul

Yeah, I added a method to get the array and renamed it from g_query to
_query since it wasn't global. Arguably globals are bad, but if that's
the way to make this work go ahead and change it. Having this working
will make debug *much* easier.
-- 
Jim C. Nasby, Database Consultant                  jim at nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


More information about the stats-dev mailing list