[stats-dev] log loader

Jim C. Nasby decibel at distributed.net
Fri Apr 22 17:22:07 EDT 2005


On Fri, Apr 22, 2005 at 02:16:51PM -0500, Chris Hodson wrote:
> On 22-Apr-2005, Jim C. Nasby wrote:
> > On Fri, Apr 22, 2005 at 03:03:04AM -0500, Jeff Lawson wrote:
> > > 
> > > > A few words about the pre-processor; If we use a C 
> > > > pre-processor, it's obviously faster, but at a cost of 
> > > > portability. 
> > > 
> > > The performance difference might not be that bad if you ensure that the
> > > limiting rate is the I/O of the database insertions, and not your other
> > > pre-processing activities.  For example, by using threading to continue to
> > > parse log lines while you're waiting for the database to do the bulk insert
> > > statement that is executing in another thread.  Of course threading in Perl
> > > is a rather rarely used feature and some consider it to still be a little
> > > experimental.
> > 
> > Nerf was actually suggesting that we just insert via perl, and not do a
> > bulk copy. IMO we should use a bulk copy, as it will be much faster.
> > 
> 
> I was wondering about how much of a speed difference it would be vs how much porability/readability we would be giving up.
> 
> Except for the initial load, I'm thinking this is either going to be run daily or hourly.  Either way it doesn't add up to a lot so extra processing time even if it's half as fast.
> 
> Just throwing out ideas at this point.

How would be we giving up portability or readability? And keep in mind
that we already have a logmod written in C that works, it just needs
some minor tweaks for the log database.
-- 
Jim C. Nasby, Database Consultant           decibel at distributed.net
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


More information about the stats-dev mailing list