[RC5] Possible bug

Mark Duell mduell52 at cox.net
Sat Jul 12 22:56:45 EDT 2003


----- Original Message ----- 
From: "Michael Pelletier" <mvpel at nortelnetworks.com>
To: <rc5 at lists.distributed.net>
Sent: Saturday, July 12, 2003 5:44 PM
Subject: [RC5] Possible bug


> I've got a number of clients set up to all work from a common set of
> buffers accessed over NFS, so that I can run a process to do the
> fetch/flush while the rest of the systems crunch the packets:
>
> Clients:
>  [buffers]
>  buffer-only-in-memory=no
>  alternate-buffer-directory=/net/gallant/vol/vol0/home/mis/mvpel/dnetc/
>  frequent-threshold-checks=0
>  [networking]
>  disabled=yes
>  [ogr]
>  fetch-workunit-threshold=1
>
> As long as the buffers are full, the clients pluck one packet at a
> time from the buff-in.ogr file in the specified alt buffer dir, and
> write it out to buff-out.ogr in that dir.
>
> However, due to the extremely short amount of time needed to process
> the 24-stubs, the buff-in.ogr ran out of stubs, which I believe
> triggered one of these clients to attempt a fetch/flush operation.
>
> However, I think it was attempting to flush the file to itself, in
> essence.  For 20 minutes, and occasionally through the day, I got:
>
> [Jul 12 01:56:09 UTC] FlushFile error: The source file isn't getting
smaller.
>                       Check for a loop in your (remote) buffer settings!
> [Jul 12 01:56:09 UTC] OGR: Transferred 4 packets (5.76 stats units) to
file.
> [Jul 12 01:56:10 UTC] FlushFile error: The source file isn't getting
smaller.
>                       Check for a loop in your (remote) buffer settings!
> [Jul 12 01:56:10 UTC] OGR: Transferred 4 packets (5.76 stats units) to
file.
>
> etc. etc. etc.
>
> The buffer shrank from about 17,000 stats units worth of OGR packets
> down to about zero.  I'm not sure if these packets are lost or not,
> I guess I'll see what happens when the stats run completes.  :-(
>
> The working dir of all the clients is the same as the
> alternate-buffer-directory.
>
> Did I tickle something in the fetch/flush code that caused this due
> to my configuration of the clients?
>
> Any comments or recommendations would be welcome.

A slightly more elegant way would be to run a pproxy on the machine that
currently hosts the buffers on nfs, and then have all the other machines
fetch/flush from it. Running a pproxy also lets you do some nifty other
things, like stats.

Mark Duell




More information about the rc5 mailing list