[RC5-PROXYPER] Trimming log-files...

Benoit Hudson bh at cs.brown.edu
Fri Dec 5 09:55:05 EST 1997

> > The FreeBSD proxyper automatically rotates the log at 0000 GMT. A few 
> > minutes past I use cron to gzip that log and move to another directory.
> So you mean the keylog-files where statistics are made about who finished 
> which block when under what OS and so on, don't you. I think I will have 
> to gzip them, too. But first I have to learn a bit more perl to read in a 
> gzipped file and split the data by emails;-)

I actually have worked on a log-size-reducer, which reduces the logfiles
to about 2-3% of the original size (essentially, I convert to binary and
throw away the block number, then I gzip the result).  Get in touch with
me if you're interested.

Then, instead of opening the logfiles, just 
	open(LOGFILE, "$command $logfile |");  
which opens the output of <command> (which could be, say, gzcat, or my
binary log reader).

> 24 [hera] dk> du -s
> 60899   .
> So the file space seems to be freed, although ls -l gives
> -rw-------   1 dk       AEG      15129397 Dec  5 09:58 nohup.out

This is probably because of the structure of S5-based filesystems (like
ufs, ext2fs, and so on).  In order to figure out which blocks are
allocated to a file, you store 'pointers' (block numbers) in an array in
the inode for the file (slightly more compilcated than that, but that's
the basics). There's a rather neat optimization which is, to represent an
empty block (all 0's), you just set the pointer to 0.  This means that a
file created with: 
	fd = open("foo", O_RDONLY);
	lseek(fd, 1<<20, SEEK_SET); /* seek to the 1-Mb mark */
	write(fd, &fd, sizeof(fd)); /* write something       */

Is just over a megabyte in size -- which is what ls will report --, but
only actually uses one block -- which is what df and du will report --
(depending on how your filesystem is configured, probably 1K). 

And, in fact, this would be the situation with the output of the proxy:
the put pointer on the file descriptor in the proxy process is not going
to move just because you fooled around with the file, so next time it
writes, it'll be however far into the file.  So we've got an inode with a
bunch of pointers set to 0, then some real blocks.

In other words, df doesn't lie about how much more you can put on your
disk; ls does.

Note that programs like cp aren't smart enough to notice the inode
structure, so if you cp <logfile> <elsewhere>, <elsewhere> will actually
use what ls says about it (the larger amount).

	-- Benoit

To unsubcribe, send 'unsubscribe rc5-proxyper' to majordomo at llamas.net
rc5-proxyper-digest subscribers replace rc5-proxyper with rc5-proxyper-digest

More information about the proxyper mailing list