[RC5] spam

David Taylor rc5 at xfiles.nildram.co.uk
Mon Jun 21 18:01:46 EDT 1999


On Sun, 20 Jun 1999, Ben Li wrote:

> >Denie Andriessen said:
> ...
> >It would still be rather easy actually, although admittedly it would be 
> >more work for the spammer.  Once you know the font being used and have 
> >a sample of every character in that font (easy) it would not be hard to 
> >write a program to compare blocks of pixels with those which are 
> >already known and to reconstruct the email addresses.
> >
> >It's just about the simplest form of image recognition.
> 
> I don't think that that is a really big concern since that would require
> quite a bit of work from a spammer, just to get a maximum of ~100,000 email
> addresses.  One GIF for each character sounds good since it prevents easy

I agree, though the work wouldnt be too difficult, it would be more effort
than the average spammer would put into something -- like car theives, you
don't (can't) make them 100% secure, you just design it to delay the
theives for 5-10 minutes.

> One GIF for each character sounds good since it prevents easy
> harvesting from pages but the number of HTTP requests (neglecting local
> agent caching) would increase very dramatically due to the need to fetch a
> file for each character in each address displayed.  A simpler way of

Uhm.  That would mean he could forget the image recognition, just figure
out the name of the 'a' image, 'b', image, 'c', image, '@' image, etc.
It would be the same as the above (find an example of each letter),
followed by little/no work on modifying a little webspider to collect the
filenames and convert them.

> preventing much of the harvesting would be to send the addresses as HTML
> ampersand codes, like @, which will render as an address when viewed as
> a human but will not show up as addresses when read as a file.  Any browser
> worth using will support the use of such codes (including lynx).  This
> solution, however, would increase the average page size in exchange for an
> increase in the number of server hits as above.  Randomly embedding ASCII
> 000 in addresses may also work since browsers generally do not render nulls
> whereas harvesting programs may treat them as something to be parsed :-)  

Again, these solutions are *very* easy to work arround with little effort.
The .gif idea, while I originally liked it, is also possible to work
around, would increase the server load, bandwidth usage, and mean I'd need
to load up netscape just to check my damned stats.
 
> 
> Cheers,
> 
> Ben.

-- 
David Taylor
E-Mail:	dtaylor at nildram.co.uk.spam
ICQ:	268004
[Remove .spam from e-mail to reply]

--
To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
rc5-digest subscribers replace rc5 with rc5-digest



More information about the rc5 mailing list