rc5 at xfiles.nildram.co.uk
Mon Jun 21 18:01:46 EDT 1999
On Sun, 20 Jun 1999, Ben Li wrote:
> >Denie Andriessen said:
> >It would still be rather easy actually, although admittedly it would be
> >more work for the spammer. Once you know the font being used and have
> >a sample of every character in that font (easy) it would not be hard to
> >write a program to compare blocks of pixels with those which are
> >already known and to reconstruct the email addresses.
> >It's just about the simplest form of image recognition.
> I don't think that that is a really big concern since that would require
> quite a bit of work from a spammer, just to get a maximum of ~100,000 email
> addresses. One GIF for each character sounds good since it prevents easy
I agree, though the work wouldnt be too difficult, it would be more effort
than the average spammer would put into something -- like car theives, you
don't (can't) make them 100% secure, you just design it to delay the
theives for 5-10 minutes.
> One GIF for each character sounds good since it prevents easy
> harvesting from pages but the number of HTTP requests (neglecting local
> agent caching) would increase very dramatically due to the need to fetch a
> file for each character in each address displayed. A simpler way of
Uhm. That would mean he could forget the image recognition, just figure
out the name of the 'a' image, 'b', image, 'c', image, '@' image, etc.
It would be the same as the above (find an example of each letter),
followed by little/no work on modifying a little webspider to collect the
filenames and convert them.
> preventing much of the harvesting would be to send the addresses as HTML
> ampersand codes, like @, which will render as an address when viewed as
> a human but will not show up as addresses when read as a file. Any browser
> worth using will support the use of such codes (including lynx). This
> solution, however, would increase the average page size in exchange for an
> increase in the number of server hits as above. Randomly embedding ASCII
> 000 in addresses may also work since browsers generally do not render nulls
> whereas harvesting programs may treat them as something to be parsed :-)
Again, these solutions are *very* easy to work arround with little effort.
The .gif idea, while I originally liked it, is also possible to work
around, would increase the server load, bandwidth usage, and mean I'd need
to load up netscape just to check my damned stats.
E-Mail: dtaylor at nildram.co.uk.spam
[Remove .spam from e-mail to reply]
To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
rc5-digest subscribers replace rc5 with rc5-digest
More information about the rc5