Saturday, May 30, 2009

Boolean Logic

I've been seeing a lot of people post their email addresses on their blogs with something like "send me an email at username at domainname dot com", thinking that this will outsmart spambots.

Object lesson: ("email" OR "mail" OR "message") {adjacent 3 strings to} ([Username] AND ("@" OR "at" OR "a") AND [Domain] AND ("." OR "dot" OR "point" OR "period") AND [top-level domain]).

Thing is, spambots are programmed by people. The first spammer who saw the old "at domain dot com" trick wrote up a Boolean string to get past it in about 45 seconds, like I just did. Probably a lot better than mine. Just sayin'. Not saying there's no way to get past spambots, but you've got to be a bit more clever than that.

1 comment:

Shane said...

You are merely scratching at the surface.

You may or may not remember this xkcd comic about Regular Expressions, but pretty much every programming language has a robust regex library available for doing exactly this kind of stuff in a very clean and efficient way. Like a single line of code.

Professionals who design things like email validators (for legitimate sites) and spam bots have thought through any scheme you can think of for masking your email address. They will have debates amongst themselves how perfect their systems are:

http://www.regular-expressions.info/email.html

In any case, the organized criminals behind the major spyware/botnet infestations are very intelligent computer scientists. The real key isn't to outrun the bear, it's to outrun your friend.