megan@elon

megan conklin's blog -- elon university, department of computing sciences

Saturday, December 13, 2003

SIG-CSE poster & musings on spam & viruses

I got approved to present a poster at the 2004 SIGCSE (pronounced "sig-see") conference in Norfolk, VA March 3-7. A poster is not a very big deal at all, but I'm excited because it will be my first SIGCSE conference. This is the ACM's Special Interest Group on Computer Science Education. My poster is entitled "Lemons Into Lemonade: Using Spam as a Teaching Tool".

And in related news, Gavin Stubberfield (real name: Jeremy Jaynes) was arrested this week in Raleigh for peddling products through unsolicited commercial email (spam). This is great news. Unfortunately, it's a little too late, since I'm more or less retired from the spam-fighting business. It was a short, but interesting, career. In fact, the SIGCSE poster will be my last spam paper.

What drove me out of spam fighting? Well, I was trying to fight spam using both the law and computer science at the same time. I was trying to develop data mining techniques that would be able to (1) match spam messages to individual spammers, and (2) organize the spams of end users in order to make prosecutions and lawsuits of these spammers easier. This was sort of a spammer fingerprinting project.

Then, last month, the US Congress passed the atrocious CANSPAM legislation, that took away an individual's right to sue a spammer. Thus, no need for my software. ISPs can still sue, but they don't really need the kind of software I was making. (Mine was more designed for an end-user who wanted to "mine" their own spam without using grep or whois or any of that stuff.) So, it's back to the drawing board for me. There are still many, many interesting projects going on with mining spam, but most of them center on building better spam filters, rather than on the kind of pattern matching (spammer fingerprinting?) I was trying to do.

Well, why not pitch in with the filtering projects? The Bayesian techniques they're using are based on machine learning, right? And this is my area.... Ugh. The whole spam filtering thing reminds me of the anti-virus business. (Which I already lived through once.) The virus writers write a virus, the anti-virus folks incorporate the virus signature into their product. Lather, rinse, repeat. To me, this gets really boring. The anti-virus companies get accused of "wanting" viruses to continue to spread so that they can sell more product. There is little public research being done on viruses; they're not particularly interesting anymore from a technical standpoint, so what is there to study besides ethical issues and sociological mumbo-jumbo? You could use machine learning or data mining techniques to try to separate viral from non-viral code, but these techniques are not used in any commercial product that I'm aware of. Why not? I don't know.

Wednesday, December 10, 2003

Prepping for Winter Term

Mostly I've been working this week on the Winter Term (GST) class. I've watched Minority Report, Tomb Raider, and X2 this week, and have rented the oldies Johnny Mnemonic and Net Force.