Server-Side Spambayes

25 01 2009

With most spam filtering systems,  I always have to check for (or at least worry about) ham that’s been misclassified as spam, and I sometimes have to look at spam messages in my inbox.  Life’s too short.

For years now I’ve been using SpamBayes instead.  I just love it.  Because it has a separate “unsure” folder that receives the few messages it can’t classify with confidence (these are almost always spam), it is able to provide an extremely low rate of false negatives and effectively zero false positives.  And that tells me exactly which messages I need to train on.

Well, I’ve finally gotten around to documenting and publishing my approach.  I’ve posted an article and the code at GitHub.  I hope this helps somebody else as much as it helps me.




