SpamBayes is a Bayesian spam filter written in Python. It is free cross platform software. The most notable difference between a conventional Bayesian filter and the filter used by SpamBayes is that there are three classifications rather than two: spam, non-spam and unsure. The user trains a message as being either ham(Valid email) or spam; when filtering a message, the spam filters generate one score for ham and another for spam.
SpamBayes is not a single application. The core code is a message classifier, however there are several applications available as part of the SpamBayes project which use the classifier in specific contexts. For the most part, the current crop of applications all operate on the client side of things, however, a number of people have experimented with using SpamBayes on mail servers to classify incoming mail for multiple users.[advt]
If the spam score is high and the ham score is low, the message will be classified as spam. If the spam score is low and the ham score is high, the message will be classified as ham. If the scores are both high or both low, the message will be classified as unsure. This approach leads to a low number of false positives and false negatives, but it may result in a number of unsures which need a human decision.
Be the first to comment