Having a spam filter in place is vital these days, and the one I use is SpamBayes. This has an excellent Outlook plugin which seems to work very well. The accuracy is very good (I have 2.5k spams and 10k hams for training), though I do get a reasonably number of SpamSuspects which are mostly Spam.
The problem I had was that it is a client only solution – thus I had to download all that email to Outlook before it was correctly identified as spam. This is a particular problem when I am on site with clients, and my only email access is typically via webmail. Scanning real email amongst the multitudes of spam is not something I enjoy.
So, I wondered how I could get the spam filtering to happen on my ISP’s server without me having to download it. Like many ISPs they do have a SpamAssasin option, but I really don’t find this very accurate when processing, plus the fact that it was changing my emails meant I would have to retrain SpamBayes.
Fortunately, my ISP (Beyond Perception) is accommodating about the software I can install there, so I worked out how to install SpamBayes filtering directly on the server using the spam/ham training database that I had accumulated locally in Outlook.
.forward
This file sits in my home directory and just forwards emails to be processed by Procmail which is used by the ISP. Some trial and error and Googling showed the following magic spell does what I need.
~$ cat .forward |/usr/bin/procmail -f-
.procmailrc
This controls how Procmail works.
~$ cat .procmailrc
# .procmailrc for using Python SpamBayes filter
MAILDIR=$HOME/mail
DEFAULT=$MAILDIR/inbox
LOGFILE=$MAILDIR/log
# Run the spam filter program on all emails which will then have X-Spam headers in for
# processing by later rules below. This just executes the Python SpamBayes program specifying the
# appropriate database.
:0fw:hamlock
| $HOME/utils/spamfilter
# Anything classified as spam put directly in spam box (might eventually
# just delete it.
:0
* ^X-Spambayes-Classification: spam
${MAILDIR}/spam/inbox
# The above just puts the mail directly into a file. If I used the following action it would forward
# the email (depends on your mail handling setup)
# ! spam@vaccaperna.co.uk
Transferring spam/ham classification DB
The above works fine but needed "training". I realised I could shortcut this using the spam/ham classification database used by my local Outlook installation. All I needed to do was export the database to a text file (using dbExpImp.py which is a script which is part of SpamBayes), ftp that file up to the server and recreate the local database there. This is a slightly manual process, but the whole thing is working well enough that I did it initially and only transfer updates every month or so.
The database is sitting in:
C:\Documents and Settings\<username>\Application Data\SpamBayes
The command to export it is:
dbExpImp.py -e -D default_bayes_database.db -f db.txt
This required a slight tweak to the SpamBayes files to ensure the correct Database format was used (Berkely DB).
The reverse procedure on the server converts the text file back again.



