The overall philosophy behind how we are running the Spamassassin is that your mail will be filtered by the server Mail.CS.Trinity.Edu . Spamassassin divides your mail into two mailboxes identifying the mail in them as ham (Non-spam mail), and spam. The names of the two mailboxes are arbitrary. The simplest thing is to have the ham mail placed in a mailbox in your default mail directory called inbox. The the spam mail could be placed in a mailbox called something like spamfound. In addition you might have two mailboxes called ham and spam that will be used for training as we will see shortly. The reason for using inbox as the mail file your mailclient (such as mutt, pine, kmail, etc) is that without using something like POP3 your mail client can't access your real mail spoolfile on the server, which is:
/var/spool/mail/<your_login_name>If you want to use a mail client and have it pop mail then you would leave your mail in the normal Linux spool file. As you go through your mail you may find mail in your spool file that is actually spam so it should be transferred into the spam mail box. If you find mail in your spamfound mailbox that is really ham then it would be transferred to your ham mail box. Any spam found in your inbox mailbox should be transferred to the spam maibox. Now initially you might find that Spamassassin will make a large number of mistakes but with time and training the mistakes should shrink to the point that you would expect your spoolfile would be nearly or maybe totally spam free. Training should be done with about equal amounts of spam and ham. So one might purposely transfer some messages from your inbox (your good mail file) to the ham folder to equalize out the number of spam and ham messages you are using in training. More about this later. The data gathered by Spamassassin to train your mail is in a databases found in the .spamassassin sub-directory in your home directory.
The filtering program is called spamassassin-default.rc and, as you will see, it is run by a .procmailrc script that you place in your home directory.
Now the .procmailrc file that you place in your home directory will be actually run by the server Mail.CS.Trinity.Edu, but since the client machines and the server share your home directory you can do all the procmail configuration on a client as well as train the spamassassin databases and these files will be used for filtering on the server.
You should perform the following steps to setup your Spamassassin configuration.
1. Adding the following recipes to the top of your .procmailrc will get the spam out of the way. Allowing everything else to be filtered as per your normal procmail recipes. The example .procmail script shown below presumes that you have subscribed to the fedora mail list and that you wish to have all fedora messages written into a mailbox file called fedora.
PATH=$HOME/bin:/usr/bin:/usr/ucb:/bin:/usr/local/bin:SHELL=/bin/sh MAILDIR = $HOME/Mail # You'd better make sure it exists LOGFILE = $MAILDIR/procmail.log LOCKFILE= $HOME/.lockmail INCLUDERC=/etc/mail/spamassassin/spamassassin-default.rc :0 *^Subject:.*\[SPAM\] spam :0 *^To:.*email@example.com fedora :0 inboxThe line with the references to fedora allow you to put all mail that is addressed to:
firstname.lastname@example.org into a mailbox called fedora.
2. You train the database using the following commands assuming that you have collected ham in a file called ham and spam in a file called spam.
sa-learn --mbox --ham Mail/hamFor spam:
sa-learn --mbox --spam Mail/spamIt turns out that it is productive to train spamassassin on the ham and spam that it has already identified.
Spamassassin configuration file
Your Spamassassin configuration directory is called .spamassassinrc and is located in your home directory. Create this directory using the
mkdir .spamassassinrccommand while in your home directory.
In that directory you will find the spamasssin databases as well as a configuration file called, user_prefs. This file can be generated by using the web interface at http://www.yrex.com/spam/spamconfig.php . The default configuration file produced by this web page is:
# SpamAssassin config file for version 2.5x # generated by http://www.yrex.com/spam/spamconfig.php (version 1.01) # How many hits before a message is considered spam. required_hits 5.0 # Whether to change the subject of suspected spam rewrite_subject 0 # Text to prepend to subject if rewrite_subject is used subject_tag [SPAM] # Encapsulate spam in an attachment report_safe 1 # Use terse version of the spam report use_terse_report 0 # Enable the Bayes system use_bayes 1 # Enable Bayes auto-learning auto_learn 1 # Enable or disable network checks skip_rbl_checks 0 use_razor2 1 use_dcc 1 use_pyzor 1 # Mail using languages used in these country codes will not be marked # as being possibly spam in a foreign language. ok_languages all # Mail using locales used in these country codes will not be marked # as being possibly spam in a foreign language. ok_locales allIn this file the required_hits option tells the system what level of “spamness” the user wants to treat as spam. The subject_tag you choose should match the one being searched for in the .procmailrc files.
You can execute the command:
perldoc Mail::SpamAssassin::Conffor details of the options that can be used in this file.
Further information on Spamassassin can be obtained at http://www.spamassassin.org .