I have a few customers that use basic IMAP e-mail hosting through a hosting provider such as Bluehost. This basic email account provides all the features they need but they were getting blasted with hundreds of spam messages every day. Most of our customers are using Exchange Online and G-Suite, both of which filter spam very well. I was struggling to configure SpamAssassin to filter out the bad stuff without filtering out the good stuff and couldn’t really find any tutorials online to help with this matter. Here we go…

What is SpamAssassin?

SpamAssassin is a mail filter to identify spam. It is an intelligent email filter which uses a diverse range of tests to identify unsolicited bulk email, more commonly known as Spam. These tests are applied to email headers and content to classify email using advanced statistical methods. In addition, SpamAssassin has a modular architecture that allows other technologies to be quickly wielded against spam and is designed for easy integration into virtually any email system.

How does it work?

This flexible and powerful set of Perl programs, unlike older spam filtering approaches, uses the combined score from multiple types of checks to determine if a given message is spam.

Its primary features are:

  • Header tests
  • Body phrase tests. For more information
  • Bayesian filtering (BayesFaq)
  • Automatic address whitelist/blacklist (AutoWhitelist)
  • Automatic sender reputation system (TxRep)
  • Manual address whitelist/blacklist (ManualWhitelist)
  • Collaborative spam identification databases (DCC, Pyzor, Razor2)
  • DNS Blocklists, also known as “RBLs” or “Realtime Blackhole Lists”
  • Character sets and locales

Even though any one of these tests might, by themselves, mis-identify a Ham or Spam, their combined score is terribly difficult to fool.

Tips and Tricks for using SpamAssassin in Cpanel

  1. Many rules should have greater weight or significance. For example, the rule “BAYES_99” should have a near threshold value. (If you’re using the default score of 5, use something close like 4.5)
  2. You do not want one attribute or rule to trigger the filter. For example, I assigned “T_SPF_PERMERROR” a 3 value because while a bad or missing SPF record is a red flag, some people just don’t know how to set these up properly.
  3. You can assign negative values to positive rules that matter. This may be the most important thing to do, if you are going to get aggressive with negative rules, you need some positive rules to counter. For example, “RCVD_IN_MSPIKE_WL” means the domain is whitelisted, so I gave it a value of “-3”. I also assign similar negative values to “DKIM_SIGNED” and “DKIM_VALID”  as they are good indicators of authentic e-mail.
  4. If the e-mail user doesn’t communicate internationally, you can benefit from adding a mid range value to rules such as “RELAYCOUNTRY_CN” or “RELAYCOUNTRY_RU”. China and Russia relay a significant amount of spam and malicious messages.
  5. READ THE PROCESSED HEADERS!!! This is beyond important as it will show you new rules or modified values you can use to catch the email next time. Each user may need some customization to the ruleset to work properly, this is generally done by analyzing the headers of both good and bad messages.