How to Deal With Email Spam?
Since I already touched the topic of email spam this week, let me send out a question to all of you. How do you deal with overwhelming email spam? Not even counting any of the spam my company LandlordMax receives each day, which I can assure you is an order of magnitude more, I personally get several thousands of spam emails a day! Yes, that’s each and every day. I know some fo you get a fraction of that and others get orders of magnitude more! I can’t imagine being Bill Gates (most spammed person in the world) and dealing with his spam levels (4 million spam emails a day).
What I’m starting to notice is that this is becoming a losing battle. I generally let my email client classify most of the emails as spam and throw them in the bulk/junk/spam (depending on which email client you use this will be named differently) folder. However at least a few hundred don’t get caught, and worse, several do get marked as spam when they aren’t (false positives). This therefore means, because I’m running a business and many of the emails are very important and not just personal correspondences, that I need to sift through the junk folder each and every day. I need to double check thousands of spam emails each day. Talk about boring, and especially error prone! It appears that I miss the odd email here and there (not many but a few a month), which is perfectly understandable from my perspective, but not from those who sent the email (which is also perfectly understandable from their perspective). So what are my options?
I do have several options, and I’ve also been suggested several others, but to be honest I haven’t found a solution that I really like yet. So far the best options I’ve seen are:
Continue the same way
This is where I’m at now but I’m looking for a better solution. This is getting more and more time consuming.
Use Spam Assassin
I’m afraid of false positives! Those are the absolute worse. At least with my current solution I can manually filter them and the blame is on me.
Purchase a commercial email spam filter
Which one? And what about false positives?
I forget the term of this type of product, but it basically sends out an email to everyone who emails you to validate who they are. Then it only allows emails through to you from those that responded to your validation email. So basically its a way of validating the emails you’re getting are from someone and not a bot.
Frank Neville from Surfulater (great product by the way) suggested this option a while back to me (thank you Frank) but I’m not too keen on it (this is probably the only time we respectfully had a difference of opinion on a business related item so far that I know of). I understand it’s working great for him, but I’m still hesitant because of the risks of not having people respond. I believe that the onus should be on you, not the person trying to contact you. If someone is trying to reach you to initiate a business deal, or respond to your request, the less trouble they have to go through the more likely things will work out for you. Much like the easier you make the purchasing process the higher your sales conversions are likely to be.
Just delete all the emails in the spam folder
No way, I know for a fact that I’d be deleting many important emails!
Add senders to my “safe list” (or similar terminology depending on your email client).
Yes, but this takes time. This does alleviate the issue, but only from people who’ve already writtent to you. Initial contacts will still have the same issue, which means you still have to sift through all the junk emails.
So what’s the solution? I don’t know. I’d love to hear your suggestions so fire away!
· October 18th, 2006 · 1:30 pm · Permalink
We use Outlook and have found SpamBayes (free at https://spambayes.sourceforge.net/) to be very effective. After training we get at worst just a couple of false positives a month. However, this may prove less than effect from those in the mortgage industry.
· October 18th, 2006 · 1:52 pm · Permalink
[…] Steph, founder of LandLordMax Property Management Software, has posted a very descriptive post about the problems of dealing with email spam. […]
· October 18th, 2006 · 1:59 pm · Permalink
It’s interesting that you mention a Bayesian filter. That’s exactly what we use for our support system. So far it’s pretty accurate, it took a while to train because of the exact reason you mention, the words in our emails are often used in spam emails (mortgages, real estate, loans, etc.)
This does look like an interesting open source solution for a personal client. I’ll take a look at it.
Thanks for the suggestion!
· October 18th, 2006 · 2:22 pm · Permalink
After some more reviewing, I decided to try out SpamBayes. It looks very powerful and in common use.
Something to note, when you first install it, all your unread emails go to the Junk Watch or Junk Suspect folder (I don’t remember which one).
That caught me by surprise because I initially thought it had marked all my emails as read. That wasn’t the case, they were just put in another folder. For me this was very scary because I use read/unread as a flag of what emails I’ve responded to, as well as a todo list (I currently have over 1000 emails as todo tasks). Yes I do use other software for my todo (TreePad), but this is also very quick as I don’t have to bring everything over.
Anyways, this is just to let you know not to be surprised like I was!
· October 18th, 2006 · 3:04 pm · Permalink
SpamAssassin is actually pretty good about false positives (or lack there of). The worst case for it is first-contact from unknown people, but even then it does well with the default ruleset.
The biggest thing that it adds (IMHO) that a pure Bayesian system can’t is checks against various blacklists and distributed hash networks. The checks for Razor and Spamhaus take out an enormous amount of junk, and I’ve yet to have any hundred or so domains I host mail for have a problem with blocked mail.
SpamAssassin’s score based rule system allows you to tweak the system VERY heavily, and it’s personal white/black lists ensure that important mail from known sources will always get through. My server has all of the RBL-based rules’ scores turned WAY up which helps a lot.
The biggest problem with SpamAssassin (at least as we have it implemented with Qmail and qmail-scanner) is that the upgrade process for new versions is anything but seamless. Generally I set aside a weekend to upgrade the virus scanner (ClamAV), qmail-scanner and SpamAssassin in one go. Granted our config is patched pretty heavily and most of the upgrade issues are with qmail-scanner, so YMMV.
That said, SpamAssassin does a phenomenal job of blocking junk. I get maybe 3-5 a day that slip through on an email address that’s been active since 1997.
· October 18th, 2006 · 11:39 pm · Permalink
[…] Steph of LandLordMax fame recently posted about how to deal with email spam. This just happens to be the focus of my day job (we’re busy exploiting the synergies of a new paradigm of… actually, no, I don’t work in Dilbert’s office). I use spam filtering religiously on all my email accounts (4 at last count) and it saves me roughly 2,000 spam a day (my business email address is forwarded to by my college email address, which was widely circulated back when I was young and stupid). […]
· October 19th, 2006 · 12:34 am · Permalink
An option someone hooked me onto a little while ago is the concept of “Greylisting”.
Basically it just tells a mailserver that you’re busy and to resend the message in 5 minutes (or however long). Spam email servers aren’t built to deal with this kind of thing, since they’re blowing through email as fast as humanly possible…so supposedly this puts a huge damper on the spam you receive.
The only flaw I’ve seen with it (aside from non-compliant servers just not returning your mail at all) is that you’ve got a five-minute wait delay in your email. Perhaps you can combine this with some form of Whitelisting to let certain people straight through the filtering?
· October 19th, 2006 · 2:01 am · Permalink
Howdy Steph,
The first step is changing to two email addresses, one which receives all support requests, pre-sale enquiries, feedback, and all enquiries originating from your website.
The other one should receive all emails from people you meet in person. For example, if you meet someone in the elevator who might be a potential client, you would give him this email address. This is also the one which is printed on your business cards etc. This one shouldn’t be available anywhere on your website.
The next step is, adding a contact form to your website, and removing all mailto: links and other mentions of the first email address. The contact form would email you whenever someone uses it, with a known string such as ‘Web Support:’ in the subject. Then you would configure your email client to filter all email, except those with ‘Web Support:’ in the subject to be sent to your junk folder.
Since the second email is never revealed on the web, you won’t get any spam (or, only a little spam) on it. And since the first one is only supposed to be used from the contact form, you won’t have to worry about other emails.
Just the best I could come up with. Also, if you really hate this problem, why not scratch your own itch and build a decent spam-fighting prodocut for your uISV? ;0)
· October 19th, 2006 · 4:33 am · Permalink
A combination of approaches gives the best results. Maintaining white / black lists ensures that you can kill messages you know are spam and ensure receipt of messages from trusted sources. Use bayesian filtering to attempt to sort out the rest and any message from a source not on the black list which the filter junks should be sent a validation request.
Joel had one of his interns implement bayesian filtering for FogBugz, a paper is available.
· October 19th, 2006 · 4:55 am · Permalink
1000s a day!
I too recommend SpamBayes for Outlook. It works beautifully for me, catching around 300 spams a day with no errors.
· October 19th, 2006 · 7:08 am · Permalink
we use amavisd with spamassassin as spam-filter. works like a charm and is highly configurable. spamassassin doesn’t have to delete all the spam, if you don’t like it to; you can also just add the sa-checks as additional header in the mail and let your client do the sorting-out, for example.
i just installed another nice plugin for spamassassin, FuzzyOCR. this script uses a free ocr-app to check images in mails for words as “commercial”, “free” and so on. works like a charm – all those f*ing image-spam-mails get caught and sorted out.
· October 19th, 2006 · 12:39 pm · Permalink
Thank you everyone for the great feedback, it’s been really helpful!
Here are some additional thoughts. Personally, I don’t really know that much about email, nor do I really want to. Although it’s A core part of my business, it’s not THE core of my business. The less resources I need to allocate to handling this issue the better.
At this stage as we’re still fairly small, I can’t afford to have someone playing and configuring the mailserver. I looked at SpamAssassin and to me the biggest issue is that although it appears very effective, it requires a substantial amount of time and effort. As well, upgrades seem to be anything but seemless… This is not something I want to do, or allocate resources to. The benefits do not outweigh these higher costs. Maybe when I reach 10,000-20,000 spam emails a day, but for now it’s not worth it (our support and customer service emails are handled by another system).
Patrick, from MicroISV on a Shoestring, suggested an alternative solution called Popfile which he says is much easier to use. After the praise he gives it, I have no doubt! I looked into this option a little, and it looks interesting. I’m going to give SpamBayes a little more time before trying something else, but this will probably be my next option unless something else attracts my attention.
Another suggestion was grey listing. Although I appreciate this suggestion, it unfortunately puts the onus back again on the person trying to reach me, which I don’t really agree with.
As well Ali suggested I limit the number of email addresses supplied for LandlordMax. This is fine when we were smaller, but as grew we needed more differentiation. Also, since you might not be as aware of how our support system works, none of the support emails come to me. We’ve already purchased a customer support system (HelpSpot which I recommend). This is only for emails addressed to me from this blog and my company (and some internal email addresses from our project management system, support system, etc.). The spam we get for LandlordMax is actually filtered through HelpSpot’s Bayersian filter, which is working great now that it’s been trained.
So far I have to admit I like SpamBayes. It took minutes to install, didn’t require any server configurations, and is very effective. Because of the quantity of emails I receive, it didn’t take nearly as long as I thought it would to train. Within a day I’ve already noticed a significant drop in spam emails in my inbox. Of course there have been many false positives, but these are already starting to drop significantly! I suspect that within a week or so it’ll be at a point where I’ll be content enough to leave things as they are.
The less amount of effort I have to spend to resolve this issue the better. This is not my company’s core competency, nor this does really bring in any additional revenue (it does however prevents me from losing revenues). This is like building a customer service system because it’s interesting rather than what pays the bills. It’d be fun, but I can’t justify it.
I’ll definitely post a follow-up article in a month or so to let you all know how it’s going. And please do continue to offer suggestions, it’s really helping me! And I’m sure many of the other people who come here are finding it useful too! Thanks again for all the help 🙂
· January 6th, 2007 · 6:31 pm · Permalink
After doing a lot of experimenting with various spam filter systems here is the solution that work best for me:
1.First all emails get filtered through a whitelist. The software I’m using can scan through my archived emails and address book and add all emails on installation. After this, everyone I send a email to automatically get white listed.
2.Then the rest of the emails go through a White rules filter, the exact opposite of a spam filter. Emails containing my name or company name are accepted (email addresses are easy to harvest, but very few spammers are able to get my name and address unless you give it to them). Emails sent from my website (using a secure contact form with hidden “to” email) is accepted. On my site I also ask customers contacting me by regular email to put a verification code in the subject line (if you click the link this code will be inserted automatically). All emails containing this code is accepted.
This covers everyone I ever received or sent email to plus practically all my customers who contact me through my website. Whats left are people contacting me out of the blue without using my site.
3. These emails are filtered through a Bayesian filter. The filter also allow for rule based filtering. I’m using rules to filter out sexually explicit words in the subject and image spam (Most Bayesian filters are not very good in detecting image spam.).
4. Emails getting a very high spam score from the Bayesian filter get deleted automatically.
5.The rest of the emails are checked for spoofed from addresses (using SPF records).
6.The emails passing this final test will receive a challenge response message just to make sure no real emails ever get caught in the spam box without giving the sender an option to get his message through. Of course no email is ever challenged twice.
This combination works quite well because:
A.) Everyone already known to me get through with no filtering. I provide options for everyone not already known to me to get trough via my website. 99% of all legitimate emails will reach me through this whitelist/rules system.
B.) Most real emails not caught by my white list/rules will be accepted by my Bayesian filter and reach my inbox with no hassle or need for people to verify themselves.
C.) The Challenge response email assures no valid email ever will just disappear. At the same time the automatic deletion of forged emails and high spam score emails reduces the amount of challenges sent and minimizes the drawbacks of traditionally configured challenge response systems that indiscriminately will just send lots of emails to everyone not on your whitelist.
· January 8th, 2007 · 8:38 pm · Permalink
Hi Ronny,
That’s quite a spam filtering system you’ve built yourself. Can I assume then that you get a substantial amount of spam? I can’t imagine going through all this trouble for just a few hundred, or even a few thousand, spam emails per day…
Also, one thing I personally don’t like is the challenge-response system. This might work on a personal level, but when it comes to a company’s face, I have to disagree with it. The onus should be on the company, not the individual to contact you. Enough said, this was already debated above.
Otherwise I think you have a very powerful spam prevention system. Have you every looked at creating a commercial product based on this?
· March 13th, 2008 · 1:20 pm · Permalink
[…] really frustrated with the levels of spam my personal emails were receiving so I wrote a blog post asking what others were doing to reduce their email spam. Someone suggested SpamBayes, a free open source solution, which worked great for a while but then […]