Spam, also known as Unsolicited Commercial Email (UCE), Unsolicited Bulk Mail (UBM), junk mail and irrelevant Newsgroup cross-posting - the list goes on – has many different definitions, none of which are ever completely accurate. According to MAPS, (Mail Abuse Prevention Systems), the definition of spam is:
An electronic message is "spam" IF: (1) the recipient's personal identity and context are irrelevant because the message is equally applicable to many other potential recipients; AND (2) the recipient has not verifiably granted deliberate, explicit, and still-revocable permission for it to be sent; AND (3) the transmission and reception of the message appears to the recipient to give a disproportionate benefit to the sender.
To the rest of us spam is the annoying, sometimes illegal, often offensive email that makes its way into our inboxes without us asking for it. It consumes our time, our disk-space and our bandwidth indiscriminately and has become one of the major headaches of the Internet Age. The first known spam was purportedly sent way back in 1978 by a Digital Equipment Corp. sales representative, vaunting the release of the new DEC-20, sending an invitation to all ARPANET addresses on the West coast. Of course, no-one called it spam back then.
The types of spam are many and varied, from the infamous ‘Nigerian Money Scam’, through a huge number of advertisements from pornographic websites to septic tank cleansing and on to the more sophisticated ‘phishing’ techniques, designed to entice unwitting users to enter bank details and other personal information.
The one thing common to all of these is this – You know spam when you see it!
What are the spammer’s goals?
The primary goal for any spammer must be to make money (no-one does this for fun, right?). Whether they are selling their product, advertising someone else’s product, perpetrating a scam or selling lists of validated email addresses to other spammers, the end goal is to put cash in their pockets – your cash. Unfortunately, unlike other mass marketing strategies, email spamming costs the spammer next to nothing to carry out. They often use poorly configured servers and Trojan-infected PC’s, along with the end-user’s bandwidth, CPU and hard-disk space to get their message across. This is tantamount to a traditional mass mailing campaigner sending out bulk mail C.O.D or a telemarketer making collect calls! Because of this, the spammer only needs a very low return rate and can send out literally tens of millions of messages to make it worthwhile. For example, a spam campaign which nets the sender only $1 for each successful hit, with a hit rate of only 0.1%, means $10,000 profit.
What techniques do spammers use?
Professional spammers have a huge arsenal of tools and techniques at their fingertips to avoid detection, bypass blocking measures and ultimately persuade the recipient to buy a product or follow a link.
There are many ‘mass mailing’ tools freely available (these were among the first products to be advertised by spammers) which, coupled with databases containing millions of verified email addresses and lists of computers acting as ‘open relays’, mean that the act of sending the spam to a huge audience is a fairly simple task.
Here are some of the techniques commonly used by spammers:
· Domain spoofing
Spammers will often make the message appear that it came from your own domain.
· Poisoning or spoiling filters
Adding text in the message, often invisible because it is the same colour as the background or contained in HTML tags, to reduce the overall ‘score’ that a filtering product would give otherwise. Another well known method is to use numbers instead of letters in certain words.
· Social engineering
Promising the growth of certain body parts or a dramatic increase in virility or constructing a subject line which is irresistible and must be opened.
· Directory harvesting
Sending messages to thousands of possible addresses at your domain and then collating non-delivery reports and server connection refusals to enable them to build a list of valid email addresses.
· Phishing attacks
Purporting to be messages sent from your bank. These types of attack have become much more prevalent and sophisticated lately and are designed to get the user to part with sensitive information that can be used for identity theft or for withdrawing money from on-line accounts.
A message may include attachments which, when launched, install a Trojan horse onto the user’s PC. This could search the hard-drive for email addresses and send copies of itself through its own SMTP engine to those found, as well as sending a report to the spammer to let them know that they can now control that machine.
What is the threat/cost to businesses?
Apart from the fact that these messages take up valuable resource in the form of disk space, processing power and bandwidth, we also have to take into account the lost productivity of workers having to filter out legitimate messages from the ‘noise’ of spam to administrators spending their valuable time fixing crippled machines and answering user’s questions about these messages. Add to this the risk that companies have when offensive or pornographic material is stored on their systems and the likelihood that users can complain about hostile working environments and the real cost of spam becomes increasingly hard to fathom.
What types of anti-spam solutions are available to me?
Anti-spam products can work in different ways:
· Bayesian filtering
Based on a probability theory published by Thomas Bayes in the 18th century, this method involves breaking the message down into tokens (words, phrases, headers etc.) and defining a probability of spam for each one. The message is then given an overall score.
Similar to Bayesian filtering, heuristics tries to learn from known ‘good’ and ‘bad’ messages and apply that logic to new emails which are delivered.
· Word blocking
A simple list of words which, if found, classify the message as spam. This method can lead to a large number of false positives (genuine email identified wrongly as being spam), especially in certain industry sectors such as finance and pharmaceuticals.
· Local or DNS Blacklisting
Local blacklists filter the messages based on whether an email is from a particular sender or domain. DNS blacklisting checks a database to see if it contains a particular IP address. Both of these methods have limited use as the spammer of today will change these identifying features regularly. In addition, DNS blacklists can produce a high number of false positives.
· Checksum databases
Samples of the message are converted into checksums or signatures; these are then checked against databases of known spam. This method can be very accurate if the database is constructed by large numbers of networked humans. Again, you know spam when you see it. Using this approach generally results in very few false positives.
There are also different deployment options for anti-spam solutions:
· Managed outsourced solutions
All email is routed through a third-party provider who scans the messages and only sends on the good ones for delivery to the end user. While this has the benefit of freeing up resources, overall control is taken away and there can be issues with some companies who, for security reasons, do not allow transmission through a third-party.
· Hardware appliances
These generally sit in front of the mail servers. They can be very effective but are often costly and can require a lot of training to operate effectively.
· Server software
Installed on the mail server, this is a common way of providing anti-spam filtering. While the administrative overhead can be higher than with an outsourced solution, this can be mitigated by choosing a product which integrates with your email software and allows end-user administration.
· Desktop software
This is installed on the end-user PC and scans the inbox for incoming messages. Often used as a second line of defence it can allow the user to define their own blacklists and rules for further filtering.
How do I choose an anti-spam solution?
The goal of an anti-spam solution is to free up valuable resources, both in hardware/network terms and that of the time spent on the problem by administrators and end users. Couple this with the desire to protect the business from litigation of various sorts and we can see that a good solution must be:
A good anti-spam solution should block at least 95% of unwanted emails and, more importantly perhaps, have a very low false positive rate.
The product should be easy to install and upgrade, with minimal effort required on the part of those who use it to keep it effective. Typing all known connotations of ‘dirty’ words is not a good use of an IT administrator’s time; neither is releasing many messages per day to end users because they have been wrongly blocked. Ideally the chosen solution should be native to, or integrate very well, with your current email system so that training costs are kept to a minimum. End users should be able to track which emails have been blocked that were sent to them and release them if necessary.
· Well supported
There should be regular updates to the software available which keep up with the spammer’s tricks. Also, technical support should be easily available should you run into difficulties.
Features like white- and blacklists allow you to fine tune the results. Reports sent to the end user eliminate administrator intervention if they can release their own mail.
There are many different solutions out there as we have already discussed but care should be taken when making the final choice. A survey by the Trans-Atlantic Consumer Dialogue (www.tacd.org) found that although 62% (13,029 people) said that they use a filter, only 17% (3,565 people) said that they were happy with the results. Therefore it makes good sense to thoroughly test your proposed solution before purchasing it!
How do I test an anti-spam solution?
Testing anti-spam solutions before you buy is critical to the overall success of the project. Many vendors offer free trial periods and you should use this time to decide on the effectiveness of the product. Do not take for granted the manufacturer’s claims, test all aspects of the product yourself and solicit feedback from your users.
· Ideally you should test across all users in your live environment, if this is not possible then try to create a pilot group of as many users as possible using a wide cross-section of the business. Train these users to keep a track of how many messages that they consider spam reach their mail files and how many (if any) genuine emails are blocked wrongly.
· Test the products that you are considering one at a time and not ‘in-line’ with each other as this can skew the results, even if you change the order of processing.
· Test for a reasonable amount of time, at least two weeks, before collating results
· Once the testing period is complete, calculate how many spam messages got through to your test group and how many were blocked. Use the following calculation to work out your blocking percentage:
(Spam blocked) / (Spam blocked + Spam missed) x 100
So, if you have 180 spam messages blocked and miss 20,
180 / (180 + 20) x 100 = 90%
· Perform another calculation to ascertain your false positive rate.
(Falsely blocked / Total blocked) x 100
So, for 5 false positives out of 1000 blocked in total
(5 / 1000) x 100 = 0.5%