|
|
Methods to Combat SPAM |
![]() |
|---|
|
"Spam" is the term for junkmail, i.e. emails that are sent en masse to a huge number of recipients. They are as unwanted as advertising leaflets in your mailbox, and since it costs virtually nothing to send them, they have become a major Internet problem.
|
| ![]() |
|---|
|
The steadily increasing avalanche of email which is commonly referred to as "spam" is threatening to make email not so viable a method of communication as it actually is. To deal with huge amounts of email is almost as time-consuming as getting stacks of real junkmail on your desk. Well, you donīt have to cut open a lot envelopes. But spam is sabotaging the email service, and the problem became acute in August 2003, when the "SobigF"-virus was let loose. This webpage will (hopefully) provide some technical understanding how spam works, and also provide some general advice how to deal with it. |
There are, of course, other websites that deal with this subject, such as:
"Spam Filtering, Filters, SpamCop Spamming",
Procmail is a free open-source mail processor and delivery agent that both users and administrators can use to automatically process and deliver incoming messages. Procmail can also be used to re-process and re-deliver messages that are already in a mailbox. It only works on Unix systems. |
According to Washington Post, Microsoft is working on programs designed to place a significant burden on those who want to send vast amounts of unsolicited e-mail. Under an initiative called "Penny Black," computers that send e-mail would be required to spend several seconds solving a complex math problem. Such a scenario would cause virtually no slowdown for average users, but spammers trying to send millions of e-mails would be faced with an enormous computational demand. Officials from Microsoft noted that the company is working on several other anti-spam programs and does not consider the Penny Black program to be a "silver bullet." For any solution to be effective, said Microsoft's George Webb, it must have "broad-based deployment across the e-mail system." |
One need, first of all, to differ between spam on the one hand, and malicious software on the other. Malicious software are small programs, such as "viruses", "worms" and "trojan horses", which are often distributed by way of spam mail. Spam can (and certainly does!) be generated by these software categories, but we will limit this discussion to just "spam". Malicious software is a broad subject that should be treated separately. "Spam" is the accepted term for mass-distributed junkmail, and, as demonstrated by the "SobigF"-virus attack, it is not only annoying and time-consuming to deal with, it can also bring down mailservers and thus block email-traffic!
Spam can cause mailservers to be:
A good remedy for this is difficult to implement. Many mailservers have filters that either reject or delete messages from certain other servers, but in order to do this, each incoming message has to be examined regarding its source address, which still takes time. The only way to go is to increase the communication handling capacity of the mailserver, and this is usually done by installing the anti-SPAM-program on a separate computer, with lots of RAM, which receive all incoming mail ahead of the mailserver. The second problem is easier. There are today spam filters that can do a lot of things, such as:
On top of this, mailservers can help each other by filtering and not forwarding mails that fill certain set criterias for being spam. |
An unfortunate side effect of the tide of spam is that many people delete received messages that they suspect of being spam without reading them. What they look at are:
As most people have discovered by now, the sender address is not to be relied upon as a criteria for deleting a received mail, because:
So; a good and relevant subject line is pretty vital to prevent your e-mail being inadvertently dismissed as spam by some readers. It is even more important when you consider the increasing use of rule-based e-mail filters that use very unforgiving software to classify incoming messages as spam or not-spam. Thus, be sure to avoid certain subject lines. For instance, don't use money in the subject line; e.g. "Buy this for only $50". Many of your readers will have spam filters than kill off anything with a dollar sign in the subject line. So, to make your life as a mail recipient easier:
Bouncing mails (i.e. returning them to the sending address) might not be a good idea, however, because:
It is usually better to leave the "bouncing" to the mailservers. They can make it look like your address doesnīt exist. Read more in chapter 8 on this page about this. |
![]() ![]() By keeping a low profile, and not brandishing your email address around, you can, to some extent, avoid spammers. But most of us want to be visible on the web, usually in the line of business. In many instances you will want to put your email address on your website, even if you provide a form as a contact method. This is useful because it increases the number of ways that someone can contact you. Why is that important? Because presumably if you have a website which you want to show the world - and also to communicate over - and this communication goes both ways. If you have a commercial site, then the answer should be obvious - someone may want to purchase something. So, of naturally your website needs to contain email addresses. But, of course; the problem with directly including your email addresses on your webpage is spam harvesters. What they do is search through the internet, looking at web sites and pages for email addresses to add to those million email address collections that you see advertised (mostly in spam) all over the web, usually for a price. There is no ironclad way to prevent these programs from scanning your web site for email addresses. Changing your email address every few months is counter-productive, as you will discover when your friends and business acquaintances time and again have to update their address registers. You will be regarded as an annoying fellow. There are a number of techniques, however, to make it a little more difficult for the spam harvesters. The best of these is to code your email addresses in something called Unicode. Other methods use "enail address cloaking", where the address is hidden, using JavaScript, encryption and the like. But; as the address harvesting programs used by spammers get increasingly smarter, they will decode most of these. There are, however, small scripts like Privacy Notes Spambot Buster, which use a roundabout way over a server to get at the email address, and this method has so far been successful. When no mail address is available, direct or encoded, it canīt be harvested. |
| 5. Is this a losing battle? |
|---|
Many people contend that spammers and virus-producers are ruining the usefulness of Internet in general and email in particular. We will be so wary of virus and tired of spam that we will not use the great facility that email is, to its fullest extent. But this attitude is wrong, for two reasons;
|
Spammers can of course operate from mailservers in countries that do not have these kind of laws. But then, such mailservers could easily be blacklisted by other mailservers, making them more or less inoperable. |
|
![]()
| 6. Limiting the Choice of Mailserver |
|---|
|
One method of identifying spammers, which has been implemented on most ISPs (= Internet Service Providers) by now, is to allow access to the SMTP-service only over the mailserver which belongs to the same ISP and modem pool which the sending client is connected to. It functions as shown in the example to the right. Letīs assume that the sending client is connected, by way of a modem and the telephone network (fixed or mobile phone network, it does not matter), to the modem pool belonging to ISP1. He can still fetch his mail from his own mailserver, no matter where on the Internet it is. But when he tries to send email, he might get into difficulties, if he has not configured his mail client program correctly. Outgoing mail, as well as incoming mail, has to go through a mailserver, but not through any mailbox, so the sender wonīt need username or password for that; he will just use the SMTP-protocol for his sending. But the ISP he is connected to will (likely) not permit him to use any other mailserver for his outgoing mail than the one in ISP1. If he tries to use the mailserver in ISP2, he will get an error message. Likewise, if he is connected to the modem pool in ISP2 and tries to use the mailserver in ISP1 he will get an error message. In this way, no matter that he tries to fake his sending address, the receiving mailserver will know that he is likely a subscriber at ISP1, and thus the sysop at the receiving end can address a letter of complaint to (in this case) ISP1. There is an exception to this rule. If the sender knows that the receiver has his mailbox at ISP3, he can in most instances access this mailserver (at ISP3) directly. But that wonīt be good enough for a spammer. |
|
|---|

| 7. Regarding SPAM-filtering |
|---|
![]() Todayīs spam-filters can be very sophisticated, with a host of options. There are two kinds.
Spam-filters that work together with mailservers basically do two things:
|
"This mail is probably spam. The original message has been attached along with this report, so you can recognize or block similar unwanted mail in future. See http://spamassassin.org/tag/ for more details." One can naturally wonder; is not this an invasion of your mail privacy? Well, it probably is, but if youīre not a crook, you will probably conclude that it is worth it. If you are a crook, you will probably be encrypting your email-messages anyway.
Regarding deletion of mail-attachments; many e-mail-users have their mail programs set to automatically process executable attached files of various kinds, often without knowing it. This creates an inroad for virus, which we are not dealing with on this page. But there is thus three good reasons for server-based anti-spam programs to remove some of these attachments:
|
Depending on the anti-SPAM-software installed at your mailserver, and on how it is configured, you could find one or more of the following message lines in your received mails. (20.10 points, 5 required) |
The criteria for setting these spam points vary for every situation, for every purpose and, of course, they also vary over time. Itīs a kind of war between the spam-filtersī counter-measures and the spammers counter-counter-measures. Itīs educational to check these reports which are inserted into the mails. I have included two examples here, and we can see that both of them qualify as spam; the first one has 20.10 where the threshold was 5, and the second (to the right) gathered 6.80 points. In the first example, one can note that "address is webmail, but starts with a number" is very detrimental. Apparently one also has to be wary about including exclamation marks. To use HTML-code in e-mail arouses suspicions, which is natural, since these are obvious tools to create attention-getting text. One should be wary about change font-size and color when composing mail. "MAILTO_TO_SPAM_ADDR" means that the mail contains one or more email-links to known or suspected spammerīs addresses. | Using UPPERCASE_25_50 excessively is also noted. in this case the mail gathered 1.5 points just because the message body was 25-50% uppercase text. In the other example, one can note that many spaces in subject line is not popular, nor is text that is regarded as "shouting", which is when you combine uppercase text with an ending of two or more exclamation marks. However, I can tell you that this mail did not contain more than one exclamation marks at any one place.
"DATE_IN_FUTURE_12_24 (1.9 points) Date: is 12 to 24 hours after Received: date" is rather interesting. One should not tamper with the computerīs clock when sending email. Finally, note that it is regarded as suspicious behavior to use Microsoftīs "FrontPage" or similar program to compose e-mail. A program that is not meant for e-mail but for producing web pages. So, there are things to consider, when you are composing e-mails to persons protected by spam filters! |
Another example
SUBJ_HAS_SPACES (1.4 points) Subject contains lots of white space |
|
As you can see from this example, each received email is assigned points by the server-based spam-filter, according to a number of criteria, and after some intricate calculations. When an emailīs total points reache a treshold, that email gets assigned spam-status, and the emailīs subject line is often complemented to reflect that, with a text reading for instance "spam-tagged". This makes it easier for the recipient to sort these mails away in his own anti-spam-program. There are individual scripts available, that can be placed in each recipientīs folder on the mailserver, and executed as each mail arrives. The purpose of this script is to delete right away all arriving mails that fulfill certain criterias, and thus considerably reduce the amount of mail that the recipient has to deal with manually. A proper script would include, first, a "whitelist" listing all mails that should be let through, regardless of content, such as mails from certain senders. After that could come the "blacklist", where received mails are checked for sending addresses and deleted outright if these match a stored list of addresses. The remaining emails are then scrutinized and assigned points according to set criterias. |
After these script lines, one can add statements saying that mails with a spam level of 5 or higher should be deleted outright, mails with levels between 3 and 5 be stored on the mailserver for a week or so (giving you a possibility to check through them before deleting), and mails with spam levels below 3 to be let through to your personal anti-spam-program, for you to be checked and deleted manually. The procedure is graphically illustrated below. Email that contain executable attachments usually get higher spam ratings than those that have non-executable attachments. Attachments containing virus get even higher ratings. Both categories of attacments can get removed by the spam filter, which then include a message in the mail, stating what it has done. If the mail (without its attachment) gets delivered (which is not certain, if you use the scripts above) you will thus get a mail with a text similar to the one at right.
Yes, it happens that email get assigned negative spam points, which tend to counter-balance the spam score. These are used when non-spam emails are recognized, but this apparently does not protect these same mails from getting positive points as well. |
Removed Mail AttachmentsContent-Type: text/plain; name="warning1.txt" Content-Disposition: inline; filename="warning1.txt" Content-Transfer-Encoding: 7bit MIME-Version: 1.0 X-Mailer: MIME-tools 5.411 (Entity 5.404)
WARNING: This e-mail has been altered by MIMEDefang. Following this paragraph are indications of the actual changes made. For more information about your site's MIMEDefang policy, contact MIMEDefang Administrator
And, of course, the attachment would be missing. |

|
The figure above illustrates what we have been talking about. Letīs assume that we run the mailserver on Unix, and use a script procedure called "procmail". Then: 1) The arriving email gets picked up by the spam filter, which is usually installed on a separate computer. The spam filter does three things.
|
2) The mail is handed over to the mailserver, which sort it into the appropriate mailbox. 3) If there are scriptfiles for this recipient, these are activated. The script called "forward" directs the mail to be processed by the script mentioned above. This script can sort the mail into thre categories:
|
4) The recipient checks his mail, using his anti-spam program. 5) Using this program, he can instruct his mailserver to bounce and/or delete some mails. 6) After this process, he fetches remaining emails using his regular email handler. 7) The recipient can, now and then, check and delete temporarily, spam-marked emails on his mailserver. Although this might sound tedious, the recipient is only involved in steps 4 to 7, and, if he uses good scripts on the server, he can very well skip steps 4, 5 and 7, and just use his mail handler in the same way as he did in the good old days, before spam. The "flow" of this process on the mailserver is illustrated in the figure at left. |
| ||
|---|---|---|
![]() |
8. To Bounce or not to Bounce | ![]() |
|---|
Anti-spam programs offer an opportunity to bounce received junk mail, in addition to deleting them. It is of course tempting to do just that, but one has to consider:
|
Does this scheme work? Not really. First off, the spammer has seen to it that the bounced mail, whether from a mailserver or from you, does not reach him, because he has faked the sending address as described above. Second, he doesnīt care to read, or even check the returned mail. Understandably, he would never have time to do anything like that. He will just delete all received mail en masse. But, the curious person might want to know: if the spammer scrutinizes the header, could he then see if the bounced mail got bounced because the address was really non-existent, or because it bounced by a recipient? Yes, he could, if he would really go to that trouble (which he wonīt, if he is a reasonably(??) sane person. I performed a simple test, and sent two mails from mailserver A to mailserver B, one mail with a faked and one with a real address. Both addresses purportedly belonged to the domain "swedetrack". The mail with the faked address got bounced by mailserver A (and not the destination-server B), the one with the real address a bounced myself, back to mailserver A (Since A was put as SMTP-server in my mail program). Both bounces were then fetched att receiving mailserver C, and compared. |
The two mails have, of course, travelled over different routes, but there are more telling differences. 1) The faked-address mail that got bounced had the "From"-field: "Internet Mail Delivery, postmaster (etc.)mailserver A" with the Subject line: "Delivery notification: Delivery has failed". 2) The real-address mail bounced by me had the "From"-field: "Mailer-daemon@swedetrack.com", with the Subject line: "Returned mail: User unknown". So; clue number one: the real-address mail (not the bounce, but the mail itself!) reached its destination mailserver. It shouldnīt had done that if the address was non-existent!
Clue number two: The faked-addressed bounce contains the text "Your message cannot be delivered to the following recipients: (etc.). That revealing info does not appear in the bounce that I performed myself. |
|
The figure at right illustrates what I have described in the text above.
1) Both mails get sent to server A.
5) The manually bounced mail gets fetched by the sender. |
![]() |
|---|
| 9. How About Paying for Wanted Mails? |
|---|
| Instead of blocking unwanted messages, two California-based companies are offering services aimed at limiting spam by ensuring delivery of wanted messages, despite increasingly stringent Internet Service Provider (ISP) filtering programs. "IronPort" offers clients "bonded" e-mail, for which clients attest that their messages are directed at recipients who want to receive them. | "IronPort" works with ISPs to ensure delivery of those messages, but if unwanted mail is delivered, the client is liable for costs, even if complaints are not proven. Habeas offers a service, free to private users and licensed to businesses for up to $5,000 per month for bulk e-mailings, that embeds Haiku into messages. There is embedded text that identifies those messages as valid to those ISPs who agree to the service. | Analysts question whether these services that guarantee legitimate e-mail and challenge the future of free e-mail will succeed, and whether they will adequately address the problem of vast and increasing volumes of unsolicited e-mail. But; itīs up to everyone to pay for what he considers worth having. |
| 10. SPAM over Cellular Phones? |
|---|
It is so easy to be enthusiastic over technological advances that one tends to forget that there are always people who are prepared to use the new technology for personal gain, regardless that they destroy the usability for other people in the process. When supermarkets replaced small shops, this radically reduced prices, until shoplifters became such a problem that security measures had to be taken. The cost for that, and for all pilfering, has to be paid for by the honest customers.E-mail over the Internet is a tremendous saver of time and money but, as has been seen, this is now being largerly offset by the time-consuming task of scanning and deleting junk mail. |
What about the emerging broadband wireless technology?
GSM-phones are now gradually being replaced by the enhanced 3G-phones. When will spammers discover how easy it will be to deliver voice messages to your telephone answering machines? It will take about one minute of listening, before you realize that this is not a new potential customer, but spam. 30 such messages waiting on your answering machine, and you will have wasted half an hour. It will be awfully difficult to design spam filters that can separate junk messages from those that you want to hear. The sending address can be faked. Voice, wording and intonation of a message are those criteria that an anti-spam filter would have to check, there are no other useful criteria. |
Already today we have seen the emergence of junk-SMS-messages, that occupy the narrow bandwith allotted to SMS. They also needlessly occupy the message buffers of the cellphones, so that they have to be checked and deleted pretty regurlarly, so that there will be buffer space for genuine SMS-messages. One can object that sending such messages will cost money to the spammers, in difference to IP-based spam over Internet, which is virtually free. That may be true, but then one forgets about VoIP, Voice over IP, which allows anyone to use the Internet IP-protocol to send voice messages. It wonīt be free of charge to use the 3G technology, but cheap enough to tempt spammers. |
|
Last Updated: 2007-01-02
| Author: Ove Johnsson |
|---|