Extreme JS

JS Greenwood's WebLog on architecture, .NET, processes, and life...

A Solution to Spam

I've had a constant e-mail address since 1998 or so.  I've never received more than five to ten real mails a day to that account as I've had others at the same time - whether at University, at work, on Hotmail, and so on.  I've got to keep this account because of all the books and articles that are out there have it as my contact details in the biography, and so on.  The problem is, the junk mail is now dwarfing the real mail (I now only receive one or two real mails a week to that account).  Here are a few statistics:

  • An average of 250 junk mails per day are being received, totalling around 5.5MB
  • 67% of the mail is "spam" in terms of items received
  • 33% of the mail are viruses in terms of items received
  • The average size of a piece of spam (not including viruses) is 11KB
  • The average size of a virus is 46KB
  • Microsoft's spam filter (Outlook 2003) catches approximately 30% of all junk mail, set on it's least aggresive level
  • My own VBA macro when combined with AVG anti-virus and Outlook's junk mail filter catches 99% of junk mail and viruses
  • On a 600Kbps Internet connection, it takes me around 10 minutes to download and process the rules on my mail each day

For me to have to take the time to write a junk mail filter that works better than Microsoft's, this is clearly an issue that irritates me.  When I spotted that the junk mail problem was getting out of control, I started thinking about what the key problems are that led to this, and how to fix them...

  1. One of the main problems is that you can't guarantee the validity of a sender.  Anyone can send me mail, and they can claim to be anyone.  This is due to open relays, the fact that you can set any SMTP details you want on sending, and so on.  This problem will hopefully go away over the next couple of years with new standards that are being put in place to validate servers.  This will no doubt be circumvented by servers being hacked, however.
  2. The second problem is the biggest - other than the cost of electricity/ISP bills, it's free to send e-mails.  Having worked for a marketing company in the past, it's all a matter of numbers...  If it costs you 25 pence (cents) to send an item of mail, and you get a 1% return rate on that, you'd need to make 25 pounds (dollars) from each sale to break even.  But if it costs you a fraction of 1 penny (cent), then you need a much lower return rate, so you can afford to mail a larger demographic of people, and be less selective of the recipients.  This is why there is so much junk mail.

The next generation of e-mail

The way to solve this second problem is to charge for sending mails, even if just a single penny (cent).  To achieve this, there will be a network of mail-authentication servers around the world.  These expose two "Web services" - a SignMessage service, and a CheckSignature service.  Everyone that wants to send an e-mail has to open an account with one of these providers.  Whenever an e-mail is sent from within Outlook/whatever, it would first call the SignMessage service, and get a signature (which would probably just be a GUID).  This would cost a trivial amount.  The message then has a header attached containing the signature.  When the message is received, the mail-server itself or the end-user's mail client sends the signature to the server-network, which validates whether the mail has been paid for or not.  If the signature doesn't validate, or there is none, the mail is destroyed.

There are two main ways of this system being funded - by the charging for mails, or by a subscription/subsidy model:

  1. The nominal charge for signing the message could easily fund the infrastructure required to do this.
  2. Upon successful receipt of a message, a third Web service could be put in place to refund the sender.  If this were the case, an alternative funding model would be required:
    • All invalid mails would go towards the funding
    • Governments could subsidise the network
    • ISPs could subsidise the network (as it would lower their bandwidth costs incurred from spam)
    • Individual users could "subscribe" to the service for a per-annum charge

To me, the obvious candidate for running this service would be Google; they've shown an interest in getting into the e-mail market, and they know how to create a massively scalable, high-availability system.

Posted: Jul 17 2004, 04:34 PM by jsgreenwood | with 8 comment(s)
Filed under:

Comments

Ashutosh Nilkanth said:

Interesting. BTW, have you looked at the Penny Black Project at MS Research: http://research.microsoft.com/research/sv/PennyBlack/
# July 17, 2004 3:40 PM

leon said:

an implementation consideration:

pretty heavyweight to run a web service for this, don't you think? there's already a scalable distributed database that can store cacheable cryptographic signatures, DNS.

in combination with a DNS blacklisting service, the cost of spamming (trying to crack a legit server, or buying domains to send from, and trying to make that work with your zombie spam-sending pool) could skyrocket exponentially.

i think incremental solutions like Caller-ID or SPF have a far greater possibility of realizing success than an all-out replacement. but who knows, maybe gmail will get big enough to have the market share to drive these kinds of changes.
# July 18, 2004 6:08 AM

leon said:

with regards to your current problem, i would try and see if whoever does your upstream mail can do spam and AV filtering at SMTP time.

works wonders for me, messages flagged as spam (in DNS blacklists or with a SpamAssassin rating of >=5) get rejected.

unfortunately i can't do AV filtering at SMTP time yet, but i have it happening while the queue is being processed and before it gets to my mailboxes, so i never see virus email.

i also do envelope sender verification and SPF checks.

i get roughly about 3-4 spam messages a day now, down from 80 before i implemented this.

good luck :)
# July 18, 2004 6:13 AM

Martel Firing said:

A very much simpler system is available right now. It requires no infrastructure changes, no registrations, no universal participation. It doesn't filter or even handle the mail, it just issues and validates stamps. And best of all - IT WORKS.

Rather than trying to describe it in detail here, please refer to my web site:
http://www.usebestmail.com

I'm the author of UseBestMail and it has just been introduced this month. Please take time to explore the web site and try out the software -- it's free (at least for the time being).

Your comments are welcome. I also have a blog at: usebestmail.blogspot.com where you're invited to leave comments. (FYI: I'm new to blogging)
# July 18, 2004 4:18 PM

JS Greenwood said:

Martel,

I'm not sure your solution is "simpler"; it seems to be much on a par, with messages being "signed", only there's the added complexity of throughput-based signing, too (which is a good idea). I admire your effort, and I hope it works well. I do have a couple of concerns, though. Most notably, what's in place to stop spammers from potentially opening hundreds of accounts with your system, and iterating through each of them, sending one mail at a time, enabling them to transmit high volumes?
# July 18, 2004 7:25 PM

JS Greenwood said:

Ashutosh - I'd not seen that, thanks for the link. Nice to see that other people are thinking similar thoughts. :)

Leon - Thanks for the feedback. I've been ardently resisting spending any money on solving my problem thus far, though. I think I might have to give in and pay for a provider that'll allow me to do cleverer things with incoming mail. On the subject of an all-out/incremental solution, this *is* incremental... all other mails could still be received in theory, but the signed ones are treated as "first class mail" (pre-approved). I don't see how DNS can really solve the problem - it can't really cache details for every message sent.

You have raised an interesting point, though - whileever client machines are vulnerable to being taken control of and used for sending mail (as in the case of e-mail based viruses), it would be the owner of that host-PC that footed the bill. Maybe that's a good thing, though - maybe then people would start treating the security of their PC as an important topic...
# July 18, 2004 7:36 PM

Martel Firing said:

Mr. Greenwood,
As to simplicity, you just download and go. None of the new infrastructure you described in your original post is needed, e.g.,
'a network of mail-authentication servers around the world. These expose two "Web services" - a SignMessage service, and a CheckSignature service. Everyone that wants to send an e-mail has to open an account with one of these providers.' This is possible because UseBestMail doesn't require a database of 'registeed users' nor does it require cryptography or ID database schemes. But it does validate the sender's mail, without the need to censor or even handle it.

You can try it right now and I invite you to do so.

Your second question - (paraphrasing) "what keeps spammers from opening multiple accounts" - There are no accounts. UseBestMail monitors IP Addresses and for a given IP address the mail flow is controlled. It is unlikely that any spammer could have "hundreds" of IP addresses. But even if this is done (unlikely for some time), there are simple counter-measures available at the server level to foil the strategy. A series of articles on my web site describe these and other design choices and features of the system in detail.

So this brings up the question: Since you think it's a good idea to have stamped mail, why don't you try out a system that is on-line and working right now? I'd really like to know.
# July 19, 2004 1:29 PM

JS Greenwood said:

Martel,

The problem for me isn't really signing outgoing mail - it's the fact that that account is used really for first-contact incoming mail - I really need everyone that's sending me valid mail to use such a system. The majority of e-mails through that account are technical requests/discussions from readers of my work, and headhunters.

I will give it a try, though, as I am interested in the application. It will also be interesting to see how solutions like this stand up to hacking - whether it's spoofing IP addresses (as noone wants a reply, this is a real threat), remote controlling infected PCs to do the work for them, or any number of other approaches. I'll read the articles you've written on this, too, as soon as possible.

Thanks for the feedback... -J
# July 19, 2004 2:48 PM
Leave a Comment

(required) 

(required) 

(optional)

(required)