Brother Turner
    Work for an ISP for 4 years, and you become an expert at several things

  • Babysitting
  • Inane office politics
  • 18-hour work days
  • Sub-standard pay rates
  • Some UNIX administration and programming :)

    But above all else, you learn that SPAM is a Way Of Life™. And lately, it's a way of life that's getting freakin' out of control. While we try to fight SPAM with RBLs and SMTP Socket Listener Deny Lists ( say that 42.6 times fast ), it doesn't seem to be helping all that much ( DDoS attacks on our SMTP servers is a daily occurance now... *sigh* ).

    Something that has piqued my interest as of late is the use of Bayesian Analysis to determine if a message is or is not spam. This interest formed when a co-worker of mine pointed me to Paul Graham's page containing his thoughts of using Bayesian Math to make spam so ineffective that it becomes prohibitively expensive to use... pure joy to the 99.999% of the internet's users that despise spam.

    While I'm not naive enough to believe that spam will ever be completely stopped ( we all get shit in the postal mail every day... same thing, won't die ), I do believe this approach has merit above even the most finely tuned RBL. Why? Because Bayesian Analysis continually learns and isn't persuaded by humans, bullshit laws, or threats of litigation.

    So what's with this page? This is my attempt to help anyone else who wants to try this with their mail servers/accounts. I don't claim to be a statistician, but I am figuring out what does and does not work... and the more of us that try to stop spam, and with as many different approaches as possible, the more successful our labor will be.

 

this page has said 'piss off' to [Unable to overwrite [/var/www/virtual/katan.com/members/brTurner/cgi-bin/counter_data/spam_index]: No such file or directory] users