Introducing Spam Karma

Picture spam_sandwich.jpg
UPDATED: 12/09/2004 15:46 JST From now on, please check the central Spam Karma page to get the latest updates and news on this plugin.

Yet another techy update for my fellow bloggers using WordPress.

Now that it’s reached version 1.4 and that most (all?) major bugs have been ironed out, I feel it’s time to introduce the latest member in the ever-expanding WordPress plugin family

Spam Karma is a mean critter that truly enjoys killing

In fact it is so mean that we had to keep it in a special military-grade containment unit on this server.

Genetically engineered in the dark recess of our Secret Spam Research Labs and trained through months of reflex conditioning and shock therapy, this thing, once unleashed on your comments, will only let go of its death grip after the last spam has been shredded to pieces.

We haven’t fed it for a week now, and it could smell spam miles away in its sleep.

But while a fierce and merciless spam killer, this plugin is also a perfect companion for your kids and friend’s comments. Only the unmistakable foul stench of spam will trigger its ire… while questionable, yet potentially legit, comments will always be given a chance to clear themselves before being irremediably disposed of.

If you are using WP Plugin Mgr, install is as easy as a click on the “Check Updates” button and a click on the “One-Click Install”… Yep, that’s all.
For those still stuck in the last century, a manual install archive is available here. Please, please, RTFM: it’s short, sweet and contains essential details.

Once installed, make sure you check at least once the Option screen (in wp-admin, click on Options >> Spam Karma).

I strongly recommend you check for updates (if you are using WPPM it will do it automatically for you) at least once a week so as to make sure you benefit from the latest bug fixes I might make.

Spam Karma v. 1.4 is now compatible with WordPress 1.2: however due to the lack of certain functions in WP 1.2 Plugin API, some of the features are missing (Option Page integration etc). It is fully enabled for use with any fairly recent release of WP alpha 1.3.


Cool, but How does it work?

Layman’s Explanation

Spam Karma works by running every new comment through a battery of filters and checks. Each of which increase or decrease the comment’s ‘Karma’ value. Depending on the final score, the comment is either:

  • Approved
  • Discarded silently as spam (no email is sent to you, unless you specifically require it, but a digest is sent to you every X spams deleted).
  • Placed in Moderation mode. With the possibility for the commenter to auto-moderate his own comment by proving he’s not a spammer (by filling a Captcha or checking a confirmation email).

This whole process insures (by order of priority):

  • No deleted false positive (bad bad bad).
  • Extremely few moderated false positives (annoying): uses Captcha and email auto-moderation to keep these at a minimum.
  • No published spam.
  • very little spam held in moderation (must be destroyed directly: really annoying to have to moderate it).

Further more, Spam Karma works in an intelligent way to automatically update its filtering database and grow stronger with each spam it catches…

In short: blocks spam with no unnecessary annoyance, for you or your visitors. The way it should be.


The Detailed Explanation

For our more tech oriented friends, here are a few more insights on the rather complex process used by Spam Karma to decide what’s spam and what’s not. Each of the following filter is given a weight varying on many factors, ranking from user-controlled values (e.g.: after how many days is a post “old”?) to the credibility that can be given to a test (e.g.: a missing header is less important than a blacklisted IP).

Mostly, Spam Karma looks at the following things:

  • If the poster is logged in the current blog, and what his user level is (e.g. automatically approve Admin posts).
  • Presence of HTML entities (e.g. {, ʚ etc).
  • Presence of a HTTP_VIA header.
  • Proper use of the posting form (hash value must be present).
  • Time taken to fill the comment (e.g.: if it’s less than a few seconds, most likely spam).
  • Posting granularity. First time posters posting many comments at once vs. old-timers (with comments previously approved by the admin).
  • Previous diagnostic from WP’s built in comment check (set on the ‘Discussion’ panel).
  • IP and regex match for URLs contained inside the comment (small weight only for non-URL text matching a URL regex).
  • Realtime Blacklist (RBL) Server check for IP and URLs.
  • Comment’s age (e.g. penalize comments on very old post).

In addition to these filters, Spam Karma uses different treatments and backup checks to insure it becomes better at stopping further spam and that it never deletes mistakenly a legit comment:

  • Ambiguous comments (that can neither be deleted or approved) are given a second check: commenter is asked to solve a Captcha or use the email auto-moderation (an email containing a hash to unlock the comment is sent to the commenter’s email address). If confirmed, the comment’s Karma is bumped up and the comment is either published or held for further review, if not confirmed within a certain period, its Karma is lowered and it is either deleted or kept into moderation (if it was sufficiently high to begin with).
  • When a comment is struck as spam, its IP and URL(s) are harvested and submitted to the Admin for inclusion in the blacklist. In the meantime, they are used as “auto-added” values, with a lesser weight than permanent blacklist entries.
  • When destroying a spam comment, it checks for recently posted comments that match similar values and retroactively moderate them (e.g.: a spammer could manage to slip X numbers of spams onto a blog, but upon reaching a certain suspicious threshold, all the comments would get retroactively moderated, then deleted).
  • Spam Karma uses a central DB to retrieve IP and URL updates. By default, it will query the DB automatically every 2 days (can be disabled). Central DB can be configured. Each install of Spam Karma can work as a sort of P2P relay in the update process (both fetching updates and publishing its own updated list for others to grab).

Thanks and Acknowledgement
Many, many people have contributed, knowingly or not, to this plugin, with their ideas, code, help, testing, advice and support… I ended up rewriting most of the code I took from these plugins, but it nonetheless gave me a solid base to start with quickly. Thanks guys.


If you encounter any error or misclassification of comments (false positive, undetected spam), please contact me and preferably include the whole comment content, such as it appears in the admin screen (with the Spam Karma debug values).

Any comment or suggestion always welcome…

Filed under: Geek, WordPress

112 comments

  1. Thanks, Dave — wow, it works great! I’m so pleased. Hey, where’s the tip jar? I think the least I can do is buy you some beer/dog treats/imported Japanese video games.

  2. Has anyone had a problem where the comment text of a comment has been replaced with the word ‘object’? One of my plugins is doing this, and I’m suspicious of Spam Karma 🙂

  3. Hi, i have this msg every time i execute :http://www.e-tonilopez.com/wordpress/wp-content/plugins/spam-karma.php?spamk_setup

    *———————————————————————————–

    Auto-updating Blacklist Table.

    Populating WP-Blacklist Table from: ‘http://www.unknowngenius.com/blog/blacklist/’
    ERROR: Could not download from this Blacklist URL.http://www.unknowngenius.com/blog/blacklist/

    Populated WP-Blacklist table: imported 0 values, skipped 0 duplicates.

    Error: did not import anything…

    *———————————————————————————–

    Help please..

    Greetings

  4. If you were using WP 1.3 like Michael and had noticed strange behaviours in the rare cases where SK would flag a comment for moderation, please update:1.9 fixed that. Actually, now matter what, if you are using an older version (current version: 1.10), you should update (and check the new Spam Karma page often enough for updates)..

    For Tony and others who’ve been getting errors during DB updates: this is actually a problem with certain install of PHP. If your host decided to configure PHP in “safe-mode”, then it is impossible for Spam Karma to connect to the mothership and automagically update its Blacklist DB at all. I have added a workaround for this in the latest update that will let you use a manually downloaded blacklist instead (just install the plugin and follow instructions). You might wanna consider asking your host to enable the “allow_url_open” flag, though, as it would make your life much easier (auto updates etc).

  5. Pingback: Dr Dave's Blog
  6. Thanks, Dave — wow, it works great! I’m so pleased. Hey, where’s the tip jar? I think the least I can do is buy you some beer/dog treats/imported Japanese video games.

    Lisa,

    Thanks a lot for the support… Well, just need to ask: http://unknowngenius.com/blog/archives/2004/12/09/buy-me-gifts-im-pretty/ 🙂

    I love dog treats (and beer), although I think I’m fine on the front of Japanese video games: I get up every morning and that’s usually enough.

  7. I have tried installing SK four times now. Twice via the plugin manager and twice manually. I only have access to my server via ftp, so I have to set the folder permissions via the file manager in cpanel. Kinda worky, but it should work nonetheless.

    Anyways, when I installed manually, I seemed to have everything working except the captcha doesn’t show up. I get the graphic, but no text in it. There is also no email link at the bottom.

    When I install via the plugin manager, I get an internal server error when trying to go to the configuration page.

    Please advise if you have time. Thanks.

  8. I also get the internal server error when installing via the plugin manager but when I refresh the page, it works fine.
    i’m not sure what its all about

  9. Looks like karma filter can also block legitimate comments. I was trying to comment on a post of too many spams at a site which wouldn’t let me complaining I had bad karma!

    This is stupid!

  10. Angsuman: The reason you had problem on other sites was the same you had problem on this very blog: your IP appears to be blacklisted by Spamhaus http://www.spamhaus.org/query/bl?ip=203.200.160.28
    There are many ways a legitimate user can end-up on one of these blacklists, and admittedly they yield a lot of false positives, but this is a choice to be made by the owner of the blog. Unfortunately, in some case, it is difficult to efficiently keep spammers away without setting restrictions that can affect legitimate users. At any rate, this is an option everybody is free to disable in Spam Karma, if they decide to, and I am in no way responsible of the blacklist itself.

    If you follow the instructions on Spamhaus website, you should be able to have that ban removed easily.

  11. Could you please make it so that Spam Karma doesn’t delete the comments in thinks are spam?! I’m loosing comments almost every day, and there’s no way to get them back 🙁

  12. Did the spammers who have been rejected once, succeed to learn astuces to re enter your site?

    Napo

  13. spam karma banned one of my best friends! she’s a legit wordpress user herself and is not at all IT savvy. i have no idea why spam karma would ban her as she is always polite and kind (she’s a grandmom forheavensakes).

    my spam karma settings at the time were on moderate and i did not touch the other settings as setup advised that if I didn’t know what i was doing to leave well enough alone.

    what went wrong?

  14. @Dr Dave
    I didn’t check back on your response until today. I figured it out myself sometime back why Spam Karma hates me 🙂
    My internet traffic from home is going through a IP of my hosting provider, which it uses for the whole city of Kolkata/Calcutta, India. Presumably someone used it to spam. More likely they had some virus/malware on their machine which did the damage. Actually there was one such incident which clogged all traffic for few hours. So anyone I also am forced to send my traffic from this unfortunate IP address. And as you can see there nothing much I can do about it. I can delist from SBL, but if someone from the entire city sends a spam it will surely be listed again.
    My question is why not force a CAPTCHA in such cases. And why not give a more polite message. It feels like commenters are being incarcerated without trial 🙂
    I can tell you as a commenter and as a blogger (from 2001) the SK messages are terribly insulting and due to such false positive issues as above can hurt legitimate viewers of a site.
    Also realize that 99% of spammers use a bot of some kind to spam. Insulting messages don’t deter bots. And spammers are surely thick skinned too, otherwise they wouldn’t be spamming in the first place.
    So in the end real users get the brunt of Spam Karma language bombs, if I may say so 😉

    Angsuman Chakraborty

  15. To Whom It May Concern,
    I just attempted to leave a comment in response to another comment at Jinlynch.com and your stupid program blindly and without ANY reason and/or merit, deleted and would not allow my comment to be posted! Before unleashing parasitic and socially cancerous programs such as yours to regelate and control speach in a open and public forum, you should be sure it’s fair and just in it’s operation! I’ll be contacting my attorneys in the morning to pursue legal action against the creaters of this hidious program. No society can truly be free when there are forces who have unilateral and uncontrolled power to regulate social commentary! See’ya in court.

Comments are closed.