Another anti-spam technique for my blog…
I continue to analyze the blog comments that get caught by Akismet and look for patterns that I can use. One of the very common patterns is, of course, the huge number of links, and I’m working on an idea for that. Another thing I have noticed is that a large number of blog comments post their links in both html and bbcode format. I can use that.
You see, WordPress does not use bbcode, so those comments are worthless. It seems that the bots or humans are smart enough to recognize an input form, but not smart enough to know which system they’re posting to.
For that reason, I have enabled the “You can use these tags” option above the comment form. For those of you using phpBB that are interested in leaving a comment, do please make a note of that. If you want to leave a link in a comment, please use html and not bbcode. Any comments posted with “URL” links are immediately trashed; they never even make it into the Akismet queue. Since I have added that code in I have blocked 11 comments that otherwise might have been allowed through.
Here are some spam stats for comments that have been blocked by my custom code since the beginning of the year:
5,224 Comments made without clicking the “Confirm” checkbox
452 Oversized comments rejected (one of those was my own…
)
140 Comments rejected because they called the standard WP comment form directly without going through the link on my site
And, as mentioned, 11 comments rejected because of the inclusion of the URL tag since I added in that code a few days ago.
The reason these numbers are significant is I personally review each and every comment held by Akismet for moderation rather than blindly clicking the “Delete All” button. One of my valued and frequent commenters has somehow managed to get on the spammer list, and every one of his comments have been tagged by Akismet. There doesn’t seem to be a way to put him on an approved list without changing my blog settings, so for now I continue to review my spam queue. And on some days it gets quite long.
At this point my specific measures have blocked over twice as many comments that Akismet has caught. Akismet has caught 2,683. I’ve blocked twice that many just by adding the “confirm” checkbox alone.
Speaking of the confirmation checkbox, if you elect to leave a comment you’ll see that the confirmation checkbox process has changed a little bit.
I find myself wondering if those leaving comments are really humans or bots, you see, so I have created a little test. There are now four checkboxes instead of one on the comment form. Only one of those checkboxes should be clicked. If you click the wrong checkbox, your comment is trashed. If you click more than one checkbox then your comment is trashed. But don’t worry, the correct checkbox is marked.
The battle continues…


Well, this is interesting.
I was wondering if “bots” simply look at the submission form and mark everything they see. In other words, do they read the form, or do the see a radio button (make a selection) and see a checkbox (mark it) and move it?
Since early this morning when I added the multiple checkboxes I have had several comments rejected because they marked all of the checkboxes. Five, in fact. That’s five more comments that were eliminated immediately. The bots are taking the bait and clicking all of the checkboxes without bothering to understand what they’re for. Ha, score one for Dave.
I found this interesting too… examine the following lines from my rejected comment log:
Post number 41 is one of my posts on the Page Permissions MOD. It currently has zero comments. So someone (or something) is trying really really hard to enter a comment. The main issue here is the amount of elapsed time between comment attempts… take a good look and you’ll see what I’m getting at.
Now if you’re a reader of this blog and are trying to comment on that post, I apologize for the difficulty that you’re having. That’s one of the challenges with making life more difficult for the spammers; you run the risk of making life more difficult for real people too. But given the timings, I am fairly confident that what is happening is a bot is entering a comment, then reviewing the screen to see if their comment got accepted, and when they discover it wasn’t they’re simply trying again. And getting rejected.
Comment by dave.rathbun — March 17, 2007 @ 12:56 pm
Further update: it seems that comment bots will either not see the confirmation checkbox (leaving it blank) or they see all of the comment checkboxes, so they mark them all.
I have had 41 comments rejected because of no checkboxes were marked, and 23 rejected because all checkboxes were marked.
There were 3 additional comments that made it through the gauntlet. Akismet correctly caught two, and there was one legitimate comment. I am much happier with the signal to noise ratio now: 67 comments in 12 hours, and only 3 to review. That makes it much less likely that I will miss a real comment that gets routed to the Akismet queue.
This would actually be a very easy method to add to the phpBB registration page as well. I think I will do that.
And of course, I will report back here on my blog with the results.
Comment by dave.rathbun — March 18, 2007 @ 11:06 am
Hi Dave, what you are speaking about? What’s the blog spam?
Joke
Some time ago I developed Advanced Textual Confirmation. (Do you remember Textual Confirmation for phpBB?) It protects everything PHP-based. On my sites, it is: phpBB, WordPress, MediaWiki, PHPwiki, probably something else. Home page:
http://bbantispam.com/atc/
Comment by olpa — March 18, 2007 @ 10:37 pm
The technique of having a random checkbox certainly seems to work. Since midnight earlier this morning there were 81 comments attempted. Of those, 57 marked every checkbox and were rejected. There were also 19 that were rejected because they didn’t mark any checkbox. That left 5.
Of those, one was rejected for using the url and href tags to enter links, another classic spammer technique. That left 4 comments, all captured by Akismet, and all spam.
I can live with that.
I plan to test this concept on a phpBB board that I have as a spammer target. It will be interesting to see if the reg-bots have the same technique as comment spam bots. Comment bots seem to ignore the checkbox that is required, or simply mark every single checkbox that is available.
Comment by dave.rathbun — March 21, 2007 @ 9:22 am
My Akismet queue has been nearly empty all week. Why?
Since April 5 here are the log statistics:
All checkboxes marked: 717
Zero checkboxes marked: 690
Successful comment processed: 11
There were a few other comments rejected for other reasons, but I think that this has real promise. I am finally getting to a point where I can spend some time on phpBB MODs again, and I hope to incorporate my multi-checkbox test on the registration page very soon.
Comment by dave.rathbun — April 13, 2007 @ 9:27 am
hehe, nice to see these unique anti spam methods. bots defenitly need to be ajusted to crack this
Comment by gratis forum — June 4, 2007 @ 8:09 am
When is this site finally becoming public?
Comment by Gratis Forum — December 22, 2007 @ 5:21 am
At this point it might not ever become public.
With the release of phpBB3 the interest in and demand for phpBB2 services is going to decline. I have already cancelled one project that I was working on, and at this point need to consider how much of my time I want to continue to dedicate to phpBB2 for other folks versus my own boards.
At this time, there is still no release date, and in fact the board may remain open only for private clients. But thanks for your interest… things could still change.
Comment by Dave Rathbun — January 9, 2008 @ 9:59 pm