Checkbox Challenge
I took the idea from my “checkbox challenge” technique that I wrote for combatting the spam-bots on my blog and used it to start development for a phpBB registration MOD today.
Registration bots can be programmed to handle certain variables. For example, I started development a while back on a MOD that changed the “agreed=true” on the registration URL to “xyzzy=1239sb” instead. The variable name was created in the ACP, and the value was generated from the user IP and session. It was based on an idea from Pentapenguin, I just took it a bit further.
And it didn’t work.
What I should say is the code worked fine, it just didn’t stop the reg bots from doing their nasty deeds. I installed this MOD on one board and while I didn’t track the frequency of spam registrations (it may have dropped some) they certainly didn’t stop. I am assuming that the bot is smart enough to try the standard URL, and if that fails, drop back to the registration page and “click” through to the registration form just like a human would do. Bots can certainly do that.
And bots can apparently also easily defeat the phpBB CAPTCHA.
What bots cannot do is react to something that requires thought. There are a number of MODs that present the user that is attempting to register with a question that they have to enter the answer (not just pick it from a list). I toyed with that idea as well. There are a couple of challenges that I see with this. First of all, the answer has to be either very obvious or available somewhere on the site. The “Textual Confirmation” MOD uses questions and answers that the board owner selects. The VIP MOD tells you to “hide” the registration code somewhere on your site. The user has to go find it before they can register. Both of those should work fairly well, and from reports I have seen they do.
I have users that cannot seem to read the “over or under 13″ text. They’re going to find the VIP code hidden somewhere on my site?
But I digress. What the Checkbox Challenge will do is present the potential user with a series of checkboxes. One of those checkboxes will be marked in some way. That checkbox and only that checkbox must be marked in order to complete the registration.
When I first put a checkbox for “confirmation” on my blog comment form all a user had to do was click a single checkbox. That stopped a lot of spam comment, but not enough. I found it hard to believe that humans were responsible for all of the spam, but I also found it hard to believe that bots were smart enough to click the checkbox. So I added three more decoy or “bait” checkboxes as well. The valid checkbox randomly moves from the first through the fourth position, and is marked with a couple of ** characters. If you scroll down to the bottom of this page you’ll see exactly what I’m talking about.
Even then it seemed that I was getting a lot more bot comments than I should. I finally realize what was going on when I added just one more check to my code. I had previously set it so that the confirmation checkbox had to be checked. The bots were, in fact, doing that. But it seems that they were also checking every other checkbox on the form!
Now it makes sense. I wasn’t rejecting those comments because I was only making sure that the correct checkbox was marked. Once I started rejecting comments that had more than one checkbox marked my spam comment count dropped to nearly zero.
What I have seen as far as my behavior analysis for the comment bots on my blog is that about 68% of them are not smart enough to click the checkbox. I’m sure there are some actual user comments in there too… I know I get bitten myself every now and then. Nearly 30% of the comments blocked are rejected because they click not one extra but every extra checkbox. That’s why I call them “bait” now.
I can see the logic behind this… the bot comes and is expecting certain basic components to the comment form. Anything else that it finds it simply reacts to. And reacts to every single component. That means the bot checks the one valid checkbox plus the three additional bait checkboxes. I think it’s significant that I have never (other than for my own test cases) seen a comment where only two or three checkboxes are marked.
It is always zero, one, or four.
So that becomes the basis for my Checkbox Challenge MOD. It has a full Admin Control Panel (ACP) interface. The board owner can configure everything from the number of checkboxes to be displayed, the way the valid checkbox is marked to distinguish it from the bait, even the names of the checkbox input fields on the form. Here’s a shot of the admin control panel as it stands so far:

The banning code has not been written yet. To be honest, I’m not a huge fan of “auto-banning” people, but I know that if I plan to releast this MOD at phpbb.com that’s going to be one of the feature requests. I will put it in, even if I don’t use it myself.
I have combined this with my EZ Registration MOD so that the registration screen the potential user sees is quite simple:

I have one board that is not very active. Okay, it’s not active at all.
But it does have a decent number of links from google and it does seem to be on a lot of spammer lists, as I get several registrations a day there. 100% of them are spam.
I installed the first version of the Checkbox Challenge there this afternoon. For the past month I have been getting an average of 4 spam registrations a day.
I hope to report success in a few days.


Great idea there, i hope you have the same success with phpBB as you’ve had with blocking spam from this blog
Comment by eviL3 — April 15, 2007 @ 7:51 am
So far it looks good, thanks for the words of encouragement.
I installed the Checkbox Challenge MOD on one of my boards that gets essentially nothing but spam registrations, and an average of about 5 a day at that. Since I installed it, nothing. I wish I had thought of this yesterday… but I didn’t… today I altered the code so that it would log rejected attempts otherwise I don’t know that anything is happening.
As of about noon today when I added the log code I have rejected about one spambot registration every hour.
I also removed all of the banned emails during the testing… I want to let them all come and see if they can make it through. I plan to put it on another board tonight, and I have permission to add it to a third board that I host (that is owned by someone else) as well.
So far so good.
Comment by dave.rathbun — April 15, 2007 @ 6:47 pm
36 hours later, and zero spam registrations. There are two legitimate registrations (folks I asked to come test) and every other registration failed to click the proper checkbox and was rejected.
This is looking quite promising.
Comment by dave.rathbun — April 17, 2007 @ 9:44 am
I installed the MOD on two more boards. I also tweaked the log that is managed by the code, and am including the IP address used during registration as well.
Comment by dave.rathbun — April 18, 2007 @ 2:20 pm
i was wondering where i can find this mod and how hard is it to install i seem to get about 8-20 spam bots aday
Comment by harry — July 6, 2007 @ 11:18 am
Hi, harry, sorry to hear about your bot issue. I think we all get that. You can download the preliminary version of this MOD from the development topic at phpbb.com.
I have made some improvements to the code on my own board(s) but have not written them up and updated the code that is available for download yet. The code that is there should be functional, even if it doesn’t have all of the features that the final version will have.
Comment by dave.rathbun — July 6, 2007 @ 3:09 pm