Some people wonder just how bad the spammer problem is with phpBB2. I can answer the question posed in the subject of this blog post in one word: Very.
As part of an experiment and a desire to capture more seed data for the upcoming relaunch of the bbProtection service I set up a phpBB2 board with no protection other than what is built in to the software. I have enabled user activation and I have activated the visual confirmation. I launched the board on August 15. Within 48 hours I had my first spam registration and my first spam post. The honey pot process has started slow but I’m getting an average of four registrations a day so far. Nine of the 17 have posted at least once (over 50% ratio). None of the posts are anything you would want your children to see; it’s really nasty stuff.
The only MODs I’ve applied to this board are a MOD to capture the IP address during the registration process (in case the bot doesn’t post I still want to know where they’ve come from) and to add the “nofollow” attribute to every link. If google finds this board I don’t want to be penalized for all of the nastiness on the other end of the outbound links.
I’ll be back in a month to post more statistics about the board. It should be interesting.
The media team for phpBB Londonvasion 2008 has posted the videos from the public day of the conference. Yay! I’m going to be hosting my own talk here on my site soon, but the entire group of presentations are available at vimeo.com. Here is my topic:
The talk includes a case study that regular (and long-term) blog readers are likely to be very familiar with: the Checkbox Challenge MOD for Wordpress, phpBB Registrations, and even Comment Forms. During the talk there were a few times where I took questions or had discussion from the audience but I think most of the “dead air” has been removed thanks to AdamR’s fine editing skills.
The bbProtection folks have launched a blog. The most recent post mentions that they’ve opened up the IRC channel for input from the user community.
At this point I would like to mention that I was invited to join the bbProtection team a few weeks ago and I accepted a limited role. I won’t be doing any coding (at least that’s the plan). My role is more of an enthusiastic user than anything else, I guess. I have offered my input as to the relative value of some of the features being considered and suggested some others. I hope to be able to provide some value as far as the database design and tuning, as that’s where my main expertise lies.
Why mention this now? Because if you do pop in to the IRC channel as discussed on the team blog, I may be there as one of the team members that you see. I don’t go into IRC every day but if I am signed on, I will be in the channel. If you have any concerns or comments about the service I would be happy to hear them, as would any of the other team members.
At Londonvasion 2008 I delivered a talk about various anti-spam techniques available for board owners. One of the challenges that is facing board owners today is that spammers are getting more creative at masking their true intentions. They post stuff that looks like legitimate content but it contains cunningly masked spam. Unless a board owner takes the time to research the rest of the web, it can be difficult to determine if the same content is appearing on other boards.
That’s where a service like AKismet (for Wordpress) or bbProtection (for bulletin boards) comes in. More…
I’ve provided a link to a pdf version of the presentation that I did at Londonvasion. If you don’t want to download the entire presentation, here’s a brief recap:
There are three different elements in the fight against spam, as outlined here:
Prevention means being able to keep spammers from getting on your board in the first place
Detection means being able to quickly identify and react to spam if it is posted
Elimination means being able to easily and thoroughly clean up the mess that a spammer has left behind
One of the boards that I help administer is seeing a new form of spam that I call fake signatures. It’s very irritating, but quite creative. The people (or person?) doing this are joining the board and getting past the checkbox challenge, so I assume they’re human. They are posting what at first glance looks like legitimate content. But there are symptoms. More…
I have another site that doesn’t yet have an active phpBB board attached to it, but it does have a blog. So I added my Checkbox Challenge for Blog Comments and all is well. A few weeks ago I started getting all sorts of emails from my comment form, of all places, all plugging various blogspot blogs. $%@# spammers, don’t they realize that the only one that’s going to see the comment form content is me? as in one person?
I have since added my Checkbox Challenge to the contact form, and the spam has been 100% eliminated. I read more about it here as well.
But I don’t get it. Forum posts? public content. Blog comments? also public content. Comment forms? Nobody gets them but me. What a waste of time.
A few days ago someone commented on my blog concerning the timing of spam. When does it happen? During the day? the night? on weekends? So I started playing with numbers.
I like playing with numbers, in case you had not figured that out by now.
For a very long time the #1 country top level domain (TLD) source for spammer registrations has been Russia. One of the primary offenders is mail.ru as I’m sure many of you are aware. For a while it seemed that .org addresses were the ones being abused, and then it was .info. Now it seems that China (.cn) is the new TLD of choice for spammers.
It was almost exactly one year ago today that I posted about what has become the final version of the Checkbox Challenge comment blocker for my WordPress blogs. At that time I was logging the comment attempts and the results of the challenge to a text file. Now I log things to a database so I can search them easier. I had thousands of entries in the text file log before I switched, and I did not bother to convert them. (I just checked; the text file was used from January of 2007 to August of 2007 and contains over 22,000 lines of data.)
The database logging process was added on August 1, 2007. So I ran for about seven months without the database log, and now have run for just over six months with the database logging in place. The results are, frankly, both astounding and a bit scary.