Home

Your premium source for custom modification services for phpBB

  logo

HomeForumsBlogMOD ManagerFAQSearchRegisterLogin

Comments September 19, 2008

Spammer “Tells”

Filed under: Anti-spam, bbProtection, phpBB — Dave Rathbun @ 6:11 am CommentsComments (14) 

If you have ever watched poker (or other games that involve bluffing) then you might have heard people talk about “tells” from other players. A “tell” is simply something that the person does – perhaps without even being aware of it – that gives away certain information. Spammers do the same thing. If I can find their tells then I can use that information against them, just like I could use that information to my advantage in a poker game.

Here are some “tells” that I have identified after analyzing my phpBB2 honey pot board with one month of spammer data.

Time Zone

A while back someone posted a MOD at phpbb.com that banned anyone that registered with the time zone of GMT – 12. If you check, you’ll find that GMT – 12 is in the middle of the ocean. Which reminds me of an old joke which I will paraphrase here:

Question: What do you call 10,000 spammers at the bottom of the ocean?
Answer: A good start!

:lol: Okay, maybe that one was just for me… on with the program…

So here are the statistics from my honey pot board for all users other than the original admin (that would be me) and the Anonymous user:

+---------------+----------+
| user_timezone | count(*) |
+---------------+----------+
|        -12.00 |      463 |
+---------------+----------+

Hm. That looks like a fairly significant tell to me. Every single spammer registered with the same time zone. Why do you suppose that is happening? Is it because that’s the default time zone? In fact, it’s not. On my honey pot board I set the board timezone to GMT – 5 which becomes the default for new user registrations. That means that spammer bots are specifically changing the time zone from -5 to -12 during their registration process. The only thing significant about -12 is that it’s the first option on the drop-down list. It would seem that the registration bots are making sure they select something, and in this case it’s something that nobody should really be selecting.

Is this a bullet-proof tell? It’s hard to know for sure, but the odds seem favorable.

User Location

What about the user location field, are there any patterns there? Here are the top 10 locations provided by spammer registrations on my board:

+-------------+----------+
| user_from   | count(*) |
+-------------+----------+
| Sex Relaxxx |      141 |
| USA         |       36 |
| Russia      |       33 |
| adult       |       19 |
| US          |       18 |
| Canada      |       18 |
| Greece      |        8 |
|             |        6 |
| Jamaica     |        6 |
| Kazakhstan  |        5 |
+-------------+----------+

The first one seems to indicate a spammer, as does the fourth. It’s hard to say much about the others.

Then there are those that enter a complete web site in the location field. There are only 6 (out of 464) on my honey pot board that did this, and to be honest I have seen legitimate users do this as well, so it would be hard to classify this as a solid tell of a spammer.

Profile Websites

For many years I observed spammers that would try to register on my boards only to get their web sites listed in their profile, which would then be displayed as a link on the memberlist. The first anti-spam measure I took was to prevent inactive members from showing up (a very simple, common, and popular MOD that can be found at phpbb.com as well). The next step was to prevent a user from entering a web site until they had posted a few times.

However, things seem to have changed. These simple measures became so popular that I suspect spammers started doing things to work around them. One of the changes made, interestingly enough, involved putting a legitimate website into their profile. Would you believe that one of the most popular web site entered by spammers now is google? :lol: Now I like to blame google for lots of things, but I doubt that they’re really behind all of the spammers joining my board.

Email Address

I have had plenty of posts where I called out specific email domains being used by spammers. I think it’s relatively easy to see patterns here. For example, these are the top 10 email domains used to register on my honey pot board:

+----------------------+----------+
| email_domain         | count(*) |
+----------------------+----------+
| serpdomains.com      |      142 |
| mail.ru              |      126 |
| gmail.com            |       33 |
| gawab.com            |       28 |
| dp-blog.com          |       25 |
| mymail-in.net        |       15 |
| gmx.us               |       15 |
| greatfreemail.net    |       12 |
| mp3bank.in           |        9 |
| paydayloancourse.com |        4 |
+----------------------+----------+

Notice who is number three on the list? That’s right, gmail. Along with spammer favorites like mail.ru and gawab.com I now have to deal with spammers using gmail accounts. It’s relatively easy to justify banning an email domain like anotherstupeddomain4bots.org (yes, I really got that, along with other domains in this post). I have heard of board owners that take the rather drastic step of banning all “free” email providers including hotmail and yahoo. I don’t think that’s a good step to take if you are trying to attract a wide range of members. Based on behavior I don’t have any problem adding certain domains to my banlist. I do have a problem with banning gmail and other free email accounts just because some spammers use their service.

Conclusion

Are any of these individual “tells” enough to block spammers? Maybe. Certain fields seem to have a higher success rate (time zone, for example) at predicting whether an account was created by a spammer or not. The problem with relying on an individual field like time zone is that it would be easy for a bot writer to change that behavior. In addition to that, I can’t be 100% sure that it’s not a legitimate user. For example, I just checked my biggest board and I have 21 users (a whopping 0.06%) that registered with the -12 time zone. Most of them have posted at least once and have survived, so they’re not spammers. If they were, I would have figured that out by now. In my opinion that means that I can’t really “auto-ban” anyone with that time zone, as attractive as that seemed at the beginning of this post.

Instead I have to look at patterns of behavior and combinations of fields. I can do that myself, or I can wait (impatiently! :) ) for the formal relaunch of the bbProtection service. The primary advantage of the bbProtection design is that it captures data from every subscriber and uses it to detect patterns from a much broader range of activity than any single board owner is likely to be able to do.

This post concentrated on reviewing registration data. Are there patterns in posting behavior that I can identify? It turns out the answer is “Yes”, and that there are some sobering statistics that show just how deep and wide the spammer-bot problems go.

14 Comments

  1. This, ladies and gentlemen, is what I was talking about when I kept referring to behavior instead of content. :) This is how bbProtection works.

    Comment by Micheal — September 19, 2008 @ 10:09 am

  2. We delete all new users with timezone -12 on a daily basis.
    I have an admin screen to sort by timezone = -12.
    When we started this practice, there were a few legitimate users with -12, we reset them to the default.
    I probably should do an outright auto-ban but it’s easy to mass delete them.

    For whatever reason, we also get spammers with timezone -3.5 but far less of those.

    I don’t allow the website to be entered until you have activated your account. I haven’t decided if this makes it easier or harder to identify spammers.

    I also have inactive users hidden from the user list, as well as restricting that page to logged-in members only. This doesn’t do anything to stop spam registrations as far as I can tell but I don’t want my members seeing anything “weird”. :)

    Comment by Everett — September 19, 2008 @ 10:31 am

  3. Hi, Micheal… there’s a reason this post is marked with the “bbProtection” tag. :)

    Everett, the steps you are taking are fairly common. For a long time steps like those that you mention were enough for me too. Eventually I got hit by spammers that were also posting, and that becomes a bigger issue. I think it goes back to the three steps I outlined in my Anti-Spam presentation at Londonvasion: Prevention, Detection, and Elimination. I would far rather prevent a spammer from registering than simply hide him from view. Do you have anything on your registration page like the RAC (Registration Auth Code) or similar MOD?

    Comment by Dave Rathbun — September 19, 2008 @ 10:40 am

  4. Dave: I know. :P I was simply reinforcing the idea. :P

    Comment by Micheal — September 19, 2008 @ 10:52 am

  5. But drathbun! You missed one! The user-agents, such as ‘LibWWW’ and ‘LibWWW-perl’ and ‘Wget’ and ‘Curl’

    Those all indicated automated access of your web site, and should be blocked, I think.

    Comment by Dog Cow — September 19, 2008 @ 11:41 am

  6. Hi, Dog Cow, this was not intended to be an all-inclusive list. :) The items I listed are some of the more common and some have had MODs or tweaks published at phpbb.com (or other sites) to address them. I skipped mentioning open or anonymous proxies, user agents, and probably plenty of other possible “tells” as well.

    Comment by Dave Rathbun — September 19, 2008 @ 11:54 am

  7. Things like the user agent are user input though, and are easy to fake. I even have a firefox extension to do that. Useful to pretend being a search engine. :P

    Comment by eviL3 — September 19, 2008 @ 4:41 pm

  8. eviL3 : Think about what you just said, “user agent are user input though, and are easy to fake.”

    What is the Time zone? – User input!
    What is the location? – User input!
    What is the email address? – User input!

    :)

    Comment by Dog Cow — September 19, 2008 @ 6:25 pm

  9. I rarely have spammers post but I do have a spam MOD in place that blocks certain phrases. It rarely blocks anybody though because we delete them daily, often before they activate.

    I have been thinking about adding something on the register page to limit registrations. The RAC MOD looks like it would work. The first implementation listed in the thread on phpBB would not work for me – I have too many inexperienced users who would not be able to locate an answer to a question posted elsewhere on the site.

    Comment by Everett — September 20, 2008 @ 1:15 am

  10. Everett, if you would like the “checkbox challenge” MOD code it’s posted at phpbb.com in a relatively work-able state. It has been very effective for me. It uses the same interface that you use to confirm a comment here on this blog, so you’ve already seen it in action. :) Dog Cow is using it on his site to allow guest posting (after modifying it to work with the posting screen) and I am also using it to protect comment forms on various sites.

    Comment by Dave Rathbun — September 20, 2008 @ 7:40 am

  11. I added the RAC MOD this am. I’ll see if it cuts down on anything.
    If not or I get too many people who can’t figure out the answer to my so-easy-maybe-you-shouldn’t-be-a-member-if-you-don’t-know-this, I may try the checkbox challenge.

    Comment by Everett — September 20, 2008 @ 10:42 am

  12. evil3: Indeed, think about what you said. Everything submitted to a site, except for the IP, is user-input and easily manipulated. That’s not the concern. The concern is what users do with that data. :) That’s where the behavior analysis comes in.

    Comment by Micheal — September 20, 2008 @ 7:50 pm

  13. Update on my install of the RAC MOD for those interested. After 24 hours, there have been ZERO fake registrations. Yey, less clean-up work.

    Comment by Everett — September 21, 2008 @ 12:01 pm

  14. Everett, you might be interested in this post if you have not read it yet. :)

    Comment by Dave Rathbun — September 21, 2008 @ 1:27 pm

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress