Advertising on Forums Part III: Measuring RFM
RFM: It’s Not What You Think…
In Part II of this series I wrote about my ideas regarding what a potential advertiser would be looking for. Advertisers don’t care how big your board is, at least to some degree. You just have to have the right segment of users. Think about how valuable a board that catered to millionaires would be. If you had even 200 verified millionaires as members you can bet advertisers would be calling.
So what’s next? After looking over your subject matter here’s the next question I think you will want to be able to answer:
What is the RFM factor of my users?
What is RFM? Well, it’s probably not what you think.
The RFM Customer Model
In my limited exposure to marketing one of the things that really stuck with me was the Recency / Frequency / Monetary model. Recency is defined as how recently a user has used a service. Frequency is defined as how often that user has used a service. Monetary is defined as how much the spent using a service. In the marketing world, the “sweet spot” users visit both recently and frequently and spend a lot along the way.
Given that most boards are offered as a free service I will drop the Monetary aspect of this model, which leaves me with two concepts to consider.
Measuring Recency
I can measure recency by looking at the user_lastvisit column of the phpbb_users table. Of course this works only for registered users that log in; more on that in a bit. For example, here is a graph showing recency statistics from my largest board. This graph is using live data so the numbers that I reference in the following paragraphs might not match exactly with what you see presented.
As I write this, there are 26739 registered users on this board. The red line on the graph shows that just over 4200 of them have visited in the last six weeks. Those numbers work out to almost 16% of my total activated population. I think that’s quite good, to be honest. The percentage will be going up in a month or so when I do a purge of users that have registered and activated but never logged in. My total users will go down, but that does not concern me if I can boost the “valuable” user percentage up.
The blue line on the graph shows a plot of “days since last visit”. The reason it jumps up and down so much is because volume on this board goes way down on weekends. The red line is a cumulative total of the blue line.
To go along with the graph here are some raw numbers from two of the larger boards that I host.1
| Total Users | Active Users | Posting Users | Recent Users | |||
|---|---|---|---|---|---|---|
| 4877 | 3518 | 73.13% | 937 | 26.63% | 255 | 7.25% |
| 27013 | 26739 | 98.99% | 14340 | 53.25% | 4209 | 15.74% |
Based on my experience a board will have between 5-20% recent users. If you are lucky you will see somewhere from 50-60% posting users. Ironically the bigger the board the lower these percentages often go. Based on a very rough guess I think that only 7-12% of the phpbb.com users are posting users.2 Obviously the recent user number would be much lower than that but I have no way to even guess what that might be. I would love to get a copy of the phpBB database in order to do some statistics, but I don’t see that happening. Still, it would be fun.
An active user is one that has activated their account. The main reason my activation percent is so high for the second board listed above is I have a weekly cron job that purges users that have never activated their account after 30 days. That means I don’t have a lot of inactive accounts sitting around that skew the numbers. A posting user is one with at least one post on their account. A “recent user” in my scenario is a user that has visited at least once in the past six weeks.
What About Guests?
One of the MODs that I talk a lot about on this blog is my Page Permissions MOD. It allows me to do quite a few useful things.
The feature that is appropriate for this discussion is the ability to track page views by guests versus logged-in users. Here are some percentages that you might find interesting.
| Page | Guest Views | Logged-In Users |
|---|---|---|
| viewtopic.php | 84% | 16% |
| viewforum.php | 68% | 32% |
| index.php | 59% | 41% |
| Total Board | 75% | 25% |
These statistics are based on an aggregate of over 31 million page views so this is more than casual data. The important piece of information here is that 75% of my total traffic is from guests. The graph I posted a while back in this post showed that I had about 4200 registered users that have visited in the last six weeks. That number can be supported with hard facts. Using that number and the ratio of guest / non-guest activity I can project that approximately 16800 unique users have visited my board in the past six weeks.
As I talk to people that use my board, I hear this over and over: “Oh, sure, I use the board all the time, I just don’t bother to log in.” What does that do to my numbers? It should change the ratio of guests to members, but I have no way to figure out how much. I won’t force people to log in to use the board which leaves me in the dark as to how many “members” use the board as guests.
Measuring Frequency
Tracking recency is easy. Unfortunately there is no history of frequency other than posting records in a standard phpBB board. In order to track visit frequency I would have to log each visit to a new table. I started thinking about writing that MOD and quickly realized how messy it could get. So I resorted to poking around in the database and doing what I like to do… playing with some numbers.
First number: 1939. Thats how many of the 4200 users that have visited in the last six weeks have posted at least once in that time frame. That works out to 46.2%. Hm, I am not sure what to do with that number. Moving on…
Second number: 6.98. That is the average number of posts entered by those 1822 users over the past six weeks. The highest number was 701
and the lowest was 1, as I might have expected since I required at least one post during the period to be included in the data set. I thought this might be interesting, but I am not sure what to do with it either. Moving on…
Third number: 580.5. That is the average number of unique posting users in each of the prior six weeks. Hm, now I feel like I am getting somewhere. Here are the actual numbers that I used to derive that value.3
+-----------+-----------------+-------------+-------------+ | post_week | week_start_date | total_users | total_posts | +-----------+-----------------+-------------+-------------+ | 2007-27 | 2007-07-08 | 577 | 2172 | | 2007-28 | 2007-07-15 | 586 | 2469 | | 2007-29 | 2007-07-22 | 581 | 2307 | | 2007-30 | 2007-07-29 | 580 | 2210 | | 2007-31 | 2007-08-05 | 592 | 2295 | | 2007-32 | 2007-08-12 | 567 | 2093 | +-----------+-----------------+-------------+-------------+
Does this help me answer any questions about frequency? I guess not. I don’t have any way to know how many of the 577 users from week 27 are the same as the 586 users of week 28. So I moved on to the next step, looking for returning users out of this set. I went back to the 1939 users I had identified before and created a temporary table4 to store some numbers for analysis. This table contained one row for each of those users, and on that row I stored the number of posts for that user for that week. After doing that, I counted how many users had posted in all six weeks, in only five weeks, four weeks, and so on, Here are the results.
+------+----------+---------+ | wk | count(*) | percent | +------+----------+---------+ | 1 | 1139 | 58.74 | | 2 | 407 | 20.99 | | 3 | 198 | 10.21 | | 4 | 85 | 4.38 | | 5 | 64 | 3.30 | | 6 | 46 | 2.37 | +------+----------+---------+
This is somewhat predictable, I guess. The majority of those users posted in only one of those weeks. only 46 users (out of 1939) have posted in every single of the past six weeks. Now what.
The bottom line is that unless I write some code to track frequency of member visits these numbers are, at best, a guess. Can I extrapolate from these numbers and conclude that 2.37% of my total visitors (16800) visit every single week? If so, I have about 400 really really loyal users. But I can’t be sure.
Summary
This was probably a more typical post as I got to dig into my data and play with some numbers. I don’t know if anyone else is interested, but sometimes these posts are to satisfy my own curiosity as much as anyone else.
In the RFM model I have thoroughly documented the “R”, had limited success with the “F”, and essentially ignored the “M” as it is inappropriate. It is what it is, and that’s what I can do without extra code.
The next question (and I’m sure you knew that was coming) is what can I do with these numbers? If I manage a board with very focused content and almost 17000 unique visitors (25% registered, 75% guests), what is that worth? What are some of the options I can consider? That’s where I will go next time.
Footnotes
1. Here is the query I used to generate the table of numbers showing recency statistics.
select count(user_id) as total_users , sum(user_active) as active_users , sum(sign(user_posts)) as posting_users , sum(case when user_lastvisit >= (unix_timestamp(now()) - (86400*42)) then 1 else 0 end) as recent_users from phpbb_users
If you are willing I would be interested to see what your numbers are. If you are not interested in sharing your total numbers (total user count) but could share your active percent, posting percent, and recent percent that would still be interesting.
2. I determined the number of posting users at phpbb.com via a very unscientific process. A few months back I went through the phpbb.com memberlist sorted by post count and determined how many pages of users had at least one post, and divided that by the total number of users. I can’t recreate that analysis now because it seems that phpBB3 does not offer the ability to sort the memberlist.
3. This is the query used to generate the third table that shows posting frequency.
select from_unixtime(post_time, '%Y-%V') as post_week
, min(from_unixtime(post_time,'%Y-%m-%d')) as week_start_date
, count(distinct p.poster_id) as total_users
, count(p.post_id) as total_posts \
from phpbb_posts p
where p.post_time between unix_timestamp('2007-07-08 00:00:00') and unix_timestamp('2007-08-18 23:59:59')
group by 1
4. Here is the structure of the recent_posters table that I used:
mysql> desc recent_posters; +-------------+-----------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------------+-----------------------+------+-----+---------+-------+ | user_id | mediumint(8) | | PRI | 0 | | | total_posts | bigint(21) | | | 0 | | | w1_posts | mediumint(8) unsigned | YES | | NULL | | | w2_posts | mediumint(8) unsigned | YES | | NULL | | | w3_posts | mediumint(8) unsigned | YES | | NULL | | | w4_posts | mediumint(8) unsigned | YES | | NULL | | | w5_posts | mediumint(8) unsigned | YES | | NULL | | | w6_posts | mediumint(8) unsigned | YES | | NULL | | +-------------+-----------------------+------+-----+---------+-------+
I picked the last full six weeks and inserted users into the table, then updated their “weekly” post numbers with this query:
update recent_posters r
set w1_posts = (
select count(p.post_id)
from phpbb_posts p
where p.poster_id = r.user_id
and p.post_time between unix_timestamp('2007-07-08 00:00:00') and unix_timestamp('2007-07-14 23:59:59'))
,
w2_posts = (
select count(p.post_id)
from phpbb_posts p
where p.poster_id = r.user_id
and p.post_time between unix_timestamp('2007-07-15 00:00:00') and unix_timestamp('2007-07-21 23:59:59'))
,
w3_posts = ( ...
And the query that I used to determine how many users had posted in every single week? Ah, that’s going to be another post. And a challenge. ![]()

