About six or eight months ago I started noticing a really weird behavior from my server. It was always related to pages from my largest phpBB2 board. The first symptom was that the server would slow to a crawl. After connecting to the board via the web host manager or via PuTTY (if I could do either) I would see a system load of 50, 90, even 300+ at one point. When I say system load I am talking about the top-line number from the “top” command. In theory with a four-cpu box my system load should never be higher that 4.0, so a load of 300+ is quite… disturbing.
I checked and changed my apache settings. I checked my database settings. I kept checking and checking and tweaking and nothing made a difference. I finally resorted to blocking IP addresses that exhibited this behavior.
I think I may finally have figured out what was causing my problem. Believe it or not, it was some code in phpBB2.
phpBB2 is often blamed for server issues. In this case it seems that it certainly was a contributing factor. There may also be something about my apache configuration so I’m not going to place all of the blame on phpBB2. But the fix is quite simple.
Extreme Server Load
Here is a thumbnail of a screen shot from my apache server status screen, click to see the larger image:
If you click the image you’ll see that the entire screen is filled with requests from a single IP address. Each request is for a different instance of viewforum.php which – coincidentally enough – is one of the pages that puts the heaviest load on my server due to the number and type of queries involved. When I looked at my apache logs I saw that these requests come in at the rate of five or more per second! This is not a typical human behavior.
And to be honest, it is not a typical web-bot behavior either. Well-behaved bots do not overload your server with too many requests or they would soon find themselves out of a job. So what was causing this?
The first thought that I had was that someone was trying to use the “offline viewing” option from their browser. This option allows a user to specify a web site and a link depth. The browser will start with the specified page for the web site and download it. Then the browser will follow any included links up to the specified depth and also download them to the offline cache.
This was easy enough to test. I set up my browser to cache my board index page and set the link depth to three. But when I checked my apache log I didn’t see the same behavior, nor did my server load spike up. So I eliminated this as a potential source for the issue.
The next suggestion that I got was that this was an indication of a web-accelerator in use. Google offers a product that does this and Firefox has something called FasterFox. I took a look at both of these and looked for others as well. However, both of these products mentioned that they do not pre-fetch dynamic URLs. This means that a URL that includes an argument like viewforum.php?f=42 would not be pre-fetched!
I finally resorted to blocking users by IP address anytime I saw this behavior. That was not my preferred solution, but I had to do something to keep my server from doing a death spiral. In some cases this browsing behavior would take my server down for up to 30 minutes.
In order to soften the blow somewhat I didn’t just block the users outright. Instead I sent them to this page which you will understand if you’ve seen the Seinfeld TV show.
On An Unrelated Note…
<link rel="top" href="./index.php" title="title here" /> <link rel="search" href="./search.php" title="Search" /> <link rel="help" href="./faq.php" title="FAQ" /> <link rel="author" href="./memberlist.php" title="Memberlist" /> <link rel="prev" href="viewtopic.php?t=40050&view=previous" title="View previous topic" /> <link rel="next" href="viewtopic.php?t=40050&view=next" title="View next topic" /> <link rel="up" href="viewforum.php?f=99" title="..." /> <link rel="chapter forum" href="viewforum.php?f=1" title="..." /> <link rel="chapter forum" href="viewforum.php?f=63" title="..." /> <link rel="chapter forum" href="viewforum.php?f=2" title="..." /> <link rel="chapter forum" href="viewforum.php?f=61" title="..." /> <link rel="chapter forum" href="viewforum.php?f=4" title="..." /> <link rel="chapter forum" href="viewforum.php?f=40" title="..." /> <link rel="chapter forum" href="viewforum.php?f=48" title="..." /> <link rel="chapter forum" href="viewforum.php?f=78" title="..." /> <link rel="chapter forum" href="viewforum.php?f=7" title="..." /> <link rel="chapter forum" href="viewforum.php?f=96" title="..." /> <link rel="chapter forum" href="viewforum.php?f=65" title="..." /> <link rel="chapter forum" href="viewforum.php?f=8" title="..." /> <link rel="chapter forum" href="viewforum.php?f=47" title="..." /> <link rel="chapter forum" href="viewforum.php?f=9" title="..." /> <link rel="chapter forum" href="viewforum.php?f=38" title="..." /> <link rel="chapter forum" href="viewforum.php?f=10" title="..." />
As soon as I saw this I flashed back to the server load screen shot. Could this be the cause of my load? Is this why a single user would request to view every single forum on my board within seconds of each other?
It turns out that this code is a bit of a legacy at this point. It was designed to provide information used by the mozilla “navigation bar” which is a feature that has been obsolete for a very long time. Or at least I thought it was, as it seems that it has come back at some point. I found this reference to an add-on for Firefox that uses this information.
In theory this could be a good thing for webmasters. It allows them to set up chapters and titles and other information to provide more structure for their web site. But in the case of phpBB2 it doesn’t seem to work. They treat each forum as a chapter. If you have a lot of forums, then there are a lot of chapters. And if the add-on or plug-in attempts to pre-fetch the chapter information from each of those pages, well, that could cause the behavior that I have observed.
Even with that in mind, I wonder who thought it was a good idea to link to the entire member list as the “author” of the board content?
For what it is worth, these links are not included in phpBB3. They are no longer generated for my phpBB2 boards either. In the days since I have removed the code that generates the navigation bar information I have not seen a single instance of a server load over 3.0, much less 30 or even 300. Since I was not online viewing the server load for 24 hours at a time I also pulled out every instance of a call to viewforum.php from my apache logs and looked at the time stamps and other information. I did not find one occurrence of this behavior for the past 72 hours.
I am now quite confident (perhaps not 100%, but very, very close) that I have discovered the root cause of this issue. Oh, and the issue that I was researching on viewtopic that caused me to view the source? I figured that out too, just in case you were worried.
The next question I found myself asking myself was, “Why me?” Why hasn’t anyone else mentioned this specific issue at phpbb.com? Perhaps there are several answers. First, someone else may have reported or observed the issue without any idea of what was causing it, much like me at the beginning. So I’m not sure which search terms I would use to identify topics about this in the support forums.
Second, my board is quite large, both in terms of members and page views and – most importantly – forums. If you don’t have hundreds of people online at the same time you probably won’t see the issue. And even if you do have hundreds of people online at the same time, if you only have a few forums it’s still not going to be an issue.
I have almost 80 visible forums on this particular board at the time I write this. I generally have over 300 sessions active on my board at any given point in time and often have numbers that are much higher, as noted in this graph. This graph shows the highest number of users (total, guests, and members) online per hour over the past week on my largest board.
The darker part of the column indicates guest sessions and the lighter part indicates a member that has logged in. I track this information by capturing board statistics every ten minutes via a cron job. These number represent the maximum number of sessions for each hour for the last seven days.
Here is a similar graph showing average users instead of the maximum:
As you can see, over the course of the last seven days I have averaged over 150 users online per hour over the entire day. The board never sleeps.
The combination of active users + total forums + some browser plug-in that pre-fetches content spelled disaster for my server. I first started noticing the issue about eight months ago, and it might have been occuring before that. The issue became very obvious over the past several months. After stumbling on the navigation bar code and then removing it I have not seen the issue since. While not 100% conclusive, I feel like the results are enough of an indication that I wanted to share them, thus this post.