Home

Your premium source for custom modification services for phpBB

  logo

HomeForumsBlogMOD ManagerFAQSearchRegisterLogin

Comments March 24, 2008

Running a phpBB2 board? Get rid of this code, and do it now

Filed under: Performance Tuning, phpBB — Dave Rathbun @ 12:11 pm CommentsComments (8) 

About six or eight months ago I started noticing a really weird behavior from my server. It was always related to pages from my largest phpBB2 board. The first symptom was that the server would slow to a crawl. After connecting to the board via the web host manager or via PuTTY (if I could do either) I would see a system load of 50, 90, even 300+ at one point. When I say system load I am talking about the top-line number from the “top” command. In theory with a four-cpu box my system load should never be higher that 4.0, so a load of 300+ is quite… disturbing.

I checked and changed my apache settings. I checked my database settings. I kept checking and checking and tweaking and nothing made a difference. I finally resorted to blocking IP addresses that exhibited this behavior.

I think I may finally have figured out what was causing my problem. Believe it or not, it was some code in phpBB2.

phpBB2 is often blamed for server issues. In this case it seems that it certainly was a contributing factor. There may also be something about my apache configuration so I’m not going to place all of the blame on phpBB2. :) But the fix is quite simple.

Extreme Server Load

Here is a thumbnail of a screen shot from my apache server status screen, click to see the larger image:

screen shot #1

If you click the image you’ll see that the entire screen is filled with requests from a single IP address. Each request is for a different instance of viewforum.php which – coincidentally enough – is one of the pages that puts the heaviest load on my server due to the number and type of queries involved. When I looked at my apache logs I saw that these requests come in at the rate of five or more per second! This is not a typical human behavior.

And to be honest, it is not a typical web-bot behavior either. Well-behaved bots do not overload your server with too many requests or they would soon find themselves out of a job. So what was causing this?

Offline Viewing

The first thought that I had was that someone was trying to use the “offline viewing” option from their browser. This option allows a user to specify a web site and a link depth. The browser will start with the specified page for the web site and download it. Then the browser will follow any included links up to the specified depth and also download them to the offline cache.

This was easy enough to test. I set up my browser to cache my board index page and set the link depth to three. But when I checked my apache log I didn’t see the same behavior, nor did my server load spike up. So I eliminated this as a potential source for the issue.

Web Accelerators

The next suggestion that I got was that this was an indication of a web-accelerator in use. Google offers a product that does this and Firefox has something called FasterFox. I took a look at both of these and looked for others as well. However, both of these products mentioned that they do not pre-fetch dynamic URLs. This means that a URL that includes an argument like viewforum.php?f=42 would not be pre-fetched!

Now what?

I finally resorted to blocking users by IP address anytime I saw this behavior. That was not my preferred solution, but I had to do something to keep my server from doing a death spiral. In some cases this browsing behavior would take my server down for up to 30 minutes. :shock:

In order to soften the blow somewhat I didn’t just block the users outright. Instead I sent them to this page which you will understand if you’ve seen the Seinfeld TV show. ;-)

On An Unrelated Note…

As if often the case the breakthrough presented itself while trying to solve something that was completely unrelated to this issue. I was doing some development on viewtopic, and one of my javascript scripts was not working. As part of the debugging process I did a “view source” on the page. What I saw gave me pause. Here it is (with some content removed for brevity)

<link rel="top" href="./index.php" title="title here" />
<link rel="search" href="./search.php" title="Search" />
<link rel="help" href="./faq.php" title="FAQ" />
<link rel="author" href="./memberlist.php" title="Memberlist" />
<link rel="prev" href="viewtopic.php?t=40050&view=previous" title="View previous topic" />
<link rel="next" href="viewtopic.php?t=40050&view=next" title="View next topic" />
<link rel="up" href="viewforum.php?f=99" title="..." />
<link rel="chapter forum" href="viewforum.php?f=1" title="..." />
<link rel="chapter forum" href="viewforum.php?f=63" title="..." />
<link rel="chapter forum" href="viewforum.php?f=2" title="..." />
<link rel="chapter forum" href="viewforum.php?f=61" title="..." />
<link rel="chapter forum" href="viewforum.php?f=4" title="..." />
<link rel="chapter forum" href="viewforum.php?f=40" title="..." />
<link rel="chapter forum" href="viewforum.php?f=48" title="..." />
<link rel="chapter forum" href="viewforum.php?f=78" title="..." />
<link rel="chapter forum" href="viewforum.php?f=7" title="..." />
<link rel="chapter forum" href="viewforum.php?f=96" title="..." />
<link rel="chapter forum" href="viewforum.php?f=65" title="..." />
<link rel="chapter forum" href="viewforum.php?f=8" title="..." />
<link rel="chapter forum" href="viewforum.php?f=47" title="..." />
<link rel="chapter forum" href="viewforum.php?f=9" title="..." />
<link rel="chapter forum" href="viewforum.php?f=38" title="..." />
<link rel="chapter forum" href="viewforum.php?f=10" title="..." />

As soon as I saw this I flashed back to the server load screen shot. Could this be the cause of my load? Is this why a single user would request to view every single forum on my board within seconds of each other?

Navigation Bar

It turns out that this code is a bit of a legacy at this point. It was designed to provide information used by the mozilla “navigation bar” which is a feature that has been obsolete for a very long time. Or at least I thought it was, as it seems that it has come back at some point. I found this reference to an add-on for Firefox that uses this information.

In theory this could be a good thing for webmasters. It allows them to set up chapters and titles and other information to provide more structure for their web site. But in the case of phpBB2 it doesn’t seem to work. They treat each forum as a chapter. If you have a lot of forums, then there are a lot of chapters. And if the add-on or plug-in attempts to pre-fetch the chapter information from each of those pages, well, that could cause the behavior that I have observed.

Even with that in mind, I wonder who thought it was a good idea to link to the entire member list as the “author” of the board content? ;-)

For what it is worth, these links are not included in phpBB3. They are no longer generated for my phpBB2 boards either. In the days since I have removed the code that generates the navigation bar information I have not seen a single instance of a server load over 3.0, much less 30 or even 300. Since I was not online viewing the server load for 24 hours at a time I also pulled out every instance of a call to viewforum.php from my apache logs and looked at the time stamps and other information. I did not find one occurrence of this behavior for the past 72 hours.

I am now quite confident (perhaps not 100%, but very, very close) that I have discovered the root cause of this issue. Oh, and the issue that I was researching on viewtopic that caused me to view the source? I figured that out too, just in case you were worried. :)

Why Me?

The next question I found myself asking myself was, “Why me?” Why hasn’t anyone else mentioned this specific issue at phpbb.com? Perhaps there are several answers. First, someone else may have reported or observed the issue without any idea of what was causing it, much like me at the beginning. So I’m not sure which search terms I would use to identify topics about this in the support forums.

Second, my board is quite large, both in terms of members and page views and – most importantly – forums. If you don’t have hundreds of people online at the same time you probably won’t see the issue. And even if you do have hundreds of people online at the same time, if you only have a few forums it’s still not going to be an issue.

I have almost 80 visible forums on this particular board at the time I write this. I generally have over 300 sessions active on my board at any given point in time and often have numbers that are much higher, as noted in this graph. This graph shows the highest number of users (total, guests, and members) online per hour over the past week on my largest board.

hourly online users

The darker part of the column indicates guest sessions and the lighter part indicates a member that has logged in. I track this information by capturing board statistics every ten minutes via a cron job. These number represent the maximum number of sessions for each hour for the last seven days.

Here is a similar graph showing average users instead of the maximum:

hourly average users

As you can see, over the course of the last seven days I have averaged over 150 users online per hour over the entire day. The board never sleeps. :)

Conclusion

The combination of active users + total forums + some browser plug-in that pre-fetches content spelled disaster for my server. I first started noticing the issue about eight months ago, and it might have been occuring before that. The issue became very obvious over the past several months. After stumbling on the navigation bar code and then removing it I have not seen the issue since. While not 100% conclusive, I feel like the results are enough of an indication that I wanted to share them, thus this post. :)

Related Links

8 Comments

  1. I had that code removed over a year ago, because it was for “legacy” as you said. I had no idea about the server issues, though!

    Comment by Dog Cow — March 24, 2008 @ 1:39 pm

  2. I have just installed a phpbb 2.0.23 board as the mod I require works only in version 2 :)
    I got a complain on excess CPU usage and server load from my host for the first time since i started websites.

    I found you topic and have configured the changes as advised by you. I will check with my host if things are ok now. If you have any further advice on server load reduction pls let me know. I use the Smartor full album mod + wp united + a cms for the front end.

    Regards and thanks for the effort and your kindness.

    Jack

    Comment by Jack — April 10, 2008 @ 1:43 am

  3. Hi, Jack, and welcome to my blog.

    This change is important, but there are other tweaks that I’ve made to the search system that will also help. If you click the “Search” category on the right-hand side of the screen you’ll find most of them. However, it sounds like you’re running quite a combination of MODs and it may be that they were not designed to work well together… some MOD authors do not take performance into consideration when they write their code. :)

    Comment by Dave Rathbun — April 10, 2008 @ 8:50 pm

  4. Hi Dave,

    Thnaks, I am forced to use phpbb2 for a particular mod which is not upgraded for version 3. I even offered professional fees but there was no reply.

    Yes, i am using cash mod, quiz, album, seo, url-rewrite, quick reply (which is not quick as VB) as it goes round and round before u see the post, and prediction.

    My server loads are quite high even with so few members. I am looking for professional help asap. I would appreciate if u can help. My email id is with you.

    Regards.

    Jack

    Comment by Jack — April 16, 2008 @ 2:47 am

  5. We reported this problem on the phpbb site awhile back. Had not idea this is what is was related to. Thanks

    Comment by JLA — October 9, 2008 @ 10:35 am

  6. JLA, if you had not removed the code before, I hope you have done so now. On a board your size it could make a big difference. I played around with various apache settings, mysql settings, o/s settings… nothing ever fixed my issue. It was sheer chance that I figured out what was causing my problem.

    Comment by Dave Rathbun — October 9, 2008 @ 8:44 pm

  7. Seeing you talk about these header links must have stuck with me as I realized the huge server load problems I was getting for a wordpress blog were caused by the rel=”archive” links in the header.

    All the linked pages were being downloaded by every caching proxy that understood the link so 80+ connections per second from a military or corporate caching proxy were not usual. Removed the rel links and now I no longer have to throttle that site – yay.

    Perhaps that was part of your problem.

    Comment by Steve — December 19, 2008 @ 6:52 pm

  8. Hi, Steve, I think you basically just described the problem again for me. :) I have not looked at the issue related to Wordpress at all, however, thanks for the tip. I will be checking that out.

    Comment by Dave Rathbun — December 20, 2008 @ 1:06 am

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress