<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Welcome to the phpBB Doctor Blog &#187; Performance Tuning</title>
	<atom:link href="http://www.phpbbdoctor.com/blog/category/phpbb/performance-tuning/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.phpbbdoctor.com/blog</link>
	<description>Your premium source for custom modification services for phpBB</description>
	<lastBuildDate>Wed, 11 Jan 2012 21:30:50 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>New phpBB2 Modifications</title>
		<link>http://www.phpbbdoctor.com/blog/2011/11/02/new-phpbb2-modifications-coming/</link>
		<comments>http://www.phpbbdoctor.com/blog/2011/11/02/new-phpbb2-modifications-coming/#comments</comments>
		<pubDate>Wed, 02 Nov 2011 17:22:16 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[MOD Writing]]></category>
		<category><![CDATA[Performance Tuning]]></category>
		<category><![CDATA[phpBB]]></category>

		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=360</guid>
		<description><![CDATA[What has Dave been working on lately? Not blog posts, obviously.   Here are the headlines&#8230;

Full-Text Search
I created a full-text index on the post subject and text over a year ago to see if maintaining that index would cause any performance issues. I&#8217;m happy to say that I have not seen any challenges from [...]]]></description>
			<content:encoded><![CDATA[<p>What has Dave been working on lately? Not blog posts, obviously. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Here are the headlines&#8230;</p>
<ol>
<li>Full-Text Search<br />
I created a full-text index on the post subject and text over a year ago to see if maintaining that index would cause any performance issues. I&#8217;m happy to say that I have not seen any challenges from inserts / updates with this index in place. I&#8217;m going to be altering the search screen to allow the <a href="http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html">full syntax offered by MySQL</a> on this type of index and hope to release that in a few months. Some of the challenges I have not yet decided how to solve are things like limiting forums &#8211; either by security or user preference &#8211; and other criteria that can be entered on the standard search screen.
</li>
<li>Capture Post Revisions<br />
I&#8217;ve also added some code to capture post revisions. We&#8217;ve had a couple of folks that come back to our board and edit their post, removing all of the text and leaving only something like &#8220;&#8230;&#8221; instead. This destroys the continuity of the topic, and as a result we&#8217;re going to now track post revisions by capturing the post text history. If needed a moderator will be able to review and then restore a prior post, and ultimately lock that post from further editing. As with the full text search I have done fairly extensive testing on how this is implemented in order to ensure that performance does not suffer, and I&#8217;ll have a few blog posts about that process. This MOD is completed and I expect to roll it out onto the main board in a few weeks. (FWIW, I <a href="http://www.phpbbdoctor.com/blog/2009/11/02/storing-post-revisions-post-locking/">first talked about this post several years ago</a>, and am just now finally getting it completed.)
</li>
<li>Moderator Posts<br />
I&#8217;ve added a new field to the post table that allows a moderator to designate whether it&#8217;s a moderator post or a user post. For example, moderators can certainly participate in a normal board conversation as a regular person. But they may also add posts in their role as a moderator. This new feature will format those posts differently so they stand out, will automatically remove the &#8220;personal&#8221; aspects of a post such as signatures, and does not increment a moderator post count for this type of post. It is intended to be a way for moderators to be able to separate out their moderator posts from their board participation posts. This MOD is also completed and expected to be released shortly.
</li>
<li>Including External Content<br />
I&#8217;ve added some cron jobs that parse RSS feeds from several blogs owned by board members. Their blog posts are automatically set up as part of their signature (as &#8220;Latest Blog Posts&#8221;) and updated once an hour. For bloggers that our community wants to recognize, this is a great way for them to get additional exposure without having to manually update their signature every time they publish a new blog post. This part of the MOD is already in use on our board. Only board admins can currently enter blogger information, as we want to go through a review process and certify blogs rather than allowing just anybody to link to an external site. This was done by altering the administrator user edit form and leaving the regular user profile form alone.</p>
<p>As an extension to this, I&#8217;m also pulling in the content from the blog post and storing that in a hidden forum. As the blog posts are added to the forum they are obviously added to the full-text index because they&#8217;re part of the same table. I am also adding these posts to the standard phpBB2 search tables at the same time. That way if someone searches for term &#8220;X&#8221; and that&#8217;s found in an external blog post, they&#8217;ll see a link in their search results. The blog address is stored on the topic table and a different icon is used to show the user that they&#8217;re leaving our board and heading to an external site. I have all of the main work done; the last requirement is altering search.php so that it offers the ability to include / exclude external content and then react to that setting accordingly. I hope to get this completed in the next few weeks.
</li>
<li>Social Media Profile Links<br />
I&#8217;ve added Facebook, Twitter, and LinkedIn fields to user profiles. These are displayed along with the other profile links, using smaller 18&#215;18 pixel logos. I&#8217;m planning on going back and redoing the other profile links to use the same form factor but that part hasn&#8217;t been done yet. Here are the images I&#8217;ve made, using logos or other material provided by each service provider. <img src="http://www.forumtopics.com/busobj/templates/bob/images/icon_twitter.png" /> <img src="http://www.forumtopics.com/busobj/templates/bob/images/icon_facebook.png" /> <img src="http://www.forumtopics.com/busobj/templates/bob/images/icon_linkedin.png" />
</ol>
<p>One thing that many of these MODs have in common is my concern for performance. We&#8217;re over 750K posts now, and still running extremely well on a server that is hosting several dozen sites, although none of them as active as our big board. Every time I touch the code performance is a primary goal. Another MOD that I&#8217;ve been planning is to port the phpBB3 posting form back to phpBB2 since it does a better job of supporting modern browsers as well as proving some additional formatting features. I haven&#8217;t even started on that yet, but I think it would be good. Now that I&#8217;ve personally switched to Chrome as my standard browser I&#8217;m noticing some interesting quirks. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>So that&#8217;s what I&#8217;ve been up to for the past few months. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_cool.gif' alt='8-)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.phpbbdoctor.com/blog/2011/11/02/new-phpbb2-modifications-coming/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Optimizing Random Users Via Last Visit Time</title>
		<link>http://www.phpbbdoctor.com/blog/2009/09/27/optimizing-random-users-via-last-visit-time/</link>
		<comments>http://www.phpbbdoctor.com/blog/2009/09/27/optimizing-random-users-via-last-visit-time/#comments</comments>
		<pubDate>Sun, 27 Sep 2009 16:33:39 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[MOD Writing]]></category>
		<category><![CDATA[Performance Tuning]]></category>
		<category><![CDATA[phpBB]]></category>
		<category><![CDATA[phpBB3]]></category>

		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=336</guid>
		<description><![CDATA[While I have not started in-depth MODding on phpBB3 yet, I do read the phpBB3 MODders forum from time to time just to start to get the flavor of how things have changed. The other day a database (query) question came up and I suggested an answer that I originally thought was only slightly different [...]]]></description>
			<content:encoded><![CDATA[<p>While I have not started in-depth MODding on phpBB3 yet, I do read the phpBB3 MODders forum from time to time just to start to get the flavor of how things have changed. The other day a database (query) question came up and I suggested an answer that I originally thought was only slightly different from what had already been proposed. However, after being asked which of the two solutions would be the least CPU intensive I did a bit more investigating.</p>
<p>I discovered that one solution was clearly better than the other, but only if the proper index was created. </p>
<p><em>Disclaimer: I tested on phpBB2. The index that I created does not exist in a standard phpBB2, nor does it exist in a standard phpBB3 install, so I suspect this post applies to both.</em> <span id="more-336"></span></p>
<h3>Defining the Problem</h3>
<p>Here is a partial quote of the <a href="http://www.phpbb.com/community/viewtopic.php?f=71&#038;t=1793305">original question in the phpBB3 MOD forum</a>:</p>
<blockquote><p>I want to select 5 random memebers from last 30 active users.</p></blockquote>
<p>Simple enough, yes? First, get the last 30 members that have visited the board, then randomly select 5 of those. This could easily be done procedurally via php code, but it can also be done directly in the database with the correct SQL code.</p>
<h3>Order By Random</h3>
<p>The first suggestion given was this:</p>
<pre>$start = 0;
$number = 30;
$sql = 'SELECT *
    FROM ' . USERS_TABLE . '
    ORDER BY user_lastvisit  DESC, RAND()'  ;
    $result = $db->sql_query_limit($sql, $number, $start);</pre>
<p>With apologies to evil&lt;3 who posted this <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  there are a couple of things that I suggested changing. First is to avoid using the * to select every column in the table unless it&#8217;s absolutely needed. In this case, I made the assumption that the original poster was looking for a way to display a random five &#8220;recent visitors&#8221; somewhere on a page on the site. To do this doesn&#8217;t require every single bit of information about the user, just certain columns. If you select * then the entire row is returned. There are 73 columns in a standard phpBB3 users table, several of them varchar(255), and in my test installation all of the fields are mandatory. That means every single column has a value, even if it is just a space or other placeholder value. By setting up a specific list of columns to request in the query the amount of I/O is reduced. Less I/O should mean if all other things are equal the query will run faster because there are fewer bits and bytes to move around.</p>
<p>The other issue with the query as provided is that it&#8217;s not very efficient. It has two order by columns, one of which cannot possibly be indexed. (You can&#8217;t index something that doesn&#8217;t exist until the runtime of the query, so the rand() function result is impossible to tune.) Here is an abbreviated display for the explain plan for this query:</p>
<pre>+-------------+------+---------------+------+---------+------+-------+---------------------------------+
| select_type | type | possible_keys | key  | key_len | ref  | rows  | Extra                           |
+-------------+------+---------------+------+---------+------+-------+---------------------------------+
| SIMPLE      | ALL  | NULL          | NULL | NULL    | NULL | 43367 | Using temporary; Using filesort |
+-------------+------+---------------+------+---------+------+-------+---------------------------------+</pre>
<p>This shows that no indexes will be used for this query at all, which is not good. I ran this query five times in a row on my large user table and got execution times of 0.12, 0.12, 0.12, 0.13, and 0.12 seconds. </p>
<h3>Select Last 30 Then Random 5</h3>
<p>As you can see from the explain data above, I have 43,367 rows in my users table right now, which is fairly large. Instead of scanning the entire table, it would be much more efficient to get the last 30 visitors in one pass and then pick five members randomly from that list. I suggested this SQL to do that:</p>
<pre>SELECT  u.user_id
,       u.username
FROM    phpbb_users u
,       (SELECT user_id
         FROM   phpbb_users v
         ORDER BY user_lastvisit desc limit 30) v
WHERE   u.user_id = v.user_id
ORDER BY rand() limit 5;</pre>
<p>This is an interesting technique called &#8220;inline tables&#8221; as I am creating a new table on the fly by writing SQL code inside the FROM clause. Every database I have worked with supports this technique, so it should be portable. (I do not count Microsoft Access as a real database. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' />  ) What this SQL code will do is run the inline table to return a list of 30 users, then join that virtual table to the real table by <code>user_id</code> (which is a unique key) and randomly select five users from the joined result set.</p>
<p>Is it more efficient?</p>
<p>I ran this query five times (as I did with the other one) and got run times of 0.10, 0.12, 0.10, 0.10, and 0.11 seconds. It seems that it&#8217;s not really that much more effective, so is there really a clear winner?</p>
<h3>Index Key Columns</h3>
<p>I ran this query:</p>
<p><code>show indexes from phpbb_users</code></p>
<p><em>Side Note: I run my SQL directly on the database using the MySQL command line, rather than phpMyAdmin. If you use the GUI interface, then you can check for keys by looking at the appropriate screen instead of doing as I documented here.</em></p>
<p>The results of the query did not show an index on <code>user_lastvisit</code> which is crucial to this solution. Here is the explain plan for my query without the index:</p>
<pre>+-------------+--------+---------------+---------+-----------+-------+---------------------------------+
| select_type | type   | possible_keys | key     | ref       | rows  | Extra                           |
+-------------+--------+---------------+---------+-----------+-------+---------------------------------+
| PRIMARY     | ALL    | NULL          | NULL    | NULL      |    30 | Using temporary; Using filesort |
| PRIMARY     | eq_ref | PRIMARY       | PRIMARY | v.user_id |     1 |                                 |
| DERIVED     | ALL    | NULL          | NULL    | NULL      | 43367 | Using filesort                  |
+-------------+--------+---------------+---------+-----------+-------+---------------------------------+</pre>
<p>Notice that is also scans all 43,367 user rows. That&#8217;s okay. What isn&#8217;t okay is that it does so without the benefit of an index and it also has to do some additional work since more than one table is involved. It would seem that the first query should be more efficient since it only has one explain step and the second one has three.</p>
<p>However, the magic of a database indexing can fix this. The driver for this entire question is the <code>user_lastvisit</code> column. After creating an index on this field (which is not indexed by default in either phpBB2 or phpBB3) here is the new explain plan.</p>
<pre>+-------------+--------+---------------+----------------+-----------+-------+---------------------------------+
| select_type | type   | possible_keys | key            | ref       | rows  | Extra                           |
+-------------+--------+---------------+----------------+-----------+-------+---------------------------------+
| PRIMARY     | ALL    | NULL          | NULL           | NULL      |    30 | Using temporary; Using filesort |
| PRIMARY     | eq_ref | PRIMARY       | PRIMARY        | v.user_id |     1 |                                 |
| DERIVED     | index  | NULL          | user_lastvisit | NULL      | 43367 |                                 |
+-------------+--------+---------------+----------------+-----------+-------+---------------------------------+</pre>
<p>Yup, still have to look at (or so the database optimizer thinks) all 43,367 users. But this time we do so with the benefit of an index. What is the impact?</p>
<p>The query without an index, remember, ran in 0.10, 0.12, 0.10, 0.10, and 0.11 seconds. After creating the index I ran the same query five times and got 0.00 seconds of execution time on every trial. </p>
<p>Does the index help the first query? Interestingly enough, it does not. The explain plan is identical with or without the index, and the query execution times do not improve either. </p>
<p>However, it gets worse. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  It doesn&#8217;t perform as required. My query gets the last 30 users that have logged in and then randomly selects five of those. The ORDER BY clause happens after all of the select and join process is complete, so I am ordering by RAND() on at most 30 rows. <strong>The other suggestion will pick the same five users nearly every single time</strong>. Why? The secondary sort column (the &#8220;random factor&#8221;) will only come into play if two users have exactly the same last visit time (down to the second). When you have two columns in the ORDER BY clause, the first column is the primary sort and every row returned will first be sorted by that column. If there are ties in the first column then and only then will the second column be sorted.</p>
<p>So the first solution suggested is the worst of both worlds: not only is it slower, it is also incorrect. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<h3>Conclusion</h3>
<p>There may be other solutions to this. I don&#8217;t mean to present this post as the ultimate answer to this question. What I hoped to accomplish with this post was to show how two solutions that look the same are not always equivalent. Subtle differences can have a huge impact on functionality. </p>
<p>The second lesson is that if you&#8217;re going to be asking the same question from your database over and over you should carefully consider indexing the columns used in the WHERE or ORDER BY clauses. I did some work for someone a while back (their board is one of the top ten phpBB boards on the &#8220;big boards&#8221; site). They wanted to display the top ten posters on their index. The code as written was taking over ten seconds just to run the query <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_eek.gif' alt=':shock:' class='wp-smiley' />  and then the php code / template process still had to complete. I rewrote the code and added an index on the <code>user_posts</code> column and the code ran in less than a hundredth of a second.</p>
<p>On the other hand, too many indexes can also be a problem, so don&#8217;t go out and create an index for every single column in your database. Just the ones that truly matter. In this case, it makes a substantial difference. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_cool.gif' alt='8-)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.phpbbdoctor.com/blog/2009/09/27/optimizing-random-users-via-last-visit-time/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>A Public Service Announcement: Some Hosts Overload Their Servers</title>
		<link>http://www.phpbbdoctor.com/blog/2008/08/28/a-public-service-announcement-some-hosts-overload-their-servers/</link>
		<comments>http://www.phpbbdoctor.com/blog/2008/08/28/a-public-service-announcement-some-hosts-overload-their-servers/#comments</comments>
		<pubDate>Thu, 28 Aug 2008 07:06:20 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Performance Tuning]]></category>
		<category><![CDATA[phpBB]]></category>
		<category><![CDATA[phpBB Doctor]]></category>

		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=255</guid>
		<description><![CDATA[Some hosts overload their servers. No, really, it&#8217;s true.   I know some of you know this, but many people don&#8217;t and are quite surprised when they ask me see what I can do to improve the performance of their board. One of the first things I do before taking on a client with [...]]]></description>
			<content:encoded><![CDATA[<p>Some hosts overload their servers. No, really, it&#8217;s true. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' />  I know some of you know this, but many people don&#8217;t and are quite surprised when they ask me see what I can do to improve the performance of their board. One of the first things I do before taking on a client with this type of request is run a check to see how many other sites are hosted on the same server. If the number is over 100, I don&#8217;t bother doing much other than telling the potential client to get a better host. How do you find this out? It&#8217;s not hard, really.</p>
<p><span id="more-255"></span>One of my current clients is building out a new site. While I was online editing (using <em>vi</em> directly on the server, as I like to do) I found that just saving the files would sometimes take several seconds. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_eek.gif' alt=':shock:' class='wp-smiley' />  And this is for a 2K file, not something incredibly large. So the first thing I did was run the &#8220;top&#8221; command, and here&#8217;s what I saw:</p>
<pre>load average: 4.29, 3.06, 2.92</pre>
<p>On some boxes, a load of 3.0 might not be bad. The &#8220;big board&#8221; that I am frequently talking about runs on the same server as the phpBBDoctor blog. If I hit 2.0 I&#8217;m really surprised. And since I have a quad-cpu box, I don&#8217;t really get worried until I get closer to 4.0. Yet the client box I&#8217;m working on is regularly over 6.0 or even 8.0 during the day. The stats posted above were captured at 2am central time, so hardly a busy time of the day. Yet still the numbers are what I would consider quite high.</p>
<p>I gave that information to my client and suggested they needed to talk to their host about the server load. It should not take 3+ seconds to have a file saved from <em>vi</em> by any stretch.</p>
<p>Then, it&#8217;s on to my next tool. I take the IP address for the domain (easily found with a simple &#8220;ping&#8221; command in most cases). I then use a tool like that found at www.myipneighbors.com which will tell me how many other domains also respond to that IP. Generally hosts are using virtual hosting so they don&#8217;t run out of IP addresses. By doing this a host can essentially offer unlimited domains on the same box with only one IP address. Hmmm, unlimited domains. Does that start to sound like a bad omen? <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p>I went through this with a different client who was at a host that will remain unnamed but rhymes with why-power-web. Their very active site was on the same server that had over 500 sites, and that was the most I had personally seen. However, when I did the IP lookup on the server I was talking about above, here&#8217;s what I got: there are 737 sites sharing this IP address. 737! That&#8217;s a fairly large airplane, and way too many sites to host on a single server.</p>
<p>Conclusion? You get what you pay for. I am afraid that my client is going to be quite disappointed when they launch their new site on this new host.</p>
<p>Oh, and the host response to the first concern, about the server load being so high? It was something like, &#8220;Our boxes have multiple CPU&#8217;s so a high server load is not an issue.&#8221; They may have multiple CPU&#8217;s, but a high load is certainly an issue. We&#8217;ll have to see how long it takes them to fuss at my client because phpBB2 is &#8220;overloading&#8221; their already overloaded box.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.phpbbdoctor.com/blog/2008/08/28/a-public-service-announcement-some-hosts-overload-their-servers/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
		<item>
		<title>phpBB3 Caching Strategies for 3.0</title>
		<link>http://www.phpbbdoctor.com/blog/2008/08/21/phpbb3-caching-strategies-for-30/</link>
		<comments>http://www.phpbbdoctor.com/blog/2008/08/21/phpbb3-caching-strategies-for-30/#comments</comments>
		<pubDate>Thu, 21 Aug 2008 11:20:13 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Performance Tuning]]></category>
		<category><![CDATA[phpBB]]></category>
		<category><![CDATA[phpBB3]]></category>

		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=252</guid>
		<description><![CDATA[Through some sort of technical glitch most of this post was missing. I&#8217;ve updated it with the rest of the content. My apologies.
A few days ago I posted Part III in my ongoing series where I am comparing phpBB3 to phpBB-Dave (my customized phpBB2-based board) from a feature perspective. I hopefully have made it clear [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Through some sort of technical glitch most of this post was missing. I&#8217;ve updated it with the rest of the content. My apologies.</strong></p>
<p>A few days ago I posted <a href="http://www.phpbbdoctor.com/blog/2008/08/18/smackdown-round-iii-phpbb3-versus-phpbb-dave/">Part III in my ongoing series</a> where I am comparing phpBB3 to phpBB-Dave (my customized phpBB2-based board) from a feature perspective. I hopefully have made it clear that I am not claiming that my board is technically superior, as I am quite sure that it is not. The experimentation I did with the template engines several months back proved that, at the very least.  </p>
<p>Part III included a review of the caching process for each board. A vanilla phpBB2 board does not offer any caching (outside of an early form of template caching). My version is caching quite a bit of information, as does phpBB3. I&#8217;ve reproduced that specific table here, as it will help clarify the points made later in this post.</p>
<p>One interesting result of Londonvasion for me was that I found out that there are team members that stop by and read my blog (at least occasionally) that don&#8217;t leave comments. DavidMJ, one of the current members of the phpBB3 development team, is apparently one of those folks. He caught me on IRC a few nights ago and offered to help explain the caching routines from phpBB3 and I was quick to take him up on his offer. Here is an excerpt from our conversation; I found it extremely enlightening. </p>
<p><span id="more-252"></span></p>
<blockquote><p>[22:03] DavidMJ: drathbun: hey<br />
[22:03] drathbun: greetings<br />
[22:04] DavidMJ: I happened to catch your blog, was wondering if you wanted to know what we mean by &#8220;arbitrary&#8221; data wrt caching<br />
[22:04] drathbun: at some point, yes, would love to<br />
[22:04] drathbun: I hadn&#8217;t had time to investigate yet, but if you&#8217;re inclined to share, I&#8217;m listening <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /><br />
[22:04] DavidMJ: sure<br />
[22:04] DavidMJ: we make a distinction between caching queries and everything else<br />
[22:05] DavidMJ: so if we want to cache a query, all we have to do is make it so that we specify a TTL, the rest of the code is unchanged<br />
[22:06] DavidMJ: everything else falls under &#8220;arbitrary&#8221;, we provide a nice mechanism for saying &#8220;take this array/string/object and cache it for some amount of time, I will remember because I have given it a name&#8221;<br />
[22:06] drathbun: like smilies?<br />
[22:06] DavidMJ: yep<br />
[22:06] DavidMJ: so caching a query is totally unnamed while data is completely named<br />
[22:06] drathbun: aha<br />
[22:06] DavidMJ: it allows us to also make sure that we only cache old things wrt queries<br />
[22:07] drathbun: so as a stupid noobish question, what exactly is cached when you say query cache? the sql or the results?<br />
[22:07] DavidMJ: as we only will cache, and recall, something we have seen before<br />
[22:07] DavidMJ: technically, both<br />
[22:07] drathbun: ok<br />
[22:07] DavidMJ: we hash the sql to be able to know that _exact_ query<br />
[22:07] drathbun: so by caching the sql, you avoid the sql build step<br />
[22:07] DavidMJ: we store the entire results very efficiently in 3.2<br />
[22:07] DavidMJ: we store them quite well for 3.0<br />
[22:09] drathbun: I do some what you might consider fairly primitive caching now&#8230;<br />
[22:09] DavidMJ: drathbun: what do you do now?<br />
[22:11] drathbun: what I call my &#8220;primitive&#8221; cache is just a dump to a file of a series of assignment statements<br />
[22:11] drathbun: so things that are static, or nearly so, are included as needed rather than running queries on every page<br />
[22:11] drathbun: I figure I&#8217;ve eliminated somewhere on the order of 500,000 queries a day from my server<br />
[22:12] DavidMJ: drathbun: effective, but not as robust as the 3.0 mechanism <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /><br />
[22:12] drathbun: I&#8217;m sure it&#8217;s not <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /><br />
[22:12] drathbun: I have a cache-loader that checks the page being processed, and loads the cache related to those pages, also does the same for language files<br />
[22:13] DavidMJ: ah, that is a bit strange<br />
[22:13] drathbun: so I hope that the file I/O I added for the cache is offset by the reduced file I/O for unneeded language files<br />
[22:13] DavidMJ: what we do is we load up caches as needed<br />
[22:13] DavidMJ: we see if we recognize a query is in the cache, if so we load it<br />
[22:13] DavidMJ: this way, identical queries on multiple pages are cached once, loaded once<br />
[22:13] DavidMJ: given the same TTL, etc.<br />
[22:14] DavidMJ: it also totally hides the caching logic<br />
[22:14] DavidMJ: another nice trick is bypassing I/O alltogether<br />
[22:14] drathbun: there are so many customized queries that can be run because of board permissions and so on, I never really applied any thought to caching queries because I figured it would be a lot of work for little benefit<br />
[22:15] DavidMJ: 3.0 really does not need board permissions cached, it is all stored in a bitfield<br />
[22:15] DavidMJ: the bitfield is stored per forum and is always easy to get to, the lookup is quite fast&#8230;<br />
[22:16] DavidMJ: we have some issues when people do permission set up without using roles on huge boards<br />
[22:16] drathbun: but in theory, with 20 different people online, couldn&#8217;t you have 20 different permission settings?<br />
[22:16] drathbun: for the same forum?<br />
[22:16] DavidMJ: yep<br />
[22:16] DavidMJ: and it is stored with each user<br />
[22:16] drathbun: aha<br />
[22:16] DavidMJ: it is not cached anywhere, there is no need<br />
[22:16] * drathbun sees a lightbulb<br />
[22:16] DavidMJ: we grab the whole row anyway<br />
[22:17] drathbun: right<br />
[22:17] DavidMJ: so permissions are quite efficient<br />
[22:17] drathbun: so you already know the permissions when you get the user data<br />
[22:17] DavidMJ: yep<br />
[22:17] DavidMJ: 3.0 is light years ahead of 2.0 wrt organization <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p></blockquote>
<p>So first, thanks to DavidMJ for taking the time to explain the caching routine to me. At some point I will be reading some code, but I have a much better understanding of what the phrases on the chart were intended to mean now. The way user permissions are stored sounds incredibly efficient, for one thing. The idea of being able to cache / share both the sql build output and the query results is also interesting. Clearly what is in 3.0 as far as caching is far, far above what I have implemented.</p>
<p>In case you missed it, &#8220;arbitrary data&#8221; is relatively static data like smilies. That&#8217;s what I thought they meant by database query caching. In that case, I do not do any query caching, only arbitrary data. So that means the new format for the feature comparison table is this:</p>
<h3>Caching</h3>
<table class="blogtable">
<tr>
<th colspan="4">Caching</th>
</tr>
<tr>
<td>Feature</td>
<td>phpBB2</td>
<td>phpBB3</td>
<td>phpBB-Dave</td>
</tr>
<tr class="alt">
<td>Database Query Caching:</td>
<td>No</td>
<td>Yes</td>
<td>No</td>
</tr>
<tr>
<td>Template Caching:</td>
<td>No</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr class="alt">
<td>Arbitrary Data:</td>
<td>No</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr>
<td>Manual Cache Refreshing:</td>
<td>No</td>
<td>Yes</td>
<td>Yes</td>
</tr>
</table>
<p>The change doesn&#8217;t alter the way I scored this category. It just helps me understand more about how the caching routines work, and I am quite happy that DavidMJ offered to educate me. </p>
<p>Oh, and my favorite quote from the conversation? DavidMJ is nothing, if not bold. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_lol.gif' alt=':lol:' class='wp-smiley' />  Here was a prediction he made about 3.2 during the conversation:</p>
<blockquote><p>DavidMJ: 3.2 will have robust and reliability guarantees beyond anything I have seen in modern forum software</p></blockquote>
<p>The thing is, I believe he along with the rest of the developer team can back that statement up and deliver on that prediction. I really do. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.phpbbdoctor.com/blog/2008/08/21/phpbb3-caching-strategies-for-30/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Why Auto-Group MODs are Bad For Your Board</title>
		<link>http://www.phpbbdoctor.com/blog/2008/04/03/why-auto-group-mods-are-bad-for-your-board/</link>
		<comments>http://www.phpbbdoctor.com/blog/2008/04/03/why-auto-group-mods-are-bad-for-your-board/#comments</comments>
		<pubDate>Fri, 04 Apr 2008 02:51:51 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[MOD Writing]]></category>
		<category><![CDATA[Performance Tuning]]></category>
		<category><![CDATA[phpBB]]></category>

		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/2008/04/03/why-auto-group-mods-are-bad-for-your-board/</guid>
		<description><![CDATA[Some time back I wrote the Forum Auth by Post Count MOD for phpBB2. I had to write it because so many people were using an &#8220;auto-group&#8221; MOD of some kind, and using it to grant permissions to view or post in specific forums. The general idea is good. In my opinion, the implementation is [...]]]></description>
			<content:encoded><![CDATA[<p>Some time back I wrote the Forum Auth by Post Count MOD for phpBB2. I had to write it because so many people were using an &#8220;auto-group&#8221; MOD of some kind, and using it to grant permissions to view or post in specific forums. The general idea is good. In my opinion, the implementation is bad and can have a substantial negative impact on your board.</p>
<p><span id="more-186"></span>One of my clients was using an auto-group MOD to require posters to post in the &#8220;Introduction&#8221; forum on her board before they posted anywhere else. There were also certain semi-private forums that were not available (visible) until a member had reached a certain post level. Now in phpBB2 you manage permissions in two ways: by user (bad idea) or by group (much better). But there is no &#8220;event&#8221; system in phpBB2 that will automatically add (or remove) someone from a group. This is where the auto-group MODs generally come into play.</p>
<h3>How They Work</h3>
<p>Auto-group MODs generally operate by checking a set of rules (a query) against the user&#8217;s post count after each operation that can impact that value (the post count). If a user&#8217;s post count has increased above the set threshold they are added to the group(s) they now qualify for. If the user&#8217;s post count has decreased below a set threshold, then they are removed. Sounds simple, right? So what is the issue?</p>
<h3>The Problem</h3>
<p>The issue is there are lots of things that can change a user&#8217;s post count. Let&#8217;s count them.</p>
<ol>
<li>A user enters a new post</li>
<li>A user deletes their own post</li>
<li>A moderator or admin deletes an individual post</li>
<li>A moderator or admin deletes an entire topic</li>
<li>The standard (automatic) pruning process removes a post</li>
<li>A board admin deletes an entire forum</li>
</ol>
<p>Is that enough? <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_lol.gif' alt=':lol:' class='wp-smiley' />  There might be more, but that list is certainly long enough to make my point for me. The good news is that most of the options above call the same code, so it&#8217;s not like there are lots of different places to add new code. That&#8217;s not the issue. The issue is what that code has to do. Every time the user&#8217;s post count changes, I have to:</p>
<ol>
<li>Get a list of groups with &#8220;auto-group&#8221; rules</li>
<li>Get a list of groups that I am already in</li>
<li>See if my new post count qualifies me for any new groups</li>
<li>See if my new post count requires me to be dropped from any current groups</li>
</ol>
<p>Once that determination has been made, I then have to execute queries to adjust my group membership(s). All of these add overhead to something (the posting process) that should be as smooth as possible. Posting is, afterall, what we want people to do on our boards, right?</p>
<p>And here&#8217;s the worst part. These queries will execute each and every time I enter a post! Suppose that I need to accrue 100 posts before I am granted membership to a private group. From post one through post ninety-nine I pay the penalty of checking for new group membership. On post 100 I finally get to join, yippee! But guess what&#8230; every post from 101 and on is still going to check my group membership!</p>
<p>In my opinion, this is a very inefficient process, and should never be run on a busy board.</p>
<h3>The Solution</h3>
<p>The Forum Auth by Post Count MOD that I wrote does not require anything to be done during the posting or pruning process. If you look at the root issue (I want to grant access to a board only after a set post count is reached) the better solution is to add that information to the forum table. Afterall, in every case where you are requesting forum information (the forum name, for example) you can easily add new fields to the query. It is a simple matter to also request the &#8220;min_posts_to_view&#8221; field and compare it to my post count. How many extra queries does that take? Zero. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>So in a nutshell that is the design behind the Forum Auth by Post Count (FAPC? Nah, don&#8217;t go there&#8230; <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' />  ) modification that I wrote for phpBB2. Instead of overhead on every single post, I check one value that I have all the time ($userdata['user_posts']) against a value that I can easily retrieve every time I get the forum_id or forum_name from the phpbb_forums table and react accordingly. I want to talk more about the design of FAPC (see, it just doesn&#8217;t scan as an abbreviation) in my next post, because I think it uses a good approach. It isolates most of the code so phpBB2 upgrades are fairly easy, and it also makes it easy to reuse in a variety of places.</p>
<p>It&#8217;s not rocket science (but then again, only rocket science === rocket science) but it&#8217;s probably worth writing about. Stay tuned. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_cool.gif' alt='8)' class='wp-smiley' /> </p>
<p><strong>Related Links</strong></p>
<ul>
<li><a href="http://www.phpbb.com/community/viewtopic.php?f=15&#038;t=471142">Release Topic</a> at phpBB.com</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.phpbbdoctor.com/blog/2008/04/03/why-auto-group-mods-are-bad-for-your-board/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Template Performance Update: Check the /contrib Folder</title>
		<link>http://www.phpbbdoctor.com/blog/2008/03/27/template-performance-update-check-the-contrib-folder/</link>
		<comments>http://www.phpbbdoctor.com/blog/2008/03/27/template-performance-update-check-the-contrib-folder/#comments</comments>
		<pubDate>Thu, 27 Mar 2008 14:06:28 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Performance Tuning]]></category>
		<category><![CDATA[phpBB]]></category>

		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/2008/03/27/template-performance-update-check-the-contrib-folder/</guid>
		<description><![CDATA[I am very excited that phpBB user Brainy has been working on an updated template engine for phpBB2. However, at the moment I can&#8217;t get it to work completely. Last night, when I was up late working on other things already anyway, I decided to finally take a look at the template options from the [...]]]></description>
			<content:encoded><![CDATA[<p>I am very excited that phpBB user Brainy has been working on an updated template engine for phpBB2. However, at the moment I can&#8217;t get it to work completely. Last night, when I was up late working on other things already anyway, I decided to finally take a look at the template options from the /contrib folder from a standard phpBB2 installation. I was glad I did. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><span id="more-180"></span></p>
<h3>The Test</h3>
<p>I ran three trials. The first trial included Categories Hierarchy, eXtreme Styles, and the file cache code from the /contrib folder. I called it &#8220;phpbb-cache&#8221; for clarity. On the first trial I ran a single session with a two second refresh. On the second trial I ran 12 sessions with a five second refresh. On the last trial (the &#8220;load test&#8221;) I ran 24 sessions with a seven second refresh. The last one is really the most interesting, since it is extremely rare that my largest board has only a single active session.</p>
<p>The reason I ran these trials is I am rolling out a huge list of updates next week, and I really want to get off of eXtreme Styles. The two main reasons for that are (1) based on my testing eXtreme Styles does not scale well under load, and (2) an exploit was recently discovered in this code so I would rather drop it even though I have fixed the known exploit.</p>
<p>All trials were run using the same process and server that I have <a href="http://www.phpbbdoctor.com/blog/2008/03/01/performance-tuning-setting-up-to-test-template-engines-in-phpbb2/">outlined before</a>.</p>
<h3>What Am I Looking For</h3>
<p>I look at four numbers on my testing. First, I capture the specific time that it takes to parse the page using the pparse() function. But as has been pointed out elsewhere, that is not the complete picture. A template engine might parse well but do other things inefficiently. So I also track the page generation time, starting in common.php and ending in page_tail.php. That is the second number, and is probably the most effective at measuring total template throughput. Third, I track the database time (SQL time) for each page. In theory, they should be nearly identical for all template engines, since the database doesn&#8217;t change from one run to the next. Finally, I calculate the standard deviation of the total page times. What I am looking for here is consistency. The template engine with the lowest standard deviation is the most consistent, and therefore the better performer under system load.</p>
<p>I also try to ensure that I have a fair sample of values. In this sample I have over 500 refreshes for each template engine for each page being tested. (I am only testing pages that can be refreshed, for obvious reasons. I don&#8217;t want to have to write a script to testing posting or other input screens.)</p>
<h3>The Results</h3>
<p>Here are the results, sorted by page name and then by average page time. </p>
<table class="blogtable">
<tr>
<th>Page</th>
<th>Template Engine</th>
<th>Average pparse()</th>
<th>Average Page</th>
<th>Average SQL</th>
<th>Std Dev of Average Page</th>
</tr>
<tr class="alt">
<td>index.php</td>
<td>  phpbb-cach</td>
<td>0.0067173108</td>
<td>0.1482119466</td>
<td>0.0190119559</td>
<td>0.0559101803</td>
</tr>
<tr>
<td>index.php</td>
<td>  ch</td>
<td>0.0085257116</td>
<td>0.1604409443</td>
<td>0.0202735289</td>
<td>0.0792683912</td>
</tr>
<tr class="alt">
<td>index.php</td>
<td>  xs</td>
<td>0.0056653028</td>
<td>0.1613536211</td>
<td>0.0160515404</td>
<td>0.0688333278</td>
</tr>
<tr>
<td>memberlist.php</td>
<td>  phpbb-cach</td>
<td>0.0033270374</td>
<td>0.1347974076</td>
<td>0.0118314910</td>
<td>0.0467842036</td>
</tr>
<tr class="alt">
<td>memberlist.php</td>
<td>  ch</td>
<td>0.0031383230</td>
<td>0.1375231788</td>
<td>0.0110986107</td>
<td>0.0499126208</td>
</tr>
<tr>
<td>memberlist.php</td>
<td>  xs</td>
<td>0.0031607790</td>
<td>0.1506782902</td>
<td>0.0125334372</td>
<td>0.0560397755</td>
</tr>
<tr class="alt">
<td>viewforum.php</td>
<td>  phpbb-cach</td>
<td>0.0123964617</td>
<td>0.2362841418</td>
<td>0.0781931222</td>
<td>0.0844987811</td>
</tr>
<tr>
<td>viewforum.php</td>
<td>  ch</td>
<td>0.0120772717</td>
<td>0.2460665215</td>
<td>0.0830869342</td>
<td>0.1043991992</td>
</tr>
<tr class="alt">
<td>viewforum.php</td>
<td>  xs</td>
<td>0.0130073584</td>
<td>0.2538506812</td>
<td>0.0802594930</td>
<td>0.0949254373</td>
</tr>
<tr>
<td>viewtopic.php</td>
<td>  ch</td>
<td>0.0201268410</td>
<td>0.2208466722</td>
<td>0.0265869996</td>
<td>0.1081425196</td>
</tr>
<tr class="alt">
<td>viewtopic.php</td>
<td>  phpbb-cach</td>
<td>0.0244082441</td>
<td>0.2239469735</td>
<td>0.0287353790</td>
<td>0.1091093586</td>
</tr>
<tr>
<td>viewtopic.php</td>
<td>  xs</td>
<td>0.0237715527</td>
<td>0.2339795790</td>
<td>0.0263962279</td>
<td>0.1056839955</td>
</tr>
</table>
<p>I know, that&#8217;s a lot of numbers. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  However, my first observation is that phpbb-cache is the top-rated template for three of the four pages in the test. My second observation is that it is also the top-rated for three of the four standard deviation values. From these numbers I conclude that the basic file cache template engine from the contrib folder does a better job under load than either of the other two template engines that I tested.</p>
<p>I think that it&#8217;s ironic that the eXtreme Styles MOD benefited from the best SQL performance on the index page, yet still came in last place for total page generation time. I am not sure why the SQL times range as much as they do; I expected more consistency.</p>
<p>For the next step I added up the four average times, divided it by four and called it an &#8220;Expected Page&#8221; time. Here are the results:</p>
<table class="blogtable">
<tr>
<th>Template Engine</th>
<th>Expected  Page Time</th>
<th></tr>
<tr class="alt">
<td>phpbb-cache</td>
<td>0.1858101174</td>
</tr>
<tr>
<td>ch</td>
<td>0.1912193292</td>
</tr>
<tr class="alt">
<td>xs</td>
<td>0.1999655429</td>
</tr>
</table>
<p>As a reminder, these numbers were generated by having 24 concurrent sessions refreshing four pages (index, viewforum, viewtopic, and memberlist) one right after the other. This number is a general indication of how efficient each template engine is across the entire board. As you can see, the phpbb caching template did quite well.</p>
<h3>Conclusion</h3>
<p>Since I have a major upgrade planned for this weekend, and I want to get off of eXtreme Styles, I am going to switch over to the template_file_cache.php engine from the /contrib folder. I am using it for several reasons. First, it works very well. Second, I have not been able to get my adjusted categories hierarchy template engine to work with the attachment MOD, so I cannot consider it. The same issue is present (at least so far) with the new Speedy Template from Brainy.</p>
<p>Oh, and if you are wondering&#8230; the phpbb-cache template came in first place on every page on the single session test, and came in second place on every page (ch was first) on the 12 session load test.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.phpbbdoctor.com/blog/2008/03/27/template-performance-update-check-the-contrib-folder/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Template Engine Analysis Postponed&#8230;</title>
		<link>http://www.phpbbdoctor.com/blog/2008/03/26/template-engine-analysis-postponed/</link>
		<comments>http://www.phpbbdoctor.com/blog/2008/03/26/template-engine-analysis-postponed/#comments</comments>
		<pubDate>Wed, 26 Mar 2008 18:43:27 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Performance Tuning]]></category>
		<category><![CDATA[phpBB]]></category>

		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/2008/03/26/template-engine-analysis-postponed/</guid>
		<description><![CDATA[As regular readers will know (I think I&#8217;m up to 3 of you now  ) I have been doing some testing and preliminary analysis on some different template engines for phpBB2. However, I have since discovered that the alternate template engines I have been evaluating work fine on most pages but fail on the [...]]]></description>
			<content:encoded><![CDATA[<p>As regular readers will know (I think I&#8217;m up to 3 of you now <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_lol.gif' alt=':lol:' class='wp-smiley' /> ) I have been doing some testing and preliminary analysis on some different template engines for phpBB2. However, I have since discovered that the alternate template engines I have been evaluating work fine on most pages but fail on the posting page. So at the moment I have tabled all further posting until I can get a template system that works for <strong>all</strong> phpBB functions, and not just the basic pages I was refreshing. It&#8217;s quite discouraging, as I was hoping to finish reviewing the numbers and roll out my selection on my largest board during our next update.</p>
<p>We&#8217;ll be sticking with eXtreme Styles for now. Further details will be posted once I get everything working.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.phpbbdoctor.com/blog/2008/03/26/template-engine-analysis-postponed/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Running a phpBB2 board? Get rid of this code, and do it now</title>
		<link>http://www.phpbbdoctor.com/blog/2008/03/24/running-a-phpbb2-board-get-rid-of-this-code-and-do-it-now/</link>
		<comments>http://www.phpbbdoctor.com/blog/2008/03/24/running-a-phpbb2-board-get-rid-of-this-code-and-do-it-now/#comments</comments>
		<pubDate>Mon, 24 Mar 2008 18:11:57 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Performance Tuning]]></category>
		<category><![CDATA[phpBB]]></category>

		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/2008/03/24/running-a-phpbb2-board-get-rid-of-this-code-and-do-it-now/</guid>
		<description><![CDATA[About six or eight months ago I started noticing a really weird behavior from my server. It was always related to pages from my largest phpBB2 board. The first symptom was that the server would slow to a crawl. After connecting to the board via the web host manager or via PuTTY (if I could [...]]]></description>
			<content:encoded><![CDATA[<p>About six or eight months ago I started noticing a really weird behavior from my server. It was always related to pages from my largest phpBB2 board. The first symptom was that the server would slow to a crawl. After connecting to the board via the web host manager or via PuTTY (if I could do either) I would see a system load of 50, 90, even 300+ at one point. When I say system load I am talking about the top-line number from the &#8220;top&#8221; command. In theory with a four-cpu box my system load should never be higher that 4.0, so a load of 300+ is quite&#8230; disturbing.</p>
<p>I checked and changed my apache settings. I checked my database settings. I kept checking and checking and tweaking and nothing made a difference. I finally resorted to blocking IP addresses that exhibited this behavior.</p>
<p>I think I may finally have figured out what was causing my problem. Believe it or not, it was some code in phpBB2. <span id="more-176"></span></p>
<p>phpBB2 is often blamed for server issues. In this case it seems that it certainly was a contributing factor. There may also be something about my apache configuration so I&#8217;m not going to place all of the blame on phpBB2. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  But the fix is quite simple.</p>
<h3>Extreme Server Load</h3>
<p>Here is a thumbnail of a screen shot from my apache server status screen, click to see the larger image:</p>
<p><a href="/blog/images/server_ip_load.png"><img src="/blog/images/tn_server_ip_load.png" width="350" height="305" alt="screen shot #1" title="Server load from phpBB2" border="0" /></a></p>
<p>If you click the image you&#8217;ll see that the entire screen is filled with requests from a single IP address. Each request is for a different instance of viewforum.php which &#8211; coincidentally enough &#8211; is one of the pages that puts the heaviest load on my server due to the number and type of queries involved. When I looked at my apache logs I saw that these requests come in at the rate of five or more per second! This is not a typical human behavior.</p>
<p>And to be honest, it is not a typical web-bot behavior either. Well-behaved bots do not overload your server with too many requests or they would soon find themselves out of a job. So what was causing this?</p>
<h3>Offline Viewing</h3>
<p>The first thought that I had was that someone was trying to use the &#8220;offline viewing&#8221; option from their browser. This option allows a user to specify a web site and a link depth. The browser will start with the specified page for the web site and download it. Then the browser will follow any included links up to the specified depth and also download them to the offline cache. </p>
<p>This was easy enough to test. I set up my browser to cache my board index page and set the link depth to three. But when I checked my apache log I didn&#8217;t see the same behavior, nor did my server load spike up. So I eliminated this as a potential source for the issue.</p>
<h3>Web Accelerators</h3>
<p>The next suggestion that I got was that this was an indication of a web-accelerator in use. <a href="http://webaccelerator.google.com/">Google offers a product</a> that does this and Firefox has something called <a href="https://addons.mozilla.org/en-US/firefox/addon/1269">FasterFox</a>. I took a look at both of these and looked for others as well. However, both of these products mentioned that they do not pre-fetch dynamic URLs. This means that a URL that includes an argument like viewforum.php?f=42 <strong>would not be pre-fetched!</strong></p>
<p>Now what?</p>
<p>I finally resorted to blocking users by IP address anytime I saw this behavior. That was not my preferred solution, but I had to do something to keep my server from doing a death spiral. In some cases this browsing behavior would take my server down for up to 30 minutes. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_eek.gif' alt=':shock:' class='wp-smiley' /> </p>
<p>In order to soften the blow somewhat I didn&#8217;t just block the users outright. Instead I sent them to <a href="http://www.forumtopics.com/no_soup.php">this page</a> which you will understand if you&#8217;ve seen the Seinfeld TV show. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<h3>On An Unrelated Note&#8230;</h3>
<p>As if often the case the breakthrough presented itself while trying to solve something that was completely unrelated to this issue. I was doing some development on viewtopic, and one of my javascript scripts was not working. As part of the debugging process I did a &#8220;view source&#8221; on the page. What I saw gave me pause. Here it is (with some content removed for brevity)</p>
<pre>&lt;link rel="top" href="./index.php" title="title here" /&gt;
&lt;link rel="search" href="./search.php" title="Search" /&gt;
&lt;link rel="help" href="./faq.php" title="FAQ" /&gt;
&lt;link rel="author" href="./memberlist.php" title="Memberlist" /&gt;
&lt;link rel="prev" href="viewtopic.php?t=40050&amp;view=previous" title="View previous topic" /&gt;
&lt;link rel="next" href="viewtopic.php?t=40050&amp;view=next" title="View next topic" /&gt;
&lt;link rel="up" href="viewforum.php?f=99" title="..." /&gt;
&lt;link rel="chapter forum" href="viewforum.php?f=1" title="..." /&gt;
&lt;link rel="chapter forum" href="viewforum.php?f=63" title="..." /&gt;
&lt;link rel="chapter forum" href="viewforum.php?f=2" title="..." /&gt;
&lt;link rel="chapter forum" href="viewforum.php?f=61" title="..." /&gt;
&lt;link rel="chapter forum" href="viewforum.php?f=4" title="..." /&gt;
&lt;link rel="chapter forum" href="viewforum.php?f=40" title="..." /&gt;
&lt;link rel="chapter forum" href="viewforum.php?f=48" title="..." /&gt;
&lt;link rel="chapter forum" href="viewforum.php?f=78" title="..." /&gt;
&lt;link rel="chapter forum" href="viewforum.php?f=7" title="..." /&gt;
&lt;link rel="chapter forum" href="viewforum.php?f=96" title="..." /&gt;
&lt;link rel="chapter forum" href="viewforum.php?f=65" title="..." /&gt;
&lt;link rel="chapter forum" href="viewforum.php?f=8" title="..." /&gt;
&lt;link rel="chapter forum" href="viewforum.php?f=47" title="..." /&gt;
&lt;link rel="chapter forum" href="viewforum.php?f=9" title="..." /&gt;
&lt;link rel="chapter forum" href="viewforum.php?f=38" title="..." /&gt;
&lt;link rel="chapter forum" href="viewforum.php?f=10" title="..." /&gt;</pre>
<p>As soon as I saw this I flashed back to the server load screen shot. Could this be the cause of my load? Is this why a single user would request to view every single forum on my board within seconds of each other?</p>
<h3>Navigation Bar</h3>
<p>It turns out that this code is a bit of a legacy at this point. It was designed to provide information used by the mozilla &#8220;navigation bar&#8221; which is a feature that has been obsolete for a very long time. Or at least I thought it was, as it seems that it has come back at some point. I found <a href="https://addons.mozilla.org/en-US/firefox/addon/1949">this reference</a> to an add-on for Firefox that uses this information.</p>
<p>In theory this could be a good thing for webmasters. It allows them to set up chapters and titles and other information to provide more structure for their web site. But in the case of phpBB2 it doesn&#8217;t seem to work. They treat each forum as a chapter. If you have a lot of forums, then there are a lot of chapters. And if the add-on or plug-in attempts to pre-fetch the chapter information from each of those pages, well, that could cause the behavior that I have observed.</p>
<p>Even with that in mind, I wonder who thought it was a good idea to link to the entire member list as the &#8220;author&#8221; of the board content? <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p>For what it is worth, these links are not included in phpBB3. They are no longer generated for my phpBB2 boards either. In the days since I have removed the code that generates the navigation bar information I have not seen a single instance of a server load over 3.0, much less 30 or even 300. Since I was not online viewing the server load for 24 hours at a time I also pulled out every instance of a call to viewforum.php from my apache logs and looked at the time stamps and other information. I did not find one occurrence of this behavior for the past 72 hours.</p>
<p>I am now quite confident (perhaps not 100%, but very, very close) that I have discovered the root cause of this issue. Oh, and the issue that I was researching on viewtopic that caused me to view the source? I figured that out too, just in case you were worried. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<h3>Why Me?</h3>
<p>The next question I found myself asking myself was, &#8220;Why me?&#8221; Why hasn&#8217;t anyone else mentioned this specific issue at phpbb.com? Perhaps there are several answers. First, someone else may have reported or observed the issue without any idea of what was causing it, much like me at the beginning. So I&#8217;m not sure which search terms I would use to identify topics about this in the support forums. </p>
<p>Second, my board is quite large, both in terms of members and page views and &#8211; most importantly &#8211; forums. If you don&#8217;t have hundreds of people online at the same time you probably won&#8217;t see the issue. And even if you do have hundreds of people online at the same time, if you only have a few forums it&#8217;s still not going to be an issue.</p>
<p>I have almost 80 visible forums on this particular board at the time I write this. I generally have over 300 sessions active on my board at any given point in time and often have numbers that are much higher, as noted in this graph. This graph shows the highest number of users (total, guests, and members) online per hour over the past week on my largest board.</p>
<p><img src="/blog/images/hourly_online_users.png" width="640" height="566" alt="hourly online users" title="Hourly Online User Statistics" border="0" /></p>
<p>The darker part of the column indicates guest sessions and the lighter part indicates a member that has logged in. I track this information by capturing board statistics every ten minutes via a cron job. These number represent the maximum number of sessions for each hour for the last seven days.</p>
<p>Here is a similar graph showing average users instead of the maximum:</p>
<p><img src="/blog/images/hourly_average_users.png" width="640" height="566" alt="hourly average users" title="Hourly Online Average User Statistics" border="0" /></p>
<p>As you can see, over the course of the last seven days I have averaged over 150 users online per hour over the entire day. The board never sleeps. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<h3>Conclusion</h3>
<p>The combination of active users + total forums + some browser plug-in that pre-fetches content spelled disaster for my server. I first started noticing the issue about eight months ago, and it might have been occuring before that. The issue became very obvious over the past several months. After stumbling on the navigation bar code and then removing it I have not seen the issue since. While not 100% conclusive, I feel like the results are enough of an indication that I wanted to share them, thus this post. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><strong>Related Links</strong></p>
<ul>
<li><a href="http://www.phpbbdoctor.com/modsteps.php?m=78&#038;l=1">Remove Navigation Links</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.phpbbdoctor.com/blog/2008/03/24/running-a-phpbb2-board-get-rid-of-this-code-and-do-it-now/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Template Engine Analysis Coming Soon&#8230;</title>
		<link>http://www.phpbbdoctor.com/blog/2008/03/17/template-engine-analysis-coming-soon/</link>
		<comments>http://www.phpbbdoctor.com/blog/2008/03/17/template-engine-analysis-coming-soon/#comments</comments>
		<pubDate>Mon, 17 Mar 2008 14:15:31 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Performance Tuning]]></category>
		<category><![CDATA[phpBB]]></category>

		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/2008/03/17/template-engine-analysis-coming-soon/</guid>
		<description><![CDATA[I have been spending my free time working on my other board, so I have not managed to get the analysis done yet for the various template statistics that I captured. What I plan to do is keep exactly 1,000 page refreshes for each template for each trial. I will do this by taking the [...]]]></description>
			<content:encoded><![CDATA[<p>I have been spending my free time working on my other board, so I have not managed to get the analysis done yet for the various template statistics that I captured. What I plan to do is keep exactly 1,000 page refreshes for each template for each trial. I will do this by taking the middle 1,000 results and tossing the best and worst times. This should help provide a better indication of the true performance of each engine.</p>
<p>I do, however, want to complete this before I relaunch the board I am working on as I will include whichever template engine performs the best with that upgrade. So hopefully I will be back to this in the next week or so. Thanks for your patience.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.phpbbdoctor.com/blog/2008/03/17/template-engine-analysis-coming-soon/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Testing Template Engines in phpBB2 Part III: Data Collection Complete</title>
		<link>http://www.phpbbdoctor.com/blog/2008/03/08/testing-template-engines-in-phpbb2-part-iii-data-collection-complete/</link>
		<comments>http://www.phpbbdoctor.com/blog/2008/03/08/testing-template-engines-in-phpbb2-part-iii-data-collection-complete/#comments</comments>
		<pubDate>Sat, 08 Mar 2008 07:19:55 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Performance Tuning]]></category>
		<category><![CDATA[phpBB]]></category>

		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/2008/03/08/testing-template-engines-in-phpbb2-part-iii-data-collection-complete/</guid>
		<description><![CDATA[I have completed the data collection phase of the template testing. I have run 10 different scenarios all on my isolated development server. A couple of scenarios are going to be tossed from the final results due to bugs in code or collection anomolies. 
So here is what I have to evaluate:



Trial Number
Description
Rows


1
Single concurrent session2 [...]]]></description>
			<content:encoded><![CDATA[<p>I have completed the data collection phase of the template testing. I have run 10 different scenarios all on my isolated development server. A couple of scenarios are going to be tossed from the final results due to bugs in code or collection anomolies. </p>
<p>So here is what I have to evaluate:</p>
<p><span id="more-169"></span></p>
<table class="blogtable">
<tr>
<th>Trial Number</th>
<th>Description</th>
<th>Rows</th>
</tr>
<tr>
<td valign="top">1</td>
<td valign="top">Single concurrent session<br />2 second meta refresh<br />Template Engines Tested:
<ul>
<li>phpBB2 Standard</li>
<li>eXtreme Styles 2.4.0</li>
<li>Speedy Template prototype</li>
</ul>
</td>
<td valign="top">13,620</td>
</tr>
<tr class="alt">
<td valign="top">2</td>
<td valign="top">Single concurrent session<br />2 second meta refresh<br />Template Engines Tested:
<ul>
<li>phpBB2 Standard</li>
<li>eXtreme Styles 2.4.0</li>
<li>Speedy Template prototype</li>
<li>Categories Hierarchy</li>
</ul>
</td>
<td valign="top">19,052</td>
</tr>
<tr>
<td valign="top">3</td>
<td valign="top">12 concurrent sessions<br />5 second meta refresh<br />Template Engines Tested:
<ul>
<li>phpBB2 Standard</li>
<li>eXtreme Styles 2.4.0</li>
<li>Speedy Template prototype</li>
<li>Categories Hierarchy</li>
</ul>
</td>
<td valign="top">18,812</td>
</tr>
<tr class="alt">
<td valign="top">4</td>
<td valign="top">22 concurrent session<br />5 second meta refresh<br />Template Engines Tested:
<ul>
<li>phpBB2 Standard</li>
<li>eXtreme Styles 2.4.0</li>
<li>Speedy Template prototype</li>
<li>Categories Hierarchy</li>
</ul>
</td>
<td valign="top">19,308</td>
</tr>
<tr>
<td valign="top">5</td>
<td valign="top">Single concurrent session<br />5 second meta refresh<br />Template Engines Tested:
<ul>
<li>phpBB2 Standard</li>
<li>eXtreme Styles 2.4.0</li>
<li>Speedy Template prototype</li>
<li>Categories Hierarchy</li>
</ul>
</td>
<td valign="top">23,441</td>
</tr>
<tr class="alt">
<td valign="top">6</td>
<td valign="top">Bugged code invalidated results, ignored</td>
<td valign="top">N/A</td>
</tr>
<tr>
<td valign="top">7</td>
<td valign="top">Single concurrent session<br />2 second meta refresh<br />Template Engines Tested:
<ul>
<li>phpBB2 Standard</li>
<li>eXtreme Styles 2.4.0</li>
<li>Speedy Template 0.1.0</li>
<li>Categories Hierarchy</li>
</ul>
</td>
<td valign="top">16,709</td>
</tr>
<tr class="alt">
<td valign="top">8</td>
<td valign="top">12 concurrent sessions<br />5 second meta refresh<br />Template Engines Tested: Same as above</td>
<td valign="top">17,832</td>
</tr>
<tr>
<td valign="top">9</td>
<td valign="top">22 concurrent sessions<br />5 second meta refresh<br />Template Engines Tested: Same as above</td>
<td valign="top">19,241</td>
</tr>
<tr class="alt">
<td valign="top">10</td>
<td valign="top">35 concurrent sessions<br />7 second meta refresh<br />Template Engines Tested: Same as above</td>
<td valign="top">21,111</td>
</tr>
</table>
<p>Session 1 did not include all four template engines and will not be used for the final analysis. Sessions 2 through 5 used a prototype version of the Speedy Template code and will not be used for the final analysis. Session 6 included bugged code from the Speedy Template prototype and will not be used. </p>
<p>Session 7, 8, 9, and 10 are where I will focus my attention at this point. What I plan to do is take each template + page combination and remove the top &#8220;n&#8221; and bottom &#8220;n&#8221; rows from the collection in order to leave exactly 1,000 page refreshes per each. By eliminating the top and bottom outlier records I hope to eliminate any advantage or disadvantage due to a brief server load. With four template engines and four pages and four data collection sessions that means I should end up with  exactly 64,000 rows of data to work with. That even fits nicely within the range of my older version of Microsoft Excel. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.phpbbdoctor.com/blog/2008/03/08/testing-template-engines-in-phpbb2-part-iii-data-collection-complete/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

