<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Storing Post Revisions / Post Locking</title>
	<atom:link href="http://www.phpbbdoctor.com/blog/2009/11/02/storing-post-revisions-post-locking/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.phpbbdoctor.com/blog/2009/11/02/storing-post-revisions-post-locking/</link>
	<description>Your premium source for custom modification services for phpBB</description>
	<lastBuildDate>Wed, 11 Jan 2012 20:39:04 -0600</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Dog Cow</title>
		<link>http://www.phpbbdoctor.com/blog/2009/11/02/storing-post-revisions-post-locking/comment-page-1/#comment-3099</link>
		<dc:creator>Dog Cow</dc:creator>
		<pubDate>Wed, 18 Nov 2009 19:46:54 +0000</pubDate>
		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=338#comment-3099</guid>
		<description>Fair enough. :-)</description>
		<content:encoded><![CDATA[<p>Fair enough. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave Rathbun</title>
		<link>http://www.phpbbdoctor.com/blog/2009/11/02/storing-post-revisions-post-locking/comment-page-1/#comment-3097</link>
		<dc:creator>Dave Rathbun</dc:creator>
		<pubDate>Wed, 18 Nov 2009 02:52:32 +0000</pubDate>
		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=338#comment-3097</guid>
		<description>Because it&#039;s convenient.

This is all a big experiment, and I may come around to your way of thinking eventually, but right now it makes it very simple to deal with.</description>
		<content:encoded><![CDATA[<p>Because it&#8217;s convenient.</p>
<p>This is all a big experiment, and I may come around to your way of thinking eventually, but right now it makes it very simple to deal with.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dog Cow</title>
		<link>http://www.phpbbdoctor.com/blog/2009/11/02/storing-post-revisions-post-locking/comment-page-1/#comment-3096</link>
		<dc:creator>Dog Cow</dc:creator>
		<pubDate>Tue, 17 Nov 2009 16:24:19 +0000</pubDate>
		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=338#comment-3096</guid>
		<description>Why have that data stored there if it is so infrequently accessed?

Since you could be storing elsewhere, that is....</description>
		<content:encoded><![CDATA[<p>Why have that data stored there if it is so infrequently accessed?</p>
<p>Since you could be storing elsewhere, that is&#8230;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave Rathbun</title>
		<link>http://www.phpbbdoctor.com/blog/2009/11/02/storing-post-revisions-post-locking/comment-page-1/#comment-3094</link>
		<dc:creator>Dave Rathbun</dc:creator>
		<pubDate>Tue, 17 Nov 2009 01:10:10 +0000</pubDate>
		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=338#comment-3094</guid>
		<description>I read it again and I&#039;m not sure what the &quot;second question&quot; is, can you humor me and clarify what it is? :-?</description>
		<content:encoded><![CDATA[<p>I read it again and I&#8217;m not sure what the &#8220;second question&#8221; is, can you humor me and clarify what it is? <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_confused.gif' alt=':-?' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dog Cow</title>
		<link>http://www.phpbbdoctor.com/blog/2009/11/02/storing-post-revisions-post-locking/comment-page-1/#comment-3088</link>
		<dc:creator>Dog Cow</dc:creator>
		<pubDate>Wed, 04 Nov 2009 17:16:33 +0000</pubDate>
		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=338#comment-3088</guid>
		<description>Ok, so what about my second question?</description>
		<content:encoded><![CDATA[<p>Ok, so what about my second question?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave Rathbun</title>
		<link>http://www.phpbbdoctor.com/blog/2009/11/02/storing-post-revisions-post-locking/comment-page-1/#comment-3087</link>
		<dc:creator>Dave Rathbun</dc:creator>
		<pubDate>Wed, 04 Nov 2009 02:55:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=338#comment-3087</guid>
		<description>Bigger table != performance problem :) With a unique key of post_id + post_version only one row (or one block, depending on how the database works) is retrieved. It doesn&#039;t matter if it&#039;s 100 rows or 100M rows, a unique key read should be nearly instant. Yes, a unique key index on a larger table is going to be larger than a unique key index on a smaller table, but it&#039;s not a linear performance hit. When I query a single post using the unique key on my post table with over 580,000 posts the query time is not measurable. According to MySQL, everything comes back in 0.00 seconds. The query plan (from the explain command) shows that the primary key index will be used, and because of the where clause exactly one row will be returned. That&#039;s about as fast as it gets.</description>
		<content:encoded><![CDATA[<p>Bigger table != performance problem <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  With a unique key of post_id + post_version only one row (or one block, depending on how the database works) is retrieved. It doesn&#8217;t matter if it&#8217;s 100 rows or 100M rows, a unique key read should be nearly instant. Yes, a unique key index on a larger table is going to be larger than a unique key index on a smaller table, but it&#8217;s not a linear performance hit. When I query a single post using the unique key on my post table with over 580,000 posts the query time is not measurable. According to MySQL, everything comes back in 0.00 seconds. The query plan (from the explain command) shows that the primary key index will be used, and because of the where clause exactly one row will be returned. That&#8217;s about as fast as it gets.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dog Cow</title>
		<link>http://www.phpbbdoctor.com/blog/2009/11/02/storing-post-revisions-post-locking/comment-page-1/#comment-3086</link>
		<dc:creator>Dog Cow</dc:creator>
		<pubDate>Tue, 03 Nov 2009 16:44:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=338#comment-3086</guid>
		<description>There is certainly a performance hit every time the table grows. Why have that data stored there if it is so infrequently accessed?

I must admit, though, that I do like how every post action becomes an INSERT. Nice job on that one. :-)</description>
		<content:encoded><![CDATA[<p>There is certainly a performance hit every time the table grows. Why have that data stored there if it is so infrequently accessed?</p>
<p>I must admit, though, that I do like how every post action becomes an INSERT. Nice job on that one. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave Rathbun</title>
		<link>http://www.phpbbdoctor.com/blog/2009/11/02/storing-post-revisions-post-locking/comment-page-1/#comment-3085</link>
		<dc:creator>Dave Rathbun</dc:creator>
		<pubDate>Tue, 03 Nov 2009 14:48:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=338#comment-3085</guid>
		<description>I had not read the topic, thanks for the link. As mentioned in the topic, performance is one concern. The way I have implemented it has no negative impact on performance and in fact may have a slight benefit. Inserts can be more efficient than updates since no locking is done. Since every post &quot;edit&quot; is in fact converted to an insert there might be a benefit there. Note I have not done any benchmarking to determine if this is the case.

However, I don&#039;t like the idea of storing the data in a separate table. It&#039;s different versions of the post text only; the post itself is not changed. Why not store it - in versioned form - in the posts text table? Since there is no performance impact of storing versions and there are certainly advantages (less code to change, easier to retrieve and display the archived versions) I think it&#039;s a good choice.

Time will tell; I plan to roll out the collection portion of the code this week. Giving the moderators the ability to review / revert code is coming later.</description>
		<content:encoded><![CDATA[<p>I had not read the topic, thanks for the link. As mentioned in the topic, performance is one concern. The way I have implemented it has no negative impact on performance and in fact may have a slight benefit. Inserts can be more efficient than updates since no locking is done. Since every post &#8220;edit&#8221; is in fact converted to an insert there might be a benefit there. Note I have not done any benchmarking to determine if this is the case.</p>
<p>However, I don&#8217;t like the idea of storing the data in a separate table. It&#8217;s different versions of the post text only; the post itself is not changed. Why not store it &#8211; in versioned form &#8211; in the posts text table? Since there is no performance impact of storing versions and there are certainly advantages (less code to change, easier to retrieve and display the archived versions) I think it&#8217;s a good choice.</p>
<p>Time will tell; I plan to roll out the collection portion of the code this week. Giving the moderators the ability to review / revert code is coming later.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dog Cow</title>
		<link>http://www.phpbbdoctor.com/blog/2009/11/02/storing-post-revisions-post-locking/comment-page-1/#comment-3084</link>
		<dc:creator>Dog Cow</dc:creator>
		<pubDate>Mon, 02 Nov 2009 19:29:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=338#comment-3084</guid>
		<description>Secondly, I don&#039;t agree with how you have designed the versioning system. I think that upon every edit, you should be inserting a new row of data into a post_version table. This table would be denormalized and the data stored compressed as mentioned earlier. Some columns would be post_id, post_version, user_id, topic_id, post_date, post_text, etc...

Then you could make the primary key on post_id, post_version and query this table separately. Therefore, the posts table and posts_text table would store _only_ the most current post revision, and nothing else. You also would not need the additional field.</description>
		<content:encoded><![CDATA[<p>Secondly, I don&#8217;t agree with how you have designed the versioning system. I think that upon every edit, you should be inserting a new row of data into a post_version table. This table would be denormalized and the data stored compressed as mentioned earlier. Some columns would be post_id, post_version, user_id, topic_id, post_date, post_text, etc&#8230;</p>
<p>Then you could make the primary key on post_id, post_version and query this table separately. Therefore, the posts table and posts_text table would store _only_ the most current post revision, and nothing else. You also would not need the additional field.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dog Cow</title>
		<link>http://www.phpbbdoctor.com/blog/2009/11/02/storing-post-revisions-post-locking/comment-page-1/#comment-3083</link>
		<dc:creator>Dog Cow</dc:creator>
		<pubDate>Mon, 02 Nov 2009 19:26:19 +0000</pubDate>
		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/?p=338#comment-3083</guid>
		<description>I posted a topic to phpBB.com some months ago on this very subject, which you may have read: http://www.phpbb.com/community/viewtopic.php?f=64&amp;t=1573125

One of the problems is that current RDBM systems (that I know of) don&#039;t have a third dimension of time. They are just row/column-oriented. Thus, you&#039;d have to use a JOIN to another table. Having to deal with a JOIN just introduces another facet of complexity due to the inherent problem that a JOINing two or more large tables presents. Quite frankly, when one is dealing with millions of rows and gigabytes of data, the only solution then becomes to remove all joins and query tables as key-value pairs, otherwise the memory and CPU involved becomes prohibitively expensive.

Now, one of the bonuses we have on our side is that of frequency of reference. It just like having a collection of volumes in a library. One uses compact shelving to save space because these historical volumes are infrequently accessed. Likewise, we can store all versioned posts as compressed binary strings in a separate table, or even on a separate database server.</description>
		<content:encoded><![CDATA[<p>I posted a topic to phpBB.com some months ago on this very subject, which you may have read: <a href="http://www.phpbb.com/community/viewtopic.php?f=64&amp;t=1573125" rel="nofollow">http://www.phpbb.com/community/viewtopic.php?f=64&amp;t=1573125</a></p>
<p>One of the problems is that current RDBM systems (that I know of) don&#8217;t have a third dimension of time. They are just row/column-oriented. Thus, you&#8217;d have to use a JOIN to another table. Having to deal with a JOIN just introduces another facet of complexity due to the inherent problem that a JOINing two or more large tables presents. Quite frankly, when one is dealing with millions of rows and gigabytes of data, the only solution then becomes to remove all joins and query tables as key-value pairs, otherwise the memory and CPU involved becomes prohibitively expensive.</p>
<p>Now, one of the bonuses we have on our side is that of frequency of reference. It just like having a collection of volumes in a library. One uses compact shelving to save space because these historical volumes are infrequently accessed. Likewise, we can store all versioned posts as compressed binary strings in a separate table, or even on a separate database server.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

