<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Welcome to the phpBB Doctor Blog &#187; SQL Challenge</title>
	<atom:link href="http://www.phpbbdoctor.com/blog/category/sql-challenge/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.phpbbdoctor.com/blog</link>
	<description>Your premium source for custom modification services for phpBB</description>
	<lastBuildDate>Wed, 11 Jan 2012 21:30:50 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>SQL Challenge #2: Top-ten Members Analysis</title>
		<link>http://www.phpbbdoctor.com/blog/2007/09/28/sql-challenge-2-are-your-top-ten-members-still-active/</link>
		<comments>http://www.phpbbdoctor.com/blog/2007/09/28/sql-challenge-2-are-your-top-ten-members-still-active/#comments</comments>
		<pubDate>Fri, 28 Sep 2007 06:48:41 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[SQL Challenge]]></category>

		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/2007/08/23/sql-challenge-2-are-your-top-ten-members-still-active/</guid>
		<description><![CDATA[phpBB Doctor SQL Challenges
This is the second SQL Challenge from the phpBB Doctor. It is my hope that this series of posts will be interesting but educational as well. The concept is simple: I will post a challenge that must be solved with (unless stated otherwise) a single SQL statement. No php code, no &#8220;insert [...]]]></description>
			<content:encoded><![CDATA[<h3>phpBB Doctor SQL Challenges</h3>
<p>This is the second SQL Challenge from the phpBB Doctor. It is my hope that this series of posts will be interesting but educational as well. The concept is simple: I will post a challenge that must be solved with (unless stated otherwise) a single SQL statement. No php code, no &#8220;insert into&#8230;&#8221; temporary tables, just one simple (or more likely not-so-simple!) SQL statement. There is often more than one way to solve the same problem, so I personally will be interested to see some of the solutions that are presented. Eventually I will post my original answer and optionally the best answers (judged by me) submitted to the challenge as well. Some of the challenges will use phpBB tables and data, others may use special data that I will create and make available for you to download.</p>
<p>What do you win? You win the knowledge that you solved a challenge. Is that enough for you? <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  </p>
<p><span id="more-137"></span></p>
<p>This one, in my opinion, is tough. In my solution I use several different SQL tricks that you may or may not have been exposed to. But give it a shot and see what you come up with. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_cool.gif' alt='8)' class='wp-smiley' /> </p>
<h3>The Setup</h3>
<p>For this challenge you should be able to use your own board. You need at least three months of posting data in order to attempt this challenge. The more data you have, the more interesting this query becomes, at least in my opinion. This challenge is based on a real-world question that I wanted to answer the other day. In many cases the &#8220;top ten&#8221; posters are in that position because they have remained active for a long time. My biggest board is now over five years old, and over time some of the top-ten posters have gradually been falling down (or off of) the list. So I thought it would be interesting to write a query that would identify the current top-ten users and then show me their average monthly post count for the last three months. That way I would get an indication of how many of my top-ten posters were still showing any level of activity.</p>
<h3>The Challenge</h3>
<p>You want to know the average number of posts per month made by your top-ten posters over the past three months. You need to determine the top-ten posters from your board inception date and then show the average number of posts each of them has made in the last three months. The technique that I used requires version 4.1 of MySQL at minimum.</p>
<p>You can assume that the value in user_posts is accurate and use it to determine your top-ten posters. In many cases the user_posts value is not 100% accurate, either because of pruning (which does not reduce the post count) or MODs (some board admins have forums that do not increment your post count). I won&#8217;t worry about that for this challenge, treat user_posts as being the correct value.</p>
<p>Here are the results that I got, with the names obscured for privacy:</p>
<pre>+--------------+-----------------+
| username     | avg(post_count) |
+--------------+-----------------+
| 1afa4b18ad00 |        202.0000 |
| 3ec682b8d52e |         25.0000 |
| 586c0d4f087d |        120.3333 |
| 87e25d48d6a8 |          0.0000 |
| 97ec7e2e5296 |         13.6667 |
| a719c7a5e035 |        115.3333 |
| c1909711326d |        119.6667 |
| cac58b5234e1 |        154.0000 |
| f6bc46b581ba |          0.0000 |
| f8b21c8ff704 |        179.0000 |
+--------------+-----------------+
</pre>
<p>As you can see, I got my answer. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Two of my current top-ten users have not done a thing (average of 0.0 posts) in the last three months. If you are curious, the username is obscured by running it through the md5() function and then chopping off some of the characters. Ultimately this query did not tell me anything that I had not already been able to derive simply by looking at the data. But it does use some interesting SQL techniques.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.phpbbdoctor.com/blog/2007/09/28/sql-challenge-2-are-your-top-ten-members-still-active/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SQL Challenge #1 Solution</title>
		<link>http://www.phpbbdoctor.com/blog/2007/09/25/sql-challenge-1-solution/</link>
		<comments>http://www.phpbbdoctor.com/blog/2007/09/25/sql-challenge-1-solution/#comments</comments>
		<pubDate>Tue, 25 Sep 2007 20:14:23 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[SQL Challenge]]></category>

		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/2007/09/25/sql-challenge-1-solution/</guid>
		<description><![CDATA[In the first challenge in this series I presented some data that I had put together to analyze the posting history of users on my largest board. Rather than post the exact SQL used I thought it would be fun to leave it as an exercise for my blog readers. Today I&#8217;ll reveal my solution.

The [...]]]></description>
			<content:encoded><![CDATA[<p>In the <a href="http://www.phpbbdoctor.com/blog/2007/08/28/sql-challenge-1-frequent-poster-analysis/">first challenge</a> in this series I presented some data that I had put together to analyze the posting history of users on my largest board. Rather than post the exact SQL used I thought it would be fun to leave it as an exercise for my blog readers. Today I&#8217;ll reveal my solution.</p>
<p><span id="more-152"></span></p>
<h3>The Data</h3>
<p>Here&#8217;s what the data looked like:</p>
<pre>+-------------+----------+----------+----------+----------+----------+----------+
| total_posts | w1_posts | w2_posts | w3_posts | w4_posts | w5_posts | w6_posts |
+-------------+----------+----------+----------+----------+----------+----------+
|          56 |        3 |        4 |        8 |        9 |       16 |       16 |
|         589 |       39 |       53 |       89 |      106 |      152 |      150 |
|           7 |        1 |        0 |        0 |        0 |        3 |        3 |
|           2 |        0 |        1 |        0 |        0 |        1 |        0 |
|          10 |        0 |        1 |        3 |        3 |        1 |        2 |
|         266 |       36 |       37 |       61 |       43 |       34 |       55 |
|          14 |        1 |        1 |        3 |        0 |        1 |        8 |
|           5 |        0 |        0 |        0 |        0 |        0 |        5 |
|          15 |        2 |        3 |        2 |        1 |        3 |        4 |
|         236 |       58 |       53 |       49 |       31 |       19 |       26 |
|           5 |        0 |        2 |        1 |        0 |        2 |        0 |
|          46 |       27 |        0 |        2 |        9 |        8 |        0 |
|           3 |        2 |        0 |        0 |        0 |        0 |        1 |
|           4 |        2 |        2 |        0 |        0 |        0 |        0 |
+-------------+----------+----------+----------+----------+----------+----------+</pre>
<p>Here was the desired output:</p>
<pre>+------+----------+
| wk   | count(*) |
+------+----------+
|    1 |     1139 |
|    2 |      407 |
|    3 |      198 |
|    4 |       85 |
|    5 |       64 |
|    6 |       46 |
+------+----------+</pre>
<p>This challenge is really quite simple, as long as you know the proper function to use.</p>
<h3>I Saw The Sign&#8230;</h3>
<p>One of the fairly standard functions that I use is sign(). I say &#8220;fairly standard&#8221; because it is available on most of the databases I have worked with. It&#8217;s a very simple function but that doesn&#8217;t stop it from being quite versatile at the same time. The definition of sign() is as follows:</p>
<pre>sign(X) = 1 if X > 0
sign(X) = 0 if X = 0
sign(X) = -1 if X < 0
sign(X) is null if X Is Null</pre>
<p>So X can be any of an infinite number of values from minus infinity to plus infinity, and the sign() function will return a finite number of values. How does this help?</p>
<p>In this challenge all I want to know is if someone posted. It's impossible for someone to have a negative or null post count. So the results of a sign() function will be exactly zero or one. Hmmm, binary. I like binary. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<h3>The Solution</h3>
<p>So here is my solution that generates the output as shown above from the data I provided:</p>
<pre>select	sign(w1_posts)+sign(w2_posts)+sign(w3_posts)+sign(w4_posts)+sign(w5_posts)+sign(w6_posts) as wk
,	count(*)
from	sql_challenge_01
group by 1</pre>
<p>What this sign() function does in this case is reduce any positive post count to one, and leave any zero post count at zero. If you then sum those values together you can count the number of weeks that included posts by a user. To do that, I include count(*) as the second column in my query.</p>
<p>Let me take the first three rows of the data and show how this works:</p>
<pre>+-------------+----------+----------+----------+----------+----------+----------+
| total_posts | w1_posts | w2_posts | w3_posts | w4_posts | w5_posts | w6_posts |
+-------------+----------+----------+----------+----------+----------+----------+
|          56 |        3 |        4 |        8 |        9 |       16 |       16 |
|         589 |       39 |       53 |       89 |      106 |      152 |      150 |
|           7 |        1 |        0 |        0 |        0 |        3 |        3 |</pre>
<p>Each of the first two rows have a positive value for each weekly value. So the sign() function will reduce them to 1. The net result is 1+1+1+1+1+1 or 6. That means that user posted in each of the last six weeks. The third row has 3 positive values and 3 zero values. The net is 1+0+0+0+1+1 or 3. That user has posted in three of the last six weeks. Simple, eh?</p>
<p>Note that I don't care which weeks the user has posted in, only that there was at least one post. By creating the column as shown in the solution SQL above I will generate a number from 1 to 6. The "group by" clause pulls those rows together and the count(*) counts the number of rows that match.</p>
<h3>Another Sign</h3>
<p>I mentioned that I use this function frequently, and so I thought I would share another use of the same function. I want to run one query that will show me how many users I have, how many <strong>activated</strong> users I have, and how many <strong>posting</strong> users I have. A user count is simple. The user_active field is already a binary, so I can simply sum() that field to count activated users. The user_posts field can have numbers from zero to infinity (it should not have negative numbers, but it can, and I ignore that in this solution). In order to count, I need to reduce that infinite number of values down to something binary. Like this:</p>
<pre>select	count(user_id) as total_users
,	sum(user_active) as active_users
,	sum(sign(user_posts)) as posting_users
from 	phpbb_users;</pre>
<p>There are other ways this could have been done. But I believe this is probably the easiest.</p>
<p>Did anyone else get something different? If so, please share. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  SQL Challenge #2 is much harder than this one, and will come out in the next few days. The rules are the same, solve the challenge using a single query, no php code is allowed. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_cool.gif' alt='8)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.phpbbdoctor.com/blog/2007/09/25/sql-challenge-1-solution/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SQL Challenge #1: Frequent Poster Analysis</title>
		<link>http://www.phpbbdoctor.com/blog/2007/08/28/sql-challenge-1-frequent-poster-analysis/</link>
		<comments>http://www.phpbbdoctor.com/blog/2007/08/28/sql-challenge-1-frequent-poster-analysis/#comments</comments>
		<pubDate>Tue, 28 Aug 2007 11:05:01 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[SQL Challenge]]></category>

		<guid isPermaLink="false">http://www.phpbbdoctor.com/blog/2007/08/26/sql-challenge-1-frequent-poster-analysis/</guid>
		<description><![CDATA[I have no idea if this will be interesting or not, but I want to try it for a while and see what happens. I am going to start a series of SQL Challenge posts. The first one is based on a query that I wrote during one of my posts in my &#8220;Advertising on [...]]]></description>
			<content:encoded><![CDATA[<p>I have no idea if this will be interesting or not, but I want to try it for a while and see what happens. I am going to start a series of SQL Challenge posts. The first one is based on a query that I wrote during one of my posts in my <a href="http://www.phpbbdoctor.com/blog/category/phpbb/advertising/">&#8220;Advertising on Forums&#8221;</a> series. Since it was based on a table that is not part of phpBB I have provided a download link for a script that will create a simple version of the table for you to put into your database and play around with, if you are so inclined.</p>
<p><span id="more-134"></span></p>
<h3>The Setup</h3>
<p>Here is the table:</p>
<pre>mysql> desc sql_challenge_01;
+-------------+-----------------------+------+-----+---------+-------+
| Field       | Type                  | Null | Key | Default | Extra |
+-------------+-----------------------+------+-----+---------+-------+
| total_posts | int(11)               |      |     | 0       |       |
| w1_posts    | mediumint(8) unsigned | YES  |     | NULL    |       |
| w2_posts    | mediumint(8) unsigned | YES  |     | NULL    |       |
| w3_posts    | mediumint(8) unsigned | YES  |     | NULL    |       |
| w4_posts    | mediumint(8) unsigned | YES  |     | NULL    |       |
| w5_posts    | mediumint(8) unsigned | YES  |     | NULL    |       |
| w6_posts    | mediumint(8) unsigned | YES  |     | NULL    |       |
+-------------+-----------------------+------+-----+---------+-------+</pre>
<p>My original work table included the user_id but I dropped that from the challenge version of the table since you won&#8217;t need to join to anything. Here are some sample rows from the table so you can get a feel for what it looks like:</p>
<pre>+-------------+----------+----------+----------+----------+----------+----------+
| total_posts | w1_posts | w2_posts | w3_posts | w4_posts | w5_posts | w6_posts |
+-------------+----------+----------+----------+----------+----------+----------+
|          56 |        3 |        4 |        8 |        9 |       16 |       16 |
|         589 |       39 |       53 |       89 |      106 |      152 |      150 |
|           7 |        1 |        0 |        0 |        0 |        3 |        3 |
|           2 |        0 |        1 |        0 |        0 |        1 |        0 |
|          10 |        0 |        1 |        3 |        3 |        1 |        2 |
|         266 |       36 |       37 |       61 |       43 |       34 |       55 |
|          14 |        1 |        1 |        3 |        0 |        1 |        8 |
|           5 |        0 |        0 |        0 |        0 |        0 |        5 |
|          15 |        2 |        3 |        2 |        1 |        3 |        4 |
|         236 |       58 |       53 |       49 |       31 |       19 |       26 |
|           5 |        0 |        2 |        1 |        0 |        2 |        0 |
|          46 |       27 |        0 |        2 |        9 |        8 |        0 |
|           3 |        2 |        0 |        0 |        0 |        0 |        1 |
|           4 |        2 |        2 |        0 |        0 |        0 |        0 |
+-------------+----------+----------+----------+----------+----------+----------+</pre>
<p>The various columns w1_posts, w2_posts, and so on contain the number of posts that a user made during that week. I built this table to help analyze how frequently users were visiting my board. In order to do that, I needed to know how many users had visited and posted in every one of the past six weeks, along with how many had posted in at least five (any of the last five) weeks, and so on.</p>
<h3>The Challenge</h3>
<p>Your challenge, should you decide to accept it, is to write one SQL script (not php!) that returns a count of how many rows in this table include users that posted in all six weeks, in any five weeks, in any in four weeks, and so on. If you look carefully at the data, any single column could be zero. Checking for rows that include posts in all six weeks is easy. Checking for rows that include posts in five out of the six gets a bit more challenging as any one of the six weeks could be zero. And so on from there.</p>
<p>When I say &#8220;in any five&#8221; I mean exactly five weeks. Anyone that posted in six weeks also posted in five, but that&#8217;s not where I went with this. &#8220;Any five&#8221; means just that.</p>
<p>Here is the expected output:</p>
<pre>+------+----------+
| wk   | count(*) |
+------+----------+
|    1 |     1139 |
|    2 |      407 |
|    3 |      198 |
|    4 |       85 |
|    5 |       64 |
|    6 |       46 |
+------+----------+</pre>
<p>So this tells me that 64 rows in the sample data had non-zero values in five out of the six weekly columns. 407 rows had non-zero values in 2 out of the six. And so on. The expected value for &#8220;wk&#8221; based on the sample rows shown above would be this:</p>
<pre>+-------------+----------+----------+----------+----------+----------+----------+------+
| total_posts | w1_posts | w2_posts | w3_posts | w4_posts | w5_posts | w6_posts | wk   |
+-------------+----------+----------+----------+----------+----------+----------+------+
|          56 |        3 |        4 |        8 |        9 |       16 |       16 |    6 |
|         589 |       39 |       53 |       89 |      106 |      152 |      150 |    6 |
|           7 |        1 |        0 |        0 |        0 |        3 |        3 |    3 |
|           2 |        0 |        1 |        0 |        0 |        1 |        0 |    2 |
|          10 |        0 |        1 |        3 |        3 |        1 |        2 |    5 |
|         266 |       36 |       37 |       61 |       43 |       34 |       55 |    6 |
|          14 |        1 |        1 |        3 |        0 |        1 |        8 |    5 |
|           5 |        0 |        0 |        0 |        0 |        0 |        5 |    1 |
|          15 |        2 |        3 |        2 |        1 |        3 |        4 |    6 |
|         236 |       58 |       53 |       49 |       31 |       19 |       26 |    6 |
|           5 |        0 |        2 |        1 |        0 |        2 |        0 |    3 |
|          46 |       27 |        0 |        2 |        9 |        8 |        0 |    4 |
|           3 |        2 |        0 |        0 |        0 |        0 |        1 |    2 |
|           4 |        2 |        2 |        0 |        0 |        0 |        0 |    2 |
+-------------+----------+----------+----------+----------+----------+----------+------+</pre>
<p>So, there you go. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  The first SQL Challenge from the phpBB Doctor. What do you win? You win the knowledge that you solved a puzzle. Is that enough for you? <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  It is my hope that this series of posts will be interesting but educational as well. There is often more than one way to solve the same problem, so I personally will be interested to see some of the solutions that are presented. Eventually I will edit the post and include my original answer and optionally the best answers (judged by me) submitted to the challenge as well.</p>
<h3>The Material</h3>
<p>This is a zip file that contains a SQL script that will create and populate the table used in this challenge. </p>
<p><a id="p135" href="http://www.phpbbdoctor.com/blog/wp-content/uploads/2007/08/sql_challenge_01.zip">sql_challenge_01.zip</a></p>
<p>Some of the future challenges will use the standard phpBB tables instead of requiring you to download and install something. I used this for the first challenge because I thought it presented an interesting challenge. <img src='http://www.phpbbdoctor.com/blog/wp-includes/images/smilies/icon_cool.gif' alt='8)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.phpbbdoctor.com/blog/2007/08/28/sql-challenge-1-frequent-poster-analysis/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

