Home

Your premium source for custom modification services for phpBB

  logo

HomeForumsBlogMOD ManagerFAQSearchRegisterLogin

Comments July 12, 2009

MySQL Bug Breaks Banner System

Filed under: Database Tips, MOD Writing, phpBB — Dave Rathbun @ 1:42 pm CommentsComments (2) 

One of the reasons I wasn’t around much earlier this year was I was in the process of moving a bunch of sites over to a new server (including this one). In most cases the move went without a hitch. In one particular case there was an interesting bug that didn’t show up right away. It was related to the banner system I wrote for my largest board. Fortunately it was an error on the “good” side, so I didn’t make any sponsors angry.

The banner system is fairly complex, but at the most basic level there is a cron (scheduled) job that periodically decrements the sponsor view balance. Once the view balance hits zero, the sponsor’s banners are deactivated until they pay for another round. The code is quite simple:

$sql = 'UPDATE  ' . SPONSORS_TABLE . '
	SET     view_balance =  view_balance - ' . $decrement_views . '
	WHERE   sponsor_id = ' . $sponsor_data[$i]['sponsor_id'];

The value for $decrement_views is assigned earlier in the loop. The definition for the view_balance column is an unsigned integer (mediumint specifically) so it will not allow negative values. On my old server this worked perfectly. If $decrement_views was greater than view_balance the sponsor view balance was set to zero. My old server was running MySQL 4.1.

My new server is running 5.0, and this same code did not work. Unfortunately it did not generate a syntax or other runtime error. Instead it did the math wrong.

Integer Storage in MySQL

Before I talk more about the bug I think I should talk about how computers store numbers. This is not specific to MySQL, it can affect any system that stores numeric values. When I store a number I have a choice of adding the unsigned attribute. In MySQL it takes the following format. The first will store a tiny integer without a sign, and the second will store a tiny integer with a sign.

create table dave (new_column tinyint unsigned);

create table dave (new_column tinyint);

What is the difference? When numbers are stored they take space. A tinyint column in MySQL can store values from -128 to 127. An unsigned tinyint can store values from 0 to 255. How does this work, and why are the numbers different in each case?

A tinyint is stored in one byte or eight bits of information. With eight bits I have a range of 0000 0000 to 1111 1111. With an unsigned value I can use all eight bits for my number, so 0000 0000 = 0 and 1111 1111 is 1 + 2 + 4 + 8 + 16 + 32 + 64 + 128, or – if you do the math – 255. That’s how an unsigned tiny integer column can store a value ranging from 0 to 255. If, however, I want to use a signed value, the first bit becomes an indication of whether the value is negative or not. That means I only have seven bits left to determine the value. 0111 1111 becomes 1 + 2 + 4 + 8 + 32 + 64 which is 127, or the maximum signed value that can be stored in a signed tiny integer field. What happens when the eighth bit gets flipped to a 1? That’s an indication that the number is negative instead of positive. So while both signed and unsigned values take the same amount of space, a signed value is one order of magnitude smaller because the most significant bit (the leading bit) is used to indicate the sign of the value that is stored.

Put another way: a signed tinyint has seven available bits and therefore can store 27-1 or 127 as the maximum value. An unsigned tinyint has eight available bits and therefore can store 28-1 or 255 as the maximum value. Suppose I am looking at a number in memory and the bit values are 1000 0001. What is the value represented by these bits?

The fact is I can’t make that determination until I know if the value is signed or not. If the value is unsigned, the number represented by 1000 0001 is 129. If it’s signed, it gets complicated :) but the value returned would be -127. Keep in mind that I am using tinyint example values here.

How Unsigned Math Broke My Sponsor System

In my banner system I don’t try to track the page views down to exactly zero. A sponsor will pay for two million page views at a time. If they actually use two million and twelve I am not going to complain about the few extra views. :) So my system is designed to allow each sponsor more views in the interest of good will but mostly for system performance. In my system, each sponsor’s view balance is only updated once an hour. Let me repeat the code that I showed above:

$sql = 'UPDATE  ' . SPONSORS_TABLE . '
	SET     view_balance =  view_balance - ' . $decrement_views . '
	WHERE   sponsor_id = ' . $sponsor_data[$i]['sponsor_id'];

First (not shown) I get a total of the banner views that have accumulated over the past hour and store them into the $decrement_views variable in my php script. Next I execute the SQL script shown above for each sponsor with an active banner. Suppose sponsor number 12 has 1000 views left and they used 300 in the last hour. The SQL code resolves to this:

UPDATE phpbb_sponsors
SET view_balance = 1000 - 300
WHERE sponsor_id = 12;

After this statement is executed the sponsor has a balance of 700 views left. Suppose the same sponsor has 100 views in their balance and they used 300 more during the last hour. The SQL ends up looking like:

UPDATE phpbb_sponsors
SET view_balance = 100 - 300
WHERE sponsor_id = 12;

When 300 is subtracted from 100 it results in a negative number. Under my old version of MySQL that number was set to zero since the column is defined as unsigned and is not capable of storing a negative value. This is what broke during the upgrade to MySQL 5.

MySQL Bug Explained

The newer version of MySQL did the math as signed, which allowed the value to go negative, and then stored the results in the unsigned field. You might start to see the problem now. :) Instead of setting the sponsor view balance to zero for any negative result, the sponsor view balance was set to the maximum value that it could possibly hold because of the overflow. In this case, the view balance got set to 16,777,215 instead of zero. What is significant about that number? Here is a quote from the MySQL web page where it details the values that can be stored for any particular numeric column type…

MEDIUMINT[(M)] [UNSIGNED] [ZEROFILL]
A medium-sized integer. The signed range is -8388608 to 8388607. The unsigned range is 0 to 16777215.

In a nutshell, the old version of MySQL caught the overflow exception and set the value to zero. The newer version of MySQL did not handle the overflow and instead let the signed value stored the unsigned negative value. I’m sure that would have made my sponsors happy, but it certainly wasn’t how things were intended to work.

Fixing The Problem

There is a good lesson to be learned here. I was lazy :) in the way I wrote my earlier code. Rather than check to see if the number of accumulated views was higher than the balance remaining and handling that exception with code, I relied on the fact that MySQL would not (should not) store a negative result in an unsigned field. When the database behavior changed (it has been recognized as a bug by MySQL according to my research) my system broke.

I have fixed the SQL by using a case statement so that this error will never occur for me again. Here is the revised SQL:

$sql = 'UPDATE  ' . SPONSORS_TABLE . '
	SET     view_balance =  case
			when ' . $decrement_views . ' > view_balance then 0
			else view_balance - ' . $decrement_views . '
			end
	WHERE   sponsor_id = ' . $sponsor_data[$i]['sponsor_id'];

This updated code uses a case statement structure to check to make sure that the remaining balance is larger than the value to be decremented. If it is not, the value is simply set to zero.

Finally, now that I’ve explained signed versus unsigned it makes the following cartoon from xkcd.com more meaningful, doesn’t it? ;)

Related Links

2 Comments »

  1. Wow, that is crazy.

    Is it really most efficient to use a case statement and have all the logic in the SQL statement?
    I would normally do the calculation in PHP before hand and pass the appropriate value to mySQL.

    Comment by Everett — July 13, 2009 @ 1:24 pm

  2. In this case it’s more efficient to do it with a case statement. The reason is, if I didn’t do a case statment I would first have to retrieve the current view balance in order to be able to do the math in php and then run the update. With a case statement I do the check for the exception (negative balance) and the update all in one database query. :)

    Comment by Dave Rathbun — July 13, 2009 @ 3:53 pm

RSS feed for comments on this post.

Leave a comment

Tags allowed in comments:
<a href="" title=""> <acronym title=""> <blockquote cite=""> <code> <strong> <em> <u> <sup> <sub> <strike>

Confirm submission by clicking only the marked checkbox:

             *

Powered by WordPress