Building a Better Board Banner System Part III: Banner Instances and Cron Jobs
I started this series of posts talking about various ways to manage the update / select process used to manage a sequential counter. The primary goal in rewriting my board banner system was to ensure that it was both accurate and efficient. I accomplished that task by using the LAST_INSERT_ID() function provided by MySQL as detailed in the prior post. That function would allow me to update a counter and record the resulting value as a session variable, and then retrieve that session variable value without being worried about updates from other pages.
The next design delima I had to overcome was: How was I going to store and manage the banner data? The obvious choice was to create a table called phpbb_banners and store everything there, but that would have required me to do a lot more work. Don’t get me wrong; I did create that table. But that’s not the table that drives the banner display logic. That table is called phpbb_banner_instances and I will show what it looks like and explain why I created it next.
Basic Design Parameters
Before I go much further it’s probably time for me to share some of the basic design parameters that I had for this system. I wanted to allow advertisers to set start and end dates for their banners. These would be date values only, meaning all banners would expire or start at midnight based on the server clock. I also wanted advertisers to be able to weight their banners as either 25%, 50%, 75%, or 100% of their allocated page views. This would allow a conservative advertiser to stretch their advertising budget by leaving 50% (as an example) of their allocated page views to someone else.
For example, if I had two advertisers and one weighted their banner at 100% and the other at 50% then the first banner would be displayed twice as often as the second. I had to be able to track that, and this is the primary reason why I decided that I needed a second table.
At this time I was only going to allow banners to appear at the top of the page. That meant that I didn’t have to worry about banner locations (top, side, bottom, after first post, and so on). I also didn’t have to worry about keeping a banner from showing up twice on the same page, which would have been a bit awkward.
My banner advertisements are currently sold strictly on a PPV or Pay Per View model. I do not do any PPC (Pay Per Click) banners. That restriction made the system much easier to build since every advertiser would have the same contract, so to speak.
With these parameters in mind let me move on to the specifics of the database design.
Banner System Table Design
Here is my banner table as currently in use on my board.
+-----------------+----------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-----------------+----------------------+------+-----+---------+----------------+ | banner_id | smallint(5) unsigned | | PRI | NULL | auto_increment | | sponsor_id | smallint(5) unsigned | | MUL | 0 | | | filename | varchar(32) | | | | | | filetype | char(3) | | | | | | active_flag | tinyint(1) unsigned | | | 1 | | | open_new_window | tinyint(1) unsigned | | | 1 | | | weighting | tinyint(2) unsigned | | | 4 | | | total_view_ctr | int(11) unsigned | | | 0 | | | total_click_ctr | int(11) unsigned | | | 0 | | | width | smallint(3) unsigned | | | 0 | | | height | smallint(3) unsigned | | | 0 | | | click_url | varchar(255) | | | | | | start_date | int(11) unsigned | | | 0 | | | end_date | int(11) unsigned | | | 0 | | | insert_date | int(11) unsigned | | | 0 | | | update_date | int(11) unsigned | | | 0 | | | alt_text | varchar(64) | YES | | NULL | | +-----------------+----------------------+------+-----+---------+----------------+
I would think that the purpose for most of these columns would be fairly obvious. The banner_id is the unique identifier for the banner row. The sponsor_id is the foreign key to the sponsors table (the owner of the banner). The filename and filetype are used to reference the name and type of file (gif, jpg, png, swf, and so on) for this banner. The active_flag lets a sponsor mark a banner as being “out of circulation” so that it isn’t going to be displayed. The weighting is a value that allows the sponsor to determine how to allocate their allotment of daily page views and I will talk more about it in a moment. And… well, I think the rest of the fields are fairly self evident. If they’re not, post a question using the comment field and I will explain.
In summary, this table contains the base data for the banner. But it does not drive how the banner is displayed. That role is reserved for a table called phpbb_banner_instances which looks like this:
+-------------------+-----------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------------------+-----------------------+------+-----+---------+-------+ | instance_id | smallint(5) unsigned | | PRI | 0 | | | banner_id | smallint(5) unsigned | | MUL | 0 | | | instance_view_ctr | mediumint(8) unsigned | | | 0 | | +-------------------+-----------------------+------+-----+---------+-------+
This table contains a primary key (instance_id), a foreign key to the banners table (banner_id, of course), and a view counter. Note that there is no click counter in this table? I will explain that more in a bit.
Picking the Next Instance to Display
The reason for this second table is to give me an effective way to manage the weighting. In my prior posts on this topic I mentioned having a set of banners to display. I wanted to be sure that I ran through the entire set without skipping anything. This table provides that set of banners, and allows me to set up a weighting to that a single banner can appear more than once in the rotation. The weight of the banner directly impacts the number of rows in the instance table, and makes that process extremely simple to manage.
But simplicity is not the only advantage, there is a performance benefit as well. By having the banner appear more than once I spread out the update statements. It is entirely possible that two people are going to be viewing the same banner at the same time. But if they are viewing separate instances of the same banner, their update statements will be hitting different rows in the table. Different updates means less contention (locking) and faster performance. This is why there is an instance_view_ctr field on this table. Each time an instance of a banner is viewed that counter is incremented.
There is no instance_click_ctr field. The odds of two (or more) people clicking the same banner at the same time are much smaller than the odds of viewing the same banner. For that reason I put the click counter on the banner table itself.
Caching The Instances
There is a bit of logic that goes into determining which banners are to be displayed. First, the banner has to belong to an advertiser in good standing, meaning they have to have paid their bills.
Second, the banner has to have been marked as “active” in order to be in the rotation. Third, the weighting needs to be considered as far as how many times the banner is to be displayed. Fourth, the start date and end date values need to be checked. Fifth, I have to make sure that the advertiser hasn’t run through their remaining page view balance. That’s a lot of work to be done on every page view!
So I don’t.
I don’t do it on every page view, that is. I have a cron job set up to manage this process. Once I have determined which banners are to be displayed they’re inserted into the instance table and at the same time a cache file is written to disk. The cache is used “as is” for at least one hour. Each hour a cron job wakes up and determines if any of the sponsors have run out of views, and if so, removes their banners from the rotation. At midnight there is extra processing that checks the start and end dates. (Remember at the top of this post I mentioned that all scheduled banners start and stop at midnight.)
Since the count information is handled live in the database (I don’t need to update the cache for that) the cache file is present while the updates are taking place. That means I can truncate the instances table and rebuild it without impacting the banners being displayed on the screen. Only after the table is repopulated successfully is the cache updated. So the cache is not only providing a performance boost; it’s also providing redundancy as well.
There is a potential bit of exposure… if an advertiser runs out of views immediately after the one hour cron job has completed, they will see their banners displayed for the rest of the hour, which will push them into a negative balance. But how much exposure is that for me, really? I currently have 46 banner instances sharing 100,000 page views a day. That means that over the course of a day each full-weight banner will see about 8,500 page views. That works out to about 350 per hour. Given that the minimum purchase I allow is 2,000,000 page views, having them run 3-500 into the negative area is not a problem. In fact, I think I will just call it good customer relations.
Conclusion
There is a lot more to this system than is covered by these two tables. I have a table to store the advertiser information. I have a table to store advertiser contacts and their role within the company. I have tables to store invoice information and more to store payment data. I have a table that stores various report definitions, and a subscription table so that advertisers can set up a series of scheduled reports that will be delivered to their email address.
The core features of the system that make it accurate and efficient are the banner_instances table and its associated cache, the cron job, and the MySQL LAST_INSERT_ID() function. Without those elements the system would look completely different. I doubt that it would perform as well. The system as detailed in this series of blog posts has been running without issues since March, 2008.
At this time I don’t have any intention to release this system as a MOD. However, I believe I have provided enough ideas and insight into how it was built that someone could provide such a MOD based on the information provided in these posts.


Thanks for the all the ideas so far. I’m using an off-the-shelf phpBB 2 mod (which, btw, should not be used without some changes, since it has big time security flaws!).
But if I ever get “serious” about advertising, I’ll make sure to refer to your docs when building the system.
Comment by Dog Cow — September 10, 2008 @ 12:25 pm