Continuous Integration Matters
Recently I’ve been spending quite a bit of time trying to work out the best way to enable my team at work to be able to rapidly meet several impending deadlines for new projects. In this I have once again started looking at some agile practices that have been largely tossed to the sidelines. Some of the more important practices that I am currently trying to preach again are the old of code standards and unit testing. While there are many people who see these as barriers to progress, viewing them as yet more things they need to do, my hope is to perhaps give them the tools to be able to see the truth; that these practices exist to empower developers.
One of the core issues that I see day in a day out is that testing almost always takes 2-3 times longer than originally expected. This is by no means the tester’s fault though. It has been shown over and over again that following a waterfall style approach to development is in fact the most inefficient way of producing software. The reason for this is because when you put testing at the end of the process, you force everything to grind to a halt as you backtrack to fix bugs.
In fact, unit-tests are designed to help prevent this. Now while I won’t go into the specifics on what unit-tests are or why they are good, as that has been covered by many wiser men than myself, I will say that holding developers accountable for writing them is a key component to making this ecosystem work.
Enter Continuous Integration
This begs the question of, what exactly is continuous integration. To quote the venerable Martin Fowler:
Continuous Integration is a software development practice where members of a team integrate their work frequently, usually each person integrates at least daily - leading to multiple integrations per day. Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible. Many teams find that this approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly.
More specifically it means automating an incremental build system, that encourages developers to find and fix bugs as they create them. This is after all the moment that they are in the zone, working with that code and most likely to be able to react quickly, already emerged in the problem. Including some standards support to warn the developer when they are about to commit code the is needlessly complex, can provide warnings that they may want to re-factor a small bit before continuing.
The important part of this whole process is that the developer be made aware of the issue as close to immediately as possible. If you’re in a month long project and major testing doesn’t happen until the fourth week, then the likelihood bugs uncovered in that fourth week pushing out the deadline is increased. This is simply because it is harder to go back to code that you previously wrote, than it is to fix it immediately after writing it.
Not to say that you will find every problem up front, but I can say from experience that when ever I spend the time to write unit tests and verify those tests before I push code, testing time is almost always miniscule in comparison to when I do not.
A recent example comes to mind. I was writing a small rules engine that would based on some simple conditions allow a CMS to control, based on dynamic input, whether-or-not a small bit of code would show up in the page. During this process, I did not do my due diligence and write my unit tests as needed. Because of this, some simple rules went back and forth with the testers for 3 days.
Finally, realizing my error, I stopped, wrote out a set of 20 test cases for how I expected the rules to behave, ran them, found five failures. Spent 20-minutes working them out and finally got my code to stable. At this point, not only was I confident that the code would work, but it went through the QA process with flying colors. Moral of the story, had I done the right thing up front and spent that 20-minutes working out the test cases, I would have saved 3-days of time while bugs bounced back and forth from me to the testers and back.
Continuous Integration is about keeping developers accountable for doing these things and verifying that the code they write doesn’t break someone else’s code before it makes it to the testing phase.
When does it take to start?
In generally you need three things to start using continuous integration in your development cycle.
The first is the simplest, version control. This can be anything from GIT, to Subversion, all the way to CVS. The latter having fallen out of fashion for various reasons, though still in use. No matter what, every project and team should be using some sort of version control. It is your contingency plan if errors occur. It is how you find out what changed and who changed it.
Second we have the CI server itself. This can be either easy or difficult, depending on your platform. .Net developers have Team Foundation Server available to them, which can be configured to provide this type of service automatically. The specific server is called Team Build, which will keep track of build successes and failures and help keep the developer informed of problems.
A more general solution for the rest of us developers is Cruise Control. This server has various plugins that allow it to work for Java, Ruby, Python, PHP and just about any other language out there. It contains reporting tools and very tight integration with Subversion. If you are in need of a solid, quick start package, this is where I would start looking.
Lastly, you need someone to keep the developers motivated and encouraged to keep following these practices. It is very easy to “forget” to create a build process for the new application that you’re developing. It can definitely seem like more of an obstacle than an option if you are only looking at the immediate gains (though I argue confidence is a huge immediate gain). So, having someone (or a group of someones) to keep people excited and motivated to use these tools.
Actively show the successes that they have provided. Gather some metrics that show how many bugs were avoided in the testing phase, because of the incremental builds. How testing time has decrease. How production bugs have gone down. All of these things are the benefits of failing early.
When should I start?
Right now!
Whether you are starting a new project, or deep in the trenches of an old one, if nothing else, then start writing unit tests. However, taking a few hours to set up Cruise Control or some other build server will be paid back in full very quickly. If you have the server set up, then make sure when you create every new project that along with setting up your repository, that creating an automated build is on your list of first things done.
If you can have this set up and ready to go before you write your first line of code, then you are in a rock-solid position. Just remember, it is an active process. Continuous integration is a way to reinforce the good habits, it does not replace them. It is up to every developer to do his part.
Creating A Dynamic DNS Script With Slicehost
Note: The script used here is written in Python. I did not write this script. However, in the time honored tradition of the Internet I am going to reiterate it here, as it was a little hard to find a solution via Google. The original script came from a Slicehost forum poster. I just want to make clear that I am not the author, just a fan of his work.
In all of my new server craze, I’ve come to have a need to be able to connect to it from where ever I am. Initially I thought that the best solution for this was to just use the DynDNS.com free service. It does exactly what I need with three exceptions.
- I have to use one of their canned domain names.
- After 1 month of inactivity, they will cancel your account.
- My current AirPort Extreme router does not support dyndns.com updating like my old LinkSys one did.
Now, most of these are either easily overcome or downright nit-picky. #2 and #3 could be easily overcome with a script that updates the IP address periodically. However if I’m going to need a script for this, I might as well go to the next step and use a solution that let’s me take care of #1 as well. Enter SliceHost.
I’ve been a very happy customer of SliceHost for a little over two years at this point. I started with a Gentoo slice and am currently rocking a Ubuntu install, simply so there is a little more fire and forget happening. I recently started using their wonderful DNS management system to add a post.drewbutler.name to point over at the Posterous servers. So I figured why not do the same thing but point it at my home server.
Setting Up A Type A Record On A Zone
In network terminology, your DNS is split into zones and records. The zone is the main domain name itself (drewbutler.name.) and the record is essentially the sub-domain (home). Now, I am going to say that this is an extreme simplification, however for our purposes, it is relatively accurate. For more in-depth information, please see the Wikipedia article related to DNS, zones and records.
I will assume for this exercise that you have your domain (zone) already set up and that we are simply adding a new sub-domain (record) to it. In the SliceManager, go to DNS -> Domains and click the Records link next to your domain. Once you’re in the records list for your zone, click on the New Record link. We’re going to set things similar to the image below, though substitute the Name field with your sub-domain and the data with your home IP address. You can use the dyndns checkip service to easily get your current ip address.

Hit update and you should be able to go to home.yourdomain.com and see your own server (assuming you have a web server set up). Otherwise, if you have a terminal available, run: ping -c 1 home.yourdomain.com and you should see your home ip address appear in parenthesis.
While we’re in the DNS records, you should see your new subdomain “home” listed. Hover over the edit link and copy the url there. It should be something like this:
https://manage.slicehost.com/zones/1234/records/123456/edit
The first number is your zone id and the second is the record id. Write down the record id (in this example, 123456) as we’ll need that in a little bit.
Using the SliceHost API to do some periodic updates for us.
First if you haven’t done so, let’s enable the API and/or get the API key. Go to the Account -> API Access in your SliceManager and click Enable API Access (if applicable). Now you should see you API password displayed. Copy and paste it into a document, write it down or whatever; we’ll need this in a bit as well.
For this next bit I am going to assume you are running some sort of *nix server and have 2 things installed, Python and Subversion. If you do not have them installed, please do so now. Both are very common packages and the installation should be very well documented for your distribution of Linux.
The script makes use of a great package called PyActiveResource. This is essentially a port of the ActiveResource module for Ruby, but in Python. At this moment, it seems the best way to get this package installed is via subversion, so let’s do that.
svn checkout http://pyactiveresource.googlecode.com/svn/trunk/ pyactiveresource-read-only
cd pyactiveresource-read-only
sudo python setup.py install
Assuming there were no errors, this should be installed as a python module now.
Next let’s setup a script to run. I’m gonna assume you are running Vim like I am, but please feel free to substitute that as appropriate.
sudo vim /home/drew/slicehost_dyndns.py
/home/drew/slicehost_dyndns.py
import urllib, re, sys, os
from pyactiveresource.activeresource import ActiveResource
api_key = 'xxxxxx'
api_url = 'https://%s@api.slicehost.com/' % api_key
record_id = '123456'
class Record(ActiveResource):
_site = api_url
results = Record.find(id=record_id)
if len(results) != 1:
print "Can't find Record %s via SliceHost API." % record_id
sys.exit(1)
salon = results[0]
found_ip = (re.findall('[0-9.]+', urllib.urlopen('http://checkip.dyndns.org/').read())[-1])
if salon.data != found_ip:
salon.data = found_ip
salon.save()
print "IP Updated: " + found_ip
else:
print "IP Unchanged: " + salon.data
Update the api_key and record_id with the information we wrote down before and save it. Be sure to maintain the spacing on lines as best as possible. Python is very strict about the use of white-space. The script from the forum post was actually not spaced properly and therefore didn’t run. That is really the only change I made. Otherwise it worked perfectly.
From the command prompt, you should now be able to run
sudo /home/drew/slicehost_dyndns.py
And will see either IP Updated or IP Unchanged. I suggest testing it at least once before moving forward.
crontab -e
Add this to have the script run every 10-minutes. As the script only does an actual update if the ip-address changes, it is fairly innocuous.
*/10 * * * * python /home/drew/slicehost_dyndns.py
You now have a working dynamic dns setup for your home server. I’ll be working on some updates to the script to do things like log when it actually changes the dns entry. I figure it’ll be helpful to monitor who often the script really needs to run and if there are any patterns in when the IP address expires. Might help in customizing the cron’s timing and thus decreasing load on the Slicehost servers. As it stands now though, this should be a very low impact solution, so I doubt they’ll notice.
An Easier Way To Keep WordPress Up To Date
Lately there has been quite a stir at my job regarding WordPress. In this we have played alot with deployment methods, including proxying it through another application to add an extra layer of features. Because of all of this I have had quite a few conversations with teammates about the easiest way to set up a new blog. As I feel I have a very quick and easy way to not only install a new WordPress installation, but also to keep it up to date, I figured I would share it here, in case that knowledge might be helpful to someone else.
A quick warning first, while nothing I am about to describe is necessarily difficult nor hazardous, all of it requires that you have some sort of direct access to your server. In the case of my explanations, I will be speaking of doing this through an SSH terminal connection. Some familiarity and access to Subversion is also needed. Mostly, this is a solution that should appeal to other developers.
Alright, on with it then.
When you first go to install WordPress, you are generally directed to download a nice zip archive of the software. This is generally accepted as an extremely easy way to install WordPress itself, however the trouble with this method comes in later on, when you need to update your installation to a newer version to combat the inevitable bugs that plague all software. While WordPress does a great job of automating most of the process, you are generally advised to back up all of your themes and plug-ins, which you will need to reapply afterward, as the update process will many times over-write some of these files.
The second issues come in from developers like myself who may or may not tweak the WordPress system itself, to perhaps change how tags are encoded. These changes generally will have to be completely redone when a new version comes out. They will be over-written by new versions of the files.
I’m a developer, so let’s start thinking like a developer.
Under the download section of the WordPress site, let’s go to the Subversion access section. Just as they assume you have Subversion installed, so will I. If not, checkout the Subversion website for information on installing it.
For the most part, we are going to follow all the instructions they have here, except for the part on which repository to use. They link to the trunk of their repository, which while generally stable, is still a development version and could have stability or performance issues. As of this writing, the current stable version of WordPress is 2.8.6, so that will be what we use in these examples. However, you can just replace the version number in the examples with whatever is current to stay up to date and avoid installing older versions.
Let’s begin.
From the command-line of the server:
# Go to the directory that your WordPress site will live.
$ cd /var/www/mysite
# Check out WordPress into the current directory
$ svn co http://core.svn.wordpress.org/tags/2.8.6/ .
That’s it, as I said, nothing difficult. The rest of the install process is just like in the Famous 5-Minute Install, only start with #2.
Now, what we have done here is basically checkout the repository as if we were going to be programming in it. Normally a developer would make changes and commit them. We definitely won’t be following that part of the life-cycle. However, we now gain all the benefits of allowing Subversion to manage when changes occur, so when we need to update, it can tell us when conflicts occur and as developers, we can manually merge them.
I still strongly suggest you keep backups of your site in case you break something really bad, however these will be for worst case scenarios, hard disk crashed or I rm -Rf /. my whole computer kind of issues.
Keeping up to date
In order to update to a new version of WordPress, now all we need to do is use the switch operation in subversion to switch the tag you have checked out.
# Go to the directory that your WordPress site will live.
$ cd /var/www/mysite
# Check out WordPress into the current directory
$ svn sw http://core.svn.wordpress.org/tags/2.8.7/ .
While as of this writing that is not a valid version of WordPress, one day when it is, that will be the line I use to upgrade to it. Of course just change the version number to match whatever is current. An update from 2.8.5 to 2.8.6 yielded the following output
$ svn sw http://core.svn.wordpress.org/tags/2.8.6/ .
U wp-includes/version.php
UU wp-includes/js/swfupload/plugins/swfupload.speed.js
U wp-includes/functions.php
U wp-includes/formatting.php
U readme.html
U wp-admin/press-this.php
Updated to revision 12288.
Were there to have been a conflict, due to changes would have otherwise been overridden by this update, there would be a ‘C’ next to the file name, instead of a ‘U’ which just means ‘update.’ In that case you would look inside that file and you’ll see both versions of the code inside there and you can then decide how to repair it. In most cases, unless you make changes to the WordPress core though, you will never see this.
This is what I consider the worlds most painless WordPress update, as when a new version comes out, I simply switch to it and the log into the /wp-admin. It will have me update the database, as it does and I am done. Takes less than a minute to update in most cases.
Anyway, hope that is helpful. While I’m sure this is a bit much for many people, at least for developers, it is fairly fluid with our normal process and is by far the ideal way to stay up to date.
Error handling in PHP, or how we won the war.
One of my biggest gripes about PHP in comparison to other modern object-oriented languages is the relatively poor state of error handling. The mix and match style of triggered errors and exceptions is unbefitting of a modern language, especially one that is ready for the enterprise. Most modern languages deal with errors strictly through the throwing of exceptions, because it allows for a more robust method of triggering a response from the system. This is because a thrown exception, can be caught. In PHP though, a triggered error cannot.
At least not technically.
Lucky for us there is a way to make PHP behave a little more civilly than it does by default. Through some clever uses of the error and exception handlers, we can make PHP just a little more robust.
Our Exception
First we start with our exception object. Personally, I’m a fan of explicit exception naming, so that I can more easily catch very particular exceptions. For this example we will use one that I will cleverly name PhpException.
class PhpException extends Exception {}
In this case we will not be modifying what the Exception object does, only giving it a new name to reference by. If you’re new to exceptions, then trust me, this will make your life easier down the road when you’re catching an exception from a method that that can fail in a couple different ways. Being able to single out one exception type that you know how to better handle is much preferred than catching everything and maybe covering up a real problem.
The error handler
In short what the error handler is, is a function or class method that you define as the recipient of any php error that is triggered in the system. This is by far the most common way for php native function to trigger their errors. The error handler is the first thing to happen after the triggered error. It is given a list of the message, the error code, the file it occurred in and on what line. From there, as long as you do not return a boolean false, it will not follow the normal error pattern. In our case, what we want to do is provide some new options. So, we will wrap this error in our new PhpException class.
function handleError( $code, $message, $file, $line, $context )
{
// Silencing a statement with @ causes this to be 0.
// Let's be respectful and not throw errors if silenced.
if ( ini_get( 'error_reporting' ) == 0 ) {
return true;
}
$msgTpl = 'PHP Error (%u): %s in: %s:%u'
$message = sprintf( $msgTpl, $code, $message, $file, $line );
throw new PhpException( $message );
}
set_error_handler( 'handleError' );
At it’s core, this is a very simple method of converting a triggered error into an exception. It first checks to see if error reporting is off, due to a silence. If that is the case, it exits early, since we can ignore that error. Otherwise, it just turns the error to an exception object and throws it. With this done, we can now catch the uncatchable.
try {
trigger_error( "You can't catch me!", E_USER_ERROR );
} catch ( PhpException $e ) {
// Oh yes I can!
}
In the most basic of ways, we have made PHP errors catchable exceptions. This is an extremely desirable behavior. The only errors that will not be caught by this are compiler and parse errors. We’ll get to parse errors later though.
Speaking of error types, what is wrong with the above example?
We are catching too much. Some of the error types that are thrown are warnings and we do not want them stopping a code run like an exception will do. This is easy to fix though. What we’ll do is create a list of the two error type and check what kind we have then act accordingly. For this, I am going to turn our error handler function into and EventHandler class, as it will give us some more flexibility later in the examples.
class EventHandler
{
const MSG_TPL = 'PHP Error (%u): %s in: %s:%u';
protected $warningLevels = array (
E_WARNING,
E_NOTICE,
E_CORE_WARNING,
E_COMPILE_WARNING,
E_USER_WARNING,
E_USER_NOTICE,
E_STRICT,
);
protected $fatalLevels = array (
E_ERROR,
E_PARSE,
E_CORE_ERROR,
E_COMPILE_ERROR,
E_USER_ERROR,
E_RECOVERABLE_ERROR,
);
public static function start()
{
$handler = new EventHandler;
set_error_handler( array ( $handler, 'handleError' ) );
}
public function handleError( $code, $message, $file, $line, $context )
{
// Silencing a statement with @ causes this to be 0.
// Let's be respectful.
if ( ini_get( 'error_reporting' ) == 0 ) {
return true;
}
$message = sprintf( self::MSG_TPL, $code, $message, $file, $line );
if ( in_array( $code, $this->warningLevels ) ) {
error_log( $message );
if ( ini_get( 'display_errors' ) ) {
echo '<p>Warning! ' . $message . '</p>';
}
return true;
}
if ( in_array( $code, $this->fatalLevels ) ) {
throw new PhpException( $message );
}
return true;
}
}
EventHandler::start();
What you’ll notice here is that we now have an object to encapsulate our logic and we now distinguish between warning and fatal level errors. In this case, a warning will still write to the PHP error log as the default behavior would do and if display errors is on (like it should NOT be in your production environment) then it will also print that error to the screen. Aside from that, it works identically. Error is caused, if it is fatal, it becomes an exception.
What to do with those exceptions.
Well, we’ve turned all of our fatal errors into catchable exceptions, but what do we do with those now? In some cases we will know what to do and will catch them. In others, we may not anticipate them or we may not know how to handle them and will need to cause a 500 Internal Server error to occur. This is where the exception handler comes in.
Like its brother the error handler, the exception handler is designed to catch exceptions for you automatically. The difference between the two is that errors are caught immediately, an exception will travel back up the call stack until it reaches the point you started you code in. At that point the exception handler will kick in and go all last action hero for you. The idea is, this will handle any failure logic you need, which can be as simple as logging the message in a certain way, as complex as showing a super pretty 500 error page or perhaps a little of both.
Let’s add a couple methods to our EventHandler class.
public function handleException( $exception )
{
header( 'Status: 500 Internal Server Error' );
$this->sendNotice( $exception->getMessage() );
$this->printError( $exception->getMessage() );
exit;
}
protected function sendNotice( $message )
{
$to = 'me@myemail.com';
$subject = 'An error occurred on your website.';
$from = 'errors@mywebsite.com';
$message = wordwrap( $message, 70 );
$headers = "From: %s\r\nReply-To: %s\r\nX-Mailer: PHP/%s";
$headers = sprintf( $headers, $from, $from, PHP_VERSION );
mail( $to, $subject, $message, $headers );
}
protected function printError( $message )
{
// Clear the output buffer. Yes you should have started this
// at some point to avoid fragments of unwanted
// data on your page. See ob_start().
ob_clean();
// Load a template.
echo "<h3>An error has occurred.!!</h3>";
echo "<p>Thank you, come again.</p>";
if ( ini_get( 'display_errors' ) ) {
echo "<p>" . $message . "</p>";
}
}
In this case I have chosen to e-mail out that an error occurred to myself and print a message to the user indicating that there was a problem. For the printError() part I usually pull in a full page template, in the site’s style, but for this example I am keeping it as brief as possible.
To enable this code, just add this line to the bottom of the start() method of the EventHandler class we’ve been making.
set_exception_handler( array ( $handler, 'handleException' ) );
Depending on how complicated your code gets for handling the exception, you may also want to consider wrapping the logic in that method in a try catch block that catches all exceptions. The reason being is that if you are generating a template and a secondary exception is thrown, you will see the dreaded “exception thrown without a stack trace on line 0” error. By wrapping that logic in a try-catch block you can catch it and have a simpler fail over available. Usually mine is just log it, and die(). As in that case I want to be as simple and brief as possible.
A brief example:
public function handleException( $exception )
{
header( 'Status: 500 Internal Server Error' );
try {
$this->sendNotice( $exception->getMessage() );
$this->printError( $exception->getMessage() );
} catch ( Exception $e ) {
$tpl = 'Something really bad happened. An exception was thrown '
. 'in the exception handler. Message: %s, File: %s, Line %u';
error_log( sprintf( $tpl,
$e->getMessage(),
$e->getFile(),
$e->getLine() ) );
}
exit;
}
You can of course do much more than that. Hopefully this was enough to get you thinking about some new possibilities though.
Parse Errors
There is one last type of error that you can catch, that is otherwise harder to see. That is the parse error. This happens when a file you include has a syntax error in it. Usually you’ll see a generic message and a message in your error log. This means you may miss it if some bad code hits your production site.
Luckily PHP has a way to add a shutdown function that will run even when the PHP is crashing. Add this to your EventHandler class.
public function start()
{
....
register_shutdown_function( array ( $handler, 'handleShutdown' ) );
}
public function handleShutdown()
{
if ( !$error = error_get_last() ) {
return;
}
$message = 'Uncaught ' . sprintf( SELF::MSG_TPL,
$error['type'],
$error['message'],
$error['file'],
$error['line'] );
$this->sendNotice( $exception->getMessage() );
$this->printError( $exception->getMessage() );
}
Now we can add a custom message and even a pretty 500 page if we so choose. No more white screens of death for your visitors, just because you forgot something. It happens, we all make mistakes. The point is to have events like these set up to make it less apparent or at least less horrible to our visitors.
Conclusion
The most important feature your can add to any application or framework is a robust way to handle when something goes wrong. Something always will and the better the experience is for your visitors, the more likely it is that they will come back. On the same note, the better your error reporting, the easier time you will have finding and fixing those errors when they occur.
There are all sorts of ways that you can extend what I’ve shown you. The error handling routines I have written for the framework I maintain provide all sorts of other useful information for my company and my team. Of course that goes outside of the scope of this article, I just hope that this gets you thinking on ways to make your own systems the best they can be.
How do you handle errors in your projects?