Pragmatic tips for unit testing in PHP
Recently I've been hearing a lot of differing opinions about unit testing. Some people just don't believe in it at all, others think the value it provides is only part of the picture, and that its cost should be weighed accordingly. Some think that 100% test coverage should be the goal for high quality software. I personally think the cost of unit testing 100% of your code is worth it, but only if you can do it efficiently. And that's the catch; there's a learning curve to become proficient at unit testing, especially with PHPUnit, the most prominent unit testing framework for PHP. Unit testing should never take more than 20-30% of your overall coding time, if you do it right.
Below are some things I keep in mind when coding and testing. I thought I'd share them here in case anyone else finds them useful.
Think about testability when you write code, not when you write the test
This may not be new to you, especially if you're familiar with dependency injection or test driven development. Your unit tests should be able to run in a vacuum: No database access, no memcache access, no headers can be sent, etc. (Sure those things can be tested separately, but should be a small subset of your code, not the overall application logic.) To do this, make sure you abstract those dependencies. For example, instead of instantiating a cache driver in the middle of your method, make a getter for it which you can mock:
If you do this abstraction when you code, there's less refactoring to be done when writing your tests. Also, if your dependency doesn't make a network call upon instantiation, you can still test the getter using assertType(). Memcached's addServer() is an example of such a dependency - no connection is made until you do an operation. If this is not the case, and you can't properly test the getter, you probably want wrap the getter in @codeCoverageIgnore annotations like so:
It doesn't make sense for code you never intend to test to count against your coverage percentage.
Keep documentation handy
PHPUnit is a big framework. There are a lot of things to remember, including all the assertion methods available, data providers, and mocking objects, among other things. Rather than always copying some other example, or bugging your co-worker, spend an hour or two, and RTFM. You'll be better off for it. A co-worker of mine kept a printout of the PHPUnit_Framework_TestCase::getMock() documentation (probably the hardest to remember) taped to his monitor. I prefer delicious bookmarking, but to each his own. Though I've been using PHPUnit for a couple of years now, I still refer to the documentation regularly.
Don't worry about test code re-usability
As good OOP programmers, we are often pre-occupied with writing elegant, re-usable code, even in tests. I have recently given up on this in my unit tests, mostly due to scope issues in PHPUnit which I'll describe later. Rather than try consolidate re-usable code, I'm more likely to copy and paste chunks throughout my test and know that they will work. I think the goal of tests is to write useful assertions, not necessarily elegant, re-usable components. Why do I care how many lines are in my test code? You might be thinking "Doesn't that make your tests harder to maintain?" Maybe so, but in the big picture, I don't think it adds much time, and once you've fixed something in one test, then you can quickly find it elsewhere. It might not be pretty, but it is faster, and more pragmatic, at least until the tools are better.
Mocking scope issues
As I mentioned above, I've found (and reported) some strange behavior I've seen with scope and mocked objects. One pattern I see people use is a getMocks() helper in a test, which returns an array of commonly used objects, pre-mocked with the most used defaults already set. Sounds like a reasonable approach to testing a large class with lots of dependency needs, right? Wrong. What I've found is that many times (and my test results have been inconsistent, unfortunately), you cannot modify the return value of a mocked method after its expectation has already been set. No warning or error is thrown when you do this, but the original mocked value is returned instead of the second value you assigned. Re-mocking return values does seem to work when it's used for throwing exceptions, though. It appears to be a scope issue, but again, my tests have not always been consistent. The usual result is that you spent an hour pulling your hear out trying to figure out why your test is failing. I eventually said screw it, and just never set default behaviour for my mocked methods.
I now use a pattern where I have getters in my test for each mocked dependency, which mock each method I may need, but don't set any initial expectations yet. I'll then pass them all into the getter for the main class I'm trying to test, which attaches them for me. This way, on a per test basis, I set the expectations of each dependency as needed. This seems to work much more reliably:
Meaningful assertions
Another complaint I hear is that, especially when enforcing code coverage rules in your development shop, developers focus more on the required coverage than on the quality of the tests. It's important to think about what it is you're testing. If you just call a method to make sure you get your code coverage, you're doing it wrong. There are lots of assertions you can do: assert the type of the response, assert the content of the response, assert the message of the exception thrown, the exception code, the type of exception, how many times a mocked method was called, etc. But you want to be smart about it as well. We had one case where the original test for a Form framework was testing the exact template output. Since there was markup and JS in the ouput, every time a designer or JS developer modified it, the test would break. After a few breaks, I modified the test to use DomDocument parsing, and make sure the relevant form ids and attributes were present, which is really what we wanted to assert. If you don't write meaningful tests, don't bother writing them at all.
Useful tools
Make sure it's easy to write and run your tests. PHPUnit has options for generating skeletons, for example, to get you started quickly. You can use --skeleton-class to generate a class from your test (if you're practicing TDD), or the reverse, --skeleton-test, to generate a test from your class.
Running tests should be easy too. One thing to keep in mind is that your include path must be correct. For example, when I'm developing a PEAR package, I may have a previous version already installed, but am developing in my repository checkout. I need to make sure my local files are not only in my include path, but that they are at the beginning. A handy wrapper like this should allow you to always run your development code easily:
If I put this in the root of my repository, I can then call it like so:
php runTests.php --coverage-html coverage OpenID_AllTests
(If this is a PEAR package, you'll want to ignore runTests.php when generating your package.xml file.)
At Digg, our development environment tools are based on make, and it's super easy to run not only your PHPUnit tests and coverage, but also chain together running PHP_CodeSniffer for coding standards compliance:
make coverage phpcs
Facebook has a good post about hiring in general, which includes a section on writing good tools for your team. This idea certainly applies to your testing tools. The easier you make it for your team to write good unit tests, the better.
Conclusion
Unit testing in PHP is not that easy, in my opinion (compared to python, anyway). Until you get good at it, that is, and learn to work around quirks in the tools. Then it's really not that hard. But everyone values unit testing differently. If you choose to invest in testing your code, perhaps you'll find some of the above useful. I used to not think much about unit testing, until I had a few experiences where unit tests caught problems in my code before I had run into them myself. At this point, I think unit testing my PHP code 100% is worth it, especially now that I can do it quickly and efficiently.
MogileFS, Zend Framework, Boobs, and Kittens
An upcoming side project of mine requires the use of MogileFS and Zend Framework. MogileFS is an open source distributed file system, meant to scale up to many millions of files without a single point of failure. It's currently used by the likes of Digg and last.fm.
Though I was already familiar with using MogileFS clients in php and python at work, the operations team actually runs the servers. I wanted to get some experience with managing MogileFS itself (trackers, mogstored, mysql), and hopefully have a better understanding of how it all worked together. While I was already familiar with Zend Framework, I'd never used it to serve images. So, I figured I should build a quick prototype using both. But what to build? I decided to take inspiration from the notorious http://explosionsandboobs.com, but put my own spin on it. The result?
Bear in mind that the above URL is on my inexpensive Slicehost VPS, and can be slow at times. This is a really simple application that allowed me to do a few things:
- scrape some content from google images search
- store them in MogileFS
- render the stored images through ZF
The source is available here. The scraper is pretty straight forward. I just called it from the command line to save images to a directory. Once saved to disk, I used the mogrify tool from Imagemagick to scale the images down to a height of 400 px max, and visually removed the worst of the images (it's relatively SFW). Next, I inserted those images into MogileFS with insertImage.php. There are only two real action controllers, the index page and image renderer. The latter was really the only thing to learn on the php side of this project. The key there was disabling the layout and view rendering, as well as setting the appropriate image information in the response object (content-type, content-length, and the content itself). Since most of the logic is in the model, the action controllers are pretty skinny. I used memcache to store db queries and the images from MogileFS to help offset the performance of a cheap VPS. Since Zend Framework doesn't ship with a MogileFS client, I used the PEAR one.
That's it. Enjoy!
2009 Year in Review
I've been thinking about starting a blog for a while, and since it's the beginning of a new year, I thought I'd start out with a list of highlights from last year. So here's my 2009 Year in Review:
- In the beginning of the year I officially became a manager at Digg, though I still spend a majority of my time there coding.
- In March I became an uncle again with the birth of my brother Mike's first child, Nina Taylor Shupp.
- April 26th I ran my first 10.6 mile race as part of the Big Sur International Marathon. I finished in around 2 hours 40 minutes. The course, on CA 1 along the Pacific ocean, has a lot of hills, but is amazingly beautiful. I hope to do the full marathon next time.
- In early May, Digg launched its Facebook Connect integration. This was a long project which for which I led the technical implementation. It went really well, and saw an initial increase in registrations upwards of 50%. It was also covered on Tech Crunch.
- Later that month I gave an internal "Brown Bag Tech Talk" on OpenID. The slides are available here.
- In June, Peggy and I celebrated our second wedding anniversary just before heading out for a 16 day vacation in Europe. We visited Paris, Copenhagen, Lunde, Tallinn, Kiel, Hamburg, St. Petersburg, Helsinki, and Stockholm.
- In July, myself and a few colleagues at Digg handed in our contribution to a forthcoming book on PHP unit testing. The book is a case study on using PHPUnit and continuous integration by Sebastian Bergmann. It's currently being translated into German, and should be out this year in both German and English.
- Later that month I spoke at OSCAMP 2009 on contributing to the PEAR project (the PHP Extension and Application Repository). The slides are available here.
- In August, I ran for and was elected to The PEAR Group, the governing body of PEAR.
- That same month, I played drums on a new album with The Whitehalls (formerly Heathrow), a band I've been playing with since shortly after moving up to San Francisco. The record is almost done, and should be out soon.
- In October I attended my first high school reunion. It was entertaining, and I really enjoyed seeing a few people, but for the most part it was pretty much exactly what I expected.
- We spent Thanksgiving up in Kohler, WI on lake Michigan where some of Peg's family was gathering. One highlight of the trip was taking a segway tour of Chicago's waterfront on the way back.
- December 1st I closed down my long running web hosting service, MerchBox, which I started back in 2000.
- Just after that saw the release of Digg's second generation API, another long project which I led initially, and contributed a good amount to. We also knocked out a fun reference application on a Sunday afternoon to help developers get up and running, which you can see here.
- Also in December, The Whitehalls participated in Silicon Valley Rocks 2009, a benefit for Music in Schools Today that raised $27,000.
- We spent Christmas and New Year's Eve back in northern Virginia, and were able to get in a fair amount of time at the Smithsonian. I wish I had appreciated it when I lived there.
- Closing out the month saw the acceptance into PEAR of two libraries I wrote, OpenID and Services_Digg2.
Cheers,
Bill Shupp

Twitter
Github
Digg
LinkedIn
PEAR
Facebook
SourceForge
YouTube
FriendFeed
Delicious







