Skip to content

Debugging Zend_Test

Sometimes, I have to debug unit tests and usually this is a situation I'm trying to avoid.

If I have to spend too much time debugging a test it's usually a bad test. Which usually means that it's too complex. However, with Zend_Test_PHPUnit_ControllerTestCase, it's often not the actual test, but the framework. This is not just tedious for myself, it's also not the most supportive fact when I ask my developers to write tests.

An example

The unit test fails with something like:

Failed asserting last module used <"error"> was "default".

Translated, this means the following:

  • The obvious: an error occurred.
  • The error was caught by our ErrorController.
  • Things I need to find out:
    • What error actually occurred?
    • Why did it occur?
    • Where did the error occur?

The last three questions are especially tricky and drive me nuts on a regular basis because a unit test should never withhold these things from you. After all, we use these tests to catch bugs to begin with. Why make it harder for the developer fix them?

In my example an error occurred, but debugging Zend_Test also kicks in when things supposedly go according to plan. Follow me to the real life example.

Real life example

I have an Api_IndexController where requests to my API are validated in its preDispatch().

Whenever a request is not validated, I will issue "HTTP/1.1 401 Unauthorized". For the sake of this example, this is exactly what happens.

class ApiController extends Zend_Controller_Action
{
    protected $authorized = false;
    
    public function preDispatch()
    {
        // authorize the request
        // ...
    }
    public function profileAction()
    {
        if ($this->authorized === false) {
            $this->getResponse()->setRawHeader('HTTP/1.1 401 Unauthorized');
        }
        // ...
    }
}

Here's the relevant test case:

class Api_IndexControllerTest ...

    public function testUnAuthorizedHeader()
    {
        $this->dispatch('/api/profile'); // unauthorized
        $this->assertResponseCode(401);
    }
}

The result:

1) Api_IndexControllerTest::testUnAuthorizedHeader
Failed asserting response code "401"

/project/library/Zend/Test/PHPUnit/Constraint/ResponseHeader.php:230
/project/library/Zend/Test/PHPUnit/ControllerTestCase.php:773
/project/tests/controllers/api/IndexControllerTest.php:58

Not very useful, eh?

Debugging

Before you step through your application with print, echo and an occasional var_dump, here's a much better way of see what went wrong.

I'm using a custom Listener for PHPUnit, which works sort of like an observer. This allows me to see where I made a mistake without hacking around in Zend_Test.

Here is how it works

Discover my PEAR channel:

sudo pear channel-discover till.pearfarm.org

Install:

[email protected]:~/ sudo pear install till.pearfarm.org/Lagged_Test_PHPUnit_ControllerTestCase_Listener-alpha
downloading Lagged_Test_PHPUnit_ControllerTestCase_Listener-0.1.0.tgz ...
Starting to download Lagged_Test_PHPUnit_ControllerTestCase_Listener-0.1.0.tgz (2,493 bytes)
....done: 2,493 bytes
install ok: channel://till.pearfarm.org/Lagged_Test_PHPUnit_ControllerTestCase_Listener-0.1.0

If you happen to not like PEAR (What's wrong with you? ;-)), the code is also on github.

Usage

This is my phpunit.xml:

<?xml version="1.0" encoding="utf-8"?>
<phpunit bootstrap="./TestInit.php" colors="true" syntaxCheck="true">
    <testsuites>
    ...
    </testsuites>
    <listeners>
        <listener class="Lagged_Test_PHPUnit_ControllerTestCase_Listener" file="Lagged/Test/PHPUnit/ControllerTestCase/Listener.php" />
    </listeners>
</phpunit>

Output

Whenever I run my test suite and a test fails, it will add something like this to the output of PHPUnit:

PHPUnit 3.4.15 by Sebastian Bergmann.

..Test 'testUnAuthorizedHeader' failed.
RESPONSE

Status Code: 200

Headers:

     Cache-Control - public, max-age=120 (replace: 1)
     Content-Type - application/json (replace: 1)
     X-Ohai - WADDAP (replace: false)

Body:

{"status":"error","msg":"Not authorized"}

F..

Time: 5 seconds, Memory: 20.50Mb

There was 1 failure:

1) Api_IndexControllerTest::testUnAuthorizedHeader
Failed asserting response code "401"

/project/library/Zend/Test/PHPUnit/Constraint/ResponseHeader.php:230
/project/library/Zend/Test/PHPUnit/ControllerTestCase.php:773
/project/tests/controllers/api/IndexControllerTest.php:58

FAILURES!
Tests: 5, Assertions: 12, Failures: 1.

Analyze

Analyzing the output, I realize that my status code was never set. Even though I used a setRawHeader() call to set it. Turns out setRawHeader() is not parsed so the status code in Zend_Controller_Response_Abstract is not updated.

IMHO this is also a bug and a limitation of the framework or Zend_Test.

The quickfix is to do the following in my action:

$this->getResponse()->setHttpResponseCode(401);

Fin

That's all. Quick, but not so dirty. If you noticed, I got away without hacking Zend_Test or PHPUnit.

The listener pattern provides us with very powerful methods to hook into our test suite. If you see the source code it also contains methods for skipped tests, errors, test suite start and end.

start-stop-daemon, Gearman and a little PHP

The scope of this blog entry is to give you a quick and dirty demo for start-stop-daemon together with a short use case on Gearman (all on Ubuntu). In this example, I'm using the start-stop-daemon to handle my Gearman workers through an init.d script.

Gearman

Gearman is a queue! But unlike for example most of the backends to Zend_Queue, Gearman provides a little more than just a message queue to send — well — messages from sender to receiver. With Gearman it's trivial to register functions (tasks) on the server to make in order to start a job and to get stuff done.

For me the biggest advantages of Gearman are that it's easy to scale (add a server, start more workers) and that you can get work done in another language without building an API of some sort in between. Gearman is that API.

Back to start-stop-daemon

start-stop-daemon is a facility to start and stop programs on system start and shutdown. On recent Ubuntus most of the scripts located in /etc/init.d/ make use of it already. It provides a simple high-level API to system calls — such as stopping a process, starting it in a background, running it under a user and the glue, such as writing a pid file.

My gearman start script

Once adjusted, register it with the rc-system: update-rc.d script defaults. This will take care of the script being run during the boot process and before shutdown is completed.

A little more detail

The script may be called with /etc/init.d/script start|stop|restart (the pipes designated "or").

Upon start, we write a pidfile to /var/run and start the process. The same pidfile is used on stop — simple as that. The rest of it is hidden behind start-stop-daemon which takes care of the ugly rest for us.

DB_CouchDB_Replicator

Update, 2010-03-04: I just rolled a 0.0.2 release. In case you had 0.0.1 installed, just use pear upgrade-all to get it automatically. This release is trying to fix a random hang while reading documents from the source server.

I also opened a repository on Github.

---

As some may have guessed from a previous blog post we are currently running a test setup with CouchDB lounge. My current objective is to migrate our 200 million documents to it, and this is where I am essentially stuck this week.

No replication, no bulk docs

The lounge currently does not support replication (to it) or saving documents via bulk requests, so in essence migrating a lot of data into it is slow and tedious.

I have yet to figure out if there is a faster way (Maybe parallelization?), but DB_CouchDB_Replicator is the result of my current efforts.

I think I gave up on parallelization for now because it looked like hammering the lounge with a single worker was already enough, but generally I didn't have time to experiment much with it. It could have been my network connection too. Feedback in this area is very, very appreciated.

DB_CouchDB_Replicator

DB_CouchDB_Replicator is a small PHP script which takes two arguments, --source and --target. Both accept values in style of http://username:[email protected]:port/db and attempt to move all documents from source to target.

Since long running operations on the Internet are bound to fail, I also added a --resume switch, and while it's running it outputs a progress bar, so it should be fairly easy to resume. And you also get an idea of where it's currently at and how much more time it will eat up.

These switches may change, and I may add more — so keep an eye on --help. Also, keep in mind, that this is very alpha and I give no guarantees.

Installation

Installation is simple! :-)

apt-get install php-pear
pear config-set preferred_state alpha
pear channel-discover till.pearfarm.org
pear install till.pearfarm.org/DB_CouchDB_Replicator

Once installed, the replicator resides in /usr/local/bin or /usr/bin and is called couchdb-replicator.

Fin

The code is not yet on github, but will eventually end up there. All feedback is welcome!

Quo vadis PEAR?

To PEAR or not to PEAR — PEAR2 is taking a while and I sometimes think that everyone associated with PEAR is busy elsewhere. Since a little competition never hurt, I'm especially excited about these recent developments.

With the release of Pirum, I'm really excited to see two public PEAR channels that aim to make PEAR a standard to deploy and manage your applications and libraries. One is PEARhub and the other is PEAR Farm. I think I'm gonna stick with PEAR Farm for a while, so this blog entry focuses on things I noticed when I first played with it.

PEAR vs PEAR Farm

A lot of people mistake these new channels for the wrong thing. They think that this will eventually replace PEAR. I don't think it will — ever.

While I really welcome the idea that people push PEAR's channel for a standard to distribute apps and libraries (PHPUnit, ezComponents, Zend Framework, Symfony, etc.), it's also very obvious that these open channels will never be the same.

For an idea of what I mean — take a look at open code repositories around the web and especially when it comes to PHP, it's very obvious that while there's a lot of code, most of it is utter crap. (There I said it!)

And no one wants to rely on it when reliability is an objective.

The PEAR Coding Standards were not invented because it's so great to tell people how to write code. But a lot of people need this guidance. While they have a lot of ideas about how to implement a feature or a cool algorithm, their passion does not extend to test coverage or even little documentation. And that's where these sometimes frowned upon coding standards come in handy because they ensure that the code in PEAR is maintainable — which is really just the tip of the ice berg for professional software engineers and I'll save the rest for another blog post.

Let's get to it

Regardless of the fact that there always only few steps, PEAR setups tend to not work for or look complicated to a lot of people. Here are a few tips on how to get started. We'll assume pear itself is installed (apt-get install php-pear).

PEAR Farm

Instead of what is written in the FAQ - suggested steps to get started. ;-)

  • pear channel-discover pearfarm.pearfarm.org
  • pear install pearfarm.pearfarm.org/pearfarm-beta

Spec files and package.xml

The package.xml is a configuration file for the PEAR package. It's pretty long, and since XML is so verbose, it's a turn off to many. PEAR Farm suggests to create a spec instead, so it can create a package.xml for you. This is pretty convenient, but there are also a few gotchas.

  • pearfarm init (inside the code repo)

If you're doing it for the first time, it will issue a warning about a configuration file being created — ignore it and move on.

Continue by editing the .spec file with description, summary, maintainer and whatever else is a suggested edit inside it.

Then, feel free create a package.xml:

  • pearfarm build

This will leave you with a package.xml file in the same directory. Mission accomplished!

Gotchas & Tricks

These are a few things you should edit before you pearfarm push:

  • If you use git and happen to have a .gitignore, pearfarm will add .gitignore to the package.xml as well.

  • pearfarm adds a baseinstalldir attribute to the top most <dir /> entity. Assuming your package is Foo_Bar and you have Foo/Bar.php in your repository, it would install into /usr/share/php/Foo_Bar/Foo/Bar.php, instead of Foo/Bar.php. I'd suggest you remove it and instead namespace in the package name right away — YourName_FooBar.

  • Does it work? Always pear package-validate before you push!

  • Does it install correctly? Feel free to pear install Package-x.y.z.tgz and check if the files installed ok (pear list-files pearfarm.pearfarm.org/Package)

  • A package.xml can have files with different roles, while the standard is php, there's also doc (for documentation) and test for tests. When the package.xml is generated from the .spec, all files get php by default, that's why I'd go through the list in <contents /> to double-check that all files have appropriate roles set.

  • Another thing to take care of (imho), would be <dependencies /> — something that could probably be automated, but otherwise read up about in the manual.

  • Need to rebuild the package from your updated package.xml? Use: pear package!

  • Changelog? Why yes, you can! The package.xml supports it and it would be nice if PEAR Farm eventually displays it. Here's how (after <phprelease />):

    
        
          0.0.2
          alpha
          2010-01-29
          bugfix in makeRequest() (curl), corrected endpoint
        
        
          0.0.1
          alpha
          2010-01-27
          initial release
        
    

Most of these things could be improved in pearfarm. and i'm pretty sure they will. Eventually! :)

The End

And that's all for this time. If you want to see my packages on the PEAR Farm, follow this link.

PHP: So you'd like to migrate from MySQL to CouchDB? - Part II

This is part II of my introductory series to move from MySQL a relational database (management) system to CouchDB. I will be using MySQL as an example. Part I of this series is available here.

Recap

In part I, I introduced CouchDB by explaining its basics. I continued by showing a simple request to create a document using curl (on the shell) and expanded how the same request could be written in PHP (using ext/curl) — or in HTTP_Request2 or with phpillow.

Part II will focus on the most basic operations and help you build a small wrapper for CouchDB. The code will be available on Github.

Getting data into CouchDB

As I explained before — CouchDB is all HTTP.

But for starters, we'll take a shortcut here and briefly talk about CouchDB's nifty administration frontend called futon. Think of Futon as a really easy to use phpMyAdmin. And by default futon is included in your CouchDB installation.

Features

Futon will allow you to virtually do anything you need:

  • create a database
  • create a document
  • create a view
  • build the view
  • view ;-) (documents, views)
  • update all of the above
  • delete all of the above
  • replicate with other CouchDB servers

Assuming the installation completed successfully, and CouchDB runs on 127.0.0.1:5984, the URL to your very own Futon is http://127.0.0.1:5984/_utils/. And it doesn't hurt to create a bookmark while you are at it.

Why GUI?

Purists will always argue that using a GUI is slower than for example hacking your requests from the shell.

That may be correct once you are a level 99 CouchDB super hero, but since we all start at level 1, Futon is a great way to interact with CouchDB to learn the basics. And besides, even level 99 CouchDB super heroes sometimes like to just click and see, and not type in crazy hacker commands to get it done.

I'll encourage everyone to check it out.

Read, write, update and delete.

Sometimes also referred to as CRUD (create, read, update, delete) — read, write, update and delete are the basics most web applications do.

Since most of you have done a "write a blog in X"-tutorial before — and I personally am tired of writing blogs in different languages or with different backends — let's use another example.

Think about a small guestbook application, it does all of the above — a fancy guest will even do update. For the sake of simplicity, I'll skip on the frontend in this example and we'll work on the backend and essentially create a small wrapper class for CouchDB.

Operations

By now — "CouchDB is all HTTP" — should sound all familiar. So in turn, all these CRUD operations in CouchDB translate to the following HTTP request methods:

  • write/create - PUT or POST
  • read - GET
  • update - PUT or POST
  • delete - DELETE

On write

Whenever you supply an ID of a new document along with the document, you should use PUT.

When you don't care about the document ID, use POST instead, and CouchDB will generate a unique ID for you.

This unique ID will not look like an autoincremented integer, but we should not cling to this concept anyway. Without diving into too advanced context now, but the auto_increment feature in MySQL is a little flawed in general and in a distributed context especially. More on this (maybe) in a later part of this series — in the mean-time, check out Joshua Schachter's post.

On update

By default, CouchDB keeps a revision ID of each document. To many this is a pretty cool feature — out of the box, so to speak. But there are two very important and fundamental things to be aware of.

  1. CouchDB will keep previous revisions of a document around until you run compact on the database. Think of compact as a house keeping chore. It will wipe your database clean of previous revisions and even optimize the indices (in case you had a lot of data changing operations in the mean time). For CouchDB revisions are especially important in a distributed context (think replication — more on this later) and while it's cool to have them, they should not be used as a feature and be exposed to the user of your application.

  2. In case we decide a document, we always have to provide the previous (or current) revision of the document. This sounds strange, but the reasons are simple — in case another update gets in between all we have to do is provide the necessary interfaces and workflows in our application to alert the user and avoid a possible conflict.

CouchDB and the HTTP standard

CouchDB's API adheres to the above in 99.999999999% of the time. And it only breaks the pattern once. The exception to the rule is that when you bulk request multiple documents, which is strictly speaking a GET operation CouchDB will allow you to post in this case.

The reason for this is that the length of GET request is limited (by RFC) and if we exceeded this by requesting too many document IDs, we would hit a hard limit. To get around this, the operation is POST — but more on this later.

Requirements

For the following examples we assume a working CouchDB installation and a database called guestbook. No admins are set up — we can read and write without permission.

For simplicty, we imagine a form with the following fields:

  • author
  • entry

... and in addition to those two keys that may be populated by the user we add:

  • type (always: guestbook)
  • added (a date of some kind)

... the last two are not absolutely necessary, but will come handy in future iterations of this tutorial.

Also, on the tech side, we need a webserver with PHP and HTTP_Request2 (pear install HTTP_Request2-alpha) in your include_path. :-)