PHP: So you'd like to migrate from MySQL to CouchDB? - Part II

If you enjoyed this article, please leave a comment, rss subscribe to my RSS feed and/or follow me on Twitter. Thank you very much!

This is part II of my introductory series to move from MySQL a relational database (management) system to CouchDB. I will be using MySQL as an example. Part I of this series is available here.

Recap

In part I, I introduced CouchDB by explaining its basics. I continued by showing a simple request to create a document using curl (on the shell) and expanded how the same request could be written in PHP (using ext/curl) — or in HTTP_Request2 or with phpillow.

Part II will focus on the most basic operations and help you build a small wrapper for CouchDB. The code will be available on Github.

Getting data into CouchDB

As I explained before — CouchDB is all HTTP.

But for starters, we'll take a shortcut here and briefly talk about CouchDB's nifty administration frontend called futon. Think of Futon as a really easy to use phpMyAdmin. And by default futon is included in your CouchDB installation.

Features

Futon will allow you to virtually do anything you need:

  • create a database
  • create a document
  • create a view
  • build the view
  • view ;-) (documents, views)
  • update all of the above
  • delete all of the above
  • replicate with other CouchDB servers

Assuming the installation completed successfully, and CouchDB runs on 127.0.0.1:5984, the URL to your very own Futon is http://127.0.0.1:5984/_utils/. And it doesn't hurt to create a bookmark while you are at it.

Why GUI?

Purists will always argue that using a GUI is slower than for example hacking your requests from the shell.

That may be correct once you are a level 99 CouchDB super hero, but since we all start at level 1, Futon is a great way to interact with CouchDB to learn the basics. And besides, even level 99 CouchDB super heroes sometimes like to just click and see, and not type in crazy hacker commands to get it done.

I'll encourage everyone to check it out.

Read, write, update and delete.

Sometimes also referred to as CRUD (create, read, update, delete) — read, write, update and delete are the basics most web applications do.

Since most of you have done a "write a blog in X"-tutorial before — and I personally am tired of writing blogs in different languages or with different backends — let's use another example.

Think about a small guestbook application, it does all of the above — a fancy guest will even do update. For the sake of simplicity, I'll skip on the frontend in this example and we'll work on the backend and essentially create a small wrapper class for CouchDB.

Operations

By now — "CouchDB is all HTTP" — should sound all familiar. So in turn, all these CRUD operations in CouchDB translate to the following HTTP request methods:

  • write/create - PUT or POST
  • read - GET
  • update - PUT or POST
  • delete - DELETE

On write

Whenever you supply an ID of a new document along with the document, you should use PUT.

When you don't care about the document ID, use POST instead, and CouchDB will generate a unique ID for you.

This unique ID will not look like an autoincremented integer, but we should not cling to this concept anyway. Without diving into too advanced context now, but the auto_increment feature in MySQL is a little flawed in general and in a distributed context especially. More on this (maybe) in a later part of this series — in the mean-time, check out Joshua Schachter's post.

On update

By default, CouchDB keeps a revision ID of each document. To many this is a pretty cool feature — out of the box, so to speak. But there are two very important and fundamental things to be aware of.

  1. CouchDB will keep previous revisions of a document around until you run compact on the database. Think of compact as a house keeping chore. It will wipe your database clean of previous revisions and even optimize the indices (in case you had a lot of data changing operations in the mean time). For CouchDB revisions are especially important in a distributed context (think replication — more on this later) and while it's cool to have them, they should not be used as a feature and be exposed to the user of your application.

  2. In case we decide a document, we always have to provide the previous (or current) revision of the document. This sounds strange, but the reasons are simple — in case another update gets in between all we have to do is provide the necessary interfaces and workflows in our application to alert the user and avoid a possible conflict.

CouchDB and the HTTP standard

CouchDB's API adheres to the above in 99.999999999% of the time. And it only breaks the pattern once. The exception to the rule is that when you bulk request multiple documents, which is strictly speaking a GET operation CouchDB will allow you to post in this case.

The reason for this is that the length of GET request is limited (by RFC) and if we exceeded this by requesting too many document IDs, we would hit a hard limit. To get around this, the operation is POST — but more on this later.

Requirements

For the following examples we assume a working CouchDB installation and a database called guestbook. No admins are set up — we can read and write without permission.

For simplicty, we imagine a form with the following fields:

  • author
  • entry

... and in addition to those two keys that may be populated by the user we add:

  • type (always: guestbook)
  • added (a date of some kind)

... the last two are not absolutely necessary, but will come handy in future iterations of this tutorial.

Also, on the tech side, we need a webserver with PHP and HTTP_Request2 (pear install HTTP_Request2-alpha) in your include_path. :-)

Gimmick!

In case you guys want to take a short cut, check this out:

Cloudant gave me an invite code for their service. It's valid for 30 accounts, in case it doesn't work for you, feel free to leave a comment.

This invite allows you to play with a hosted CouchDB without installing it yourself.

Instructions

  1. Go to http://cloudant.com/start/
  2. Use the invite code blyinio
  3. Cloudant's FAQ: http://cloudant.com/#faq

Let's get to the Code

For this example, I invented a new wrapper for CouchDB called ArmChair! ;-) ArmChair is a very short (and simple) PHP class. It uses HTTP_Request2 for all HTTP operations. ArmChair extends the HTTP_Request2 class, which is why the methods setUri(), send(), setBody(), etc. are all available in $this context.

And in case I lost anyone with the above, they need to read up on object oriented programming and PHP before they continue. ;-)

The current version of ArmChair is 0.1.0 and not meant for production use. It will evolve as this series continues. Make sure to stay in the 0.1.0 tag on Github so the code I use in this tutorial, matches the release.

So together with the example curl requests for the most basic operations, the code from ArmChair.

Creating an entry

Here's a simple POST request to add new entry in our guestbook:

curl -X POST http://127.0.0.1:5984/guestbook \
-d '{"entry":"Great guestbook","author":"till","type":"guestbook","added":"2009-11-02 22:22:00"}'

ArmChair's version:

<?php
$armchair = new ArmChair('http://127.0.0.1:5984/guestbook');

$entry = array(
    'entry' => 'Great guestbook!',
    'author' => 'Till',
    'type' => 'guestbook',
    'date' => '2009-11-02 22:22:00',
);

$document = $armchair->addDocument($entry);
var_dump($document);

The relevant source code:

public function addDocument(array $data)
{
    if (isset($data['_id'])) {
        $id = urlencode($data['_id']);
        unset($data['_id']);
        $this->setUri($this->server . '/' . $id);
        $this->setMethod(HTTP_Request2::METHOD_PUT);
    } else {
        $this->setUri($this->server);
        $this->setMethod(HTTP_Request2::METHOD_POST);
    }
    $this->setBody(json_encode($data));

    $response = $this->send();
    return $this->parseResponse($response);
}

In a nutshell, when in case we supply an _id key, we get to select the ID the document is saved as (PUT). Otherwise, CouchDB will take care of it (POST).

In our example, we create a document with a random ID.

Reading all entries

curl -X GET http://127.0.0.1:5984/guestbook/_all_docs

(Note: _all_docs is not exactly suitable for production. Instead, we should create a view to retrieve all documents later. Since this is only part II of my series, I'll skip on views here and will get to it later.)

ArmChair's version:

<?php
$armchair  = new ArmChair('http://127.0.0.1:5984/guestbook');
$documents = $armchair->get();
var_dump($documents);

The relevant source code:

public function get($id = null)
{
    if ($id === null) {
        $this->setUri($this->server . '/_all_docs');
    } else {
        $id = urlencode($id);
        $this->setUri($this->server . '/' . $id);
    }
    $this->setMethod(HTTP_Request2::METHOD_GET);
    $response = $this->send();
    return $this->parseResponse($response);
}

In simple words — we can use this method to retrieve all documents, are a single one. It all depends on if we supply an $id parameter when we call $armchair->get().

Deleting an entry

curl -X DELETE http://127.0.0.1:5984/guestbook/ID?rev=1-ea2f7bfea40efed2bcd89a8f17e903bb

ArmChair's version of delete:

<?php
$armchair = new ArmChair('http://127.0.0.1:5984/guestbook');
$armchair->deleteDocument('ID', '1-ea2f7bfea40efed2bcd89a8f17e903bb');

The source code:

public function deleteDocument($id, $rev)
{
    $id = trim($id);
    if (empty($id)) {
        return false;
    }
    $id  = urlencode($id);
    $rev = urlencode($rev);
    $this->setUri($this->server . '/' . $id . '?rev=' . $rev);
    $this->setMethod(HTTP_Request2::METHOD_DELETE);

    $response = $this->send();
    return $this->parseResponse($response);
}

Complete

Again, the complete source code is available on Github.

Disclaimer

And again. This class is not very robust (in terms of error handling, etc.), or feature complete. Of course anyone may use this as an entry to CouchDB and to get started, but it not be used in production — at least not until I say so. ;-)

Conclusion

I encourage everyone to browse the ArmChair code on Github. Besides creating, reading and deleting it supports updating documents.

The next part of this series will focus on view creation — in order to pass on /_all_docs and play with the equivalents of LIMIT and ORDER BY in CouchDB.

In this part of the series, I showed off powerful yet simple CRUD operations with CouchDB. Stay tuned for more!

| More