CouchDB on Ubuntu on AWS

If you enjoyed this article, please leave a comment, rss subscribe to my RSS feed and/or follow me on Twitter. Thank you very much!

Here's a little HowTo on how to setup CouchDB on an AWS EC2 instance. But outside of AWS (and EC2), this setup works on any other Ubuntu server, and I suppose Debian as well.

Getting started

The following steps are a rough draft, or a sketch on how to get started. I suggest that you familiarize yourself with what all of these things do. If you want to skip on the reading and just get started, this should work anyway.

  • you (obviously) need an AWS account (and log into the AWS console).
  • you need a custom security group (make sure to open up for http traffic)
security_group_001

security_group_002

  • create an EBS volume (Take a deep breath and think about the size of the volume. Keep in mind that you don't want to run into space issues right away and that allocated storage (even idle) costs you money (e.g. 400 GB =~ 40 USD (per month), excluding the i/o).)
  • create a keypair (It'll prompt you to download a foobar.pem, I placed mine on my local machine in ~/.ssh/ and ran chmod 400 on it.)
  • get an elastic IP
  • start the instance
    • select an AMI (I selected alestic's 64bit server Ubuntu 9.04 AMI.)
    • assign your own security group AND the defaults one
    • select your keypair

Woo! We made it that far.

The instance should boot and once this is done (green indicates all went well), we want to associate the previously created EBS volume and the elastic IP to said instance.

Once these steps are complete, go on the instance screen, click on your running instance and then click on "Connect". It'll show you the ssh command to connect to your instance -- it should be similar to this:

ssh -i .ssh/foobar.pem [email protected]

The W-X-Y-Z part is most likely replaced with your elastic IP.

This process is not very automated yet, but at least you have an instance up and running. The next step is to try to login and see if the EBS was attached — if all went well, you should have /mnt.

Setup

To ease the setup, and because I've been trying this with different instances etc., I wrote a small script to handle the setup process. This script basically installs checkinstall along with the requirements noted in CouchDB's README file. checkinstall is — in case you didn't know of it yet — the tool to keep your sanity when you deal with outdated software on Ubuntu (or Debian).

A Linux rant

[Feel free to skip over this.]

checkinstall is one of those things I discovered when I was almost ready to give up on Ubuntu in the cloud.

My main grudge when it comes to Linux is that the package managers really suck (sorry, but it's true). You probably think different because your shiney apt-get install foobar always works, but my beef comes with customizable software, management and general up-to-date-ness.

For example, the most recent CouchDB package (at the time of me writing this blog entry — 2009/08/27) is 0.8, which is almost exactly a year old. I can only hope that some people get their act together and update CouchDB for Karmic Koala (aka Ubuntu 9.10) (I've been promised to), but CouchDB is really just the tip of the iceberg. For me, this list continues with PHP, memcached and nginx to name the more known candidates.

Just btw — who comes up with the release names? ;-)

Do not get me started on the fact that /usr/local on my aptitude driven Ubuntu is always empty — why all software is installed in base system and pollutes it is just beyond me.

Do not get me started (part dos) on weird dependencies some of these stoic binary packages impose on me.

Of course people may argue that you can customize whatever you want and always install software from source, but if you've ever done this, you must agree that source installs tend to become unmaintainable.

How do I remove the software ...

  • in case part of it poses a security issue
  • for a clean case of updates
  • ... and so on.

The bottom line, if you worked with ports (especially on FreeBSD) for the last 10 years this feels like a dozen steps in the wrong direction.

checkinstall to the rescue

checkinstall replaces make install when you compile software from source and automatically registers a package which you can kill with a certain dpkg commad. It's that easy, and it almost feels like FreeBSD ports (I miss you. ;-(). Other things it does is, it registers with dependencies it'll even attempt to provide them. If you get around to checkinstall make sure to read the manual on all its goodness.

In a nutshell, the advantage is what I ranted above — a clean, maintainable system and recent software. Following this procedure I can run current releases (e.g. trunk) or even nightly builds without going nuts and polluting my system.

Installation

Without further ado — here's the script I use to setup CouchDB on my AWS instances:

#!/bin/sh
apt-get install -y checkinstall
apt-get install -y automake autoconf libtool help2man
apt-get install -y build-essential erlang libicu-dev libmozjs-dev libcurl4-openssl-dev
mkdir ~/build
cd ~/build
wget http://apache.easy-webs.de/couchdb/0.9.1/apache-couchdb-0.9.1.tar.gz
tar zxvf apache-couchdb-0.9.1.tar.gz
cd apache-couchdb-0.9.1/
./configure --prefix=/mnt/couchdb
make
checkinstall-y -D --pkgname=apache-couchdb \
--pkgversion=0.9.1 [email protected] \
--pakdir=/mnt --pkglicense=Apache
ln -s /mnt/couchdb/etc/init.d/couchdb /etc/init.d/couchdb
update-rc.d couchdb defaults
ln -s /mnt/couchdb/etc/logrotate.d/couchdb /etc/logrotate.d/couchdb

The last three lines symlink the start script for CouchDB and a configuration file for logrotation into the correct places, and adds CouchDB to the system startup. If all goes well, this should be good. (Note: This assumes that the EBS volume stays with the server and is not detached from the instance.)

See the registered package on your system

dpkg --get-selections |grep couchdb

See all installed files

dpkg -L apache-couchdb
/.
/usr
/usr/share
/usr/share/doc
/usr/share/doc/apache-couchdb
/usr/share/doc/apache-couchdb/AUTHORS
/usr/share/doc/apache-couchdb/NEWS
/usr/share/doc/apache-couchdb/THANKS
/usr/share/doc/apache-couchdb/BUGS.gz
/usr/share/doc/apache-couchdb/CHANGES
/usr/share/doc/apache-couchdb/LICENSE
/usr/share/doc/apache-couchdb/README
/usr/share/doc/apache-couchdb/BUGS
/usr/share/doc/apache-couchdb/README.gz
/mnt
/mnt/couchdb
/mnt/couchdb/lib
/mnt/couchdb/lib/couchdb
/mnt/couchdb/lib/couchdb/erlang
...

Need to remove CouchDB?

If you ever needed to remove CouchDB, then this is the command:

dpkg -r apache-couchdb

Configuration

  • go to /mnt/couchdb/etc/couchdb/local.ini and add:

    [httpd] port = 80 bind_address = 0.0.0.0

  • feel free to add users in /mnt/couchdb/etc/couchdb/local.ini (see [admin])

  • go into /mnt/couchdb/etc/defaults/couchdb and edit COUCHDB_USER to root

The bind_addressis set to 0.0.0.0 on purpose. This is to make it listen on all IPs on the server (on port 80). If you have a dedicated CouchDB server this should not be an issue. If the server is not dedicated to CouchDB there might be issues since this will obviously take over port 80 all over the place. However, the idea on a dedicated CouchDB server is to avoid hard coding IPs since CouchDB only listens on 127.0.0.1 by default and the private instance IP might change, and the Elastic IP is not directly assigned to your server. (Hat tip to Jan and Mathias).

Also, to skip editing the defaults for COUCHDB_USER, you can probably run ./configure with ./configure --... COUCHDB_USER=root.

All done?

Start CouchDB with service couchdb start. Double-check that couchdb is running (ps aux):

...
root      1139  0.0  0.0   4020   632 ?        S    12:38   0:00 /bin/sh -e /mnt/couchdb/bin/couchdb -c \"/mnt/couchdb/etc/couchdb/default.ini\" -c \"/mnt/couchdb/etc
root      1151  0.0  0.0   4020   352 ?        S    12:38   0:00 /bin/sh -e /mnt/couchdb/bin/couchdb -c \"/mnt/couchdb/etc/couchdb/default.ini\" -c \"/mnt/couchdb/etc
root      1152  0.0  0.1 115040 13648 ?        Sl   12:38   0:00 /usr/lib/erlang/erts-5.6.5/bin/beam.smp -Bd -K true -- -root /usr/lib/erlang -progname erl -- -home /
root      1196  0.0  0.0   3784   496 ?        Ss   12:38   0:00 heart -pid 1152 -ht 11
...

Go to your elastic IP, and you should see the following:

{"couchdb":"Welcome","version":"0.9.1"}

Time to relax. :-)

Conclusion

For me, cloud computing is pretty interesting, and I'm just getting started. What's particularly interesting to me (even outside the cloud) — and I bet everyone who does a little administration can relate to it — is automatic setup and deployment of applications with as little intervention as possible.

setup.sh is of course not very solid yet, but if you are getting into the game, for example using a service such as WrongRightScale, there are just small additions to turn this into a fabulous RightScript to automate server setup. :-)

I'll keep you, posted!

| More