Quo vadis, CouchDB?

If you enjoyed this article, please leave a comment, rss subscribe to my RSS feed and/or follow me on Twitter. Thank you very much!

Update, 2011-12-21: Couchbase posted their review of 2011 (the other day) — TL;DR: Couchbase Single Server (their Apache CouchDB distribution) is discontinued and its documentation (and its buildtools) will be contributed to Apache CouchDB.


When Ubuntu1 dropped CouchDB two weeks ago, there were a couple things which annoy (present tense) me a lot. Add to that the general echo from various media outlets blogs which pronounced CouchDB dead and a general misconception how this situation or CouchDB in general is dealt with.

Some people said I am caremad about CouchDB and that is probably true. Let me try to work through these things without offending more people.

Ubuntu1

What annoy[ed,s] me about this situation is that I wrote a chapter about Ubuntu1 in my CouchDB book. And while I realize that as soon as a book is published the information is outdated, I also want to say that I could have used the space for another project.

I talked to a couple of people about CouchDB at Ubuntu1 on IRC and no one made it sound like they are having huge or for that matter any issues.

Of course I neither work for Canonical or Couchbase. I haven't signed any NDAs etc. — but looking back a week or two my well-educated guess is that not even the people at Couchbase knew there were fundamental issues with CouchDB and Ubuntu1.

The NDA-part is of course an assumption: don't quote me on it.

Transparency

Scumbag Ubuntu1 drops CouchDB and doesn't say why. — myself on Twitter

First off: I'm not really sorry. I was abusing a meme and if you read my Twitter bio, you should not take things personal.

I also should have known better since it's not like I expect anything transparent from Canonical. (Just said it.)

When people are compelled to write a press release and put it out like that, they should expect a backlash. The reason why I reacted harsh is that Canonical didn't share any valuable information on why they discontinued using CouchDB except for: it doesn't scale.

And I'm not aware of anything concious to date.

Helpful criticism — how does it work?

Please take a look at the following email: https://lists.launchpad.net/u1db-discuss/msg00043.html

This email contains a lot of criticism. And it's all valid as well.

CouchDB feedback

Other examples:

These are great emails because they contain extremely valuable feedback.

Deal with it!

In my (humble) opinion, these kind of emails are exactly what is necessary in CouchDB-land, and many other open source projects: criticism and a little time to reflect on not so awesome features. And then moving on to make it better. If the feedback cycle doesn't happen, there's no development or evolution — just stagnation.

And in retrospect I wish more people would share their opinion on CouchDB and this situation more often. Since I'm personally invested in CouchDB, it's hard to say certain things. Honesty is sometimes brutal, but it's necessary.

In summary, a CouchDB user like Ubuntu1 (or Canonical) doesn't have the civic duty to give feedback, but to desert a project while pretending to be an Open Source vendor, and not talking to the community of the project or sharing your issues in public, that is extremely unhelpful.

Overall it strikes me that the only thing to date known about Canonical's collaboration with CouchDB is the support for OAuth in CouchDB. And most people don't even know about that (or wouldn't know how to use it). It worries me personally to not know the kind of problems Canonical ran into because they seem so messed up that they couldn't be discussed in public.

CouchDB doesn't scale

One thing I was able to extract is: CouchDB doesn't scale.

Thanks! But no thanks.

I wrote a book on CouchDB and I pretty much used it all, or at least looked at it very, very closely. I also get plenty of experience with CouchDB due to my job. Indeed, there are many situations where CouchDB doesn't scale or where it becomes extremely hard to make it scale. Situations where the user is better of putting data somewhere else.

Myself (and I'm assuming others) enjoy to learn the reasons why things break, so we can take this experience and use it going forward. If this doesn't happen we might as well all subscribe to the koolaid of a closed source vendor and purchase update subscriptions, install security packs and happily live ever after.

A patch to make CouchDB scale?

Another piece of information I gathered from the various emails written is that Canonical maintained CouchDB-specific patches for Ubuntu1. However, it's unknown what the purpose of these patches were. For example, if these patches made CouchDB scale (magically) for Ubuntu1 or if the patchset added a new feature.

What I'd really like to know is why these patches were not discussed in the open and why no one worked with the project on incorporating them into upstream. The upstream is the Apache CouchDB project.

This is another example of where communication went horribly wrong or just didn't happen.

A CouchDB company

I'm a little torn here and I don't want to offend anyone (further) especially since I know a couple Couchbase'rs or original CouchOne'rs (Hello to Jan, JChris and Mikeal) in person, but seriously: a lot of people realized that CouchOne stopped being The CouchDB company a long time ago.

This is not to say that the CouchDB project members who are employed by CouchOne/Couchbase are not dedicated to CouchDB. But if I take a look at the mobile strategy and the more or less recent integration of CouchDB with Membase/Memcache, I must notice that these strategies are far away from Apache CouchDB. Big data (whatever that means), to mobile and back.

The conclusion is that the majority of work done will not be merged into Apache CouchDB and this is one of the reasons why the Apache CouchDB project hasn't evolved much in a long time.

Not all changes can go upstream

I realize that when a company has a different strategy, not everything they do can be send upstream. After all, most if not all companies operate in a world where money is to be made and goals are to be met. Nothing wrong there.

But let's take a look at the one project which could have been dedicated to Apache CouchDB: the documentation project.

CouchOne hired an ex-MySQL'er to write really great documentation for CouchDB. The documentation made sense, it was up to date with releases, contained lots examples and what not. But it was never contributed to the open source project. The documentation is still online today, though it's now the documentation of the Couchbase Server HTTP API.

Wakey, wakey!

So in my opinion the biggest news is not that Canonical stopped using CouchDB and it's also not outrageous to think that there can be one CouchDB company. The biggest news is that Couchbase officially said: "It's not us!".

Having said that and also not knowing much about Canonical's setup and scale, I still fail to even remotely understand why they didn't work with Cloudant who spezialize in making CouchDB scale all along.

CouchDB and Evolution

Of course it is unfair to single them (Couchbase employees) out like that. For the record, there are pretty vivid projects such as GeoCouch which are also funded by Couchbase and while being devoted to the project, these guys also have to meet goals for their company.

Add to that, that other CouchDB contributors involved have not driven sustantial user-facing changes in Apache CouchDB either. CouchDB is still a very technical project with a lot of technical issues to solve. The upside to this situation is that while other NoSQL vendors add new buzzwords to each and every CHANGELOG, CouchDB is very conservative and stability driven. I appreciate that a lot.

User-facing changes on the other side are just as important for the health of a project. Subtle changes aside, but today's talks on for example querying CouchDB are extremely similar to those talks given a year or two ago. Whatever happens in this regard is not visible to users at all.

Take URL rewriting, virtualhosts and range queries as examples for features. I question:

  • the usefulness for 80% of the users
  • the rather questionable quality
  • the state of their documentation

Users need to have the ability to grasp what's on the roadmap for CouchDB. There needs to be a way for not so technical users to provide feedback which is actually incorporated into the project. All of these things aside from opening issues in a monster like Jira.

Since no one bothers currently, this is not going to happen soon.

Pretty candid stuff.

Marketing

In terms of marketing and with a lack of an official CouchDB company, the CouchDB project has taken a PostgreSQL-attitude in the last two years.

In a nutshell:

We don't give a damn if you don't realize that our database is better than this other database.

This is a little dangerous for the project itself because when I look at the cash other NoSQL vendors pour into marketing for their NoSQL database, I realized quickly that with the lack of support this project can go away pretty soon.

CouchDB being an Apache project doesn't save me or anyone either: clean intellectual property, deserted, for forever.

The various larger companies (let's say Cloudant and Meebo) are basically employed with their own forks with maybe too little reason to merge anything back to upstream yet. There are independent contributors Enki Multimedia who contribute to core but also sub projects like CouchApp.

And then, there's Couchbase which is trying to tie CouchDB behind Memcached. And from what I can tell pretty much abondens HTTP and other slower CouchDB principals in the process.

Is CouchDB alive and kicking?

You saw it coming: it depends!

Dear Jan, I'm still thinking about the email you wrote while I write my own blog entry. And honestly, that email and the general response raised more questions for myself and others than it answered.

I'd like to emphasize a difference I see (thanks, Lukas):

Core

Is the core of Apache CouchDB alive? — It's not dead.

  • Yes, because some companies drive a lot of stability into CouchDB.
  • No, because there's little or no innovation happening right now.

Ecosystem

There is a lot of innovation going on in CouchDB's ecosystem.

Most notable, the following projects come to mind:

  • BigCouch
  • Couchappspora
  • CouchDB-lucene
  • Doctrine ODM in PHP (and I'm sure there are similar projects in other languages)
  • ElasticSearch's river
  • erica
  • GeoCouch
  • Lounge (and lode)
  • various JavaScript libraries to connect CouchDB with CouchApps or node.js
  • various open data projects (like refuge.io)

Need more? Check out CouchDB in the wild which I think is more or less up to date.

Hate it or love it — there is plenty of innovating going on. And many (if not all) CouchDB committers are a part of it.

The innovation just doesn't happen in CouchDB's core.

Fin

My closing words are that I don't plan on migrating anywhere else. If anything, we have mostly migrated to BigCouch.

For Apache CouchDB, I think it's important that someone fills that void. That can be either a company, a BDFL or more engaging project leaders (plural). I think this is required so the project continues vividly.

Because I would really like to see the project survive.

| More