Skip to content

Women in Tech

There is one thing which annoyed me a lot in 2011: the general topic is Women in Tech.

I'm not annoyed because I don't like women or don't want them to attend conferences. I'm annoyed because nothing happens.

Status Quo

For the most part it's guys at conference who discuss what can be done about it.

For example, I don't remember how exactly we got into this discussion at Funconf, but at the time we even had two or three women in the room when this was brought up. And I guess two women in a room with 15-20 guys is an exceptionally high rate for tech conference.

Side-note: I really wish I had asked them at the time how comfortable they are with this topic to begin with.

Outside conferences, the discussion happens in blogs and on Twitter. For example, there are frequently women in my Twitter timeline (I don't want to single anyone out.), who mention things like the female-male-ratio of attendees at tech conferences ("Too many guys!") and usually end with that there are too many guys who give talks.

Of course they have every right to mention this, but are women even submitting talks to these conferences? It often seems like a chicken-egg-problem to me.

Rhetorical question: Am I an asshole for pointing out that complaining gets you nowhere.

Issues

Drifting off the gender topic, there are in fact many other issues in the tech world.

For example, let's take a brief look at another sensitive subject: the ratio of white and non-white attendees at tech conferences (in the western world). I would be blind if I said there is no racism. Of course it's omni-present, but that doesn't make everyone a racist.

Racial issues aside, there are countless other examples where people might not feel welcome or at home when they attend a tech conference. Change has to come on many different levels and while some people might say, "Boys will be boys.", that doesn't mean we shouldn't be a little more aware.

Making change

Getting back to my original topic: of course part of changing the game is that conferences will have to cater to women also.

I'm honestly not sure what exactly needs to be done. Part of it would be to drop panel discussions about "Women in Tech". I got the suggestion that this is not just annoying for male attendees but a reason for potential female attendees to avoid a conference as well.

Leadership

So apart from conferences changing, I think the key is: women need to get involved.

First off: it's tough to go to places where you are a minority. I've done that myself, I can relate. I also realize there are women at tech conferences who do this already. But others who are more vocal on Twitter or blogs currently, need to follow them and do the same.

Words are powerful. But they won't take us all the way. Actions are required: please lead by example and change will follow.

More input

One of the areas where conferences need female expertise is (obviously) running a conference. This may sound a little snarky, but I doubt that guys will get it right otherwise. Women need to shape conferences from the top in order to change them. Join up, or roll your own.

Another important part is giving talks. It's simple: if you'd like to see more female speakers at conferences, you should submit a talk. I find it a little unbearable when the most vocal people demand more female speakers at conferences but do not submit talks themselves to begin with.

Fin

For the past years men complain that there are no women at tech conferences. And when men try to answer why, I think they are just guessing. If women know why, then we should start to discuss a solution.

If women don't know why, then maybe they need to ask themselves.

That's me venting — from 10,000 ft. Hit me up if you want to discuss any of this.

From Subversion to GIT (and beyond!)

Here's a more or less simple way to migrate from Subversion to GIT(hub), this includes mapping commits and tags and what not!

Authors

If multiple people congtributed to your project, this is probably the toughest part. If you're not migration from let's say Google Code but PHP's Subversion repository, then it's really pretty simple indeed: the username is the email address.

I found a nifty bash script to get it done (and adjusted it a little bit):

#!/usr/bin/env bash
authors=$(svn log -q | grep -e '^r' | awk 'BEGIN { FS = "|" } ; { print $2 }' | sort | uniq)
for author in ${authors}; do
  echo "${author} = ${author} <${author}@php.net>";
done

Since I migrated my project already, I didn't have the Subversion tree handy. That's why I used another package I maintain to demo this.

This is how you run it (assumes you have chmod +x'd it before):

# ./authors.sh
cvs2svn = cvs2svn <...>
cweiske = cweiske <...>
danielc = danielc <...>
gwynne = gwynne <...>
kguest = kguest <...>
pmjones = pmjones <...>
rasmus = rasmus <...>
till = till <...>

If you redirect the output to authors.txt, you're done.

Note: In case people don't have the email address you used on their Github account, they can always add it later on. Github allows you to use multiple email addresses, which is pretty handy for stuff like open source and work-related repositories.

git clone

This part took me a long time to figure out — especially because of the semi-weird setup in Google Code. The wiki is in Subversion as well, so the repository root is not the root where the software lives. This is probably a non-issue if you want to migrate the wiki as well, but I don't see why you would cluter master with it. Instead, I'd suggest to migrate the wiki content into a seperate branch.

Without further ado, this works:

# git svn clone --authors-file=./authors.txt --no-metadata --prefix=svn/ \
--tags=Services_ProjectHoneyPot/tags \
--trunk=Services_ProjectHoneyPot/trunk \
--branches=Services_ProjectHoneyPot/branches \
http://services-projecthoneypot.googlecode.com/svn/ \
Services_ProjectHoneyPot/

The final steps are to add a new remote and push master to it. Done!

You can see the result on Github.

A shortcut with Google Code

I facepalm'd myself when I saw that I could have converted my project on Google Code from Subversion to GIT.

This seems a lot easier since it would allow me to just push the branch to Github without cloning etc.. I'm not sure how it would spin off the wiki-content and how author information is preserved, but I suggest you try it out in case you want to migrate your project off of Google Code.

Doing it the other way is not time wasted since I had to figure out the steps regardless.

Summary

There seem to be literally a million ways to migrate (from Subversion) to GIT. I hope this example got you one step closer to your objective.

The biggest problem migrating is, that often people in Subversion-land screw up tags by committing to them (I'm guilty as well). I try to avoid that in open source, but as far as work is concerned, we sometimes need to hotfix a tag and just redeploy instead of running through the process of recreating a new tag (which is sometimes super tedious when a large Subversion repository is involved).

When I migrated a repository the other day, I noticed that these tags became branches in GIT. The good news is, I won't be able to do this anymore with GIT (Yay!), which is good because it forces me to create a more robust and streamlined process to get code into production. But how do I fix this problem during the migration?

Fix up your tags

If you happen to run into this problem, my suggestion is to migrate trunk only and then re-create the tags in GIT.

GIT allows me to create a tag based on a branch or based on a commit. Both options are simple and much better than installing a couple Python- and/or Ruby-scripts to fix your tree, which either end up not working or require a PHD in Math to understand.

To create a tag from a branch, I check out the branch and tag it. This may work best during a migration and of course it depends on how many tags need to be (re-)created and if you had tags at all. Creating a tag based on a commit comes in handy, when you forgot to create a tag at all: for example, you fixed a bug for a new release and then ended up refactoring a whole lot more.

In order to get the history of your GIT repository, try git log:

# git log --pretty=oneline                                                                                                                             [16:03:13]
1d973dfe6f6e361e6f54953f374d60289bb0abea add AllTests
f53404579f5416058937941d0609df4720717cae  * update package.xml for 0.6.0 release
d5b42eef2035d26b1e1d119ff44a09efa418685e  * refactored Services_ProjectHoneyPot_Response:    * no static anymore    * type-hinting all around
82d7e8d229109565d42f98c6548354f85734583c make skip more robust
b9e77a427eb546bacce600f5bc41546e85c148d7 prep package.xml for 0.6.0
6cecfbc19c00f0bf06b800c297e23f00cee650ef  * remove response-format mambojambo
036a9d9509adb456114f601c60d188839c012004 make test more robust
713c4ec91c28e19fbb33d6bb853ced0bdeb3f321  * update harvester's IP (this fails always)
755a2bba8f8506525a9cd2a1f11b266b7d26bbe6 throw exception if not a boolean
2aa21913946e2b4b3db949233a118dbbe2e34bf4  * all set*() are a fluent interface now  * update from Net_DNS_Resolver to Net_DNS2_Resolver  * dumb down setResolver(): Net_DNS2_Resolver is created in __const
81f544d880fc7b7a6321be9420b911817b567bd1 update docblock
dbe74da67c5fa1f1209fe85d3050041ca2a2de6b  * update docblock  * fix cs, whitespace
...

If I wanted to create a tag based on a certain commit (e.g. see last revision in the previous listing), I'd run the following command:

# git tag -a 0.5.4 dbe74da67c5fa1f1209fe85d3050041ca2a2de6b

Pro-tip: GIT allows you to create tags based on part of the hash to. Try "dbe74da", it should work as well.

That's all.

Things to learn

Moving from Subversion to GIT doesn't require too much to relearn. Instead of just commit, you also push and pull. These commands will get you pretty far if you just care for the speed and not for the rest.

Since I'm hoping, you want more, here are a couple things to look into:

  • branching
  • merging
  • git add -p
  • git remote add

Especially branching and merging are almost painless with GIT. I highly, highly recommend you make heavy use of it.

While GIT is sometimes a brainf*ck (e.g. submodules, commit really stages, subtree, absense of _switch_), the many benefits usually outweigh the downside. The one and only thing I truely miss are svn:externals currently. However, I'm hoping to master subtree one day and then I'll be a very happy camper.

Fin

That's all.

Quo vadis, CouchDB?

Update, 2011-12-21: Couchbase posted their review of 2011 (the other day) — TL;DR: Couchbase Single Server (their Apache CouchDB distribution) is discontinued and its documentation (and its buildtools) will be contributed to Apache CouchDB.


When Ubuntu1 dropped CouchDB two weeks ago, there were a couple things which annoy (present tense) me a lot. Add to that the general echo from various media outlets blogs which pronounced CouchDB dead and a general misconception how this situation or CouchDB in general is dealt with.

Some people said I am caremad about CouchDB and that is probably true. Let me try to work through these things without offending more people.

Ubuntu1

What annoy[ed,s] me about this situation is that I wrote a chapter about Ubuntu1 in my CouchDB book. And while I realize that as soon as a book is published the information is outdated, I also want to say that I could have used the space for another project.

I talked to a couple of people about CouchDB at Ubuntu1 on IRC and no one made it sound like they are having huge or for that matter any issues.

Of course I neither work for Canonical or Couchbase. I haven't signed any NDAs etc. — but looking back a week or two my well-educated guess is that not even the people at Couchbase knew there were fundamental issues with CouchDB and Ubuntu1.

The NDA-part is of course an assumption: don't quote me on it.

Transparency

Scumbag Ubuntu1 drops CouchDB and doesn't say why. — myself on Twitter

First off: I'm not really sorry. I was abusing a meme and if you read my Twitter bio, you should not take things personal.

I also should have known better since it's not like I expect anything transparent from Canonical. (Just said it.)

When people are compelled to write a press release and put it out like that, they should expect a backlash. The reason why I reacted harsh is that Canonical didn't share any valuable information on why they discontinued using CouchDB except for: it doesn't scale.

And I'm not aware of anything concious to date.

Helpful criticism — how does it work?

Please take a look at the following email: https://lists.launchpad.net/u1db-discuss/msg00043.html

This email contains a lot of criticism. And it's all valid as well.

CouchDB feedback

Other examples:

These are great emails because they contain extremely valuable feedback.

Deal with it!

In my (humble) opinion, these kind of emails are exactly what is necessary in CouchDB-land, and many other open source projects: criticism and a little time to reflect on not so awesome features. And then moving on to make it better. If the feedback cycle doesn't happen, there's no development or evolution — just stagnation.

And in retrospect I wish more people would share their opinion on CouchDB and this situation more often. Since I'm personally invested in CouchDB, it's hard to say certain things. Honesty is sometimes brutal, but it's necessary.

In summary, a CouchDB user like Ubuntu1 (or Canonical) doesn't have the civic duty to give feedback, but to desert a project while pretending to be an Open Source vendor, and not talking to the community of the project or sharing your issues in public, that is extremely unhelpful.

Overall it strikes me that the only thing to date known about Canonical's collaboration with CouchDB is the support for OAuth in CouchDB. And most people don't even know about that (or wouldn't know how to use it). It worries me personally to not know the kind of problems Canonical ran into because they seem so messed up that they couldn't be discussed in public.

CouchDB doesn't scale

One thing I was able to extract is: CouchDB doesn't scale.

Thanks! But no thanks.

I wrote a book on CouchDB and I pretty much used it all, or at least looked at it very, very closely. I also get plenty of experience with CouchDB due to my job. Indeed, there are many situations where CouchDB doesn't scale or where it becomes extremely hard to make it scale. Situations where the user is better of putting data somewhere else.

Myself (and I'm assuming others) enjoy to learn the reasons why things break, so we can take this experience and use it going forward. If this doesn't happen we might as well all subscribe to the koolaid of a closed source vendor and purchase update subscriptions, install security packs and happily live ever after.

A patch to make CouchDB scale?

Another piece of information I gathered from the various emails written is that Canonical maintained CouchDB-specific patches for Ubuntu1. However, it's unknown what the purpose of these patches were. For example, if these patches made CouchDB scale (magically) for Ubuntu1 or if the patchset added a new feature.

What I'd really like to know is why these patches were not discussed in the open and why no one worked with the project on incorporating them into upstream. The upstream is the Apache CouchDB project.

This is another example of where communication went horribly wrong or just didn't happen.

A CouchDB company

I'm a little torn here and I don't want to offend anyone (further) especially since I know a couple Couchbase'rs or original CouchOne'rs (Hello to Jan, JChris and Mikeal) in person, but seriously: a lot of people realized that CouchOne stopped being The CouchDB company a long time ago.

This is not to say that the CouchDB project members who are employed by CouchOne/Couchbase are not dedicated to CouchDB. But if I take a look at the mobile strategy and the more or less recent integration of CouchDB with Membase/Memcache, I must notice that these strategies are far away from Apache CouchDB. Big data (whatever that means), to mobile and back.

The conclusion is that the majority of work done will not be merged into Apache CouchDB and this is one of the reasons why the Apache CouchDB project hasn't evolved much in a long time.

Not all changes can go upstream

I realize that when a company has a different strategy, not everything they do can be send upstream. After all, most if not all companies operate in a world where money is to be made and goals are to be met. Nothing wrong there.

But let's take a look at the one project which could have been dedicated to Apache CouchDB: the documentation project.

CouchOne hired an ex-MySQL'er to write really great documentation for CouchDB. The documentation made sense, it was up to date with releases, contained lots examples and what not. But it was never contributed to the open source project. The documentation is still online today, though it's now the documentation of the Couchbase Server HTTP API.

Wakey, wakey!

So in my opinion the biggest news is not that Canonical stopped using CouchDB and it's also not outrageous to think that there can be one CouchDB company. The biggest news is that Couchbase officially said: "It's not us!".

Having said that and also not knowing much about Canonical's setup and scale, I still fail to even remotely understand why they didn't work with Cloudant who spezialize in making CouchDB scale all along.

CouchDB and Evolution

Of course it is unfair to single them (Couchbase employees) out like that. For the record, there are pretty vivid projects such as GeoCouch which are also funded by Couchbase and while being devoted to the project, these guys also have to meet goals for their company.

Add to that, that other CouchDB contributors involved have not driven sustantial user-facing changes in Apache CouchDB either. CouchDB is still a very technical project with a lot of technical issues to solve. The upside to this situation is that while other NoSQL vendors add new buzzwords to each and every CHANGELOG, CouchDB is very conservative and stability driven. I appreciate that a lot.

User-facing changes on the other side are just as important for the health of a project. Subtle changes aside, but today's talks on for example querying CouchDB are extremely similar to those talks given a year or two ago. Whatever happens in this regard is not visible to users at all.

Take URL rewriting, virtualhosts and range queries as examples for features. I question:

  • the usefulness for 80% of the users
  • the rather questionable quality
  • the state of their documentation

Users need to have the ability to grasp what's on the roadmap for CouchDB. There needs to be a way for not so technical users to provide feedback which is actually incorporated into the project. All of these things aside from opening issues in a monster like Jira.

Since no one bothers currently, this is not going to happen soon.

Pretty candid stuff.

Marketing

In terms of marketing and with a lack of an official CouchDB company, the CouchDB project has taken a PostgreSQL-attitude in the last two years.

In a nutshell:

We don't give a damn if you don't realize that our database is better than this other database.

This is a little dangerous for the project itself because when I look at the cash other NoSQL vendors pour into marketing for their NoSQL database, I realized quickly that with the lack of support this project can go away pretty soon.

CouchDB being an Apache project doesn't save me or anyone either: clean intellectual property, deserted, for forever.

The various larger companies (let's say Cloudant and Meebo) are basically employed with their own forks with maybe too little reason to merge anything back to upstream yet. There are independent contributors Enki Multimedia who contribute to core but also sub projects like CouchApp.

And then, there's Couchbase which is trying to tie CouchDB behind Memcached. And from what I can tell pretty much abondens HTTP and other slower CouchDB principals in the process.

Is CouchDB alive and kicking?

You saw it coming: it depends!

Dear Jan, I'm still thinking about the email you wrote while I write my own blog entry. And honestly, that email and the general response raised more questions for myself and others than it answered.

I'd like to emphasize a difference I see (thanks, Lukas):

Core

Is the core of Apache CouchDB alive? — It's not dead.

  • Yes, because some companies drive a lot of stability into CouchDB.
  • No, because there's little or no innovation happening right now.

Ecosystem

There is a lot of innovation going on in CouchDB's ecosystem.

Most notable, the following projects come to mind:

  • BigCouch
  • Couchappspora
  • CouchDB-lucene
  • Doctrine ODM in PHP (and I'm sure there are similar projects in other languages)
  • ElasticSearch's river
  • erica
  • GeoCouch
  • Lounge (and lode)
  • various JavaScript libraries to connect CouchDB with CouchApps or node.js
  • various open data projects (like refuge.io)

Need more? Check out CouchDB in the wild which I think is more or less up to date.

Hate it or love it — there is plenty of innovating going on. And many (if not all) CouchDB committers are a part of it.

The innovation just doesn't happen in CouchDB's core.

Fin

My closing words are that I don't plan on migrating anywhere else. If anything, we have mostly migrated to BigCouch.

For Apache CouchDB, I think it's important that someone fills that void. That can be either a company, a BDFL or more engaging project leaders (plural). I think this is required so the project continues vividly.

Because I would really like to see the project survive.

Cooking PHPUnit (and a chef-solo example on top)

I'm sure most of you noticed that with the recent upgrade of PHPUnit to version 3.6, a lot of breakage was introduced in various projects.

And for example Zend Framework 1.x won't update to the latest version either. When I ranted on twitter someone send me Christer Edvartsen's blog post on how to setup multiple versions of PHPUnit. It's really neat since it walks you through the setup step by step and you learn about things such as --installroot on the way. --installroot in particular is something I never ever saw before and I've been using PEAR for more than a few years now. So kudos to Christer for introducing myself to it.

The only thing to add from my side would be, Why are you guys not aggregated on planet-php?.

Cooking with Chef

Another reason why I decided to write this blog entry was that I created a chef-recipe based on Christer's blog entry.

If you follow my blog for a while, you might have noticed that I'm a huge fan of automation. I just moved one of our development servers the other day and had one of these moments where something just paid off. Taking for granted that I can spin up fully operational EC2 instances in minutes, I also had our development stack installed and configured in an instant.

My recipe basically follows Christer's instructions and because I distribute phpunit's command along with it, editing of the file is no longer required: when the chef run completes, phpunit34 is installed and ready to be used.

Get started

I'm doing the following commands as root — my setup is in /root/chef-setup.

Install chef(-solo) and clone my cookbooks

shell# gem install --no-ri --no-rdoc chef
... 
shell# git clone git://github.com/till/easybib-cookbooks.git
...

Chef configuration

Then setup a node.json file which chef-solo will need to run:

{
  "run_list": [
    "recipe[phpunit]"
  ]
}

Then create a solo.rb:

file_cache_path "/var/chef-solo"
cookbook_path ["/root/chef-setup/easybib-cookbooks"]

Chef run

Finally we start chef-solo with following command:

shell# chef-solo -c /root/chef-setup/solo.rb -j /root/chef-setup/node.json -l debug
...

The command runs chef-solo (which is part of the gem we installed) and reads the basic configuration from the solo.rb-file. This file contains the location of the cookbooks (remember git clone ...) and a path to cache files. You don't need to create anything, it should be all taken care of.

The node.json-part allows us to set node-specific values. The prime example is the run-list, but it allows you to set attributes as well. Attributes contain values for variables used in recipes, but are not used in this example.

Last but not least: -l debug — a lot of useful output, but we usually run with -l warn. And if this is interesting enough for you, I suggest the other blog entries I wrote on this topic.

Did it work?

Depending on the location of your pear setup — usually /usr/bin/pear or /usr/local/bin/pear — the phpunit34 script is created in the same path:

shell# which phpunit34
/usr/local/bin/phpunit34

Yay!

Fin

This feels like hitting two birds with one stone. Though just by figure of speech! I object to violence against birds.

It might be overkill to setup chef to just install phpunit 3.4 by itself, but I think this serves as a stellar example of how you can leverage the power of chef to get more done. Writing a couple more recipes to install and configure the rest of your stack shouldn't be too hard.

If you'd like to see anything in particular: I'll take requests via email, Twitter or in the comments.