Skip to content

Managing software deployments of your PHP applications I

Disclaimer: I've been doing mostly PHP and Zend Framework based projects in the past two years, but the information from this article is general and should be applicable to most setups — even to non PHP-based projects (to a certain extent).

Inspired by Padraic's posting spree the other week, here's another attempt to provide you with some hands-on usefulness. I'm all open for all feedback, and sorry for the length!

What is deployment and how do you manage deployments?

Wikipedia says:

Software deployment is all of the activities that make a software system available for use.

... and goes on:

The general deployment process consists of several interrelated activities with possible transitions between them. These activities can occur at the producer site or at the consumer site or both. Because every software system is unique, the precise processes or procedures within each activity can hardly be defined.

In general, there are different approaches to software deployment. Most people are probably not aware of a deployment process at all. They edit files and push it live. In most cases, live (sometimes referred to as production environment) is the webhosting account — for consistency, the environments setup for larger projects also includes a development and staging environment.

Taking the facts into account, we can summarize those efforts into three cases:

  1. Deploym..? I'm a skilled surgeon and shell ninja! I like to edit all my files online!
  2. I have a local WAMP, MAMP or LAMP and then FTP the files online.
  3. We have a defined process and use SVN, PEAR, phing or similar.

Why should I manage deployments?

There are a few reasons as to why it's good for youTM to come up with a release schedule to manage your deployments.

  1. Establishing a release schedule allows you to project the time necessary to implement and test new features and items.
  2. Planning in advance also helps to meet the plan.
  3. A schedule enables code testing (and other QA measures) before it's live. For example, assuming the schedule says to release a new version every two weeks, the net time (ten business days) could be divided up into eight business days for development, and two business days for testing. (Adjust as needed!) Take note — the schedule excludes weekends!
  4. A release schedule helps the development team to avoid all those extra last-minute changes which break things and cause grey hair without feeling bad or guilty. The established practice and process has to be used by all people involved, and developers are not to blame if someone else forgets that.

PHP FastCGI woes!

Those of you who run high traffic websites, have probably tried php-cgi/fcgi down the road. And most of us, have gone back to Apache.

But now — actually since the middle of 2007 — there's light at the end of the tunnel. I read a blog post by Evert Pot's last night (Apache speed and reverse proxies). Evert noted that he tried to use Lighttpd and php-fcgi, all the infamouse tricks with spawn-fcgi.sh, etc.. — and failed.

He referenced my own blog post where I shared a similar experience; on a sidenote, I'm very glad I'm not the only one who's had these issues. One of the commenters on Evert's blog suggested that he used a project called php-fpm, which I had never heard of to date.

Drum roll!

So anyway, php-fpm is the efforts of Andrei Nigmatulin and they seem to be the end to all those problems. I've spend a few hours last night reading up on it (with the help of Google Translate) and doing a test install, and it seemed pretty cool. I emailed Andrei and suggested that he added links to Google Translate from all pages but he instead setup a wiki. Wee!

I spent two hours over the course of this day moving the pages into the wiki. And the result is:

Quo vadis?

First off, let me just add that the wiki is work in progress, and you are welcome to contribute! A lot of English is straight Google Translate which is naturally not perfect.

So far, I've been moving all Russian pages to English ones, I'm hoping Andrei feels guilty (:-)) when he sees my pages and adds Russian back in. I've also emailed someone who translated the brief HowTo into Chinese!

Further more, I have not yet tried php-fpm but I'm excited and will let you know what the results are. If you are a step ahead of me (Well, I've overslept this thing since 2007!), please share your experience in the comment section!

Fixing up anti-spam plugins in Wordpress (and other apps) for Mosso

A lot of companies moved their web applications, or parts of them, to the cloud in 2008. Some people have had issues, for others (and AWS in particular), it's been one success story.

Because some of us like to focus on the business side and not run servers ourselves, providers like Mosso (a division of Rackspace) and MediaTemple offer scalable webhosting environments available to everyone.

Some of them call their offering cloud, others call it grid. Apparently it's the same. And I'm sure I am oversimplifying the services they offer (and I mean no disrespect), but scalable webhosting is what it really is.

Mosso in particular caught people's attention because they had a lot of issues in the beginning and because most of us know there is no such thing as 100% uptime for 100 USD/month, I don't want to poke them too hard for it.

One important thing to take into account when moving into the cloud is that on the configuration side, any virtual solution is slightly different from regular webhosting.

In particular one of the issues which my friend Allen Stern ran into when he moved to Mosso was that due to the virtual nature of the entire setup, none of his anti spam plugins in Drupal and Wordpress worked. Reason is that the IP populated in $_SERVER['REMOTE_ADDR'] is always the IP of Mosso's loadbalancer, which runs in front of the server farm and distributes all traffic to servers where resources are vacant.

Mosso instead populates the $_SERVER['HTTP_X_CLUSTER_CLIENT_IP'] header but because the majority of PHP developers are used to a very specific setup — the LAMP stack — they rarely waste time by thinking ahead of other environments.

In this case, the plugins will blacklist Mosso's loadbalancer soon and you will end up with a lot of comments which you will need to moderate. This blacklisting makes using those plugins (e.g. Akismet, Mollom) useless.

While I'm certainly amazed that Mosso could not fix this at the server level, here are a couple solutions (free of charge) for their customer base to use to fix the problem themselves.

PHP to the rescue

For everyone involved, there are multiple solutions to this problem.

The hack!

Mass-replace REMOTE_ADDR with HTTP_X_CLUSTER_CLIENT_IP.

The disadvantage is that if you run software such as Wordpress, you will loose the easy update feature since you edited all files.

Still semi-dirty

Find a file (e.g. a configuration file) which is included by the software everywhere and add the following line into it: $_SERVER['REMOTE_ADDR'] = $_SERVER['HTTP_X_CLUSTER_CLIENT_IP'];

There is no real disadvantage here, the only thing you need to keep in mind is that you probably need to re-add this to the file in case the software itself updates it and overwrites your changes.

The clean solution!

My favorite is to put the above statement into its own file (e.g. ip-fix.php) and use auto_prepend_file to fix the IP everywhere - period. The great advantage here is that this fix (while probably not the best in terms of performance) is sort of independent of the server (.htaccess requires Apache, or at least htscanner) you run and all the updates and changes you do to it.

In a nutshell, you should paste the following into a .htaccess file:

php_value auto_prepend_file /complete/path/to/ip-fix.php.

Would you trade an arm for a leg?

All three solutions are of course less than ideal because they require the customer to fix something that should be fixed on the serverside. For example, Mosso could patch the Apache to override the header, or use a webserver such as nginx etc. which does it out of the box.

According to my buddy Allen, it worked for him, and Mosso wants to roll out my work-around for all customers. (Just by the way Mosso — I'm always available for consulting! :-D)

Other providers

I do know that this is an issue with other providers as well. And while Mosso uses HTTP_X_CLUSTER_CLIENT_IP, all you need to find out is where your provider hides the real IP address, to make apply this workaround to your environment. And that's all.

Here is an idea of how to go about it:

  1. Go to http://whatismyip.com and write down your IP-Address.
  2. Create a .php file with the following contents in it, and upload it: <?php phpinfo(); ?>
  3. Open the URL of the file in your browser and look for your IP address.

In case there is no other IP-related header populated, you will need to rely on the client-side to get this IP and/or utilize captchas to defend yourself from spam. Or, of course, move providers. ;-)

Drobo with DroboShare on XP, Vista, MacOSX, Ubuntu

I bought a Drobo for myself about seven months ago and I couldn't be any happier. My files are backed up on a RAID system, I still got plenty of space to waste. My world is OK.

Some friends of mine recently bought one of the new Drobo units with a DroboShare. The DroboShare costs $200 (USD) and is a glorified Linux server which exports your Drobo using Samba to all clients on the network.

My friends are using Windows and MacOSX to connect so after some intial problems where they were running FAT32 and the Drobo decided to go unlabeled, we decided to format the unit and use HFS+ instead.

Unlabeled?

I googled this and to my surprise there are no information available - Data Robotics keeps it all pretty well hidden behind case numbers on their ticketing system. It would be nice if they provided more details why a Drobo unit would end up in unlabeled state.

HFS, or what's your flavour?

Our reasons to select HFS+ are:

  • It's a modern filesystem (vs. FAT32) with journaling.
  • If all fails, you can hook it up to the Mac and use DiskWarrior to recover the volume.

If you are not using a Mac and keep in a Windows-only environment, it makes more sense to select NTFS, in a Linux-only environment you are save with ext3. I would select a filesystem which still allows you to hook up the Drobo to any of your clients in order to be able to easily recover the volumes in case they decide to stop working.

Network write issues

When we setup the Drobo we tried to copy 2 GB from various clients to it. What took between 7-9 minutes on most clients, was estimated with 30 hours on Tiger. ;-)

To rule out an issue with the Drobo, we briefly tested the performance from various systems. The candidates included Windows XP, Windows Vista, MacOSX 10.5.4 (Leopard), Ubuntu 8.10 and MacOSX 10.4.11 (Tiger). The Drobo performed well on all systems -- except for Tiger.

Researching the network issue on Google, I found various people who reported all kind of network issue with 10.4.11 and since there are other random crashes on the same workstation we pronounced the Drobo to work and perform. The workstation is subject to a system overhaul next week.

Ubuntu

Setting up the Drobo on Ubuntu is pretty easy -- point taken, there is no Drobo Dashboard and the drobo-utils only supported units which are connected via USB or Firewire directly to the workstation.

To setup the share you could samba mount \\DroboShare\DROBO and provide the same credentials you use on Windows/Mac when the Drobo Dashboard prompts you for a login to the DroboShare. An alternative with GUI is to use Places > Connect To Server.

The DroboShare itself exposes itself on the network and registers itself as (well) DroboShare in DNS/WINS (netbios?) -- if you decide to use its IP to setup shares and so on, make sure to assign a static IP to the DroboShare (MAC-Address is on the bottom side of it) so the IP doesn't change when you restart your router or the lease expires.

Hope this helps!

Seven Things -- Tagged by Chuck Burgess

Thanks to Chuck, I've been tagged — rejoice. (Edit: I've also been tagged by Greg.) Now because I'm a web2 slut, it'll be hard to tell anyone seven things which they don't know already, but here we go anyway!

Seven things

  1. Little did you know, but my middle name is Felix.
  2. I'm deadly afraid of rodents. =(
  3. By birth, I'm a real commie. I was born in Karl-Marx-City, East-Germany, before the wall came down.
  4. I can go without computer and Internet for two weeks!
  5. I'm a liberal by heart and member of the liberal party of Germany.
  6. Back at school, I argued myself through my history exam without using a single date, ever.
  7. I got interested in the Internet because my mother bought a book on HTML for her job.

I hereby tag

This is pretty hard, I'm trying to not tag people twice. So bear with me.

  1. Christian Weiske — because he's pretty brilliant, and a friend.
  2. Thilo Utke — who does Ruby (on Rails), but talks about testing without a prejudice for programming languages.
  3. Thomas Bruederli — for founding RoundCube, the best webmail to date!
  4. Just — who doesn't do PHP either, but takes awesome photos and writes about street and urban art in Berlin, and Europe.
  5. Helgi — for all the hard work^H^H^H^H^H^H^H^H^H^H^commits, very early in the morning.
  6. Colin Percival — because he does a lot of awesome stuff (portsnap, freebsd-update, [email protected], depenguinator, ...) for FreeBSD!
  7. Last but not least, my good friend Allen! — Because he's one of my best friends and because I think he can tag more interesting people.

The rules

  • Link your original tagger(s), and list these rules on your blog.
  • Share seven facts about yourself in the post - some random, some weird.
  • Tag seven people at the end of your post by leaving their names and the links to their blogs.
  • Let them know they've been tagged by leaving a comment on their blogs and/or Twitter.