PHP performance III -- Running nginx

Sunday, May 31. 2009
Comments

Since part one and two were uber-successful, here's an update on my Zend Framework PHP performance situation. I've also had this post sitting around since beginning of May and I figured if I don't post it now, I never will.

Disclaimer: All numbers (aka pseudo benchmarks) were not taken on a full moon and are (of course) very relative to our server hardware (e.g. DELL 1950, 8 GB RAM) and environment. The application we run is Zend Framework-based and currently handles between 150,000 and 200,000 visitors per day.

Why switch at all?

In January of this year (2009), we started investigating the 2.2 branch of the Apache webserver. Because we used Apache 1.3 for forever, we never had the need to upgrade to Apache 2.0, or 2.2. After all, you're probably familiar with the don't fix it, if it's not broken-approach.

Late last year we ran into a couple (maybe) rather FreeBSD-specific issues with PHP and its opcode cache APC. I am by no means an expert on the entire situation, but from reading mailing lists and investigating on the server, this seemed to be expected behavior — in a nutshell: Apache 1.3 and a large opcode cache on a newer versions of FreeBSD (7) were bound to fail with larger amounts of traffic.

We tried bumping up a few settings (pv entries), but we just ran into the same issue again and again.

Because the architecture of Apache 2.2 and 1.3 is so different from one another (and upgrading to 2.2 was the proposed solution), I went on to explore this upgrade to Apache 2.2. And once I completed the switch to Apache 2.2, my issues went away.

So far, so good!

Performance?

On the performance side we experienced rather mediocre results.

While we benched that a static file could be read at around 300 requests per second (that is a pretty standard Apache 2.2 install, sans a couple unnecessary modules), PHP (mod_php) performed at a fraction of that, averaging between 20 and 23 requests per second.

Myth: Hardware is cheap(, developer time is not)!

Before some people yell at me for trying to optimize my web server, one needs to take the costs of scaling (to a 100 requests per seconds) into account.

One of those servers currently runs at 2,600.00 USD. The price tag adds up to an additional 10,400.00 USD in order to scale to a 100 (lousy) requests per seconds. Chances are of course, that the hardware is slightly less expensive since DELL gives great rebates — but the 8 GB of (server) RAM and the SAS disks by themselves are melting budgets away.

And on top of all hardware costs, you need add setup, maintenance and running costs (rack space, electricity) for an additional four servers — suddenly, developer time is cheap. ;-)

So what do we do? Nginx to the rescue?!


Continue reading "PHP performance III -- Running nginx"

MySQL: Using indices correctly

Tuesday, May 5. 2009
Comments

The objective was to select sessions from a table, that are older than two days.

Table setup

This is the definition:

CREATE TABLE `session` (
  `id` varchar(32) NOT NULL DEFAULT '',
  `data` text NOT NULL,
  `user` int(11) DEFAULT NULL,
  `created` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `updated` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  PRIMARY KEY (`id`),
  KEY `user_id` (`user`),
  KEY `rec_datemod` (`updated`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Right and wrong

Wrong query:

SELECT * FROM session
WHERE DATE_ADD(`updated`, INTERVAL 2 DAY)< NOW()

Correct query:

SELECT * FROM session
WHERE `updated` < DATE_SUB(NOW(), INTERVAL 2 DAY)

You wonder why?

Executing the first query, MySQL will scan the entire table and calculate the date from each row. Then it will continue and compare the value to NOW() and return the row if it matches. This is somewhat (Not really!) OK until a certain amount of traffic on the table. In my case, I have 500,000 (five-hundred-thousand) active sessions (aka rows) in the table, which makes it slower and slower and slower.

Because of the full table scan, this will also effectively lock the table (even though it's INNODB) and block it from further updates.

The second query (obviously) works around that and uses the KEY on updated.

Conclusion

The first lesson is to always use EXPLAIN!

Further more, I know some of you will shiver but phpMyAdmin is actually a pretty useful tool for these circumstances. The website stalled, you log into phpMyAdmin and figure out what's running ("Processes" tab, when you're logged in as a privileged account). If you're a shell-ninja, just execute SHOW PROCESSLIST (in mysql) and push whatever runs the longest to EXPLAIN.

The slow query-log is also something you should read up on.

Measuring CouchDB performance

Monday, February 23. 2009
Comments

The overall document-oriented approach of CouchDB and the free-form way of saving data are probably the things that appeal to most of us when we first read about this new database.

Most of the people that were introduced to CouchDB so far quickly made the decision to use it in production despite the early beta'ish state of the project. We all hate normalization, we all want a faster and responsive database, and some of us want multiple nodes and inter-node replication. CouchDB manages to sell all these quite well. And of course, there are plenty of other reasons.


Continue reading "Measuring CouchDB performance"

PHP FastCGI woes!

Monday, January 26. 2009
Comments

Those of you who run high traffic websites, have probably tried php-cgi/fcgi down the road. And most of us, have gone back to Apache.

But now — actually since the middle of 2007 — there's light at the end of the tunnel. I read a blog post by Evert Pot's last night (Apache speed and reverse proxies). Evert noted that he tried to use Lighttpd and php-fcgi, all the infamouse tricks with spawn-fcgi.sh, etc.. — and failed.

He referenced my own blog post where I shared a similar experience; on a sidenote, I'm very glad I'm not the only one who's had these issues. One of the commenters on Evert's blog suggested that he used a project called php-fpm, which I had never heard of to date.

Drum roll!

So anyway, php-fpm is the efforts of Andrei Nigmatulin and they seem to be the end to all those problems. I've spend a few hours last night reading up on it (with the help of Google Translate) and doing a test install, and it seemed pretty cool. I emailed Andrei and suggested that he added links to Google Translate from all pages but he instead setup a wiki. Wee!

I spent two hours over the course of this day moving the pages into the wiki. And the result is:

Quo vadis?

First off, let me just add that the wiki is work in progress, and you are welcome to contribute! A lot of English is straight Google Translate which is naturally not perfect.

So far, I've been moving all Russian pages to English ones, I'm hoping Andrei feels guilty (:-)) when he sees my pages and adds Russian back in. I've also emailed someone who translated the brief HowTo into Chinese!

Further more, I have not yet tried php-fpm but I'm excited and will let you know what the results are. If you are a step ahead of me (Well, I've overslept this thing since 2007!), please share your experience in the comment section!