PHP: So you'd like to migrate from MySQL to CouchDB? - Part III

Monday, May 17. 2010
Comments

This is part three of a beginner series for people with a MySQL/PHP background. Apologies for the delay, this blog entry has been in draft since the 13th December of last year (2009).

Follow these links for the previous parts:

Recap

Part I introduced the CouchDB basics which included basic requests using PHP and cURL. Part II focused on create, read, update and delete operations in CouchDB. I also introduced my nifty PHP CouchDB called ArmChair!

ArmChair is my own very simple and (hopefully) easy-to-use approach to accessing CouchDB from PHP. The objective is to develop it with each part of this series to make it a more comprehensive solution.

Part III

Part three will target basic view functions in CouchDB — think of views as a WHERE-clause in MySQL. They are similar, but also not. :-)

Map-Reduce-Thingy

If you read up on CouchDB before coming to this blog, you will probably heard of map-reduce. There, or maybe elsewhere. A lot of people attribute Google's success to map-reduce. Because they are able to process a lot of data in parallel (across multiple cores and/or machines) in relatively little time.

I guess the PageRank in Google Search or Google Analytics are examples of where it could be used.

In the following, I'll try to explain what map-reduce is. For people without a science degree. (And that includes me!)

Map

Generally, map-reduce is a way to process data. It's made off two things, map and reduce.

The idea is that the map-function is very robust and it allows data to be broken up into smaller pieces so it can be processed in parallel. In most cases the order data is processed in doesn't really matter. What generally counts is that it is processed at all. And since map allows us to run the processing in parallel, it's easier to scale out. (That's the secret sauce!)

And when I write scale-out, I don't suggest to built a cluster of 1000 servers in order to process a couple thousand documents. It's already sufficient in this case to utilize all cores in my own computer when the map task is run in parallel.

In CouchDB, the result of map is a list of keys and values.

Reduce

Reduce is called once the map-part is done. It's an optional step in terms of CouchDB — not every map requires a reduce to follow.

Real world example

  • take a simple photo application (such as flickr) with comments
  • use map to sort through the comments and emit the names of users who left one
  • use reduce to only get unique references and see how many comments were left by these user

In SQL:

SELECT user, count(*) FROM comments GROUP BY user

Why the fuzz?

Just so people don't feel offended. Map-reduce is slightly more complicated than my example SQL-query but it's also not some secret-super-duper thing. Its strength is really parallelization which requires the ability to break data into chunks to process them. The end.


Continue reading "PHP: So you'd like to migrate from MySQL to CouchDB? - Part III"

The Slicehost-Cogent-Outage, or How to setup a relay with Postfix

Wednesday, May 5. 2010
Comments

Our problem is that an application hosted on Slicehost uses an external mailserver, which is located in Europe. Since neither Slicehost/Rackspace or Cogent seem to be able to fix the situation after almost two days, here's a quick workaround.

The idea is that our relay will collect emails and send them whenever the connection permits.

Postfix install

This is a pretty simple:

sudo aptitude install postfix

Configuration

main.cf

Edit /etc/postfix/main.cf (this is my entire main.cf):

relayhost = **MY-EXTERNAL-MAILSERVER (SMTP)**
smtp_sasl_auth_enable = yes
smtp_sasl_password_maps = hash:/etc/postfix/sasl/passwd
smtp_sasl_security_options =
smtpd_banner = $myhostname ESMTP $mail_name (Ubuntu)
biff = no
append_dot_mydomain = no
readme_directory = no
myhostname = **MY-SLICE-HOSTNAME**
mynetworks = 127.0.0.0/8, **MY-SLICE-IP**
alias_maps = hash:/etc/aliases
alias_database = hash:/etc/aliases
myorigin = /etc/mailname
mydestination =
mailbox_size_limit = 0
recipient_delimiter = +
inet_interfaces = loopback-only

Obviously you need to replace the **ENTRIES** in caps entries (a total of three).

/etc/postfix/sasl/passwd

Create the file:

sudo touch /etc/postfix/sasl/passwd
sudo chmod 600 /etc/postfix/sasl/passwd

Enter something like the following in it (vi /etc/postfix/sasl/passwd):

mail.example.org username:password
mail2.example.org user@example.org:password

Replace mail/mail2 with your actual SMTP. The SMTP will have to allow plain-text login for this to work.

Run the following to finish up:

sudo postmap /etc/postfix/sasl/passwd

Adjust

sudo /etc/init.d/postfix check
sudo /etc/init.d/postfix reload

Then, configure your application to not use smtp-auth and your SMPT runs on 127.0.0.1:25 if it runs on the same server.

Please note in the above main.cf, I configured postfix to only listen on the loopback.

Fin

This is an excerpt from an email:

Received: from [slice.ip] (helo=slice.host)
by mailserver.in.europe with esmtpa (Exim 4.69)
(envelope-from <email@example.org>)
id 1O9gAL-0005xQ-ES; Wed, 05 May 2010 17:05:37 +0200
Received: from [slice.ip] (localhost [127.0.0.1])
by slice.host (Postfix) with ESMTP id DE2DB11428C;
Wed,  5 May 2010 15:05:44 +0000 (UTC)

And that's all, kids!

Defined tags for this entry: , , , , , ,

start-stop-daemon, Gearman and a little PHP

Thursday, April 22. 2010
Comments

The scope of this blog entry is to give you a quick and dirty demo for start-stop-daemon together with a short use case on Gearman (all on Ubuntu). In this example, I'm using the start-stop-daemon to handle my Gearman workers through an init.d script.

Gearman

Gearman is a queue! But unlike for example most of the backends to Zend_Queue, Gearman provides a little more than just a message queue to send — well — messages from sender to receiver. With Gearman it's trivial to register functions (tasks) on the server to make in order to start a job and to get stuff done.

For me the biggest advantages of Gearman are that it's easy to scale (add a server, start more workers) and that you can get work done in another language without building an API of some sort in between. Gearman is that API.

Back to start-stop-daemon

start-stop-daemon is a facility to start and stop programs on system start and shutdown. On recent Ubuntus most of the scripts located in /etc/init.d/ make use of it already. It provides a simple high-level API to system calls — such as stopping a process, starting it in a background, running it under a user and the glue, such as writing a pid file.

My gearman start script

Once adjusted, register it with the rc-system: update-rc.d script defaults. This will take care of the script being run during the boot process and before shutdown is completed.

A little more detail

The script may be called with /etc/init.d/script start|stop|restart (the pipes designated "or").

Upon start, we write a pidfile to /var/run and start the process. The same pidfile is used on stop — simple as that. The rest of it is hidden behind start-stop-daemon which takes care of the ugly rest for us.


Continue reading "start-stop-daemon, Gearman and a little PHP"

Redis on Ubuntu (9.04)

Friday, March 19. 2010
Comments

A small howto to get the latest redis-server and a webinterface on Ubuntu.

Installation

wget http://ftp.de.debian.org/debian/pool/main/r/redis/redis-server_1.2.5-1_amd64.deb
sudo dpkg -i redis-server_1.2.5-1_amd64.deb
/etc/init.d/redis-server start

... redis should listen on localhost:6379.

You may need to get i386 instead of amd64 if you run 32bit.

Tweaks

You may need to add the following to /etc/sysctl.conf:

vm.overcommit_memory = 1

... that is, especially if you run in a VE (e.g. inside xen).

All other configs are in /etc/redis/redis.conf.

Web

Because web interfaces are so simple, I decided to get redweb.

Dependencies

wget http://ftp.us.debian.org/debian/pool/main/p/python-support/python-support_1.0.7_all.deb
dpkg -i python-support_1.0.7_all.deb
wget http://ftp.us.debian.org/debian/pool/main/p/python-redis/python-redis_1.34.1-1_all.deb
dpkg -i python-redis_1.34.1-1_all.deb

So, on Ubuntu, python-support is at 0.8.4 currently, but we'll need something equal or greater than 0.9.0. This is why I update python-support from Debian.

Installation

git clone http://github.com/tnm/redweb ./redweb-git

Patch redweb-git/redweb/redweb.py with:

index e79a062..e278fca 100644
--- a/redweb/redweb.py
+++ b/redweb/redweb.py
@@ -15,6 +15,8 @@ __author__ = 'Ted Nyman'
 __version__ = '0.2.2'
 __license__ = 'MIT'

+import sys
+sys.path.append('/path/to/redweb-git/')

 from bottle import route, request, response, view, send_file, run
 import redis

Run!

cd redweb-git/redweb/
python redweb.py

... this is a bit annoying. If you do python redweb/redweb.py, it'll complain about missing files.

Then browse to http://127.0.0.1:8080.

Fin

So this is my redis-server howto — nice and simple.

And once you have Redis up and running, feel free to browse over to Rediska and use their session handling for Zend Framework. Setup is pretty simple and it works like a charm. :-) I'd suggest you use their trunk code, which is hosted on Github as it will contain a few improvements and a small bugfix which I did.

For more on Rediska, watch this space. ;-)

Defined tags for this entry: , , ,

Fan Error

Monday, October 19. 2009
Comments

A Fan Error in this case is not when your Facebook fan page is down. I received this message after my Lenovo X61s notebook decided to quit and I restarted it. The screen said "Fan Error", and the notebook refused to continue to the boot process.

A rescue party

Of course this is the last thing you want on a Sunday evening, but in true GTD fashion, I wanted to fix it right away. Here's how.

Precaution

In order to not electrocute myself, I removed the battery and unplugged the notebook.

Get in there!

I basically unscrewed every screw there is at the bottom of the notebook, until it would let me remove the upper part of the casing and keyboard.

FAN ERROR

Then I tried to carefully clean the inner of my notebook from dust and dirt that accumulated over the past 14 months since I purchased it. I think had dust (and what not) from North America, Europe and South America in there. It was kinda gross. It really didn't look pretty. And that is despite all efforts to not eat and drink near it.

Fan

When I got to the fan, it wouldn't really move. Hence the fan error!

I forced it a little and white dust came out of it. So I decided to take more drastic measures and sucked it clean using my Dyson. In the beginning it wouldn't really move, but it took only a minute to resolve that. (Word of advice: If you are not super careful, the Dyson will try to suck in whatever it gets. So make sure to not vacuum the insides of your notebook. ;-))

Reassembly

Reassembly is pretty simple. The case clicks, and then you fill in the screws. IBM/Lenovo were smart enough to only use screws of the same type. There was a total of ten (or maybe nine), and they are all gone. So that must have worked.

Conclusion

Don't try this, unless you have to. And know what you are doing. This blog entry comes with no guarantees or extended warranty. Being able to fix little things yourself, feels good though.

Defined tags for this entry: , , , , , , ,