Thoughts on RightScale

If you enjoyed this article, please leave a comment, rss subscribe to my RSS feed and/or follow me on Twitter. Thank you very much!

RightScale provides all kinds of things — from a pre-configured MySQL master-slave setup (with automatic EBS/s3 backups), to a full LAMP stack, Rails app servers, virtually all kinds of other pre-configured server templates to a nifty auto-scaling feature.

We decided to leverage RightScale when we planned our move to AWS a couple months ago in order to not have to build everything ourselves. I've been writing this blog entry for the past five weeks and here are some observations, thoughts and tips.

RightScale

First off, whatever you think, and do, or have done so far, let me assure you, there's always a RightScale way of doing things. For (maybe) general sanity (and definitely your own), I suggest you don't do it their way — always.

One example for the RightScale way is, that all the so-called RightScripts will attempt to start services on reboot for you, instead of registering them with the init system (e.g., on Ubuntu, update-rc.d foo defaults) when they are set up.

You may argue that RightScale's attempt will provide you with a maybe more detailed protocol of what happened during the boot sequence, but at the same time it provides more potential for errors and introduces another layer around what the operating system provides, and what generally works pretty well already.

PHP and RightScale

RightScale's sales team knows how to charm people, and when I say charm, I do not mean scam (just for clarity)! :-)

The demos are very impressive and the client show cases not any less. Where they really need to excel though are PHP-related demos because not everyone in the world runs Ruby on Rails yet. No, really — there's still us PHP people and also folks who run Python, Java and so on.

Coming from the sales pitch, I felt disappointed a little because a standard PHP setup on RightScale is as standard as you would think three years ago. mod_php, Apache2 and so on. The configuration itself is a downer as well, a lot of unnecessary settings and generally not so speedy choices. Then remember that neither CentOS nor Ubuntu are exactly up to date on packages and add another constraint to the mix — Ubuntu is on 8.04 which is one and half years in the past as I write this entry.

And even though I can relate to RighScale's position — in terms of that supporting customers with all kinds of different software is a burden and messy to say the least — I am also not a fan.

Scaling up

The largest advantage when you select a service provider such as RightScale is, that they turn raw EC2 instances into usable servers out of the box. So far example setting up a lamp stack yourself requires time, while it's still a trivial task for many. With RightScale, it's a matter of a couple clicks — select image, start, provide input variables and done.

Aside from enhanced AMIs RightScale's advantage is auto-scaling. Auto-scaling has been done a couple times before. There are more than one service provider which leverages EC2 and provides scalability on top. Then take a look at Scalr, which is open source, and then recently Amazon themselves added their own Elastic Load Balancer.

In general, I think auto-scaling is something everyone gets, and wants, but of course it's not that dead simple to implement. And especially when you move to a new platform, it's a perfect trade off to sacrifice a flexibility and money for a warm and fuzzy "works out of the box" feeling.

PHP on RightScale

Deployment basics

Before we do anything, else, let's get a deployment setup.

  • Log into RightScale's UI and add a deployment in Manage > Deployments.
  • Add a new server, select the most recent PHP frontend template, and clone it right away.
  • Go back to your deployment and adjust the parameters on the "Inputs" tab.

General parameters to look out for are Apache related, the ha-proxy configuration (RightScale's standard load balancer) and make sure you have all keys and security groups set. Last but not least — if you deploy from Subversion, enter a username, password and the repository URL.

For PHP warriors, I suggest you run PHP_CompatInfo and PHP_Depend on your application to figure out the basics needed.

Then familiarize yourself with what Ubuntu 8.04 offers with a standard PHP install and last but not least add in whatever is not included into PHP_MODULES_LIST on the "Inputs" tab. Also, don't forget to double-check OPT_PHP_ENABLE. ;-)

I realize that Ubuntu's year old PHP install is not the most efficient install you can do, but the objective here is to get started. I'll show you later how to optimize the different steps on the way.

Making it work.

Follow these steps:

  • Double-check and make sure you have some SSH keys and a security group setup (open port 80, etc.).
  • Get an elastic IP — I like them static. In case you need to terminate the instance a couple of times, it's nice to have.
  • Launch the instance.
  • Setup a DNS entry (You don't have to use DNSMadeEasy, but it could be helpful.).

Stranded?

Chances are, your instance might strand a couple times before you get it off the ground. It can strand in booting, or when it terminates. If it's stranded while terminating, just force the termination again.

But what does that even mean? Stranded means that the instance ran into an error during the boot or shutdown process. Once stranded there is no automatic recovery since the server (or RightScale) stops checking itself. Sometimes you can force operational on the server's "Info" tab.

Reasons to be stranded are:

  • Nothing obvious. (Yeah, it's true sometimes.)
  • Bugs in your scripts.
  • Missing/wrong inputs.

To debug this, check out the "Audit Entries" related to the server. If it's not a network related hickup, they are usually very, very verbose and informative as of what went wrong or is missing.

RightScale provides more insight on the stranded state in their support area. One of the fixes suggested to us during a support call was to sometimes avoid "Inputs" and directly hardcode data into RightScripts because there may be issues. I can't say that I particulary like that.

This is something that took me a while to find because RightScale's UI is not always what one would call intuitive.

Aside from obvious issues, sometimes, a script will leave your instance stranded if something just takes too long to execute. E.g., imagine you register your instances in an external DNS — imagine it takes a little longer, and you are stranded. Whatever you script, don't assume it'll run top to bottom within 10 seconds, double-check always.

At the end of the day I wish RightScale would offer a VMWARE-like image to stage the deployment process on localhost. Trying out various things online is very time-consuming and exhausting because the boot process of an instance can take a little longer. For example, our app server takes roughly 8 minutes to become operational.

In case of an error one needs to terminate the instance, then edit the designated RightScript, maybe update its revision in the ServerTemplate (Pro-tip: select HEAD intially.) and relaunch, and wait.

My server takes hours and hours to start

First off, cloud computing is not the formula uno. If you envision an autoscaling array as a rack of servers which starts up and down in a matter of seconds, you are mistaken. In fact, even though the instance is virtual, it still needs to be started, customized and configured before it's usable.

Depending on what's expected later on, the configuration part can take as long as 10 minutes, or longer. So for example, a checkout from Subversion is not fast. This is something to take into account and something to plan for in advance — because the traffic might be gone once the servers are ready to handle it.

Great success!

Once your instance is started, feel free to login from the RightScale UI — click on the instance ID (not the server template's name) and click on "SSH Console".

Post launch

Auto-scaling is sweet. You need to realize that it still takes time to launch and configure instances. Granted, it's not a week as if you would order a server from Dell or Supermicro, but it's still a 10-12 minute stretch. It's not as instant, so keep that in mind when you provision your array. By the time an instance started, your traffic might very well be gone, you will have suffered downtime and lost business.

Configuration

Once the instance is online, the configuration should look familiar:

  • Apache2 is installed with in a standard linux layout.
  • RightScale's own Apache additions are in /etc/apache2/rightscale.d/.
  • PHP's configurations are located in /etc/php5/apache2/.
  • ha-proxy and apache2 should both run.
  • Your application is in /home/webapps/foobar/current (, and that's the document root too).
    • foobar is set on the "Inputs" tab (APPLICATION).
    • current is a symbolic link to the checkout`.

That's all? Nah!

RightScale's loadbalancer template is setup with an Apache(2) which has a virtualhost on port 80 and one on port 8888.

The ha-proxy itself listens on port 85. Port 80 routes incoming traffic to ha-proxy (localhost:85), which distributes it across the cluster. If you just have one instance, it'll come out on localhost:8888, otherwise, on another app server.

I didn't understand yet why ha-proxy doesn't listen on 80 — but I'm sure there is a reason.

The reason for this (slightly more complicated) setup is to be able to push a maintenance page ("Sorry we are down...") before the proxy and the rest of the website. All at the expensive of routing all traffic through the notoriously speed Apache2 (Note: sarcasm!).

Customize

Raw AWS and RightScale

Contrary to what you may think you can still use both — AWS and RightScale.

You can start instances directly using AWS' console, and in/from RightScale. Both will show up in the other interface. So for example since not all of our servers auto-scale, etc., we decided to set some up using AWS directly. I also find that for the basics the AWS console sometimes outsmarts the RightScale UI.

In this case we use Alestic's very up to date AMIs as a base for our own.

Clone it

Whenever you make a change to a RightScript or Server Template, don't forget to clone it. Cloning allows customization, and it also makes sure that your changes are kept around and what you expect next time you run the server.

Cloning doesn't require one to re-do everything, just the bits that need to be customized. Typically, one would clone the Server Template, and then clone/replace the scripts that need to be adjusted. Of course RightScripts can be written from scratch, but starting off with an example is usually easier.

New software

Assume one wants a more recent PHP install on Ubuntu (e.g. 5.3.x), the easiest would be to use a small tool called checkinstall to created packages and distribute those from S3 or another master server during the installation process of the EC2 instance.

I've blogged about checkinstall before, so I won't go into greater detail here.

PHP

Here is a script to automatically build a PHP package, which you can later distribute:

#!/bin/bash -ex

export php_ver=5.3.1RC1
export php_dl=http://downloads.php.net/johannes
export store=/mnt
export php_file=${store}/php-${php_ver}.tar.gz

# dependencies
apt-get install -y checkinstall

# download PHP
if [ -e $php_file ]; then
    echo "Skipping download."
else
    wget -O ${php_file} ${php_dl}/php-${php_ver}.tar.gz
fi

# untar, configure and make
tar -C ${store} -zxvf ${php_file}

cd ${store}/php-${php_ver}
./configure --prefix=${store} \
--disable-pdo --without-sqlite --without-sqlite3 \
--disable-posix --without-iconv \
--with-openssl --with-pcre-regex \
--with-curl 

make

# checkinstall (build package, don't install)
checkinstall -D --pkgname=php-custom-${php_ver} \
--pkgversion=${php_ver} --maintainer=[email protected] \
--pakdir=/mnt --pkglicense=PHP \
--nodoc \
--install=no \
--delspec=no

[ Depending on what you need, you should review the ./configure line above and adjust this for your own. ]

This leaves you with a .deb file, to install do this:

dpkg -i php-custom-5.3.1rc1_5.3.1RC1-1_amd64.deb

Here is a small script to fetch and install the file:

#!/bin/bash -ex
export master=http://your.s3.url
export deb=php-custom-5.3.1rc1_5.3.1RC1-1_amd64.deb

wget -O /root/${deb} ${master}/${deb}
cd /root/ && dpkg -i php-custom-5.3.1rc1_5.3.1RC1-1_amd64.deb

Installing additional dependencies may be required. (This depends on the AMI or ServerTemplate which you use as a base for your setup.)

Because I'm on RightScale, I can paste it into a RightScript and create what RightScale calls "Inputs":

#!/bin/bash -ex
export master=$DEPLOY_MASTER
export deb=$PHP_DEP

wget -O /root/${deb} ${master}/${deb}
cd /root/ && dpkg -i $PHP_DEP

The uppercase variables are automatically turned into those "Inputs", and they can be managed in the "Inputs" tab on your deployment (or array) dashboard. This allows you to avoid hardcoding in scripts, by managing them from the RightScale UI.

Nginx

My nginx setup is similar to what I do for php above. The only addition is that I download all configuration files from our master server for deployment and customize a couple variables inside the configuration (e.g. root, etc.) using simple sed calls.

For example, I have the following in another RightScript:

sed s,WWW_USER,$WWW_USER, /etc/nginx/nginx.conf

The above replaces the string WWW_USER in the file /etc/nginx/nginx.conf with the variable $WWW_USER ("Inputs"). It's as simple as that.

(, is the delimiter here which I selected to have one style regular expression when I deal with other things such as a path (I didn't want to escape the / in the path, in order to be able to sed /foo/bar/, and this looks much easier.).)

Want MOAR?

I started a small repository on Github where I add/maintain some bash scripts to get servers operational. They should serve as a start for your own. And it's also pretty easy to turn them into RightScripts.

RightScale Gotcha's

Support

First off, RightScale has a hefty setup fee. Which basically buys you a couple hours of support — at a rate of 300 USD.

Then, unless you are on a super-duper Gold support package, support will respond within eight hours. In our case, they maxed it out most times and also kept exceeding. For 500 USD a month, I'd expect a faster turn-around in general and even if the SLA includes eight hours, it should not always be maxed out. Supposedly they will add more people soon but in the meantime this is one of the major setbacks on this platform because once you are stuck and open a ticket, you are stuck for a day — or two.

Bugs

So far, I found a couple bugs with RightScale. Most notably, the interface was a work in process for many, many weeks. Especially when they switched to the current, it seemed like a beta test. But unfortunately not as in Google or Flickr beta, more as in really beta.

User Interface

My latest UI bug is that Inputs do not get updated when you try to run operational scripts on a deployment or array. Which makes using the operational scripts to deploy to a new release almost impossible.

A small but not less annoying UI bug is that Safari and Chromium work fine in the beginning, and then all of the sudden — for no obvious reason — the ajaxy menus stop responding. The suggested fix is to use Firefox or Internet Explorer.

RightScripts

We've discovered bugs in various RightScripts — I don't recall all details, but the latest was stuck in the script which registers your app servers with the loadbalancer. They just wouldn't register. Also, usage wise, these scripts seem less stable and fault tolerant (which in the end gets you stranded more often).

Loadbalancers

Related to the start-up process — it also seems that RightScale's setup is not always very fault tolerant. For example, RightScale suggests two loadbalancers where all the application servers register themselves. This setup sounds to me like it would be useful for failover.

But when you start an app server and it fails to register with one of them, it stalls and hangs for infinity — stranded. The RightScript being used is not as smart and flexible to handle that. And since it cannot recover from that state without intervention, you need to re-run the scripts to register with the loadbalancers whenever they are available again.

This is one of those major WTFs.

One of the work-arounds here is to immediately adjust the DNS when loadbalancer goes down. But this adds complexity in other areas as well.

Documentation

Documentation can be pretty sparse at times with RightScale. Even though they are screencasts and tutorials, they rarely cover the edge cases or include all the issues you can run into. It's a lot of trial and error.

Customize within reason

Only customize when necessary.

E.g. for my own sanity, I decided to use the instance' EIP in RightScale's ha-proxy script vs. the instance' ID (i-8b6hda, i-7262js, etc.) — not a good idea. Even though I had only worked with instances up until this point who all had an elastic IP, it didn't work. They either failed to register on the load balancer, or registered and showed up as down.

RightScale couldn't tell me why, but supposedly ha-proxy chocked on it because it wasn't unique enough. I'm not sure if that is an issue, since I was assuming an IP always is, but it magically started to work again once we switched back to the default.

Conclusion

I hope I gave you an impression of what you have to expect from RightScale and offered some ways to deal with it — in case you have to. Most of these basics are also applicable to a setup using puppet or chef. I'll post some thoughts and get into those in a later blog post.

As for to RightScale, or not to RightScale. While there are a lot of advantages, I do have to retreat and meditate on this more than a few times before there is a next time. My biggest concern is how all those issues can sneak into RightScale while they have a lot of customers, and why are we the only one to notice?

I don't want to say, "Don't do it!". But right now I obviously cannot recommend RightScale all the way either.

| More