Among all the different technologies in our stack, we also use MySQL. While we still run MySQL (or Percona-Server) ourselves, we selected a managed solution to power parts of our production infrastructure: a Multi-AZ setup with Amazon’s RDS.

AZ is Amazon-speak for “availability zone”, essentially a datacenter. RDS stands for: Relational Database Service.

Judging from my experience with our own setups where EBS is in the mix, I have to say that Amazon does an outstanding job hiding these potential issues with RDS from us. Looking at the price tag of the setup can be intimidating at first, but as far as TCO is concerned, RDS is the complete package: managed from every which way and painless for the most part.

RDS in a nutshell

RDS is pretty awesome — it’s basically a highly available MySQL setup with backups and optional goodness like read-slaves. RDS is one of the best services as far as Amazon Webservices are concerned: 90% of what anyone would need from RDS, Amazon allows you to do with a couple clicks. For tuning, it gets a little less shiny and maybe even messy, but even changing parameters is possible.

Another pretty important feature is growing and shrinking RDS. Change your storage and either apply the change right away or wait for your next maintenance window. It should be noted that these changes are rarely instant (or “right away”), which doesn’t make it any less awesome. So even though for example resizing the storage is not an instant operation (of course), it still puts a whole new definition into the word elastic.

The package

A standard RDS setup gives you a managed database instance with a DNS hostname and database credentials to log in.

Access from instances is granted using database security groups, which work just like the regular security groups (on AWS). In non-AWS-language, this translates to firewall policies.

Pricing

As far as pricing is concerned, AWS is always a little tough to understand: the baseline is 0.88 USD per hour for a multi-az deployment which totals to 633.6 USD a month (large instance class). Since we opted for reservation (a 1,200 USD one time fee for a three (3) year term), we were able to drop that price to 0.56 USD per hour.

Aside from instance costs there are storage costs as well: 0.20 USD per GB (100 GB will cost you and me about 20 USD) and 0.10 USD per million I/O requests (aka the “i/o rate”). On our multi-az RDS we selected 100 GB for total storage initially but since we currently use only about 60 GB, we just end up paying about 12 USD per billing period.

While storage costs are somewhat easy to predict, the “i/o rate” is not. But it’s also not a major factor. I’m unable to provide exact numbers currently because we have three RDS servers (1 multi-az deployment, 1 read-slave and another single-az deployment) and the numbers are aggregated on the billing statement but our total is 368,915,692 IOs which runs at roughly 36 USD per month.

Vendor lockin

Anyway — if RDS is awesome, what’s the catch? A closed environment.

The primary advantage and disadvantage of RDS is that we don’t get access to the server and our own backups.

Of course there are backups and we can use them to restore (or rollback) our RDS setup from within AWS. There are options using the AWS console and I believe using their API as well. But in the end there is no way to export this backup and load it into a non-RDS-setup. And add to that: replicating from or into RDS is not possible either. Which makes migrations and backups an unnecessary pain in the butt.

Aside from not getting access to our own backup, we also don’t get access to the actual instances. Which makes sense for AWS, but it means we need to rely on in my opinion questionable metrics like Cloudwatch. Questionable because there is no way for the customer to verify the data. AWS uses their own metrics (and scale) and it’s often not obvious to me how well Cloudwatch works even on regular AWS EC2 instance.

I’ve seen instances which became unavailable, but Cloudwatch is reporting A-OK (green). I’m not sure how beta Cloudwatch is, but we decided on Silverline (more on that in another blog post) for instance monitoring. Since Silverline requires a local client, it’s unfortunately not an option for RDS.

What’s pain?

Aside from the monitoring and backup quirks, one of the real pain points of Amazon RDS is that a lot of the collective MySQL knowledge is not available to us. The knowledge which is manifested in books, blogs, various monitoring solutions and outstanding tools like Percona’s backup tools are not available to people who run Amazon RDS setups.

Whatever the (technical) reasons for all this may be, they pain me at least a little and should be discussed when Amazon RDS is evaluated to replace MySQL infrastructure.

MySQL and backups

I mentioned backups! First off, I hope ya’ll are doing them! ;-)

For many, the preferred way to do MySQL-backups is mysqldump. It’s pretty simple to use:

This command essentially dumps all (-A) databases from your MySQL server.

mysqldump is an OK solution as long as you take a few things in mind:

  • Create your backups during a period where there is the least activity — typically the night.
  • There will be a slight bump, but hope that your database is small enough so no one notices.

With a larger database or most databases with lots of read and write activity, this is almost impossible to achieve. For a snapshot to be consistent, table locks are used and that usually stalls access to any application which relies on your database.

Of course there is Percona’s xtrabackup (which is outstanding), but with RDS, that is not an option either.

Read-slave to the rescue

Typically people will use a read-slave with MySQL to offload read queries from the master. I haven’t done any tests on how far these typically lag behind with Amazon RDS, but I am going to use my RDS read-slave for something else: backup.

Adding a read-slave is easy:

  1. Log into the AWS Console and go to RDS.
  2. Select your RDS server and click ‘add read replica’ above

The operation will likely take a while. The timeframe depends on the type of instance and the amount of storage provisioned. In my case I selected a small instance and it assigned 100 GB of storage to match my RDS. Small instances boot notoriously long — the entire operation completed in a little over ten minutes.

On a side-note: read-replicas allow you to scale RDS beyond availability zones (AZ). But you should be aware that traffic across different AZ is billed to the customer.

Costs

A small instance costs roughly 76 USD/month (excluding storage, I/O rate and bandwidth), which by itself is not bad for a fully managed server which I basically setup with two or three clicks. Since we plan to do backup on a regular basis, we will buy a coupon to reserve the instance which cuts down costs tremendously and generally makes the AWS cloud very affordable.

Amazon RDS, quo vadis?

I mentioned a little vendor-lockin with the service and the little visibility from the outside.

In theory, this should not matter — however there are more than a few issues you should be aware of. I don’t want to mention them to stomp on Amazon — RDS is still in beta after all. But you should be aware of them to get a complete picture.

Pretty questionable is the way some of these issues are handle: not at all or in private messages. AWS is not always at fault here since I imagine pretty often the customer forgets to update the ticket when the issue is only temporary because their focus shifts to other areas.

But one of the core problem with customer service problem all over AWS is that customers have to resort to posting on a forum with no guaranteed response or have to buy a support contract which includes answers like “we fixed it”. The first response is usually that more details are needed (Maybe customer accounts on the forum are not linked to AWS accounts on the inside?) and off it goes into private mode.

My wish is that these situations across all AWS services are handled more transparent in the future so people see development and evolution of the service which means that a trust-worthy platform is be build.

Fin

I’ve been thinking about my final statement for a while. If anything right now, I would be more in favour of Amazon RDS.

Amazon RDS is an extremely interesting product — the beta-tag is even more impressive. It’ll be interesting to see what it will offer once Amazon pronounces it stable.

As for the future of our RDS-setups: they are not gonna go away soon. One of our objectives for 2012 is stabilizing across all products and infrastructure underneath. I think this will include some sort of integration with our external monitoring system to at least try to make sense of things like Cloudwatch and to be able to correlate with other events from all over production.