Skip to content

NetworkManager (for resolv.conf and firewalld) on CentOS7

As I am spiralling into Linux server administration, there's certainly a lot to learn. Certainly a lot leaves me wanting BSD, but since that's not an option, ... here we go.

NetworkManager

The NetworkManager on Linux (or CentOS specially) manages the network. Whatever content/blog posts/knowledge base I found. It usually suggests that you uninstall it first. Common problems are that people are unable to manage /etc/resolv.conf — because changes made by them to that file get overwritten again.

Internals

The NetworkManager gets everything it needs from a few configuration files.

These are located in: /etc/sysconfig/network-scripts/

There're easy enough to be managed with automation (Ansible, Chef, Salt) and here's how you get a grip on DNS.

As an example, the host I'm dealing with has an eth0 device. It's configuration is located in the directory, in a ifcfg-eth0 file and its contents are the following:

TYPE="Ethernet"
PROXY_METHOD="none"
BROWSER_ONLY="no"
BOOTPROTO="none"
DEFROUTE="yes"
IPV4_FAILURE_FATAL="no"
NAME="eth0"
UUID="63b28d0a-41f0-4e3a-bf30-c05c98772dbb"
DEVICE="eth0"
ONBOOT="yes"
IPADDR="172.21.0.12"
PREFIX="24"
GATEWAY="172.21.0.1"
IPV6_PRIVACY="no"
ZONE=public
DNS1="172.21.0.1"

Most of this speaks for itself, but there are a few titbits in here.

Managing DNS and resolve.conf

In order to (statically) manage the nameservers used by this host, I put the following into the file:

DNS1="172.21.0.1"

If I needed multiple DNS (e.g. for fallback):

DNS1="172.21.0.1"
DNS2="172.21.0.2"
DNS3="172.21.0.3"

In order to apply this, you can use a hammer and reboot — or use your best friend (sarcasm) systemd:

$ systemctl restart NetworkManager

Done!

introducing firewalld

firewalld is another interesting component. It breaks your firewall down into zones, services and sources. (And a few other things.) It's not half bad, even though pf is still superior. Its biggest advantage is that it hides iptables from me (mostly). And it allows me to define rules in a structured XML, which is still easier to read and assert on than iptables -nL.

In order to for example but my eth0 device into a the public zone, put this into ifcfg-eth0:

ZONE=public

This also implies that I can't put this device into another zone — conflicts. But this makes sense. We can also change this of course and put devices into different zones. I believe public may be an implicit default.

FIN

Thanks for reading!

Ansible Molecule drivers

(Hello again. I haven't blogged in a while. But since I'm growing weary of platforms such as medium. Here we go.)

I've recently spent ~~too much~~ a lot of time with Ansible. Once I got into the rhythm of playbooks, roles and maybe modules/libraries, I desperately needed a way to test my YAML. And by testing, I didn't mean the annoying linting that Ansible ships with, but actual (integration) tests to verify everything works.


Enter Molecule (and TestInfra)

Molecule seems to be the go to in the Ansible universe. It's an odd project — primarily because it's so undocumented (and I don't mean Python class docs, but human-readable examples).

One of the fun things about Molecule are drivers. Drivers allow me to use Docker, VirtualBox or a cloud service (like AWS, Azure, DO, Hetzner ;-)) to start instances that my playbook is run on (and then TestInfra runs assertions and collects the results). In the a nutshell, this is what Molecule does — think of it as test-kitchen.

Drivers and switching them

Drivers are crucial to a scenario. You can't, or shouldn't attempt to create a scenario and then switch it to another driver. When a scenario is initialised using a driver, it creates create.yml, destroy.yml (playbook) files. These files are highly specific to the driver and Molecule doesn't play well when these are incorrect, or even missing.

It took me too long to figure this out. Hence, I'm leaving a note here.

Fin

I'll promise I'll blog more. Again. Thanks for reading!

What's wrong with composer and your .travis.yml?

I'm a huge advocate of CI and one service in particular called Travis-Ci.

Travis-CI runs a continuous integration platform for both open source and commercial products. In a nutshell: Travis-CI listens for a commit to a Github repository and runs your test suite. Simple as that, no Jenkins required.

At Imagine Easy we happily take advantage of both. :)

So what's wrong?

For some reason, every other open source project (and probably a lot of closed source projects), use Travis-CI wrong in a way, that it will eventually break your builds.

When exactly? Whenever Travis-CI promotes composer to be a first-class citizen on the platform and attempts to run composer install automatically for you.

There may be breakage, but there may also be slowdown because by then you may end up with not one, but two composer install runs before your tests actually run.

Here's what needs fixing

A lot of projects use composer like so:

language: php
before_script: composer install --dev
script: phpunit

Here's what you have to adjust

language: php
install: composer install --dev
script: phpunit

install vs. before_script

I had never seen the install target either. Not consciously at least. And since I don't do a lot of CI for Ruby projects, I wasn't exposed to it either. On a Ruby build, Travis-CI will automatically run bundler for you, using said install target.

order of execution

In a nutshell, here are the relevant targets and how the execute:

  1. before_install
  2. install
  3. before_script
  4. script

The future

The future is that Travis-CI will do the following:

  1. before_install will self-update the composer(.phar)
  2. install will run composer install
  3. There is also the rumour of a composer_opts (or similar) setting so you can provide something like --prefer-source to the install target, without having to add an install target

Fin

Is any of this your fault? I don't think so, since the documentation leaves a lot to be desired. Scanning it while writing this blog post, I can't find a mention of install target on the pages related to building PHP products.

Long story short: go update your configurations now! ;)

I've started with doctrine/cache and doctrine/dbal, and will make it a habit to send a PR each time I see a configuration which is not what it should be.

Speeding up composer on AWS OpsWorks

At EasyBib, we're heavy users of composer and AWS OpsWorks. Since we recently moved a lot of our applications to a continuous deployment model, the benefits of speeding up the deployment process (~4-5 minutes) became more obvious.

Composer install

Whenever we run composer install, there are a lot of rount-trips between the server, our satis and Github (or Amazon S3).

One of my first ideas was to to get around a continous reinstall by symlinking the vendor directory between releases. This doesn't work consistently for two reasons:

What's a release?

OpsWorks, or Chef in particular, calls deployed code releases.

A release is the checkout/clone/download of your application and lives in /srv/www:

srv/
└── www
    └── my_app
        ├── current -> /srv/www/my_app/releases/20131008134950
        ├── releases
        └── shared

The releases directory, contains your application code and the latest is always symlinked into place using current.

Atomic deploys

  1. When deploying code, deploys need to be atomic. We don't want to break whatever is currently online — not even for a second or a fraction of it.
  2. We have to be able to roll-back deployments.

Symlinking the vendor directory between releases doesn't work because it would break existing code (because who knows how long the composer install takes or a restart of the application server) and it would require an additional safety net in place to be able to rollback a failed deployed successfully.

Ruby & Chef to the rescue

Whenever a deployment is run, Chef allows us to hook into the process using deploy hooks. These hooks are documented for OpsWorks as well.

The available hooks are:

  • before migrate
  • before symlink (!)
  • before restart
  • after restart

In order to use them, create a deploy directory in your application and put a couple ruby files in there:

  • before_migrate.rb
  • before_symlink.rb
  • before_restart.rb
  • after_restart.rb

If you're a little in the know about Rails, these hooks will look familiar.

The migration hook is probably used to run database migrations — something we don't do and probably will never do. ;-) But be assured: at this point in time the checkout of your applications is complete: or in other words, the code is on the instance.

The symlink hook is what we use to run composer install to get the web app up to speed, we'll take a closer look in a second.

Before restart is a hook used to run commands before your application server reloads — for example something like purging cache directories, whatever you want to get in order before /etc/init.d/php-fpm reload is executed to revive APC.

And last but not least, after restart — used on our applications to send an annotation to NewRelic — that we successfully deployed a new release.

Before symlink

So up until now, the before_symlink.rb looked like this:

composer_command = "/usr/local/bin/php"
composer_command << " #{release_path}/composer.phar"
composer_command << " --no-dev"
composer_command << " --prefer-source"
composer_command << " --optimize-autoloader"
composer_command << " install"

run "cd #{release_path} && #{composer_command}"

Note: release_path is a variable automatically available/populated in the scope of this script. If you need more, your node attributes are available as well.

Anyway, after reading Scaling Symfony2 with AWS OpsWorks, it inspired me to attempt to copy my vendors around. But instead of doing it in a recipe, I decided to use one of the available deploy hooks for this:

app_current = ::File.expand_path("#{release_path}/../../current")
vendor_dir  = "#{app_current}/vendor"

deploy_user  = "www-data"
deploy_group = "www-data"

release_vendor = "#{release_path}/vendor"

::FileUtils.cp_r vendor_dir, release_vendor if ::File.exists?(vendor_dir)
::FileUtils.chown_R deploy_user, deploy_group, release_vendor if ::File.exists?(release_vendor)

composer_command = "/usr/local/bin/php"
composer_command << " #{release_path}/composer.phar"
composer_command << " --no-dev"
composer_command << " --prefer-source"
composer_command << " --optimize-autoloader"
composer_command << " install"

run "cd #{release_path} && #{composer_command}"

Step by step:

  • copy the current release's vendor to the new release (if it exists)
  • chown all files to the webserver (if the new vendor exists)

This allows the deploy hook to complete, even if we're on a fresh instance.

Benchmarks?

Effectively, this cut deployment from 4-5 minutes, to 2-3 minutes. With tailwind, a 50% improvement.

FIN

That's all. Happy deploying!

SQL MAX() and GROUP BY for CouchDB

While re-writing a couple SQL statements into CouchDB we got stuck when we wanted to do a SELECT MAX(...), id ... GROUP BY id in CouchDB.

MySQL

Imagine the following SQL table with data:

mysql> SHOW FIELDS FROM deploys;
+-------------+-------------+------+-----+---------+-------+
| Field       | Type        | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------+-------+
| project     | varchar(10) | NO   |     | NULL    |       |
| deploy_time | datetime    | NO   |     | NULL    |       |
+-------------+-------------+------+-----+---------+-------+
2 rows in set (0.01 sec)

In order to get the latest deploy for each project, I'd issue:

mysql> SELECT MAX(deploy_time), project FROM deploys GROUP BY project;
+---------------------+----------+
| MAX(deploy_time)    | project  |
+---------------------+----------+
| 2013-10-04 22:01:26 | project1 |
| 2013-10-04 22:02:17 | project2 |
| 2013-10-04 22:02:20 | project3 |
+---------------------+----------+
3 rows in set (0.00 sec)

Simple. But what do you do in CouchDB?

CouchDB

My documents look like this:

{
  "_id": "hash",
  "project": "github-user/repo/branch",
  "deploy_time": {
     "date": "2013-10-04 22:02:20",
     /* ... */
  },
  /* ... */
}

So, after more than a couple hours trying to wrap our heads around map-reduce in CouchDB, it's working.

Here's the view's map function:

function (doc) {
  if (doc.project) {
    emit(doc.project, doc.deploy_time.date);
  }
}

This produces nice key value pairs — in fact, multiple — for each project.

And because the map-function returns multiple, we need to reduce our set.

So here is the reduce:

function (doc, values, rereduce) {
  var max_time = 0, max_value, time_parsed;
  values.forEach(function(deploy_date) {
    time_parsed = Date.parse(deploy_date.replace(/ /gi,'T'));
    if (max_time < time_parsed) {
      max_time = time_parsed;
      max_value = deploy_date;
    }
  });
  return max_value;
}

The little .replace(/ /gi,'T') took especially long to figure out. Special thanks to Cloudant's señor JS-date-engineer Adam Kocoloski for helping out. ;-)

Step by step

  • iterate over values
  • fix each value (add the "T") to make Spidermonkey comply
  • parse, and compare
  • return the "latest" in the end

A note of advice: To save yourself some trouble, install a local copy of Spidermonkey and test your view code in there and not in your browser.

Open the view in your browser: http://localhost/db/_design/deploys/_view/last_deploys?group=true

{
  "rows":[
    {
      "key":"testowner/testname/testbranch",
      "value":"2013-09-13 11:41:03"
    },
    {
      "key":"testowner/testname/testbranch2",
      "value":"2013-09-12 16:48:39"
    }
  ]
}

Yay, works!

FIN

That's all for today.