Onehub is built on the Amazon Web Services Platform, and as a result we are able to quickly and easily provision new servers to meet demand. We have servers for load distribution, web serving, queue processing, transcoding, administration, and other tasks. Consequently, every new server we launch needs a specific configuration in order to fulfill its functional role in our infrastructure. After careful consideration, we settled on Puppet to manage the configuration of our servers. Puppet does the majority of the heavy lifting—installing packages, writing out configuration files, etc—but, we still needed something to group and instruct these Puppet instances. Combining a number of open source tools, we were able to vastly simplify the management of our cloud, and bring the familiar cap deploy to our infrastructure.
Assembling The Pieces
At the center of our mash-up is iClassify, a small Rails application that registers Puppet instances as nodes in a database. iClassify provides a YAML description of an individual server to the Puppetmaster, the central Puppet server that compiles the configuration for individual nodes. In order to facilitate registering these nodes, we extracted the icagent tool from iClassify and made it available as both a gem and RPM. Using Facter, new nodes come with a host of attributes that describe the server which enables configurations to be specifically tailored. Finally, we use Capistrano to glue everything together.
Setting The Stage
Before any servers can be launched, iClassify and Puppetmaster must be up and running on servers that will be available to all other machines that will register with it; it is handy if both of these services are on the same machine. For simplicity, in our Puppetmaster configuration, we enable autosigning of certificates for all machines with *.internal addresses, but beware, this setting can be dangerous if your machines need to communicate over the internet as opposed to a private LAN.
The Puppetmaster needs to be configured to use iClassify as its external node terminus. Once setup, Puppetmaster will query iClassify for the classes and variables (tags and attributes respectively) to compile as the configuration (catalog) for each individual node:
Starting The Show
At the base of all new servers is our 'Stem Cell' image. This Amazon Machine Image has only two packages beyond a basic Linux Distribution: Puppet and icagent, at boot it is configured to run both of these.
First, icagent:
$ /usr/share/icagent/bin/icagent -d /usr/share/icagent/recipes/
This command run icagent's recipes against the server and then submit all of the gathered information to the iClassify server along with a UUID.
Second, Puppet:
$ /etc/init.d/puppet start
The newly registered node will default to receiving the base Puppet configuration which includes a few handy packages and Puppet modules we use on all of our machines:
Issuing Orders
With our new machine up and running, we need to assign it a role. On the iClassify server the newly registered node will show up at the top of the server listing. It is a good idea to edit it to have a more descriptive name, and then add the tags (that will map to puppet classes) that define its role. In addition to the standard attributes, we will add a custom puppet_env to each node that enables us to correlate servers with our application's environments (i.e. master, staging, or production). This is critically important as it lets us selectively disable cron tasks and other jobs that we would not want run against test data (like billing!).
Now that the machine is tagged it will eventually run all of the specified Puppet recipes and be ready to receive the 'live' tag. However we're going to take this a bit further with some nifty Capistrano scripting.
Stringing Up The Puppets
Using the iclassify-interface gem we can take a simple Capistrano deploy script and adapt it to coordinate our Puppet instances.
This example uses a standard Capistrano deploy scheme. Please note that the deploy path from Capistrano should match the module location for the different Puppet environments, our Puppet instances run in the same categories as our application (i.e. puppetenv == railsenv).
Now, deploying new puppet recipes is as simple as $ cap staging deploy, the code will be checked out of version control and Puppet will be restarted on all the servers in that environment. We can also use this same trick in our application's deploy script.
When it comes time to deploy the application the list of servers is dynamically generated, saving us from having to tediously edit the deploy.rb file.
The Final Act
While the simplified deployment of our configurations is a huge feature in itself, the true beauty comes in making 'smart' Puppet recipes. For example, we have an instance of nginx that functions as a load-balancer: it proxies to our web servers. Nginx needs to know which 'webservers' are available for it to proxy to at any given time. Inside our nginx manifest we can define a search for the Puppetmaster to perform against iClassify to populate this list:
$nginx_lb_search = "puppet_env:${puppet_env} tag:webserver tag:live"
We simply (1) start a new server, (2) tag it, (3) deploy our application, (4) tag it live, (5) then deploy Puppet. The first few times there were a few kinks (hey that's what staging environments are for!), but with all of the pieces strung together, we have automated an amazing amount of tedious and error-prone work. A new webserver will boot, install all of our base packages, receive the application and start its processes, then get automatically included in to the load-balancers round-robin list.
What is Workspace.me?
Workspace.me is a social news community for Enterprise 2.0 and online collaboration stories.
Comments