My Workflow with Chef

Something i wish somebody had published before i went and discovered myself. I have started with the Single Repository for everything at first, quickly realizing that maintaining such a repository would be a pain. I then went and split all the Community cookbooks and my own cookbooks into two directories in the repository, so i could update the Community Versions with knife cookbook install.

As soon as you have multiple projects sharing some cookbooks, this way also becomes hard to manage. After a lot of reading and exploring, i ended up with the following.

One Repository per cookbook

This is a way, that afaict is something that evolved from using Vagrant and Berkshelf in recent months. It makes sense if you start with these tools, to manage every Cookbook in it's separate Repository. That way you can easily Change and Test things, without having to worry about the entire Repository.

For Personal cookbooks i have the habbit of prefixing them with something, so tools like berkshelf and knife won't suddenly pull Versions from their Repositories because the name is identical.

Use a 'Meta' cookbook for roles and configuration

Beating a dead horse here, but still. Something i wish the Chef Server would do better sometimes. Because of the Lack of versioning in all Configurations on the Chef Server (Environments, Roles etc) i mostly end up pulling all Configuration into meta cookbooks that just pull togother all required cookbooks and apply configuration through attributes.

You would for example create a cookbook 'Webserver' that includes Apache, and apply the configuration in 'attributes/apache.rb'.

With this, you will be able to actually test configuration changes before pushing them live. Also adding new Nodes is then just a matter of setting one Role in the node Config to recipe[webserver].

Berkshelf for Version Management

Berkshelf is a nice tool, albeit somewhat buggy at times. It helps with Version Management for a cookbook. Make sure you don't use both chef_api :config and site :opscode in one Berksfile, choose your "master" source, and configure single special Cookbooks to pull from a different source. Using both sources, lets Berkshelf choose where to get the cookbook from, which i have found to be very funky.

Also don't use Version constraints in the metadata.rb. Berkshelf 2 can't use them, and the Chef server tends to lock up trying to resolve them in my experience. Use the Berksfile for constraints, and then berks apply them to a Environment on the Chef Server that is used by your destination Nodes.

And don't add the Berksfile.lock to the Repo, After writing it once, Berkshelf 2 can't read it anymore. You'll end up constantly deleting it… Occasionally deleting ~/.berkshelf/cookbooks also seems a good idea, to prevent Berkshelf from using it as cache.

I hope Berkshelf 3 is released soon, and fixes all the oddities and bugs that currently exist. Haven't been brave enough to use the Beta yet.

Vagrant for Testing and Development

Use Vagrant. Use one of the Provisionerless Boxes from opscode/bento, and add the Omnibus and Berkshelf plugins to Vagrant. That way you have a 'clean' base box, with a Chef version of your choice.

The Vagrant Box never talks to the Chef Server, so if you need data bags, use chef-solo-search.

Foodcritic / knife cookbook test / Unit Test

As bare minimum, use foodcritic and knife cookbook test to verify your code. I use chef-minitest for Unitesting. You really want to use these facilities, you'll thank yourself the first time you need to change something after you've not touched a cookbook in a few weeks.

Tie everyting into Jenkins

At Work i have a Jenkins running that watches the cookbook Repositories, and runs foodcritic and knife cookbook test on them, and finally vagrant destroy -f && vagrant up. So before i berkshelf upload i can verify my changes on Jenkins on a clean installation.

The Steps

This is what the workflow entails at the end:

  • Bump metadata.rb version
  • Work on cookbook
  • Write Unit Tests
  • vagrant provision to test changes locally, rinse, repeat
  • git commit to central Repository, wait for Jenkins to finish the entire run
  • berks upload to Upload Cookbook
  • berks apply to send Cookbook into the wild

A few Chef Practices

Here is some advice of deploying chef in your environment, that may help you avoid some of the pitfalls of Chef

Try not to Fork Community Recipes

I did it, and it took me a lot of work to revert to the community Versions. Once you fork a recipe by changing directly into the community version, merging upstream changes will be your task. Unless you do a lot of merging you will disconnect from the community versions rather quickly, and all the maintenance burden will be yours.

A better way is to write wrapper cookbooks that just add to the community version. The way to do so, is to create a new cookbook, add a depends 'cookbook' to the 'metadata.rb, and create your recipe that starts with include_recipe 'nginx', after which you can proceed to make your changes.

Some cookbooks from the community though are extremely weak, and sometimes don't really warrant a wrapper cookbook.

Another advantage is, that you can actually upstream fixes you make to a community cookbook, without having to worry about the local changes that are specific to your environment.

Do not use roles for run_lists

Today i use Chef Roles and Environments just for configuration. Roles and Environments have one big flaw: They are not versioned in Chef. So if you want to add a new cookbook to your run list on a development server to your "Webserver" Role, the Production environment will receive that change too.

Create another cookbook that uses the metadata.rb depends, and include_recipe to build your run lists. That way you can version the run-list cookbook and don't have to worry about spilling into environments that you didn't intend it to. These run list cookbooks can also be used to set defaults, which are then also versionable.

Test your Work

This is probably a somewhat alien concept to a lot of the more traditional admin folk, but once you start with Chef, your infrastructure is code. And code has to be tested.

Create a Development Environment that suits your needs, and use Chef Minitests to verify your results. Do not deploy to your Production Environment.

The better your Test Coverage becomes, the less scary changes to your infrastructure become. If you write a large cookbook today, and skip out on the tests, you will be scared to touch the thing in half a year or so, once you're not in the code anymore.

Other minor tips

  • If you don't already use berkshelf, keep your cookbooks in a separate directory in your repo, and use the cookbook_path in your knife.rb
  • Where applicable use a dedicated Git Repository for each of your cookbooks
  • Put a universal .chef/knife.rb into your repository
  • Don't put site specific configuration into your cookbooks. Use the meta ones, or roles and environments
  • shef -z is a good way to interactively work with a chef server
  • Don't run the chef-client as daemon, it is a memory hog, and grows over time. I prefer the Cron variation.
  • If at all possible, consider different distributions in your recipes. Try not to specialize onto one. Quite a few of the community cookbooks are impossible to use on RHEL variants because they have been developed on Ubuntu.

Use Chef with Kickstart / Cobbler

If a full stack deployment of Chef to manage your Infrastructure seems a little nerve wracking to start with, there are ways to incorporate it with your current work flow in a less invasive manner.

I wanted a Kickstart environment that was capable of deploying a number of different distribution. To get the system off the ground, i decided to go with cobbler as the alternative solutions didn't seem mature enough, or to distribution specific at the time.

The problem then is, how to configure the different distribution so the resulting installations have a common setup and feel. Cobbler has some mechanisms to do so, but i decided to go for Chef of course.

Getting Chef-Solo onto a fresh install

I've tried various ways to get Chef onto a system in the past. From distribution supplied packages, to using gem install. The issue here however is, that you end up with various versions of Chef with various distribution specific Bugs (Ubuntu's random ruby segfaulting for example)

Recently Opscode started to create a full-stack package of Chef called "Omnibus Chef". These packages come with ruby and everything required for Chef to run.

In the Cobbler configuration, there is a snippet that Looks like:

# Install Omnibus Chef
curl -L https://www.opscode.com/chef/install.sh | bash

# Create Chef Solo Config
mkdir -p /etc/chef/
cat <<EOBM > /etc/chef/solo.rb
file_cache_path "/var/chef-solo/cache"
cookbook_path ["/var/chef-solo/cookbooks", "/var/chef-solo/site-cookbooks"]
role_path "/var/chef-solo/roles"
data_bag_path "/var/chef-solo/data_bags"
EOBM

# Clone Chef Cookbooks for chef-solo
rm -rf /var/chef-solo
/usr/bin/git clone http://<git-server>/git/chef.git /var/chef-solo

# chef solo needs fqdn to be set properly
# something that can't be guaranteed during install
/bin/hostname localhost

# Run Chef solo
/opt/chef/bin/chef-solo \
    -o 'recipe[acme::cobbler-install]'
    -c /etc/chef/solo.rb \
    -L /var/log/chef-client.log

This way you hand over control of the Systems Configuration to Chef the soonest possible, and don't have to Shell Script or Cobbler Template for the different Distributions.

Testing the whole thing

To test the entire stack from Cobbler to Chef I've build a script that uses Cobblers XMLRPC Interface to switch distributions after the Chef Minitests have successfully finished. A little `rc.local' script tests the Cookbooks, and on success switches the distribution, scrubs the disk and reboots. On failure, the system just stops waiting for somebody to fix the Cookbooks and tests.

Chef

What is Chef, and what is the big deal

In Marketing words:

Chef is an open-source systems integration framework built specifically for automating the cloud. No matter how complex the realities of your business, Chef makes it easy to deploy servers and scale applications throughout your entire infrastructure. Because it combines the fundamental elements of configuration management and service oriented architectures with the full power of Ruby, Chef makes it easy to create an elegant, fully automated infrastructure.

Infrastructure automation is not a new thing. CFEngine for example is a system that has existed for years, though it has never impacted like Chef or Puppet have.

For me the big deal these days is primarily the ability to create a reproducible environment. Yes, i am lazy, and yes i like to automate everything i can. But the fact that i can rely on my systems to behave equally after every installation for me personally is the biggest deal.

I'll try to cover some more Chef related topics here in the furute.

Why Chef

I guess this boils down largely to personal preference. The features that got me into chef, over puppet are as follows

The DSL of Chef just looks nice to me

A statement in Chef to deploy a template looks like

template "#{node[:phpfpm][:pooldir]}/claus.conf" do
    source "fpm-claus.conf.erb"
    mode 00644
    owner "root"
    group "root"
    notifies :restart, "service[#{node[:phpfpm][:service]}]"
end

While in Puppet it looks like:

file {"/usr/local/bin/jvm_options.sh":
    mode    => "664",
    owner   => "root",
    group   => "root",
    content => template("jvm/options.erb"),
    notify  => Service[apache2]
}

As i said, personal preference.

The ability to seamlessly switch between Chef DSL and Ruby

Chef's DSL is an extension to ruby, so while writing recipes you can rather seamlessly incorporate ruby code.

For example a loop to install a bunch of packages:

%w{spamassassin procmail razor fetchmail python-spf}.each do |pkg|
    package pkg do
        action :install
    end
end

Interaction between Nodes

This is among the primary reasons i prefer Chef. A good example to illustrate this, is the Nagios Cookbook. The Nagios Server Recipe can search for all nodes, and will create a configuration for those, while the Client Cookbook will find all Servers and allow Access from those.

Another example could be a Loadbalancer Cookbook that would use a 'role' search on all nodes to identify it's web servers, and create the configuration accordingly:

search(:node, "role:web-fe-group-a") do |r|
    # Configure LB
end

Data Bags

A Data Bag is a collection of json objects stored in the Chef Server that can be searched and used in recipes.

These can be used to create users for example. A databag for a user could look like:

{
    "comment": "Example User",
        "groups": "users",
        "gid": 1041,
        "id": "example",
        "shell": "/bin/false",
        "uid": 4131
}

To use this in a recipe, you would do something like:

search(:users, 'groups:users') do |u|
    user u['id'] do
        uid u['uid']
        gid u['gid']
        shell u['shell']
        comment u['comment']
        supports :manage_home => true
        home "/home/#{u['id']}"
    end
end

My History with Chef

The first commit to my personal Chef Repository is from 2011. I got started on Chef by using the free Trial on opscode.com. It allows the management of up to 5 nodes for free. This is probably the best way to get acquainted with Chef development, as setting up the full stack can be a hassle.

I then moved to using littlechef to handle my systems. Littlechef is Chef Solo with some extensions to manage a bunch of nodes. To learn Chef, this system is also a nice alternative to setting up the full stack.

Today i run a mix of `chef-solo' in a Cobbler environment, and Chef with the Community Server to handle the entire life cycle of an installation.