This is the first article in a three-part series on managing LAMP environments (Linux, Apache, MySQL, and PHP) with Chef (a configuration management tool).
The series covers using Chef to provision a development environment on a virtual machine with Vagrant and VirtualBox, and a production environment in the cloud with Amazon EC2. Prior knowledge of how to manually configure a LAMP stack is assumed–this tutorial shows how to automate the process, but doesn’t explain the configuration options in any detail.
Articles in this series:
- Part 1: Introduction (this article)
- Part 2: Provisioning a LAMP stack
- Part 3: Configuring and deploying a web application
In this article:
It can be a real hassle managing the different computing environments for developing web applications. At a minimum, you’ve got a development workstation and a production server. Often you have many of each, plus additional environments for testing, staging, etc. These environments need to match each other as closely as possible to minimize unpleasant surprises, and they need to be kept in sync as changes are made. If you’re maintaining all the environments by hand, something as simple as adding a new PHP extension can be a pain, and any significant upgrades can cause significant headaches.
This is where configuration management (CM) comes in. A CM tool like Chef can install and configure system software according to recipes you define in source code. This lets you manage your infrastructure in much the same way you manage your code, using version control, automated builds, issue tracking, etc.
Why Chef instead of Puppet? Puppet is fine too, and I know some smart people who are using it. Puppet has been around longer and has a larger user base, but Chef is gaining quickly (especially among the DevOps and web developer crowd). I prefer Chef’s “deterministic” approach, which lets me tell it exactly what to do in what order, over Puppet’s “declarative” approach, in which you describe the final system state and let Puppet decide how to get there. I also like some of Chef’s unique features, like encrypted data bags. Here’s one article with a relatively balanced comparison of Chef and Puppet.
Vagrant is a tool for managing virtualized development environments in VirtualBox virtual machines. It can automatically create and provision development VMs using a Vagrant config file and the Chef provisioning scripts. In short, Vagrant makes it very easy to set up dev VMs.
Using a virtual machine for development has several advantages:
- Development environments can use the exact same OS and system software as the production environment (so-called dev/prod parity).
- Homogeneous environments are easier to maintain.
- If you mess something up, you can simply destroy the VM and spin up a new one.
- Multiple VMs can coexist on the same host machine, with conflicting environments for different applications.
- Virtual dev environments are more portable and distributed, making them more flexible and robust.
All of these benefits can help increase developer productivity and lead to happier developers.
EC2 is Amazon’s cloud computing platform, which provides resizable server instances on-demand. There are other similar services, but this tutorial focuses on EC2 because it seems to be the most popular.
Configuration management and cloud computing are soul mates. When you use them both together, you can provision new server instances–both hardware and software–with a single command. That means you can do it automatically, enabling all the auto-scaling magic that cloud computing is famous for.
It’s also handy to be able to spin up cloud instances for brief development or testing use, and then tear them down when you’re done with them. This can be a cost effective way to get short-term access to a large number of server instances.
Key Chef concepts
As you know by now, Chef is a configuration management tool for installing and configuring servers according to source code recipes. These recipes are written in Ruby, mostly using a domain-specific language (DSL) that is custom to Chef. If you don’t know Ruby, don’t worry. You can go a long way just by customizing the examples in this tutorial, and it’s not hard to learn just enough Ruby for Chef.
There are three main players in Chef infrastructure:
- The Chef server, the central repository where all the configurations are stored.
- One or more Chef clients, which are agents that connect to the Chef server and execute the commands that configure the node on which they are installed.
- One or more management workstations, from which admins can manage the nodes and configurations using a command-line tool called
“Nodes” are the whole point of the CM exercise–nodes are the servers you want to install. Software packages are installed and configured on nodes by way of “cookbooks”, which are comprised of various Ruby scripts and other supporting files. Nodes can have “roles” (ex. “webserver” or “database”), which define a collection of recipes to be installed on those nodes. Nodes have settings called “attributes”, which can serve as variables in recipes, and which are set based on the node’s role, environment, and node-specific configuration.
Breaking down the terminology
Chef has a mouthful of terminology you need to hear at least once, to keep from getting lost in the documentation. But don’t try to memorize all this stuff now. Just skim the definitions and then refer back here as needed. It should all become clearer as you work through the examples in this tutorial.
The following excerpt is from Introduction to Cookbooks in the Chef manual:
- Cookbooks contain recipes, attribute files, and other configuration information.
- Resources are the building blocks of recipes and describe a discrete piece of a Node’s configuration.
- Resources are idempotent. Applying the same resource to a node twice should have the same result.
- Providers take the necessary actions required to ensure the Node’s state matches the resource description.
- Attributes provide tunable parameters that can be used within a recipe as well as information about the node.
- Roles provide a way to describe a particular function or type of node. Roles have run lists and attributes.
- Environments provide the means of managing different infrastructure spaces within one Chef instance.
And here’s an excerpt from Test-Driven Infrastructure with Chef:
At its simplest, the process of developing infrastructure with Chef looks like this:
- Declare policy using resources
- Collect resources into recipes
- Package recipes and supporting code into cookbooks
- Apply cookbooks to nodes using roles
- Run Chef to configure nodes according to their assigned roles
A picture is worth a thousand words
It might help to engage the other side of your brain at this point. For a graphical overview, take a look at the visual introduction to Chef put together by Kate Leroux of Urbanspoon.
Here is some related work that others have done covering similar ground to this tutorial.
Application cookbook: A Chef cookbook intended to streamline the deployment of web applications. It seems to be built with Ruby on Rails in mind, but it supports PHP applications along with other languages and frameworks. Part 3 of this tutorial stole several ideas from an earlier version of the application cookbook.
Chef wiki article on building a LAMP stack: Focuses on building a LAMP stack using an older version of the application cookbook. It’s a bit dated, but it still has some good information.
Enough talk. Let’s build something.
Continue with Part 2: Provisioning a LAMP stack.