As programmers we adopt new tools to make our lives easier and increase the speed of development. Web development is an area of extremely rapid innovation with many incredible libraries and packages. I have worked on a number of Django powered websites, and I wanted to share my process to help those who are just getting started or who want to improve their setup.
I plan on covering an end-to-end project, from development to deployment on AWS, so there is a lot to cover. In an attempt to make this more manageable, I’ll break apart the posts into multiple parts. In this segment we will setting up the local environment.
It may happen that you have an idea for the next billion dollar, social, 2.0 cloud service and you want to just start coding immediately. So you download the tools you need and start hacking. Soon, you need to work with others and incompatibilities arise. Then you push to production and nothing seems to work. In our exuberance to build things, we sometimes forget the engineering part our work: it’s all about the process.
Firstly, embrace virtualization whole heartedly. Virtualization, for the unfamiliar, allows you to run a full operating system within another operating system. What does this mean? It means you and your team can develop on Windows, OSX, or Linux but have all your code running within a consistent environment where the operating system and package versions are controlled. This means you don’t need to worry about compatibility across development machines, staging or production. The immediate consequence is a slight learning curve, but the long-term payoff of consistent environments is that bugs are found more quickly and rarely reach production.
To use virtual machines (VMs) in our workflow, we make use of VirtualBox which is a tool from Oracle (previously Sun). VirtualBox is a “hypervisor”, because it can supervise multiple operating systems. It interfaces those operating systems with the host system, and makes things like network and file access possible. However, dealing with VirtualBox can be a bit cumbersome and error prone, so we use a cool tool called Vagrant to make development easier. Vagrant wraps around VirtualBox and uses a “Vagrantfile” to automatically setup and provision our virtual machines. This simplifies things, but also makes them much more repeatable and we can use version control on our Vagrantfile to further button down our setup.
Vagrant::Config.run do |config| config.vm.box = "lucid32" # The url from where the 'config.vm.box' box will be fetched if it # doesn't already exist on the user's system. config.vm.box_url = "http://files.vagrantup.com/lucid32.box" # Assign this VM to a host only network IP, allowing you to access it # via the IP. config.vm.network :hostonly, "126.96.36.199" # Forward a port from the guest to the host, which allows for outside # computers to access the VM, whereas host only networking does not. config.vm.forward_port 80, 8080 config.vm.forward_port 81, 7000 # Share an additional folder to the guest VM. config.vm.share_folder("v2-data", "/project", "./") end
The syntax of the file is in Ruby, but it is well documented. Our file is fairly basic, and basically instructs Vagrant to download a copy of the Ubuntu Lucid 32bit image as the base of our VM, to make it available at IP 188.8.131.52, forward some ports (so we can access our web server), and to share our project folder. This goal is to mask the fact that we are developing inside a virtual machine. By mapping our project folder into the VM, all our code is available to both us and the VM, and by forwarding the ports we can access localhost just like we normally would but have the connection forwarded to our VM.
To get started using these tools, we need to get Virtualbox and Vagrant, and then bring our VM up. This is fairly straightforward. First, download Virtualbox from the official website. Vagrant can be installed using an OS specific package from here.
Once this is done, we can manipulate our VM using vagrant:
$ vagrant up # Starts the VM $ vagrant halt # Sotops the VM $ vagrant status # Shows the status $ vagrant reload # Restarts the VM $ vagrant ssh # SSH into the VM
Those are the basic commands that are needed. You can read the documentation for advanced features and commands.
vagrant up, a headless VM will be configured and started in the background, ready for use.
Bootstrapping our Environment
At this point, we have managed to bring up a headless Ubuntu system. Now we need to setup our project so we can start developing. There are number of ways to do this. The one we are all most familiar with is manually SSH’ing in and tweaking things until they work. While this works, it is often error prone and difficult to repeat. There are a number of tools designed to automate this, including Chef, Puppet, and Fabric. Chef and Puppet are very sophisticated, as they allow for centralized management, client-server configurations, roles, and are extremely flexible. However, their strengths increase their complexity a fair bit, and they can be difficult to get started with.
Somewhere between fully automated and completely manual is Fabric. Fabric is a command line tool for system administration. It allows you to define tasks in Python and have them be run on any number of machines. This is my personal choice for small projects as it is very simple to get started with and has a shallow learning curve. Fabric is very flexible in its use, but I prefer to define a set of “environments”, and design my tasks to work in any environment. Environments are simple, they are basically used to establish a list of hosts that should be configured and tweak various settings. I use a “vagrant”, “staging”, and “production” environment. If you hate the idea of using virtualization, you could write a new “local” environment that foregoes using VMs. In addition to environments, we take our routine commands and implement them as Fabric tasks. Examples are bootstrapping new hosts, updating the running code, restarting web servers, etc. Then we can simply compose our environments and tasks from the command line, and issue a command that basically says “download the latest code, and restart the production web servers”.
Just as Vagrant used a Vagrantfile, Fabric uses a fabfile (fabfile.py specifically). Here is the one for our project. Don’t be overwhelmed, there is a lot there, but most of it is very mechanical. The command we are immediately interested in
def setup_vagrant(): "Bootstraps the Vagrant environment" require('hosts', provided_by=[vagrant]) sub_stop_processes() # Stop everything sub_install_packages() # Get the installed packages sub_build_packages() # Build some stuff sub_get_virtualenv() # Download virtualenv sub_make_virtualenv() # Build the virtualenv sub_vagrant_link_project() # Links the project in sub_get_requirements() # Get the requirements (pip install) sub_get_admin_media() # Copy Django admin media over sudo("usermod -aG vagrant www-data") # Add www-data to the vagrant group sub_copy_memcached_config() # Copies the memcache config sub_start_processes() # Start everything
This is the basic command that we issue when we want to provision our vagrant instance. As the function names and comments indicate, we stop all the running services first, install and build the packages we need, and get everything in order, and start up the processes. The installed packages are pretty basic, NTP to keep our time updated, python for obvious reasons, git, gcc, and memcached. These packages are relatively up-to-date from the package manager, so we don’t need to build it ourselves. The packages we need to build are Nginx, which is a web server, and uWSGI which is a python application server. We use Nginx as it is very performant for serving static content and handling a large number of connections. For the dynamic context, Nginx passes the request to uWSGI which is a fast and stable application server that works very well for Django. Together, these are the basic ingredients to any front-end server.
In addition to the basics, there are two nifty tools that are integrated in our setup flow. The first is VirtualEnv, which is used to build an isolated Python environment. What that means is that we can have a python environment with total control over the versions of libraries that are installed. It is common to work on projects that have different version requirements (Django 1.2 vs Django 1.3), and these are very difficult to resolve if dependencies are installed at a system level. Instead of installing requirements into our system site-packages directory, VirtualEnv creates a “virtual environment” (shocker, I know), and installs libraries in there.
When we execute
sub_get_virtualenv() it downloads the VirtualEnv package, and
sub_make_virtualenv() creates a new virtual environment for us at /server/env.example.com. Because of the isolation this provides, we could easily have another project at /server/env.foobar.com with conflicting dependencies. For the developer, once this is setup it is completely transparent, so there is really no reason not to do it. As we will see, we can use this to control the versions we deploy to production to ensure we don’t run into conflicts.
Once we have a our virtual environment, we need to install the frameworks we use for development, the main one being Django. Continuing our trend of automation, we use PIP, which is a package installer for Python. Not coincidentally pip understand virtual environments and can install directly into our folder instead of at the system level. When Fabric calls
sub_get_requirements() we are invoking PIP with our requirements.txt file. For our simple example, we rely on Django, South which provides migrations for our models, and pylibmc which is a memcache client.
There is a lot to internalize, and although we are adding complexity in the configuration and setup of projects this is a one time cost. Instead of needing to remember how to set things up or documenting them haphazardly, we have a self-documenting fabfile. As long as you avoid by-hand configuration and always add it to the fabfile, you never need to worry about forgetting a step. All this enables us to provision a system with a single command:
$ fab vagrant setup_vagrant default_project
This will select “vagrant” as our environment, meaning the only host we are provisioning is our local VM. Then by chaining setup_vagrant we will run all the commands to setup our environment. Lastly, when default_project runs it will create a blank generic Django project for us.
Now, if we point our browser to http://localhost:8080, we see:
Lets summarize the command that were necessary for us to set all of this up:
$ vagrant up $ fab vagrant setup_vagrant default_project
By putting our effort into defining our environment in a tangible way, we can make use of tools such as Vagrant and Fabric to make setting up easy and portable. If we need to onboard a new developer, we can having the entire stack up and running in 20 minutes now. I won’t deny that there is a learning curve involved in learning all these tools, but the benefits truly do outweigh the costs.
I hope that our setup can be used as an example of how to add rigor and automation to the development process. There are some additional details that I did not discuss, but everything is available in the git project. In the next part, we will dig into setting up our Django project and doing some development.