I started the series with setting up your local environment using Vagrant and Fabric to quickly bootstrap. In the second part, we reviewed some conventions for Django development as well as useful tools and tricks. In the final part of the series we will cover a simple deployment to Amazon EC2.
Getting started with AWS
The first step in deploying to Amazon EC2 is to setup an account with Amazon Web Services. This is fairly straightforward. Go to http://aws.amazon.com/ and click on “Create an AWS Account”, and follow the steps. It may take a few hours for the account to be active, but then you will be able to login to the AWS Management Console. From there you have access to all the AWS services.
For our simple application, open the console and click the EC2 tab. The Amazon Elastic Compute Cloud (EC2) allows you to rent server infrastructure on a pay-as-you-go basis. This is great for startups or projects where a large investment in server infrastructure is not possible.This is perfect for our project, as we will deploy onto a single server. There are several important key concepts in EC2:
- Instances : An instance in EC2 is a single host. There are a variety of types of instances, ranging from micro to high-memory large instances. Each type of instances has a variable amount of CPU power, RAM, I/O capacity, and disk space. The more powerful machines cost more on an hourly basis.
- Elastic Block Store : EBS is a service which allows for persistent storage. A typical EC2 instances has “instance storage” provided which is very large and high speed, but there is no data backup provided. If the instance dies, then the instance store is lost. EBS instead provides disks which are backed up and will persist in case an instance dies. They are not necessary, but may be useful depending on the application.
- AMI : AMI is short for Amazon Machine Image, and it is basically a “snapshot” of a running machine. When a new instance is started, it uses an AMI as its base image. This image may have any operating system or software pre-installed. Typically, you would start an instance with something like Ubuntu or CentOS with a default install, and then customize it from there. If you want, you can create your own AMI from an existing setup.
- Security Groups : Security groups allows you to set a firewall configuration policy for a set of hosts in EC2. This increases the security of your servers and allows you to run software that listens on the network but to limit connections to only those originating from your servers. A common mistake is to not configure the security groups, and thus make it impossible to access your web server.
- Elastic IPs : Every Amazon EC2 instance has a public and private DNS name. The private DNS name can only be used by hosts on EC2 in the same region. This is used for servers to communicate with each other, without causing the data to leave the data center. It is important that things such as SQL databases and Memcached instances be accessed using the private DNS name to minimize latency and bandwidth cost. The public DNS name is accessible by hosts on the Internet, however the DNS name has to guarantee of stability, and the IP address may change at any time. Elastic IPs allow you to provide a stable IP address to an EC2 host. An Elastic IP is first allocated and then associated with a specific host.
- Key Pairs : EC2 instances are typically configured using SSH if they are running Linux, or Remote Desktop if they are running Windows. EC2 maintains a set of named key pairs which are used to communicate with your hosts. When a host is created, you may specify a key pair to use, and all SSH communication must provide the key to login.
There is a lot to digest when getting started with Amazon EC2, but most of is fairly basic. To get us started, create a new key pair. This will download an .pem file that should be kept safe. This file can be used to access your hosts, so do not make it public. Next, create a security group. Click the security group, go to the “Inbound” tab, add enable access to your web server on port 80 from all IPs (specified as 0.0.0.0/0 in CIDR format). Also enable incoming SSH connections on port 22.
Once that is all setup, you are ready to create your first instance. Go to Instances, and click on “Launch Instance”. The first step is to select an AMI. Since in part 1 we setup Vagrant to develop within a linux environment running Ubuntu 10.04, we want to use the same thing on EC2. The 32bit AMI that uses the instance storage is ami-e4d42d8d. Next we select the type of instance we want, in this case a single small instance. We can continue on and select a name, key pair, and security group. Eventually we get to review our setup:
Once you click launch, the instance will begin booting and will be available in a few minutes. When the status is “Running”, you can select the instance and find its public DNS name. To login, you use your key pair, and execute something like:
$ chmod 600 testdjango.pem $ ssh -i testdjango.pem ubuntu@ec2-...compute-1.amazonaws.com
If you can login, then you have succeeded! If not, check the status of your host and make sure the security group is properly configured to allow incoming connections. At this point, we have a blank host up and running, but we need to bootstrap it to run our Django server.
Bootstrapping with Fabric
Once we create a new host, we need to bootstrap it with our environment. This is normally the most difficult process of using a service such as EC2. Unless you are using automated tools, bootstrapping is tedious and error prone. However, because we have invested time in defining our Fabric file, we can bootstrap new hosts painlessly.
To start, we need to modify our fabfile to specify the hosts. You should modify the
HOSTS variable at the top of
the file to include the EC2 Public hostname. Next, we need to provide the key file that is used to communicate
with the host. Hosts on EC2 accept SSH connections only if a valid identity file is provided. By default, our fabric
file will use the pem file at config/aws/testdjango.pem. You should replace this with your actual key that was downloaded
when a new keypair was generated.
Next, we need to enable our hosts to get the code from GitHub. To do this, we need to generate a set of “deploy keys”, which are SSH keys that you provide to GitHub which enables your code to be cloned. Generating the keys is straightforward:
$ cd config; ssh-keygen -f id_rsa -t rsa -N ''
Once you have the key, go to the GitHub page for your project, click on “Admin”, and in the Deploy Keys section, you need to upload the contents of id_rsa.pub. This will enable the EC2 servers to clone a copy of your repo to serve the site.
Once we have specified the hosts and setup all our keys, we are ready to bootstrap. We can just issue the command:
$ fab production bootstrap
This will select the production environment, and run the bootstrap command on all the hosts. This is similar to how we did:
$ fab vagrant setup_vagrant
The setup_vagrant command shares most of the sub-routines as bootstrap but a few minor differences are present due to the Virtualbox environment. Once the bootstrap command is finished, we should be able to point our browser at the public hostname of our EC2 instance and see our site running live.
This covered the simplest case of bootstrapping in production. To build a staging environment, we can just define a different set of hosts for staging, and run the same command just selecting stage instead of production.
Updating code and pushing to production
We’ve covered everything necessary to get our configure our servers and get our site running live. However, websites are inherently iterative. As such, a common process is deploying the latest version of code to the running site with minimal interruption. One way to do this would be to rerun bootstrap, but this is rather invasive and may cause a few minutes of downtime.
Instead, we can use a set of Fabric commands to make the process easy and minimize downtime. The critical commands are the following:
- cut_staging : Merges the code from master into the staging branch. This allows the latest development code to be deployed to a staging environment.
- cut_release : Merges the code from staging into release. This allows code that has been tested in a staging environment to be deployed to production.
- pull : Performs a Git pull, so that the server has the latest code from the proper branch
- reload : Reloads both Nginx and uWSGI so that any new code changes take affect
- syncdb : Performs a synbdb and migrate command so that DB schemas are brought up to date.
Because Fabric commands can be composed, we can easily use a single command to perform all the relevant steps. The most common is to push code from the development (master) branch to staging. We can use the following to do so:
$ fab cut_staging staging pull reload syncdb
We can then test our changes in staging and look for any bugs. Once we are confident in our code, we can easily deploy to production:
$ fab cut_release production pull reload syncdb
This will basically perform the same steps, but merge into the release branch and run on the production hosts. At this point, our latest code changes are available in production.
I’ve tried to cover all the basics required to get our side deployed to EC2, but there are still many things needed to button down our deployment. For most deployments, you will probably want to make use of a database to provide persistence. Setting up a DB is outside the scope of this post, but one could either setup an EC2 instance and manually configure MySQL or PostgreSQL, or use Amazon RDS to provide a database as a service.
Additionally, most sites will benefit from using caching, particularly memcache. Memcache is installed by default on all of our machines, but the Django settings files need to be updated to use the proper hostnames.
Lastly, if you would like to run your site on EC2 with nicer URLs you have two options. You can create an Elastic IP and assign it to your instance which will provide it a stable public IP, and then update your DNS A record to point to the EIP. Alternatively, you can just setup a DNS CNAME record to the public hostname of the EC2 instance. The advantage of the first approach is that Elastic IPs can be remapped very quickly, and in the case of a host failure changing a CNAME may take a substantial amount of time.
In the final part of this series I covered how to deploy our code to Amazon EC2. Doing our deployment is simplified by the tools we used and by our foresight in maintaining consistent development and production environments. This allows us to have rapid iteration without worrying about incompatibles or environmental issues. We can use Fabric both to maintain and grow our staging and production fleets, but also to do simple code deployments. Lastly, we briefly covered some of the considerations needed for a proper buttoned down deployment.
I wrote this series of blog posts for a number of reasons. I wanted to swap out the information I had built up in my mind for my own reference. It also serves as documentation for team members working on projects so that they may fully grasp our setup and process. Lastly, I hoped to share our methodology and structure so that those who are new to Django may adopt sensible conventions for projects as well as more rigorous engineering processes.
Regardless of your background or expose to Django, I hope some of this information was valuable. If you have any questions about the setup, please ping me ( twitter works well: @ArmonDadgar).