What happens when you migrate between DigitalOcean and AWS is Configuration Management
Let me tell you how we move production services from Digital Ocean to AWS. We started by thinking about configuration management, and that lead to writing code to build things for us, and enforcing policies, and building rules for environments.
In some of these moves we could afford some downtime, so this story won’t really apply to all but some of you can relate.
Let me share with you some parts of our problem and why we needed to move:
- Development and Production environments have different resource needs. Privacy, VPNs, and even CPU cycles.
- We’re running all of our services with in one company, let’s expand into Amazon as we explored it and felt it was wise. Now we have to use at least two suppliers for our services to run.
- Let’s keep DigitalOcean as the sandbox, their limited SSD capacities don’t make storage production storage very great or that scalable. So we’ll run production on new instance types and block storage available on Amazon.
I think our approach solves some pretty basic problems when creating services in DigitalOcean or AWS. In order to really make use of Amazon there was some more configuration and concepts we needed to manage for Amazon. These are Amazon VPC and Security Groups. For now let’s just think of it as just a set of rules and policies to work out.
The rules and policy for Amazon is:
- Setup a new VPC container and Name it something relevant
- Define Private Networks and Subnets to use in your VPC
- Setup new Security Groups for your services and servers with useful names.
- Configure each Security Group to Allow for relevant HTTP/HTTPS, MySQL, and Redis services.
No ‘ANY ANY’ type rules please. That’s crazy. - Create Name and Version Tags on EC2 instances to identify things.
- Create SSH Key Pairs for use to Bootstrap Config Management
I’ll provision 4 m3.medium instances into our new VPC, named “Armada”, with 2 instances on a private subnet, and 2 instances on a public subnet. These are just for classification right now. I’ll make sure to use the SSH Key Pair I created and use that for the Config Manager key so that I can apply my server policies to these instances.
The Security Groups are setup already, so I just assign the “Web Public” SG to the Public subnet Instances, and the “Data Internal” SG to the Private Subnet. Basically, the webservers allow HTTP/HTTPS from anywhere, and are allowed to communicate redis and mysql to the other servers in a different subnet.
Here’s how we define Server Policies in configuration management:
- 2 MySQL Database Nodes and we think Master on Instance A and Slave is Instance B. We’ll pop in the IP address info from Amazon’s EC2 console here.
- We need 2 database users, 2 system users and 2 databases created on the systems. We’ll just define the db user and password, maybe we’ll use Clojure one day to manage the passwords, but we’ll install Public SSH Keys for the system users.
- 2 Nginx Web Servers running Oracle Java 8, and PHP 5.6 with tons of great modules installed.
- Let’s Set Environment Variables for Databases and IP addresses, we can get IPs from Amazon’s console.
So we did our prep work, and know now that we have our data and put work into our Configuration Management system, let’s bootstrap the instances into Config Manager, in this case Puppet or Chef, it doesn’t matter. We’ll apply our server policies and grab a snack. In the next 5 to 10 minutes I will have an nginx welcome page and empty database services running and can login to these things and move in data or poke around if necessary.
When it’s time to go live, here’s what we do:
- Warm up the services on the AWS, by dropping the Apps into the Web servers using our system accounts and Git.
- Run two commands to dump and copy our MySQL data, and Redis data into the Amazon systems.
- Load the data into the Amazon systems using our system accounts
- Update our DNS settings to the Amazon Infrastructure
All in all this whole thing from provisioning to migration took about a day. The major part was really the DNS updates as we have configure TTLs for 8 hours in some cases. But, If I drop that down to 1 hour or 5 minutes, If there was ever a crisis like AWS is partitioned from the net, how do I move my production to HP Cloud ? We got some ideas on that and will run through that experience and post some thoughts on that as well.
Some of my Lessons Learned through the whole thing:
- VPC and EC2 are different, always go with VPC as a rule if you’re starting out in Amazon. It’s the equivalent of being able to drive clutch or manual, if you can drive clutch or manual you can basically drive anything. EC2 will offer some conveniences, but it may make sense to you later, but it’s case by case.
- Configuration Management of any type is just ideal, I’ve shared some of this configurations and many have shared them to me through Community Sites. Great for getting started on deploying new services we’ve never heard of or wanted to but didn’t have time to install and configure through a HOWTO or README.
- DigitalOcean like Rackspace allows for private interfaces it can let you do things like VPC does but it’s up to you to manage that and it’s not nearly as mature as VPC concepts. Config Management will help you in this regard to make up for lack of maturity in Infrastructure Partner’s offering, Just DIY.
- It’s an ecosystem out there, without the config management people sharing, I would be way too intimidated to start out and build systems and run production on them without it on my own. The knowledge shared on EC2 blogs, their Docs, Digital Oceans community and all the other millions of HOWTOs for running system commands makes this cloud thing possible for everyone. The IaaS folks made this so cheap for us to play and learn. It goes back to that innovating and creativity thing.
- I could replicate these things using the built in features of Redis and MySQL but that time hasn’t come yet. It is all definitely possible with the power of Configuration Management and new Cloud Management Platforms.