Deploying HA WordPress on AWS using Terraform and Salt
— Architecture, Cloud, Hashicorp — 6 min read
Love it or hate it WordPress is still around to stay. It is still the go-to for creating websites, thanks to a massive amount of plugins, themes and experience floating around for it.
While trying to learn anything I have found scripted deployments a great way to learn a product from a different perspective, requiring deeper thought to all the moving parts. This is the first of several posts around automating AWS deployment. I have experience with Cloudformation, but in the quest to always widen my skillset, I have started to work with Terraform instead. I will be using Salt to automate the deployment of WordPress onto the AWS infrastructure.
This article will get you up and running with a High Availability (HA) deployment of WordPress, ready for the next big thing™. It assumes some familiarity with AWS and WordPress.
Design is all about trade-offs
Designing IT Architecture is definitely not black and white, it is a complicated hairball of assumptions, estimates, previous experience and constraints (knowledge, financial and environmental to name a few). Beyond the simplest of design briefs, give two architects the same brief, and you will get two different designs. All design is inherently opinionated.
What does this have to do with our WordPress HA solution? It sounds simple on the surface but is it?...
For example, what would you mean by HA? Stays up during maintenance? Resilient to AWS failure? Resilient to WordPress issue (bad plugin, malformed update)?
What are your constraints? How much is your monthly spend budget? What expertise do you have to keep this solution fed and watered? Realistically, how much downtime can you tolerate? (it might be more than you think).
Out of the above, how much is certain? how much is an estimate (informed or guess)? how much do you think you need?
When approaching this project, I tried to break down each functional component and evaluate the best AWS product for the task. I am trying to evaluate each option based on the AWS Well Architected Framework Pillars, not just what 'sounds right'. System ComponentLow-Fi SolutionGenerally used Solution'Premium' / Exceptional Requirements SolutionWhat would I choose?EdgeDNS Round Robin (Route53) + Health MonitorAWS ELB (Application Load Balancer)AWS ELB (Classic or ALB)For most sites, AWS ELB in Application Mode. It is cost-effective, scales well, requires minimal integration effort and is conceptually well understood. Classic may be required where throughput is required at all costs. Round robin would be required when non-TCP traffic needs to be balanced.ComputeEC2 InstancesEC2 Instances in Auto Scaling GroupEC2 Instances in Auto Scaling GroupUsing an Auto Scaling group is a no-brainer here. The extra learning curve is worth the operational convenience. It won't attract additional cost unless mis-configured.DatabaseEC2 hosted MySQL InstanceRDS Multi AZ deploymentRDS Multi AZ + cross region read replicaI would stump for a Multi AZ Deployment here. If you need to be tolerant of AWS Region failure, then you will need to consider cross-region read replicas (with a custom failover mechanism to promote a read replica and amending the WordPress configuration).WordPress Root StorageN/A, bake into AMIPush / Pull from S3EFSEFS all the way. Higher cost but operationally slicker and less prone to errors. Syncing to S3 using cronjob or similar could have strange concurrency issues if filesystem updates occur on multiple WordPress hosts within a short time frame. Cost-efficient when considering operational advantage, especially when using the IA storage class.Object StorageNo object storage, just serve static resource from EC2 instancesS3 + CloudfrontS3 + CloudfrontS3 all the way here to allow media to then be surfaced via Cloudfront Delivery Network.
Terraform Quickstart
Terraform is delightfully simple to get started with. It is simple to deploy and use, and the syntax is clean.
I'm not going to write a step-by-step how to get Terraform installed and running here, but head over to my channel for a tutorial for Windows and Linux. Instead, I want to cover the principles of the Terraform workflow and how to use modules. I also highly recommend the tutorial track from HashiCorp (Terraform's creator).
Terraform has a simple architecture. It comprises of the terraform
tool which when run within a directory either; dry-runs (plan
), deploys (apply
) or removes (destroy
) the infrastructure defined in one or more modules. These are the 3 core commands which will be required to manage this deployment. There are no special requirements for where the tool is installed or run from, except connectivity to your provider (in this case internet access to AWS).
Apart from the automation benefits of using an Infrastructure as Code tool like Terraform, we should be moving away from reinventing the wheel, and thinking more like a developer, using libraries wherever we can. In Terraform, infrastructure definitions can be wrapped up into a module to allow it to be reused elsewhere, just like a programming library. Terraform has access to a large repository of ready-made, battle-tested modules in the Terraform Registry. This tutorial will be making extensive usage of the AWS Modules available in the Registry. I have no problem in admitting that the authors of these modules likely know AWS and Terraform better than me! This tutorial will use a simplified folder structure which I would adapt to facilitate real-world usage. There are various ways to achieve a structure ready for real-world usage, check out the recommended reading at the end for links.
To this end, our structure is going to look like this;
1├── main.tf2├── outputs.tf3├── modules4│ ├── compute5│ │ ├── main.tf6│ │ ├── userdata.tmpl7│ │ └── variables.tf8│ ├── database9│ │ ├── main.tf10│ │ ├── outputs.tf11│ │ └── variables.tf12│ ├── efs13│ │ ├── main.tf14│ │ ├── output.tf15│ │ └── variables.tf16│ ├── media17│ │ ├── main.tf18│ │ ├── outputs.tf19│ │ └── variables.tf20│ ├── network21│ │ ├── main.tf22│ │ └── outputs.tf23│ └── seeder24│ ├── main.tf25│ └── variables.tf26├── packer27│ ├── aws_vars.json28│ └── template.json29├── salt_tree30│ └── srv31│ ├── pillar32│ └── salt33├── terraform.tfvars34├── wordpress.auto.tfvars 35├── private.auto.tfvars36└── wordpressha.pem
This source tree contains;
main.tf
this contains the entrypoint that all our modules will be called from.
modules
contains our sections of functionality as per our design analysis above.
packer
contains the template for our AMI base image. See Creating a LAMP AMI using Packer and Salt on how to use this.
salt_tree
is used by both Packer and Terraform to configure our WordPress installation on our deployed EC2 instances. You could easily swap this out for a different tool i.e Chef or Puppet and change the provisioner in the Terraform code accordingly.
wordpress.auto.tfvars
contains our configuration values to stand up our solution. Empty fields will need completing before running Terraform.
private.auto.tfvars
contains our secrets. This requires;
- aws_access_key (string)
- aws_secret_key (string)
- ec2_private_key (file path string relative to terraform root)
For example;
aws_access_key = "QWERTYWIBBLE"
aws_secret_key = "WkjwoEWSECRET"
ec2_private_key = "mysshkey.pem"
Deploying to AWS 🎉
That's the theory over, if you've got this far, you're on the home stretch!
Now updated for Terraform 0.12!
Assuming you have installed Terraform and Packer correctly, checkout the code from my Git repository at [https://gitlab.com/fluffy-clouds-and-lines/ha-wordpress-using-terraform-and-salt.git](https://gitlab.com/fluffy-clouds-and-lines/ha-wordpress-using-terraform-and-salt.git).
Before proceeding with the Terraform run, we need an SSH keypair (Terraform cannot currently create them). To create your keypair;
- Logon to your AWS Console,
- Change to your target region, and open EC2,
- Network & Security > Keypairs > Create Key Pair
- Name the Keypair 'wordpressha' and copy the downloaded wordpressha.pem to the directory where the Terraform code has been checked out into,
- On Linux, change permissions to 400 (read only by user).
Next, ensure that wordpress.auto.tfvars
is completed and private.auto.tfvars
is created and completed. Once done, execute;
1> terraform init2# Terraform modules for RDS and VPC don't resolve dependancies correctly, so explictly build VPC first3> terraform apply -target=module.network 4# Deploy Seeder dependancies5> terraform apply -target=module.database -target=module.efs 6# Deploy seeder7> terraform apply -target=module.seeder8# Build WordPress Node Template9> packer build -var-file=./packer/aws_vars.json ./packer/template.json10# Deploy all to make state consistent11> terraform apply
This should take around 15 minutes end to end. This will;
- Build our custom AMI image with all our LAMP (Apache, MySQL, PHP) dependencies baked in,
- Download the external Terraform modules,
- Build the AWS VPC,
- Deploy the S3 bucket and CloudFront distribution,
- Create the application load balancer and auto-scaling group,
- Deploy the RDS MySQL Database instance,
- Create the Elastic Filesystem,
- Deploy the 'WordPress seeder'. This mounts the EFS and installs WordPress so that nodes that are started as part of the auto-scaling group already have the WordPress installation available to them,
- Publish an A record to Route53, linked to the ALB.
You should then be able to browse to http://nextamazing.site/ and see your completed installation.
Don't like the use of -target
? Yes, it's bad;
This targeting capability is provided for exceptional circumstances, such as recovering from mistakes or working around Terraform limitations. It is not recommended to use
-target
for routine operations, since this can lead to undetected configuration drift and confusion about how the true state of resources relates to configuration.
See below on suggestions on how to make this work in a real-life scenario. You really shouldn't take this approach going forward in production usage.
More Design Decisions
There are a few more design decisions that need to be made before this could be considered 'production ready'
- The site should really be running on HTTPS, whether this is done with an AWS Managed Certificate (via ACM) or an externally signed CSR made available to the load balancer via ACM or IAM,
- Although the infrastructure is in place for asset delivery via CloudFront, it is not setup in WordPress as part of this Terraform run. There are several options, both free and paid that will achieve this i.e plugins or custom cron jobs,
- How will you maintain backups? At present, the RDS snapshot defaults will be used. How will you backup your WordPress installation?
Wrapping up...
That's it for now. This should have given a good introduction on how to use Terraform to deploy a full solution on AWS. Earlier I mentioned some simplifications made for the purposes of this blog article, a couple of things to consider;
- The major creative license I have taken here is to create one large module that needs separate components to be called in a specific order to achieve a specific end result. As some of the modules are dependant on each other (there is no reason why you couldn't run
terraform apply
twice and have a successful deployment), I would suggest either breaking this up into distinct modules i.ebase
,seeder
andwordpress
, or use a tool like Terragrunt, - One of the thought leaders in the IaC space, Gruntwork, has developed Terragrunt to improve your Terraform workflow to mitigate potential issues when running in production. One of the big advantages here is being able to compartmentalise Terraform state (the record of what Terraform has deployed) into smaller chunks, to reduce impact in cases of state corruption or loss (a definite possibility). This tool is worth considering in a large, multi-module deployment like this.
Recommended Reading
Terraform Learning Track (HashiCorp)
https://learn.hashicorp.com/terraform/
Terragrunt Documentation
https://github.com/gruntwork-io/terragrunt
How things can go wrong with Terraform state
https://charity.wtf/2016/03/30/terraform-vpc-and-why-you-want-a-tfstate-file-per-env/