Experiences with the new AWS Application Load Balancer

Originally posted on Andrew’s blog @ cloudconsultancy.info

Summary

Recently I had an opportunity to test drive AWS Application load balancer as my client had a requirement for making their websocket application fault tolerant. The implementation was complete windows stack and utilised ADFS 2.0 for SAML authentication however this should not affect other people’s implementation.

The AWS Application load balancer is a fairly new feature which provides layer 7 load balancing and support for HTTP/2 as well as websockets. In this blog post I will include examples of the configuration that I used to implement as well is some of the troubleshooting steps I needed to resolve.

The application load balancer is an independent AWS resource from classic ELB and is defined as aws elbv2 with a number of different properties.

Benefits of Application Load Balancer include:

  • Content based routing, ie route /store to a different set of instances from /apiv2
  • Support for websocket
  • Support for HTTP/2 over HTTPS only (much larger throughput as it’s a single stream multiplexed meaning it’s great for mobile and other high latency apps)
  • Cheaper cost than classic, roughly 10% cheaper than traditional.
  • Cross-zone load balancing is always enabled for ALB.

Some changes that I’ve noticed:

  • Load balancing algorithm used for application load balancer is currently round robin.
  • Cross-zone load balancing is always enabled for an Application Load Balancer and is disabled by default for a Classic Load Balancer.
  • With an Application Load Balancer, the idle timeout value applies only to front-end connections and not the LB-> server connection and this prevents the LB cycling the connection.
  • Application Load balancer is exactly that and performs at Layer 7, so if you want to perform SSL bridge use Classic load balancer with TCP and configure SSL certs on your server endpoint.
  • cookie-expiration-period value of 0 is not supported to defer session timeout to the application. I ended up having to configure the stickiness.lb_cookie.duration_seconds value. I’d suggest making this 1 minute longer than application session timeout, in my example a value of 1860.
  • The X-Forwarded-For parameter is still supported and should be utilised if you need to track client IP addresses, in particular useful if going through a proxy server.

For more detailed information from AWS see http://docs.aws.amazon.com/elasticloadbalancing/latest/userguide/how-elastic-load-balancing-works.html.

Importing SSL Certificate into AWS – Windows

(http://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_server-certs.html )

  1. Convert the existing pkcs key into .pem format for AWS

You’ll need openssl for this, the pfx and the password for the SSL certificate.

I like to use chocolatey as my Windows package manager, similar to yum or apt-get for Windows, which is a saviour for downloading package and managing dependencies in order to support automation, but enough of that, check it out @ https://chocolatey.org/

Once choco is installed I simply execute the following from an elevated command prompt.

“choco install openssl.light”

Thereafter I run the following two commands which breaks out the private and public keys (during which you’ll be prompted for the password):

openssl pkcs12 -in keyStore.pfx -out SomePrivateKey.key –nodes –nocerts

openssl pkcs12 -in keyStore.pfx -out SomePublic.cert –nodes –nocerts

NB: I’ve found that sometimes copy and paste doesn’t work when trying to convert keys.

  1. Next you’ll need to also break out the trust chain into one contiguous file, like the following.
-----BEGIN CERTIFICATE-----

Intermediate certificate 2

-----END CERTIFICATE-----

-----BEGIN CERTIFICATE-----

Intermediate certificate 1

-----END CERTIFICATE-----

-----BEGIN CERTIFICATE-----

Optional: Root certificate

-----END CERTIFICATE-----

Save the file for future use,

thawte_trust_chain.txt

Example attached above is for a Thawte trust chain with the following properties

“thawte Primary Root CA” Thumbprint ‎91 c6 d6 ee 3e 8a c8 63 84 e5 48 c2 99 29 5c 75 6c 81 7b 81

With intermediate

“thawte SSL CA - G2” Thumbprint ‎2e a7 1c 36 7d 17 8c 84 3f d2 1d b4 fd b6 30 ba 54 a2 0d c5

Ordinarily you’ll only have a root and intermediate CA, although sometimes there will be second intermediary CA.

Ensure that your certificates are base 64 encoded when you export them.

  1. Finally execute the following after authenticating to the AWS CLI (v1.11.14+ to support aws elbv2 function) then run “aws configure” applying your access and secret keys, configuring region and format type. Please note that this includes some of the above elements including trust chain and public and private keys.

If you get the error as below

A client error (MalformedCertificate) occurred when calling the UploadServerCertificate operation: Unable to validate certificate chain. The certificate chain must start with the immediate signing certificate, followed by any intermediaries in order. The index within the chain of the invalid certificate is: 2”

Please check the contents of the original root and intermediate keys as they probably still have the headers and maybe some intermediate,

ie

Bag Attributes

localKeyID: 01 00 00 00

friendlyName: serviceSSL

subject=/C=AU/ST=New South Wales/L=Sydney/O=Some Company/OU=IT/CN=service.example.com

issuer=/C=US/O=thawte, Inc./CN=thawte SSL CA - G2

Bag Attributes

friendlyName: thawte

subject=/C=US/O=thawte, Inc./OU=Certification Services Division/OU=(c) 2006 thawte, Inc. - For authorized use only/CN=thawte Primary Root CA

issuer=/C=US/O=thawte, Inc./OU=Certification Services Division/OU=(c) 2006 thawte, Inc. - For authorized use only/CN=thawte Primary Root CA

AWS Application LB Configuration

Follow this gist with comments embedded. Comments provided based on some gotchas during configuration.

You should be now good to go, the load balancer takes a little while to warm up, however will be available within multiple availability zones.

If you have issues connecting to the ALB, validate connectivity direct to the server using curl.  Again chocolatey comes in handy “choco install curl”

Also double check your security group registered against the ALB and confirm NACLS.

WebServer configuration

You’ll need to import the SSL certificate into the local computer certificate store. Some of the third party issuing (Ensign, Thawte, etc) CAs may not have the intermediate CA within the computed trusted Root CAs store, especially if built in a network isolated from the internet, so make sure after installing the SSL certificate on the server that the trust chain is correct.

You won’t need to update local hosts file on the servers to point to the load balanced address.

Implementation using CNAMEs

In large enterprises where I’ve worked there have been long lead times associated with fairly simple DNS changes, which defeats some of the agility provided by cloud computing. A pattern I’ve often seen adopted is to use multiple CNAMEs to work around such lead times. Generally you’ll have a subdomain domain somewhere where the Ops team have more control over or shorter lead times. Within the target domain (Example.com) create a CNAME pointing to an address within the ops managed domain (aws.corp.internal) and have a CNAME created within that zone to point to the ALB address, ie

Service.example.com -> service.aws.corp.internal -> elbarn.region.elb.amazonaws.com

With this approach I can update service.aws.corp.internal to reflect a new service which I’ve built via a new ELB and avoid the enterprise change lead times associated with a change in .example.com.

Viewing AWS CloudFormation and bootstrap logs in CloudWatch

Mature cloud platforms such as AWS and Azure have simplified infrastructure provisioning with toolsets such as CloudFormation and Azure Resource Manager (ARM) to provide an easy way to create and manage a collection of related infrastructure resources. Both tool sets allow developers and system administrators to use JavaScript Object Notation (JSON) to specify resources to provision, as well as provide the means to bootstrap systems, effectively allowing for single click fully configured environment deployments.

While these toolsets are an excellent means to prevent RSI from performing repetitive monotonous tasks, the initial writing and testing of templates and scripts can be incredibly time consuming. Troubleshooting and debugging bootstrap scripts usually involves logging into hosts and checking log files. These hosts are often behind firewalls, resulting in the need to use jump hosts which may be MFA integrated, all resulting in a reduced appetite for infrastructure as code.

One of my favourite things about Azure is the ability to watch the ARM provisioning and host bootstrapping process through the console. Unless there’s a need to rerun a script on a host and watch it in real time, troubleshooting the deployment failure can be performed by viewing the ARM deployment history or viewing the relevant Virtual Machine Extension. Examples can be seen below:

This screenshot shows the ARM resources have been deployed successfully.

This screenshot shows the ARM resources have been deployed successfully.

This screenshot shows the DSC extension status, with more deployment details on the right pane.

This screenshot shows the DSC extension status, with more deployment details on the right pane.

While this seems simple enough in Azure, I found it a little less straight forward in AWS. Like Azure, bootstrap logs for the instance reside on the host itself, however the logs aren’t shown in the console by default. Although there’s a blog post on AWS to view CloudFormation logs in CloudWatch, it was tailored to Linux instances. Keen for a similar experience to Azure, I decided to put together the following instructional to have bootstrap logs appear in CloudWatch.

To enable CloudWatch for instances dynamically, the first step is to create an IAM role that can be attached to EC2 instances when they’re launched, providing them with access to CloudWatch. The following JSON code shows a sample policy I’ve used to define my IAM role.

The next task is to create a script that can be used at the start of the bootstrap process to dynamically enable the CloudWatch plugin on the EC2 instance. The plugin is disabled by default and when enabled, requires a restart of the EC2Config service. I used the following script:

It’s worth noting that the EC2Config service is set to recover by default and therefore starts by itself after the process is killed.

Now that we’ve got a script to enable the CloudWatch plugin, we need to change the default CloudWatch config file on the host prior to enabling the CloudWatch plugin. The default CloudWatch config file is AWS.EC2.Windows.CloudWatch.json and contains details of all the logs that should be monitored as well as defining CloudWatch log groups and log streams. Because there’s a considerable number of changes made to the default file to achieve the desired result, I prefer to create and store a customised version of the file in S3. As part of the bootstrap process, I download it to the host and place it in the default location. My customised CloudWatch config file looks like the following:

Let’s take a closer look at what’s happening here. The first three components are windows event logs I’m choosing to monitor:

You’ll notice I’ve included the Desired State Configuration (DSC) event logs, as DSC is my preferred configuration management tool of choice when it comes to Windows. When defining a windows event log, a level needs to be specified, indicating the verbosity of the output. The values are as follows:

1 – Only error messages uploaded.
2 – Only warning messages uploaded.
4 – Only information messages uploaded.

You can add values together to include more than one type of message. For example, 3 means that error messages (1) and warning messages (2) get uploaded. A value of 7 means that error messages (1), warning messages (2), and information messages (4) get uploaded. For those familiar with Linux permissions, this probably looks very familiar! 🙂

To monitor other windows event logs, you can create additional components in the JSON template. The value of “LogName” can be found by viewing the properties of the event log file, as shown below:

img_57c431da7dfb4

The next two components monitor the two logs that are relevant to the bootstrap process:

Once again, a lot of this is self explanatory. The “LogDirectoryPath” specifies the absolute directory path to the relevant log file, and the filter specifies the log filename to be monitored. The tricky thing here was getting the “TimeStampFormat” parameter correct. I used this article on MSDN plus trial and error to work this out. Additionally, it’s worth noting that cfn-init.log’s timestamp is the local time of the EC2 instance, while EC2ConfigLog.txt takes on UTC time. Getting this right ensures you have the correct timestamps in CloudWatch.

Next, we need to define the log groups in CloudWatch that will hold the log streams. I’ve got three separate Log Groups defined:

You’ll also notice that the Log Streams are named after the instance ID. Each instance that is launched will create a separate log stream in each log group that can be identified by its instance ID.

Finally, the flows are defined:

This section specifies which logs are assigned to which Log Group. I’ve put all the WindowsEventLogs in a single Log Group, as it’s easy to search based on the event log name. Not as easy to differentiate between cfn-init.log and EC2ConfigLog.txt entries, so I’ve split them out.

So how do we get this customised CloudWatch config file into place? My preferred method is to upload the file with the set-cloudwatch.ps1 script to a bucket in S3, then pull them down and run the PowerShell script as part of the bootstrap process. I’ve included a subset of my standard cloudformation template below, showing the monitoring config key that’s part of the ConfigSet.

What does this look like in the end? Here we can see the log groups specified have been created:

img_57c4d1ddcd273

If we drill further down into the cfninit-Log-Group, we can see the instance ID of the recently provisioned host. Finally, if we click on the Instance ID, we can see the cfn-init.log file in all its glory. Yippie!

img_57c4d5c4c3b89

Hummm, looks like my provisioning failed because a file is non-existent. Bootstrap monitoring has clearly served its purpose! Now all that’s left to do is to teardown the infrastructure, remediate the issue, and reprovision!

The next step to reducing the amount of repetitive tasks in the infrastructure as code development process is a custom pipeline to orchestrate the provisioning and teardown workflow… More on that in another blog!

AWS CloudFormation AWS::RDS::OptionGroup Unknown option: Mirroring

Amazon recently announced Multi-AZ support for SQL Server in Sydney, which provides high availability for SQL RDS instances using SQL Server mirroring technology. In an effort to make life simpler for myself, I figured I’d write a CloudFormation template for future provisioning requests, however it wasn’t as straight forward as I’d expected.

I began by trying to guess my way through the JSON resources, based on what I’d already knew for MySQL deployments. I figured the MultiAZ property was still relevant, so I hacked together a template and attempted to provision the stack, which failed, indicating the following error:

CREATE_FAILED        |  Invalid Parameter Combination: MultiAZ property cannot be used with SQL Server DB instances, use the Mirroring option in an option group associated with the DB instance instead.

The CloudFormation output clearly provides some guidance on the correct parameters required to enable mirroring in SQL Server. I had a bit of trouble tracking down documentation for the mirroring option, but after crawling the web for sample templates, I managed to put together the correct CloudFormation template, which can be seen below.

Excited and thinking I had finalised my template, I attempted to create the stack in ap-southeast-2 (Sydney, Australia), only for it to fail with a different error this time…

CREATE_FAILED        | Unknown option: Mirroring

Finding this output strange, I attempted to run the CloudFormation template in eu-west-1, which completed successfully.

With no options left, I decided to contact AWS Support who indicated that this is a known issue in the ap-southeast-2 region, which is also evident when attempting to create an option group in the GUI and the dropdown box is greyed out, as shown below.

What it should look like:

What it currently looks like:

The suggested workaround is to manually create the SQL RDS instance in the GUI which provides the mirroring option in the deployment wizard. Although the limitation is being treated with priority, there’s no ETA for resolution at the moment.

Hopefully this assists someone out there banging their head against the wall trying to get this to work!

Create AWS CloudFormation Templates with Visual Studio

Background

AWS CloudFormation is a wonderful service for automating your AWS builds – my colleagues have done a number of detailed walk-throughs in other blog posts.

AWS also provides a toolkit for Visual Studio as an extension of the IDE.  To get started, configure the extension with your AWS IAM Access Key ID and Secret Key and you will be able to use the new AWS explorer pane to explore all AWS services such as VPC, EC2, RDS, etc.

By installing the toolkit, it also automatically installed the AWS .NET SDK which included libraries to develop apps using AWS services using .NET classes.  With the AWS SDK support on the .NET platform, building applications and infrastructure leveraging AWS services using .NET easier.

AWSVisualStudio

Create and deploy your CloudFormation template in Visual Studio

To create a new CloudFormation template in Visual Studio, you simply add a new project: select AWS — File — New — Project.  Navigate to Templates — AWS and select AWS CloudFormation Project.

NewProject

Once the project is created, you will be presented with the goodness of Visual Studio including Intellisense!

IntelliSense

To deploy the template, right click the template and select Deploy to AWS CloudFormation

Deploy

Troubleshooting notes

I came across an error whenever I deployed a new AWS CloudFormation template created in Visual Studio (I am using Visual Studio 2012 Premium edition).  The error indicated a syntax error; and after validating my template – it is clear that it is not a formatting error.

Deploying the same template directly on the AWS console or via an AWS Powershell command (test-cfntemplate) rendered the same result:

Error

Test-CFNTemplate : Template format error: JSON not well-formed. (line 1, column 2)
 At line:1 char:1
 + Test-CFNTemplate -TemplateURL "https://s3-ap-southeast-1.amazonaws.com/vicperdan ...
 + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 + CategoryInfo : InvalidOperation: (Amazon.PowerShe...NTemplateCmdlet:TestCFNTemplateCmdlet) [Test-CFNTem
 plate], InvalidOperationException
 + FullyQualifiedErrorId : Amazon.CloudFormation.AmazonCloudFormationException,Amazon.PowerShell.Cmdlets.CFN.TestCF
 NTemplateCmdlet

Finding the solution took some searching, until I found a post indicating that this is caused by the default encoding used by Visual Studio: UTF-8 with BOM (Byte-Order-Mark).  Changing the default to UTF-8 without BOM fixed the issue.  This can be changed by selecting File — Advanced Save Options in Visual Studio.

UTF

Automate your Cloud Operations Part 2: AWS CloudFormation

Stacking the AWS CloudFormation

Automate your Cloud Operations blog post Part 1 have given us the basic understanding on how to automate the AWS stack using CloudFormation.

This post will help the reader on how to layer the stack on top of the existing AWS CloudFormation stack using AWS CloudFormation instead of modifying the base template. AWS resources can be added into existing VPC using the outputs detailing the resources from the main VPC stack instead of having to modify the main template.

This will allow us to compartmentalize, separate out any components of AWS infrastructure and again versioning all different AWS infrastructure code for every components.

Note: The template I will use for this post for educational purposes only and may not be suitable for production workloads :).

Diagram below will help to illustrate the concept:

CloudFormation3

Bastion Stack

Previously (Part 1), we created the initial stack which provide us the base VPC. Next, we will  provision bastion stack which will create a bastion host on top of our base VPC. Below are the components of the bastion stack:

  • Create IAM user that can find out information about the stack and has permissions to create KeyPairs and actions related.
  • Create bastion host instance with the AWS Security Group enable SSH access via port 22
  • Use CloudFormation Init to install packages, create files and run commands on the bastion host instance also take the creds created for the IAM user and setup to be used by the scripts
  • Use the EC2 UserData to run the cfn-init command that actually does the above via a bash script
  • The condition handle: the completion of the instance is dependent on the scripts running properly, if the scripts fail, the CloudFormation stack will error out and fail

Below is the CloudFormation template to build the bastion stack:

Following are the high level steps to layer the bastion stack on top of the initial stack:

I put together the following video on how to use the template:

NAT Stack

It is important to design the VPC with security in mind. I recommend to design your Security Zone and network segregation, I have written a blog post regarding how to Secure Azure Network. This approach also can be impelemented on AWS environment using VPC, subnet and security groups. At the very minimum we will segregate the Private subnet and Public subnet on our VPC.

NAT instance will be added to our Initial VPC “public” subnets so that the future private instances can use the NAT instance for communication outside the Initial VPC. We will use exact same method like what we did on Bastion stack.

Diagram below will help to illustrate the concept:

CloudFormation4

The components of the NAT stack:

  • An Elastic IP address (EIP) for the NAT instance
  • A Security Group for the NAT instance: Allowing ingress traffic tcp, udp from port 0-65535 from internal subnet ; Allowing egress traffic tcp port 22, 80, 443, 9418 to any and egress traffic udp port 123 to Internet and egress traffic port 0-65535 to internal subnet
  • The NAT instance
  • A private route table
  • A private route using the NAT instance as the default route for all traffic

Following is the CloudFormation template to build the stack:

Similar like the previous steps on how to layer the stack:

Hopefully after reading the Part 1 and the Part 2 of my blog posts, the readers will gain some basic understanding on how to automate the AWS cloud operations using AWS CloudFormation.

Please contact Kloud Solutions if the readers need help with automating AWS production environment

http://www.wasita.net

Automate your Cloud Operations Part 1: AWS CloudFormation

Operations

What is Operations?

In the IT world, Operations refers to a team or department within IT which is responsible for the running of a business’ IT systems and infrastructure.

So what kind of activities this team perform on day to day basis?

Building, modifying, provisioning, updating systems, software and infrastructure to keep them available, performing and secure which ensures that users can be as productive as possible.

When moving to public cloud platforms the areas of focus for Operations are:

  • Cost reduction: if we design it properly and apply good practices when managing it (scale down / switch off)
  • Smarter operation: Use of Automation and APIs
  • Agility: faster in provisioning infrastructure or environments by Automating the everything
  • Better Uptime: Plan for failover, and design effective DR solutions more cost effectively.

If Cloud is the new normal then Automation is the new normal.

For this blog post we will focus on automation using AWS CloudFormation. The template I will use for this post for educational purposes only and may not be suitable for production workloads :).

AWS CloudFormation

AWS CloudFormation provides developers and system administrators DevOps an easy way to create and manage a collection of related AWS resources, including provisioning and updating them in an orderly and predictable fashion. AWS provides various CloudFormation templates, snippets and reference implementations.

Let’s talk about versioning before diving deeper into CloudFormation. It is extremely important to version your AWS infrastructure in the same way as you version your software. Versioning will help you to track change within your infrastructure by identifying:

  • What changed?
  • Who changed it?
  • When was it changed?
  • Why was it changed?

You can tie this version to a service management or project delivery tools if you wish.

You should also put your templates into source control. Personally I am using Github to version my infrastructure code, but any system such as Team Foundation Server (TFS) will do.

AWS Infrastructure

The below diagram illustrates the basic AWS infrastructure we will build and automate for this blog post:

CloudFormation1

Initial Stack

Firstly we will create the initial stack. Below are the components for the initial stack:

  • A VPC with CIDR block of 192.168.0.0/16 : 65,543 IPs
  • Three Public Subnets across 3 Availability Zones : 192.168.10.0/24, 192.168.11.0/24,  192.168.12.0/24
  • An Internet Gateway attached to the VPC to allow public Internet access. This is a routing construct for VPC and not an EC2 instance
  • Routes and Route tables for three public subnets so EC2 instances in those public subnets can communicate
  • Default Network ACLs to allow all communication inside of the VPC.

Below is the CloudFormation template to build the initial stack.

The template can be downloaded here: https://s3-ap-southeast-2.amazonaws.com/andreaswasita/cloudformation_template/demo/lab1-vpc_ELB_combined.template

I put together the following video on how to use the template:

Understanding a CloudFormation template

AWS CloudFormation is pretty neat and FREE. You only need to pay for the AWS resources provisioned by the CloudFormation template.

The next bit is understanding the Structure of the template. Typically CloudFormation template will have 5 sections:

  • Headers
  • Parameters
  • Mappings
  • Resources
  • Outputs

Headers: Example:

Parameters: Provision-time spec command-line options. Example:

Mappings: Conditionals Case Statements. Example:

Resources: All resources to be provisioned. Example:

Outputs: Example:

Note: Not all AWS Resources can be provisioned using AWS CloudFormation and it has the following limitations.

In Part 2 we will deep dive further on AWS CloudFormation and automating the EC2 including the configuration for NAT and Bastion Host instance.

http://www.wasita.net

Updating your AWS bootstrap

In Bootstrapping AWS we looked at what’s required to kick off a brand new installation with your latest build.  But it’s two weeks later now – and you’re about to release version 2 of the application.  Using the Cloud Formation script we created first, it’s actually quite easy.

In the first build script, there was a reference in the CloudFormation Metadata to the website source – being {“Ref” : “BuildNumber”}.

"Parameters" : {
  "BuildNumber" : {
  "Type" : "Number"
  }
}

So the process is as follows.

  1. Execute your Cloud Formation Script
  2. Bump the build number from 1 to 2
  3. Perform a termination operation using powershell – ok, the below is a little brutal – but it does achieve the desired result.
    Get-ASAutoScalingInstance | where-object {$_.AutoScalingGroupName -eq 'Blog-WebServerScalingGroup-AHAW1FUMED64'} | foreach {Stop-EC2Instance -instance $_.InstanceId -Terminate}

What’s just happened is that we’ve updated the cloudformation metadata to reference the new source zip file.  Then, we’ve terminated all the instances in our scaling group.  Autoscale will kick in and ensures the minimum number of instances are running.   It starts up two new instances with the new CloudFormation BuildNumber and Release 2 source.  You could of course terminate these one at a time, maintaining up time, but this is quicker to demonstrate.

The net result of this is that we end up with two new brand new instances kicked off using release 2.  But if you’re running a lot of servers, this can be a little time consuming and there’s potentially a lot of extra hourly dollars you’re being charged for.  (Bring on micro charging!).  But, there’s a better way already baked into the EC2 toolset on the base AMIs.  It’s called cfn-hup. cfn-hup is a service that polls the CloudFormation meta-data for changes, and then executes some actions when a change is detected.  Let’s use our existing update process and roll out a better method using cfn-hup via CloudFormation.

So to start – we need to configure cfn-hup on our instances.  But in line with the objectives of bootstrapping, we’ll do it via a launch configuration and CloudFormation, so that we can roll it out easily.

Returning to our original CloudFormation script and the instance metadata, we first of all need to ensure that cfn-hup has the configuration we require when we start it.  Cfn-hup looks in the c:\cfn folder for a file called cfn_hup.conf.   That file tells the service where to find the metadata, and how often to look at it.  We need to create that file.  So returning to our CloudFormation script, we add under AWS::CloudFormation::Init -> “config” a new “files” element like this

"files" : {
  "c:\\cfn\\cfn-hup.conf" : {
    "content" : { "Fn::Join" : ["", [
      "[main]\n",
      "stack=", { "Ref" : "AWS::StackId" }, "\n",
      "region=", { "Ref" : "AWS::Region" }, "\n",
      "interval=1"
    ]]}
  },
  "c:\\cfn\\hooks.d\\cfn-auto-reloader.conf" : {
    "content": { "Fn::Join" : ["", [
      "[cfn-auto-reloader-hook]\n",
      "triggers=post.update\n",
      "path=Resources.WebServerLaunchConfiguration.Metadata.AWS::CloudFormation::Init\n",
      "action=cfn-init.exe -v -s ", { "Ref" : "AWS::StackId" },
                         " -r WebServerLaunchConfiguration",
                         " --region ", { "Ref" : "AWS::Region" }, "\n"
      ]]}
    }
  }

The first part creates the cfn-hup configuration.  Using CloudFormation pseudo parameters, we’re able to pass in the region and the stack to tell cfn-hup where to look.  And we use the interval to tell us how frequently (in minutes) to check for updates.

The next file we add is a hook into any changes identified by cfn-hup.   We’re going to trigger an action post any detected update in elements below the “path” – ie – Resources.WebServerLaunchConfigurtion.Metadata.  Note that BuildNumber is below Metadata in the original CloudFormation script.   If you look closely, you’ll see the action we fire is one of the same actions we kicked off when we first bootstrapped the instance – ie cfn-init.  This means we get to leverage all the cfn-init goodness we used earlier plus, leverage any additional magic in the future.

A key thing to note here is that the script in UserData will not be re-executed.  This is important because we don’t want to configure the website all over again.  Userdata only executes once per instance by default.

If you were to run this update into CloudFormation now, it’s still not going to execute though.  We need to ensure that the cfn-hup service is running.  We do that by adding a “services” section under config.  (At the same level as the “files” action we previously added)

"services" : {
  "windows" : {
    "cfn-hup" : {
      "enabled" : "true",
      "ensureRunning" : "true",
      "files" : ["c:\\cfn\\cfn-hup.conf", "c:\\cfn\\hooks.d\\cfn-auto-reloader.conf"]
    }
  }
}

The above script declares that the cfn-hup service is to be enabled and running.   We also declare that the service is dependent and will monitor any changes to the “files” listed.  Any changes to those files will initiate a cfn-hup restart. Now we are ready to roll it out.

We perform the roll out by executing the same steps we executed earlier.   Perform an update and terminate the instances.  The build number can be left the same.  Once you’ve done this, auto-scale will bring up new instances with the new launch configuration that will monitor the metadata, and update themselves without further manual intervention. 

So next fortnight – when you roll out version 3, you only need to perform an update by re-executing the CloudFormation script, and giving it the new build number.  The running instances (however many there may be) will perform an in place upgrade to themselves – without needing to terminate or create new instances.

Bootstrapping on AWS

The Task

This post is going to look at the process of hosting a highly available corporate website using Windows Server 2012 Amazon Machine Image (AMI), and bootstrapping the installation of Internet Information Services (IIS), urlrewrite, and our website. We don’t need a golden image as we release software every week. We also want to make sure that it is a high availability solution, so we need to look at scaling groups and repeatability.

High availability cloud formation stack for web servers

High availability cloud formation stack for web servers

Our high availability solution will contain one load balancer, and a minimum of two Elastic Compute Cloud (EC2) instances across multiple availability zones (AZ’s).  In the case, our stack will be created using CloudFormation, and by utilising this, we’ll be able to repeat the process across multiple environments. We’re also going to make a couple of assumptions.

  1. The website source code has been built and zipped into files with the build number being the name – ie – 1.zip
  2. The above build has been uploaded into Simple Storage Service (S3)
  3. Appropriate permissions for the instance to access to S3 and the objects have already been applied

The Process

We start with a blank CloudFormation template

{
  "AWSTemplateFormatVersion" : "2010-09-09",
  "Description" : "",
  "Parameters" : {
  },
  "Resources" : {
  },
  "Outputs" : {
  }
}

We start with the fundamentals – the first thing we need is an elastic load balancer. We define this under the “Resources” section of the template.

...
  "Resources" : {
    "WebLoadBalancer" : {
      "Type" : "AWS::ElasticLoadBalancing::LoadBalancer",
      "Properties" : {
        "AvailabilityZones" : {"Fn::GetAZs" : ""},
        "Listeners" : [{
          "LoadBalancerPort" : "80",
          "InstancePort" : "80",
          "Protocol" : "HTTP"
        }]
      }
    }
  }

The second thing we need are instances that are going to respond to the load balancer.  A key point here though is that even though CloudFormation allows us to create instances directly, we don’t want to do that.  By doing that, we lose the ability that the AWS magic sauce provides to ensure that a minimum number of instances are running.  So instead – we need two pieces.  A scaling group to manage the availability of the instances, and a launch configuration to allow the scaling group to launch new instances.  We’ll start with the scaling group.

"WebServerScalingGroup" : {
    "Type" : "AWS::AutoScaling::AutoScalingGroup",
    "Properties" : {
      "AvailabilityZones" : { "Fn::GetAZs" : "" },
      "LaunchConfigurationName" : { "Ref" : "WebServerLaunchConfiguration" },
      "MinSize" : "2",
      "MaxSize" : "4",
      "LoadBalancerNames" : [{"Ref" : "WebLoadBalancer"}]
    }
  }

The above piece of code creates a scaling group across all availability zones (Fn::GetAZs) with a minimum of two instances and a maximum of four. It will respond to requests by the WebLoadBalancer that we defined earlier, and when required, will launch new instances by using the WebServerLaunchConfiguration which we will define next.  And this is where it starts to get tricky. First – we’re going to use a standard base Windows Server 2012 AMI – so that when AWS upgrade it, we can just plug in the later version.  Out of the box, this is missing IIS, and also – critically for us, the UrlRewrite module.  The IIS installation is relatively easy as Windows provides management through powershell, but the installation of UrlRewrite and our source code becomes a little more tricky. We’ll start with the basic CloudFormation script, and – using the userdata, bootstrap the IIS installation.

  "WebServerLaunchConfiguration" : {
    "Type" : "AWS::AutoScaling::LaunchConfiguration",
    "Properties" : {
      "IamInstanceProfile" : { "Ref" : "WebServerProfile" },
      "ImageId" : "ami-1223b028",
      "InstanceType" : "m1.small",
      "KeyName" : "production",
      "SecurityGroups" : ["web", "rdp"],
      "UserData" : { "Fn::Base64" : { "Fn::Join" : ["", [
        "<script>\n",
          "powershell.exe add-windowsfeature web-webserver -includeallsubfeature -logpath $env:temp\\webserver_addrole.log \n",
          "powershell.exe add-windowsfeature web-mgmt-tools -includeallsubfeature -logpath $env:temp\\mgmttools_addrole.log \n",
          "cfn-init.exe -v -s ", {"Ref" : "AWS::StackId"}, " -r WebServerLaunchConfiguration --region ", {"Ref" : "AWS::Region"}, "\n",
        "</script>\n",
        "<powershell>\n",
          "new-website -name Test -port 80 -physicalpath c:\\inetpub\\Test -ApplicationPool \".NET v4.5\" -force \n",
          "remove-website -name \"Default Web Site\" \n",
          "start-website -name Test \n",
        "</powershell>"
      ]]}
      "Metadata" : {
        "BuildNumber" : { "Ref" : "BuildNumber" },
        "AWS::CloudFormation::Authentication" : {
          "default" : {
            "type" : "s3",
            "buckets" : ["test-production-artifacts"],
            "roleName" : { "Ref" : "BuildAccessRole" }
           }
        },
        "AWS::CloudFormation::Init" : {
          "config" : {
            "sources" : {
              "c:\\inetpub\\test" : {"Fn::Join" : ["",[
                "https://artifacts.s3.amazonaws.com/", {"Ref":"BuildNumber"},".zip"
              ]]}
            },
            "packages" : {
              "msi" : {
                "urlrewrite" : "http://download.microsoft.com/download/6/7/D/67D80164-7DD0-48AF-86E3-DE7A182D6815/rewrite_2.0_rtw_x64.msi"
              }
            },
          }
        } 
      }
    }	
  },

The above template is a fairly standard Launch Configuration for a base installation. The customization comes when we start looking at the UserData property. User data has to be passed as a Base64 string – hence the cryptic Fn::Base64 command at the top. (Be nice if AWS could just accept straight strings and encode it if necessary). Code that is enclosed in <script> tags is executed as a batchfile, and code enclosed in <powershell> is executed via powershell.  Both the batch and the powershell scripts are executed with elevated privileges.  I’ve used both types mainly for the illustration.

The first part of the script installs the web server components and all its sub features and logs into the administrators temp folder.  This is standard windows powershell scripting of IIS. The second component is cfn-init.  This is a helper script which parses the AWS::CloudFormation::Init section of the metadata provided. In the Metadata section, we include a Build Number which is attached to each instance and provided as a parameter to the cloud formation script.

There is also a “source” element, which is a link to a zip file built dynamically based on the build number ie – http://artifacts.s3.amazonaws.com/1.zip representing build 1.  The cfn-init script knows to expand the zip file into the location provided by the name of the property.  The next element is the packages property – and the urlrewrite property provides a downloadable msi for the UrlRewrite module. cfn-init will execute this msi, and install the module into our previously configured server.

After the completion of the script components – being the configuration of IIS, the installation of the source code, and the installation of the module, the bootstrap process moves onto the powershell section of the script.  Again – back to basic IIS configuration, we create a new website pointing to the source download folder, delete the old one, and start it up.  The final part is to add the build number as a parameter to the “parameters” section.  This represents the name of the file (without the .zip extension).

  "Parameters" : {
    "BuildNumber" : {
      "Type" : "Number"
    }
  },

So to release a new build, you can now copy 2.zip into s3, and run a cloud formation update passing in 2 as the build parameter. Once done, drop each of the servers, and the autoscaling group will automagically create brand new servers, but this time, they’ll install 2.zip into the website.

Next post will look at automagically updating the existing servers when the build number is updated.