A Way to Keep Logs Safe on Disposable Servers

Automatic replacement of failed cloud configuration items is a life-saver. Having items recover themselves with no ops team intervention can be a life-saver too, and not to mention a sleep-saver. Relieved from the responsibility of having to restore service, the only outstanding task is often to explain what happened.

What if the thing that failed was an EC2 application server running RedHat and the logs were on the server’s now-replaced volumes though? The contents of /var/log are gone, and while we might be capturing them in a log aggregator like Splunk or a syslog system of some sort, those aren’t always simple to compile into a report or send to an application vendor for a post mortem.  I had this problem recently and the solution was to move logs off the instance using systemd.path, or path unit configuration. This replaces an old concept of inotify tools, which aren’t available in EC2 RedHat instance’s repositories.

Using path unit configuration we configure three things;

  1. A path watcher that that sets out what directory we’re going to watch for changes
  2. A service file that describes a script to run when that path has any changes
  3. A script that conditionally copies or moves contents off to a file share or somewhere safe in case our instance is retired, whether that was deliberate or a nasty surprise

The path watcher

Create a file in /etc/systemd/system called something descriptive like logsaver.path, with the following contents.

[Unit]

Description= Triggers a service that keeps historical log files out of the blast radius of an instance replacement.

Documentation= man:systemd.path

[Path]

PathModified=/var/log/fragileapp

[Install]

WantedBy=multi-user.target

This has some metadata about what we’re trying to achieve, then a path, and some information about how it’s targeted.

The service

The next step is to create a file in /etc/systemd/logsaver.service, with the following contents

[Unit]

Description= Starts our log rescue script

Documentation= man:systemd.service

[Service]

Type=oneshot

ExecStart=/home/kloud/rescueRotatedLogs.sh

This is some metadata again saying what we’re doing, and a service which runs once (oneshot), the script specified in ExecStart.

Now we’ve got a path to watch, and the location of a script, let’s write the actual script.

The script

In our rescueRotatedLogs.sh script, we can do the following;

#!/bin/bash

# Move files with a datestamp like 2015-04-03 in them somewhere else

for file in $(ls /var/log/fragileapp | grep -E "20[0-9]{2}\-(0[1-9]|1[0-2])\-([0-2][0-9]|3[0-1])")

do

  mv /var/log/fragileapp/"$file" /mnt/aMountpointToAShareOrDFSNS

done

I know what you’re thinking, regex! But it doesn’t need to be complicated. Linux system logs are usually straightforward with an application.log and then a bunch of 2015-03-22-application.log.gz or other format that means the logs have been rotated out and the app is currently writing to the main log. These are the ones that the vendor is going to need to diagnose, or the post incident review committee is going to want, and the regex just finds anything with an even vaguely datelike string in the name (that regex’ll find invalid dates too like 2099, but hopefully the instance didn’t crash because the app writes weird dates).

Once we have this all set up, we kick it off by starting our path watcher;

systemctl start logsaver.path

Now if you did this in a test system you’ll notice that all the rotated logs with an ISO 8601 ‘yyyy-mm-dd’ formatted date in them just got sent to the share. You could replace;

mv /var/log/fragileapp/"$file" /mnt/aMountpointToAShareOrDFSNamespace

with

aws s3 put /var/log/fragileapp/"$file" s3://AnS3Bucket –sse AES256

If you have the AWS CLI tools installed on your instances and you want to be fully cloud (and who doesn’t).

Or maybe you’re multi-cloud

az storage blob upload --container-name $aContainerYouMadeBefore --file $file --name $blob_name

You could also replace the grepping for a regex with just moving any zipped files with the extensions zip or tar.gz.  Or anything with an extension of .1 – .12 to handle logrotate rotated files. Implementing this, we always have the historical logs of our disposable instance, in a non-disposable place, and are better positioned to understand what happened to our servers, without being overly attached to them.

 

 

 

API Mocking for Developers

API is the most common practice to exchange messages in a microservices architecture world. There are actually two different approaches for API development. One is called Model First and the other is called Design First. Usually the latter, AKA Spec-Driven Development (SDD), is preferred over the former.

When is the Model First approach useful? If you are running legacy API applications, this would be a good example of using this approach. If those systems are well documented, API documents can be easily extracted by tools like Swagger which is now renamed to Open API. There are many implementations for Swagger, Swashbuckle for example, so it’s very easy to extract API spec document using those tools.

What if we are developing a new API application? Should I develop the application and extract its API spec document like above? Of course we can do like this. There’s no problem at all. If the API spec is updated, what should we do then? The application should be updated first then the updated API spec should be extracted, which might be expensive. In this case, the Design First approach would be useful because we can make all API spec clearly before the actual implementation, which we can reduce time and cost, and achieve better productivity. Here’s a rough diagram how SDD workflow looks like:

  1. Design API spec based on the requirements.
  2. Simulate the API spec.
  3. Gather feedback from API consumers.
  4. Validates the API spec if it meets the requirements.
  5. Publish the API spec.

There is an interesting point. How can we run/simulate the API without the actual implementation? Here we can introduce API mocking. By mocking APIs, front-end developers, mobile developers or other back-end developers who actually consume the APIs can simulate expected result, regardless that the APIs are working or not. Also by mocking those APIs, developers can feed back more easily.

In this post, we will have a look how API mocking features work in different API management tools such as MuleSoft API Manager, Azure API Management and Amazon API Gateway, and discuss their goods and bads.

MuleSoft API Manager + RAML

MuleSoft API Manager is a strong supporter of RAML(RESTful API Modelling Language) spec. RAML spec version 0.8 is the most popular but recently its Spec version 1.0 has been released. RAML basically follows the YAML format, which brings more readability to API spec developers. Here’s a simple RAML based API spec document.

According to the spec document, API has two different endpoints of /products and /products/{productId} and they define GET, POST, PATCH, DELETE respectively. Someone might have picked up in which part of the spec document would be used. Each response body has its example node. These nodes are used for mocking in MuleSoft API Manager. Let’s see how we can mock both API endpoints.

First of all, login to Anypoint website. You can create a free trial account, if you want.

Navigate to API Manager by clicking either the button or icon.

Click the Add new API button to add a new API and fill the form fields.

After creating an API, click the Edit in API designer link to define APIs.

We have already got the API spec in RAML, so just import it.

The designer screen consists of three sections – left, centre, and right. The centre section shows the RAML document itself while the right section displays how RAML is visualised as API document. Now, click the Mocking Service button at the top-right corner to activate the mocking status. If mocking is enabled, the original URL is commented out and a new mocking URL is added like below:

Of course, when the mocking is de-activated, the mocking URL disappears and the original URL is uncommented. With this mocking URL, we can actually send requests through Postman for simulation. In addition to this, developers consuming this API don’t have to wait until it’s implemented but just using it. The API developer can develop in parallel with front-end/mobile developers. This is the screenshot of using Postman to send API requests.

As defined in the example nodes of RAML spec document, the pre-populated result is popping out. Like the same way, other endpoints and/or methods return their mocked results.

So far, we have looked how MuleSoft API Manager can handle RAML and API mocking. It’s very easy to use. We just imported a RAML spec document and switched on the mocking feature, and that’s it. Mocking can’t be easier that this. It also supports Swagger 2.0 spec. When we import a Swagger document, it is automatically converted to a RAML 1.0 document. However, in the API designer, we still have to use RAML. It would be great if MuleSoft API Manager supports to edit Swagger spec documents out of the box sooner rather than later, which is a de-facto standard nowadays.

There are couple of downsides of using API Manager. We can’t have precise controls on individual endpoints and methods.

  • “I want to mock only the GET method on this specific endpoint!”
  • “Nah, it’s not possible here. Try another service.”

Also, the API designer changes the base URL of API, if we activate the mocking feature. It might be critical for other developers consuming the API for their development practice. They have to change the API URL once the API is implemented.

Now, let’s move onto the example supporting Swagger natively.

Azure API Management + Swagger

Swagger has now become Open API, and spec version 2.0 is the most popular. Recently a new spec version 3.0 has been released for preview. Swagger is versatile – ie. it supports YAML and JSON format. Therefore, we can design the spec in YAML and save it into a JSON file so that Azure API Management can import the spec document natively. Here’s the same API document written in YAML based on Swagger spec.

It looks very similar to RAML, which includes a examples node in each endpoint and/or method. Therefore, we can easily activate the mocking feature. Here’s how.

First of all, we need to create an API Management instance on Azure Portal. It takes about half an hour for provisioning. Once the instance is fulfilled, click the APIs - PREVIEW blade at the left-hand side to see the list of APIs.

We can also see tiles for API registration. Click the Open API specification tile to register API.

Upload a Swagger definition file in JSON format, say swagger.json, and enter appropriate name and suffix. We just used shop as a prefix for now.

So, that’s it! We just uploaded the swagger.json file and it completes the API definition on API Management. Now, we need to mock an endpoint. Unlike MuleSoft API Manager, Azure API Management can handle mocking on endpoints and methods individually. Mocking can be set in the Inbound processing tile as it intercepts the back-end processing before it hits the back-end side. It can also set mocking at a global level rather than individual level. For now, we simply setup the mocking feature only on the /products endpoint and the GET method.

Select /products - GET at the left-hand side and click the pencil icon at the top-right corner of the Inbound Processing tile. Then we’re able to see the screen below:

Click the Mocking tab, select the Static responses option on the Mocking behavior item, and choose the 200 OK option of the Sample or schema responses item, followed by Save. We can expect the content defined under the examples node. Once saved, the screen will display something like this:

In order to use the API Management, we have to send a special header key in every request. Go to the Products - PREVIEW blade and add the API that we’ve defined above.

Go to the Users - PREVIEW blade to get the Subscription Key.

We’re ready now. In Postman, we can see the result like:

The subscription key has been sent through the header with a key of Ocp-Apim-Subscription-Key and the result was what we expected above, which has already been defined in the Swagger definition.

So far, we have used Azure API Management with Swagger definition for mocking. The main differences between MuleSoft API Manager and Azure API Management are:

  • Azure API Management doesn’t change the mocking endpoint URL, while MuleSoft API Manager does. That’s actually very important for front-end/mobile developers because they don’t get bothered with changing the base URL – same URL with mocked result or actual result based on the setting.
  • Azure API Management can mock endpoint and method individually. We only mock necessary ones.

However, the downside of using Azure API Management will be the cost. It costs more than $60/month on the developer pricing plan, which is the cheapest. MuleSoft API Manager is literally free to use, from the mocking perspective (Of course we have to pay for other services in MuleSoft).

Well, what else service can we use together with Swagger? Of course there is Amazon AWS. Let’s have a look.

Amazon API Gateway + Swagger

Amazon API Gateway also uses Swagger for its service. Login to the API Gateway Console and import the Swagger definition stated above.

Once imported, we can see all API endpoints. We choose the /products endpoint with the GET method. Select the Mock option of the Integration type item,, and click Save.

Now, we’re at the Method Execution screen. When we see at the rightmost-hand side of the screen, it says Mock Endpoint where this API will hit. Click the Integration Request tile.

We confirm this is mocked endpoint. Right below, select the When there are no templates defined (recommended) option of the Request body passthrough item.

Go back to the Method Execution screen and click the Integration Response tile.

There’s already a definition of the HTTP Status Code of 200 in Swagger, which is automatically showed upon the screen. Click the triangle icon at the left-hand side.

Open the Body Mapping Templates section and click the application/json item. By doing so, we’re able to see the sample data input field. We need to put a JSON object as a response example by hand.

I couldn’t find any other way to automatically populate the sample data. Especially, the array type response object can’t be auto-generated. This is questionable; why doesn’t Amazon API Gateway allow a valid object? If we want to avoid this situation, we have to update our Swagger definition, which is vendor dependent.

Once the sample data is updated, save it and go back to the Method Execution screen again. We are ready to use the mocking feature. Wait. One more step. We need to publish for public access.

Eh oh… We found another issue here. In order to publish (or deploy) the API, we have to set up the Integration Type of all endpoints and methods INDIVIDUALLY!!! We can’t skip one single endpoint/method. We can’t set up a global mocking feature, either. This is a simple example that only contains four endpoints/methods. But, as a real life scenario, there are hundreds of API endpoints/methods in one application. How can we set up individual endpoints/methods then? There’s no such way here.

Anyway, once deployment completes, API Gateway gives us a URL to access to API Gateway endpoints. Use Postman to see the result:

So far, we have looked Amazon API Gateway for mocking. Let’s wrap up this post.

  • Global API Mocking: MueSoft API Manager provides a one-click button, while Amazon API Gateway doesn’t provide such feature.
  • Individual API Mocking: Both Azure API Management and Amazon API Gateway are good for handling individual endpoints/methods, while MuleSoft API Manager can’t do it. Also Amazon API Gateway doesn’t well support to return mocked data, especially for array type response. Azure API Management perfectly supports this feature.
  • Uploading Automation of API Definitions: Amazon API Gateway has to manually update several places after uploading Swagger definition files, like examples as mocked data. On the other hand, both Azure API Management and MuleSoft API Manager perfectly supports this feature. There’s no need for manual handling after uploading definitions.
  • Cost of API Mocking: Azure API Management is horrible from the cost perspective. MuleSoft provides virtually a free account for mocking and Amazon API Gateway offers the first 12 months free trial period.

We have so far briefly looked how we can mock our APIs using the spec documents. As we saw above, we don’t need to code for this feature. Mocking purely relies on the spec document. We also have looked how this mocking can be done in each platform – MuleSoft API Manager, Azure API Management and Amazon API Gateway, and discuss merits and demerits of each service, from the mocking perspective.

Watching the watcher – Monitoring the EC2Config Service

EC2Config service is a nifty Windows service provided by Amazon that performs many important chores on instances based on AWS Windows Server 2003-2012 R2 AMIs. These tasks include (but are not limited to):

  • Initial start-up tasks when the instance is first started (e.g. executing the user data, setting random Administrator account password etc)
  • Display wallpaper information to the desktop background.
  • Run Sysprep and shut down the instance

More details about this service can be found at Amazon’s webpage

Another important aspect of EC2Config service is that it can be configured to send performance metrics to CloudWatch. Example of these metrics are Available Memory, Free Disk Space, Page File Usage to name a few. The problem we faced is sometimes this service will either stop or fail to start due to a misconfigured configuration file. Having this service running all the time was critical for monitoring and compliance reasons.

To make sure that this service was running and publishing metrics to CloudWatch, we came up with a simple solution. We used a Python script written as a Lambda function to query Windows performance metrics for the last 10 minutes (function scheduled to run every 30-minute interval configurable through Lambda Trigger) and if the metric was missing, send an alert.

Following is the code written for this purpose. The salient features of the code are:

  1. The function lambda_handler is invoked by Lambda
  2. Variable are initialised, currently these are coded in to the function but they can also be parametrized using Environment Variables feature of a Lambda function
  3. Ec2 and CloudWatch objects are initialised
  4. Running Instances are retrieved based on “running” filter
  5. If an Instance is running for less than the period requested than ignore this instance (this avoids false alarms for instances started in the last few minutes)
  6. Cloudwatch metric ‘Available Memory’ for the instance is retrieved for last 10 min. This can be substituted with any other metric name. Please also take note of the Dimension of the metric
  7. Datapoint result is inspected, if no Datapoint is found this instance is added to a list (later used for alert)
  8. If the list has some values, an alert is sent via SNS topic

#
#
# AWS Lambda Python script to query for Cloudwatch metrics for all running 
# EC2 instance and if unavailable send a message through an SNS topic
# to check for EC2Config service
#
# Required IAM permissions:
#   ec2:DescribeInstances
#   sns:Publish
#   cloudwatch:GetMetricStatistics
#
# Setup:
# Check these in the code (Search *1 and *2): 
#   *1: Confirm details of the parameters
#   *2: Confirm details of the dimensions
#

from __future__ import print_function
import boto3,sys,os
from calendar import timegm
from datetime import datetime, timedelta

def check_tag_present(instance, tag_name, tag_value):
    for tag in instance.tags:
        if tag['Key'] == tag_name:
            if tag['Value'] == tag_value:
                return True

    return False

def send_alert(list_instances, topic_arn):
    if topic_arn == "":
        return

    instances = ""

    for s in list_instances:
        instances += s
        instances += "\n"

    subject = "Warning: Missing CloudWatch metric data"
    message = "Warning: Missing CloudWatch metric data for the following instance id(s): \n\n" + instances + "\n\nCheck the EC2Config service is running and the config file in C:\\Program Files\\Amazon\\Ec2ConfigService\\Settings is correct."
    client = boto3.client('sns')
    response = client.publish(TargetArn=topic_arn,Message=message,Subject=subject)
    print ("*** Sending alert ***")

def lambda_handler(event, context):
    
    # *1-Provide the following information
    _instancetagname = 'Environment' # Main filter Tag key
    _instancetagvalue = 'Prod'       # Main filter Tag value
    _period = int(10)                # Period in minutes
    _namespace = 'WindowsPlatform'   # Namespace of metric
    _metricname = 'Available Memory' # Metric name
    _unit = 'Megabytes'              # Unit
    _topicarn =  ''                  # SNS Topic ARN to write message to
    _region = "ap-southeast-2"       # Region

    ec2 = boto3.resource('ec2',_region)
    cw = boto3.client('cloudwatch',_region)

    filters = [{'Name':'instance-state-name','Values':['running']}]

    instances = ec2.instances.filter(Filters=filters)

    now = datetime.now()

    print('Reading Cloud watch metric for last %s min\n' %(_period))

    start_time = datetime.utcnow() - timedelta(minutes=_period)
    end_time = datetime.utcnow()

    print ("List of running instances:")

    list_instances=[]

    for instance in instances:
        
        if check_tag_present(instance, _instancetagname, _instancetagvalue)==False:            
            continue #Tag/Value missing, ignoring instance

        print ("Checking ", instance.id)

        i=1
        
        date_s=instance.launch_time
        date_s=date_s.replace(tzinfo=None)
        new_dt = datetime.utcnow() - date_s

        instance_name = [tag['Value'] for tag in instance.tags if tag['Key'] == 'Name'][0]
        minutessince = int(new_dt.total_seconds() / 60)
        
        if minutessince < _period:
            print ("Not looking for data on this instance as uptime is less than requested period.\n")
            continue

        metrics = cw.get_metric_statistics(
            Namespace=_namespace,
            MetricName=_metricname,            
            Dimensions=[{'Name': 'InstanceId','Value': instance.id}], 
            StartTime=start_time,
            EndTime=end_time,
            Period=300,
            Statistics=['Maximum'],
            Unit=_unit
        )
        
        datapoints = metrics['Datapoints']

        for datapoint in datapoints:
            if datapoint['Maximum']:
                print (i,")\nDatapoint Data:",datapoint['Maximum'],"\nTimeStamp: ",datapoint['Timestamp'],"\n")
                i+=1
            else:
                print ("Cloudwatch has no Maimum metrics for",_metricname,"instance id: ", instance.id)

        if i == 1: #No data point found
            print ("Cloudwatch has no metrics for",_metricname," for instance id: ", instance.id)
            list_instances.append(instance_name + " (" + instance.id+ ")" + ", CW Server Name: " + cw_server_name)
            
        print ("=================================================\n")

    if len(list_instances) > 0:
        send_alert(list_instances, _topicarn)

Please note: The function needs some permissions to execute, so the following policy should be attached to lambda function’s role:

{
    "Version": "2012-10-17",
    "Statement": [{
        "Sid": "Stmt1493179460000",
        "Effect": "Allow",
        "Action": ["ec2:DescribeInstances"],
        "Resource": ["*"]
    },
    {
        "Sid": "Stmt1493179541000",
        "Effect": "Allow",
        "Action": ["sns:Publish"],
        "Resource": ["*"]
    },
    {
        "Sid": "Stmt1493179652000",
        "Effect": "Allow",
        "Action": ["cloudwatch:GetMetricStatistics"],
        "Resource": ["*"]
    }]
}

VPC ( Virtual Private Cloud) Configuration

Introduction

This blog is Part 01 of a 02 part series related to custom VPC configurations

Part 01 discusses the following scenario

  • Creating a VPC with 02 subnets ( Public and Private )
  • Creating a bastion host server in the public subnet
  • Allowing the Bastion host to connect to the servers in the Private Subnet using RDP.

Part 02 will discuss the following

  • Configuring NAT Instances
  • Configuring VPC Peering
  • Configuring VPC flow Logs.

What is a VPC

VPC can be described as a logical Datacenter where AWS resources can be deployed.

The logical datacenter can be connected to your physical datacenter through VPN or direct connect. Details (https://blog.kloud.com.au/2014/04/10/quickly-connect-to-your-aws-vpc-via-vpn/)

This section deals with the following tasks.

  • Creating the VPC
  • Creating Subnets
  • Configuring Subnets for internet access

1 Creating the VPC

The following steps should be followed for configuring VPC. we can use the wizard to create a VPC but this document will focus on the detailed method where every configuration parameter is defined by the user.

Step 01.01 : Logon to the AWS console

Step 01.02 : Click on VPC

Step 01.03 : Select Your VPCs

01

Step 01.04 : Select Create VPC

02

Step 01.05 Enter the following details in the Create VPC option

  • Enter the details of the Name Tag
  • Enter the CIDR Block. keep in mind that the block size cannot be greater that /16.

Step 01.06: Click on Yes,Create

04

We have now created a VPC. The following resources are also created automatically

  • Routing table for the VPC
  • Default VPC Security Group
  • Network ACL for the VPC

Default Routing Table ( Route Table Id = rtb-ab1cc9d3)

Check the Routing table below for the VPC. If you check the routes of the route table, you see the following

  • Destination :10.0.0.0/16
  • target : Local
  • Status: Active
  • Propagated: No

12

This route ensures that all the subnets in the VPC are able to connect with each other. All the subnets created in the VPC are assigned to the default route table therefore its best practice not to change the default route table. For any route modification, a new route table can be created and assigned to subnets specifically.

Default Network Access Control List ( NACL id= acl-ded45ca7)

Mentioned below is the snapshot of the default NACL created when the VPC was created.

06.GIF

Default security group for the VPC ( Group id = sg-5c088122)

Mentioned below is the snapshot of the default Security Group created when the VPC was created.

07.GIF

Now we need to create Subnets. Keeping in mind that the considered scenario needs 02 subnets ( 01 Private and 01 Public ).1.

2 Creating Subnets

Step 02.01 : Go to the VPC Dashboard and select Subnets

08

Step 02.02 : Click on Create Subnet

09

Step 02.03: Enter the following details in the Create Subnet window

  • Name Tag: Subnet IPv4 CIDR Block ) – “Availability Zone” =  10.0.1.0/24 – us-east-1a
  • VPC : Select the newly created VPC = vpc-cd54beb4 | MyVPC
  • Availability Zone: us-east-1a
  • IPv4 CIDR Block :10.0.1.0/24

Step 02.04: Click on Yes,Create

10

Now we have created subnet 10.0.1.0/24

We will use the same steps to create another subnet. 10.0.2.0/24 in availability zone us-east-1b

  • Name Tag: Subnet IPv4 CIDR Block ) – “Availability Zone” =  10.0.2.0/24 – us-east-1b
  • VPC : Select the newly created VPC = vpc-cd54beb4 | MyVPC
  • Availability Zone: us-east-1b
  • IPv4 CIDR Block :10.0.2.0/24

11

3 Configuring subnets

Now that we have 02 subnets and we need to configure the 10.0.1.0/24 as the public subnet and 10.0.2.0/24 as the private subnets. The following tasks need to be performed for the activity

  • Internet Gateway creation and configuration
  • Route table Creation and configuration
  • Auto Assign Public IP Configuration.

3.1 Internet gateway Creation and Configuration ( IGW Config )

Internet gateways as the name suggest provide access to the internet. They are assigned to VPC and routing table is configured to direct all internet based traffic to the internet gateway.

Mentioned below are the steps for creating and configuring the internet gateway.

Step 03.01.01 : Select Internet Gateways  from the VPC dashboard and click on Create Internet Gateway

13.GIF

Step 03.01.02 : Enter the name tag and click on Yes,Create

14.GIF

The internet gateways is created but not attached to any VPC.( internet gateway Id = igw-90c467f6)

Step 03.01.03: Select the Internet Gateway and click on Attach to VPC

15

Step 03.01.04 : Select your VPC and click on Yes,Attach

16.GIF

We have now attached the Internet Gateway to the VPC. Now we need to configure the route tables for internet access.

3.2 Route Table creation and Configuration ( RTBL Config)

A default route table ( with id rtb-ab1cc9d3) was created when the VPC was created. Its best practice to create a separate route table to internet access.

Step 03.02.01 : Click on the Route Table section in the VPC Dashboard and click Create Route table

17

Step 03.02.02: Enter the following details in the Create Route Table window and click on Yes,Create

  • Name tag: Relevant Name = InternetAccessRoutetbl
  • VPC : Your VPC = vpc-cd54b3b4 | MyVPC

19

Step 03.02.03 : Select a newly created Route table( Route Table Id = rtb-3b78ad43 | InternetAccessRouteTbl) and Click Routes and then Edit

20

Step 03.02.04: Click on Add Another Route

21

Step 03.02.05 : Enter the following values in the route and click on Save

  • Destination: 0.0.0.0/0
  • Target : Your Internet Gateway ID  = igw-90c467f6 ( in my case )

22.GIF

Route table needs subnet associations. The subnets which we want to make Public should be associated with the route table. In our case, we would associate Subnet 10.0.1.0/24 to the route table.

Step 03.02.06: Click on Subnet Associations

23.GIF

You should be able to see the message “You do not have any subnet associations”

Step 03.02.07: Click on Edit

24.GIFStep 03.02.08: Select the subnet you want to configure as a Public Subnet. In our case 10.0.1.0/24 and Click on Save

25.GIF

03.03 Auto Assign Public IP Configuration

Both the subnets created ( 10.0.1.0/24 and 10.0.2.0/24) will not assign public IP addresses to the instances deployed in them as per their default configuration.

We need to configure the public subnet ( 10.0.1.0/24 ) to provide Public IPs automatically.

Step 03.03.01: Go to the Subnets section in the VPC dashboard.

Step 03.03.02: Select the Public Subnet

Step 03.03.03: Click on Subnet Actions

Step 03.03.04: Select Modify auto-assign IP Settings

26.GIF

Step 03.03.05: Check the Enable Auto-assign Public IPv4 Addresses  in the  Modify Auto-Assign IP Settings Window and click on Save

27.GIF

After this configuration, any EC2 instance deployed in the 10.0.1.0/24 subnet will be assigned a public IP.

4 Security

security group acts as a virtual firewall that controls the traffic for one or more instances. When you launch an instance, you associate one or more security groups with the instance. You add rules to each security group that allow traffic to or from its associated instances. You can modify the rules for a security group at any time; the new rules are automatically applied to all instances that are associated with the security group. When we decide whether to allow traffic to reach an instance, we evaluate all the rules from all the security groups that are associated with the instance.

We will create 02 security groups,

  • Public-Private ( Will contains access rules from Public Subnet to Private Subnet )
  • Internet-Public ( will contains the ports allowed from the internet to the Public Subnet )

Step 4.1  : Click on Security Groups in the Network and Security section

Step 4.2 : Click on Create Security Group

28

Step 4.3 : Enter the following details on the Create Security group Window and click on Create

  • Security Group Name : Public-Private
  • Description : Rules between Private subnet and Public subnets
  • VPC : Select the VPC we created in the exercise.
  • Click on Add Rules to Add the following rules
    • Type = RDP , Protocol = TCP , POrt Range = 3389 , Source = Custom : 10.0.1.0/24
    • Type = All ICMP – IPV4, Protocol = ICMP , Port Range = 0 – 65535 , Source = Custom, 10.0.1.0/24

29

Step 4.4 : Enter the following details on the Create Security group Window and click on Create

  • Security Group Name : Public-Internet
  • Description : Rules between Public and the internet
  • VPC : Select the VPC we created in the exercise.
  • Click on Add Rules to Add the following rules
    • Type = RDP , Protocol = TCP , POrt Range = 3389 , Source =Anywhere
    • Type = All ICMP – IPV4, Protocol = ICMP , Port Range = 0 – 65535 , Source = Anywhere

30

4 EC2 installation

Now we will deploy 02 EC2 instances . One EC2 Instances Named PrivateInstance in the 10.0.2.0/24 subnet and one instance named PublicInstance in the 10.0.1.0/24 subnet.

Public Instance Configuration :

  • Instance Name : Public Instance
  • Network : MyVPC
  • Subnet : 10.0.1.0/24
  • Auto-Assign Public ip : Use subnet setting ( enabled )
  • Security Group : Public-Internet security group
  • IAM Role : As per requirement

Private Instance Configuration :

  • Instance Name : Private Instance
  • Network : MyVPC
  • Subnet : 10.0.2.0/24
  • Auto-Assign Public ip : Use subnet setting ( disabled)
  • Security Group : Public-Private security group
  • IAM Role : As per requirement

Once the deployment of the EC2 instance is complete, you can connect to the PublicInstance through RDP and from there connect further to the Private instances.

Patching EC2 through SSM

 

Why Patch Manager?

AWS SSM Patch Manager is an automated tool that helps you simplify your operating system patching process, including selecting the patches you want to deploy, the timing for patch roll-outs, controlling instance reboots, and many other tasks. You can define auto-approval rules for patches with an added ability to black-list or white-list specific patches, control how the patches are deployed on the target instances (e.g. stop services before applying the patch), and schedule the automatic roll out through maintenance windows.

These capabilities help you automate your patch maintenance process to save you time and reduce the risk of non-compliance. All capabilities of EC2 Systems Manager, including Patch Manager, are available free of charge so you only pay for the resources you manage.

The article can be used to configure patching for instances hosted in AWS Platform.

You will need to have the necessary pre-requisite knowledge regarding, EC2, and IAM section of the AWS. If so then please read on.

The configuration has three major sections

  • EC2 instance configuration for patching
  • Default Patching Baseline Configuration
  • Maintenance Window configuration.

1  Instance Configuration

We will start with the First section which is configuring the Instances to be patched. This requires the following tasks.

  1. Create Amazon EC2 Role for patching with two policies attached
    • AmazonEC2RoleForSSM
    • AmazonSSMFullAccess
  2. Assign Roles to the EC2 Instances
  3. Configure Tags to ensure patching in groups.

Important: The Machines to be patched should be able to contact Windows Update Services.  Mentioned below article contains the URLS which should be accessible for proper patch management.

https://technet.microsoft.com/en-us/library/cc708605(v=ws.10).aspx

Mentioned below are the detailed steps for the creation of an IAM role for Instances to be Patched using Patch Manager.

Step 1: Select IAM —–> Roles and Click on Create New Role

1

Step 2: Select Role Type —-> Amazon EC2

2.PNG

Step 3: Under Attach Policy Select the following and Click Next

  • AmazonEC2RoleForSSM
  • AmazonSSMFullAccess

3.PNG

Step 4: Enter the Role Name and Select Create Role (At the bottom of the page)

4.PNG

Now you have gone through the first step in your patch management journey.

Instances should be configured to use the above created role to ensure proper patch management. (or any roles which has AmazonEC2RoleforSSM and AmazonSSMFullAccess policies attached to it.)

5.PNG

We need to group our AWS hosted servers in groups cause no one with the right frame of mind wants to patch all the servers in one go.

To accomplish that we need to use Patch Groups (explained later).

For example:

We can configure Patch manager to Patch EC2 instances with Patch Group Tag = Group01 on Wednesday and EC2 instances with Patch Group Tag = PatchGroup02 on Friday.

To utilize patch groups, all EC2 instances should be tagged to support cumulative patch management based on Patch Groups.

Congratulations, you have completed the first section of the configuration. Keep following just two to go.

Default Patch Baseline configuration.

Patches are categorized using the following attributes :

  • Product Type: like Windows version etc.
  • Classification: CriticalUpdates, SecurityUpdates, SevicePacks, UpdateRollUps
  • Severity: Critical,Important,Low etc.

Patches are prioritized based on the above factors.

A Patch baseline can be used to configure the following using rules

  • Products to be included in Patching
  • Classification of Patches
  • Severity of Patches
  • Auto Approval Delay: Time to wait (Days) before automatic approval)

Patch Baseline is configured as follows.

Step 01: Select EC2 —> Select Patch Baselines (under the Systems Manager Services Section)

Step 02: Click on Create Patch Baseline

6.PNG

Step 03: Fill in the details of the baseline and click on Create

7.PNG

Go to Patch Baseline and make the newly created baseline as your default.

8.PNG

At this point, the instances to be patched are configured and we have also configured the patch policies. The next section we provide AWS the when (Date and Time) and what (task) of the patching cycle.

Maintenance Windows Configuration

As the name specifies, Maintenance Windows give us the option to Run Tasks on EC2 Instances on a specified schedule.

What we wish to accomplish with Maintenance Windows is to Run a Command (Apply-AWSPatchBaseline), but on a given schedule and on a subset of our servers. This is where all the above configurations gel together to make patching work.

Configuring Maintenance windows consist of the following tasks.

  • IAM role for Maintenance Windows
  • Creating the Maintenance Window itself
  • Registering Targets (Selecting servers for the activity)
  • Registering Tasks (Selecting tasks to be executed)

Mentioned below are the detailed steps for configuring all the above.

Step 01: Create a Role with the following policy attached

  • AmazonSSMMaintenanceWindowRole

9.PNG

Step 02: Enter the Role Name and Role Description

10.PNG

Step 03: Click on Role and copy the Role ARN

Step 04: Click on Edit Trust Relationships

11.PNG

Step 05: Add the following values under the Principal section of the JSON file as shown below

“Service”: “ssm.amazonaws.com”

Step 06: Click on Update Trust Relationships (on the bottom of the page)

12.PNG

At this point the IAM role for the maintenance window has been configured. The next section details the configuration of the maintenance window.

Step 01: Click on EC2 and select Maintenance Windows (under the Systems Manager Shared Resources section)

13.PNG

Step 02: Enter the details of the maintenance Windows and click on Create Maintenance Windows

14.PNG

At this point the Maintenance Window has been created. The next task is to Register Targets and Register Tasks for this maintenance window.

Step 01: Select the Maintenance Window created and click on Actions

Step 02: Select Register Targets

15.PNG

Step 03: Enter Owner Information and select the Tag Name and Tag Value

Step 04: Select Register Targets

16.PNG

At this point the targets for the maintenance window have been configured. This leaves us with the last activity in the configuration which is to register the tasks to be executed in the maintenance window.

Step 01: Select the Maintenance Window and Click on Actions

Step 02: Select Register Task

17.PNG

Step 03: Select AWS-ApplyPatchBaseline from the Document section

18.PNG

Step 04: Click on Registered targets and select the instances based on the Patch Group Tag

Step 05: Select Operation SCAN or Install based on the desired function (Keep in mind that an Install will result in a server restart).

Step 06: Select the MaintenanceWindowsRole

Step 07: Click on Register Tasks

19.PNG

After completing the configuration, the Registered Task will run on the Registered Targets based on the schedule specified in the Maintenance Window

The status of the Maintenance Window can be seen in the History section (as Shown below)

20.PNG

Hope this guide does get you through the initial patching configuration for your EC2 instances in Amazon.

Also in AWS the configuration can be done using CLI as well. Lets leave that for another blog for now.

Thanks for Reading.

Re-execute the UserData script in an AWS Windows Instance

First published at https://nivleshc.wordpress.com

Bootstrapping is an awesome way of customising your instances in AWS (similar capability exists in Azure).

To enable bootstrapping, while configuring the launch instance, in Step 3: Configure Instance Details scroll down to the bottom and then expand Advanced Details.

You will notice a User data text box. This is where you can provide your bootstrap script. The script will be run when your instance is first launched.

AWS_BootstrapScript

I went ahead and entered my script in the text box and proceeded to complete my instance configuration. Once my instance was running, I initiated a Remote Desktop connection to it, to confirm that my script had run. Unfortunately, I couldn’t see any customisations (which meant my script didn’t run)

Thinking that the instance had not been able to access the user data, I opened up Internet Explorer and then browsed to the following url (this is an internal url that can be used to access the user-data)

http://169.254.169.254/latest/user-data/

I was able to successfully access the user-data, which meant that there were no issues with that.  However when checking the content, I noticed a typo! Aha, that was the reason why my customisations didn’t happen.

Unfortunately, according to AWS, user-data is only executed during launch (for those that would like to read, here is the official AWS documentation). To get the fixed bootstrap script to run, I would have to terminate my instance and launch a new one with the corrected script (I tried re-booting my windows instance after correcting my typo, however it didn’t run).

I wasn’t very happy on terminating my current instance and then launching a new one, since for those that might not be aware, AWS EC2 compute charges are rounded up to the next hour. Which means that if I terminated my current instance and launched a new one, I would be charged for 2 x 1hour sessions instead of just 1 x 1 hour!

So I set about trying to find another solution. And guess what, I did find it 🙂

Reading through the volumes of documentation on AWS, I found that when Windows Instances are provisioned, the service that does the customisations using user-data is called EC2Config. This service runs the initial startup tasks when the instance is first started and then disables them. HOWEVER, there is a way to re-enable the startup tasks later on 🙂 Here is the document that gives more information on EC2Config.

The Amazon Windows AMIs include a utility called EC2ConfigService Settings. This allows you to configure EC2Config to execute the user-data on next service startup. The utility can be found under All Programs (or you can search for it).

AWS_EC2ConfigSettings_AllApps

AWS_EC2ConfigSettings_Search

Once Open, under General you will see the following option

Enable UserData execution for next service start (automatically enabled at Sysprep) eg. or <powershell></powershell>

AWS_EC2ConfigSettings

Tick this option and then press OK. Then restart your Windows Instance.

After your Windows Instance restarts, EC2Config will execute the userData (bootstrap script) and then it will automatically remove the tick from the above option so that the userData is not executed on subsequent restarts (or service starts)

There you go. A simple way to re-run your bootstrap scripts on an AWS Windows Instance without having to terminate the current instance and launching a new one.

There are other options available in the EC2ConfigService Settings that you can explore as well 🙂

A Closer Look at Amazon Chime

In news this quarter AWS have released a web conferencing cloud service to their existing ‘Business Productivity‘ services which already includes Amazon WorkDocs and Amazon WorkMail. So my thought was to help you gauge where this sits in relation to Skype for Business. I don’t want to put this into a Microsoft versus Amazon review but I do want you to understand the product names that ‘somewhat’ align with each other.

Exchange = WorkMail

SharePoint/OneDrive for Business  =  WorkDocs

Skype for Business  = Chime

The Microsoft products are reasonably well known in the world so I’ll give you a quick one liner about the Amazons products:

WorkMail “Hosted Email”

WorkDocs “Hosted files accessible via web, PC, mobile devices with editing and sharing capability”

So what is Chime?

Chime isn’t exactly a one-to-one feature set for Skype for Business so be wary of articles conveying this sentiment as they haven’t really done their homework. Chime can be called either a web meeting, online meeting, or online video conferencing platform. Unlike Skype for Business, Chime is not a PBX replacement. So what does it offer?

  • Presence
  • 1-1 and group IM
  • Persistent chat / chat rooms
  • Transfer files
  • Group HD video calling
  • Schedule meetings using Outlook or Google calendar
  • Join meeting from desktop or browser
  • Join meeting anonymously
  • Meeting controls for presenter
  • Desktop sharing
  • Remote desktop control
  • Record audio and video of meetings
  • Allow participants to dial-in with no Chime client (PSTN conferencing bridge)
  • Enable legacy H.323 endpoints to join meetings (H.323 bridge)
  • Customisable / personalised meeting space URL

The cloud hosted based products that I see are similar to Chime are WebEx, Google Hangouts, GoToMeeting and ReadyTalk to name just a few. As here at Kloud we are Microsoft Gold Partners and have a Unified Communication team that deliver Skype for Business solutions and not the other products I have previously mentioned, I will tell you a few things that differentiate SFB from Chime.

  • Direct Inward Dial (DID) user assignment
  • Call forwarding and voicemail
  • Automated Call Distribution (ACD)
  • Outbound PSTN dialling and route based on policy assignment
  • Integrated Voice Recording (IVR)
  • Hunt Groups / Call Pickup Groups
  • Shared Line Apperance
  • SIP IP-Phone compatibility
  • Attendant / Reception call pickup
  • On-premises, hybrid and hosted deployment models

Basically all things that replace a PBX Solution Skype for Business will do. Now this isn’t a cheap shot at Amazon, cause that isn’t where they are positioning their product. What I’ve hoped to have done is clarify any misconception about where the product sits in the market and how it relates to features in a well known product like Microsoft Skype for Business.

For the price that Amazon are offering Chime for in the online meeting market it is very competitive against some other hosted products. Their ‘rolls-royce’ plan is simply purchased for $15 USD per user per month. If you’re not invested in the Office 365 E3/E5 license ecosystem and you need a online meeting platform at a low cost, then Chime might be right for you. Amazon offer a 30 day free trial for free that is a great way to test it out.

https://chime.aws

 

Dynamically rename an AWS Windows host deployed via a syspreped AMI

One of my customers presented me with a unique problem last week. They needed to rename a Windows Server 2016 host deployed using a custom AMI without rebooting during the bootstrap process. This lack of a reboot rules out the simple option of using the PowerShell Rename-Computer Cmdlet. While there are a number of methods to do this, one option we came up with is dynamically updating the sysprep unattended answer file using a PowerShell script prior to the unattended install running during first boot of a sysprepped instance.

To begin, we looked at the Windows Server 2016 sysprep process on an EC2 instance to get an understanding of the process required. It’s important to note that this process is slightly different to Server 2012 on AWS, as EC2Launch has replaced EC2Config. The high level process we need to perform is the following:

  1. Deploy a Windows Server 2016 AMI
  2. Connect to the Windows instance and customise it as required
  3. Create a startup PowerShell script to set the hostname that will be invoked on boot before the unattended installation begins
  4. Run InitializeInstance.ps1 -Schedule to register a scheduled task that initializes the instance on next boot
  5. Modify SysprepInstance.ps1 to replace the cmdline registry key after sysprep is complete and prior to shutdown
  6. Run SysprepInstance.ps1 to sysprep the machine.

Steps 1 and 2 are fairly self-explanatory, so let’s start by taking a look at step 3.

After a host is sysprepped, the value for the cmdline registry key located in HKLM:\System\Setup is populated with the windeploy.exe path, indicating that it should be invoked after reboot to initiate the unattended installation. Our aim is to ensure that the answer file (unattend.xml) is modified to include the computer name we want the host to be named prior to windeploy.exe executing. To do this, I’ve created the following PowerShell script named startup.ps1 and placed it in C:\Scripts to ensure my hostname is based on the internal IP address of my instance.

Once this script is in place, we can move to step 4, where we schedule the InitializeInstance.ps1 script to run on first boot. This can be done by running InitializeInstance.ps1 -Schedule located in C:\ProgramData\Amazon\EC2-Windows\Launch\Scripts

Step 5 requires a modification of the SysprepInstance.ps1 file to ensure the shutdown after Sysprep is delayed to allow modification of the cmdline registry entry mentioned in step 3. The aim here is to replace the windeploy.exe path with our startup script, then shutdown the host. The modifications to the PowerShell script to accommodate this are made after the “#Finally, perform Sysprep” comment.

Finally, run the SysprepInstance.ps1 script.

Once the host changes to the stopped state, you can now create an AMI and use this as your image.

Configuring AWS Web Application Firewall

In a previous blog, we discussed Site Delivery with AWS CloudFront CDN, one aspect in that blog was not covered and that was WAF (Web Application Firewall).

What is Web Application Firewall?

AWS WAF is a web application firewall that helps protect your web applications from common web exploits that could affect application availability, compromise security, or consume excessive resources. AWS WAF gives you control over which traffic to allow or block to your web applications by defining customizable web security rules. You can use AWS WAF to create custom rules that block common attack patterns, such as SQL injection or cross-site scripting, and rules that are designed for your specific application. New rules can be deployed within minutes, letting you respond quickly to changing traffic patterns. Also, AWS WAF includes a full-featured API that you can use to automate the creation, deployment, and maintenance of web security rules.

When to create your WAF?

Since this blog is related to CloudFront, and the fact that WAF is tightly integrated with CloudFront, our focus will be on that. Your WAF rules will run and be applied in all Edge Locations you specify during your CloudFront configuration.

It is recommended that you create your WAF before deploying your CloudFront Distribution, since CloudFront takes a while to be deployed, any changes applied after its deployment will take equal time to be updated. Even after you attach the WAF to the CloudFront Distribution (this is done in “Choose Resource” during WAF configuration – shown below).

Although there is no general rule here, it is up to the organisation or administrator to apply WAF rules before or after the deployment of CloudFront distribution.

Main WAF Features

In terms of security, WAF protects your applications from the following attacks:

  • SQL Injection
  • DDoS attacks
  • Cross Site Scripting

In terms of visibility, WAF gives you the ability to monitor requests and attacks through CloudWatch integration (excluded from this blog). It gives you raw data on location, IP Addresses and so on.

How to setup WAF?

Setting up WAF can be done in few ways, you could either use CloudFormation template, or configure the setting on the WAF page.

Since each organisation is different, and requirements change based on applications, and websites, the configuration present on this blog are considered general practice and recommendation. However, you will still need to tailor the WAF rules according to your own needs.

WAF Conditions

For the rules to function, you need to setup filter conditions for your application or website ACL.

I already have WAF setup in my AWS Account, and here’s a sample on how conditions will look.

1-waf

If you have no Conditions already setup, you will see something like “There is no IP Match conditions, please create one”.

To Create a condition, have a look at the following images:

In here, we’re creating a filter that an HTTP method contains a threat after HTML decoding.

2-waf

Once you’ve selected your filters, click on “Add Filter”. The filter will be added to the list of filters, and once you’re done adding all your filters, create your condition.

3-waf

You need to follow the same procedure to create your conditions for SQL Injection for example.

WAF Rules

When you are done with configuring conditions, you can create a rule and attach it to your web ACL. You can attach multiple rules to an ACL.

Creating a rule – Here’s where you specify the conditions you have created in a previous step.

4-waf

From the list of rules, select the rule you have created from the drop down menu, and attach it to the ACL.

5-waf

In the next steps you will have the option to choose your AWS Resource, in this case one of my CloudFront Distributions. Review and create your Web ACL.

6-waf

7-waf

Once you click on create, go to your CloudFront distribution and check its status, it should show “In progress”.

WAF Sample

Since there isn’t a one way for creating a WAF rule, and if you’re not sure where to begin, AWS gives you a good way to start with a CloudFormation template that will create WAF sample rules for you.

This sample WAF rule will include the following found here:

  • A manual IP rule that contains an empty IP match set that must be updated manually with IP addresses to be blocked.
  • An auto IP rule that contains an empty IP match condition for optionally implementing an automated AWS Lambda function, such as is shown in How to Import IP Address Reputation Lists to Automatically Update AWS WAF IP Blacklists and How to Use AWS WAF to Block IP Addresses That Generate Bad Requests.
  • A SQL injection rule and condition to match SQL injection-like patterns in URI, query string, and body.
  • A cross-site scripting rule and condition to match Xss-like patterns in URI and query string.
  • A size-constraint rule and condition to match requests with URI or query string >= 8192 bytes which may assist in mitigating against buffer overflow type attacks.
  • ByteHeader rules and conditions (split into two sets) to match user agents that include spiders for non–English-speaking countries that are commonly blocked in a robots.txt file, such as sogou, baidu, and etaospider, and tools that you might choose to monitor use of, such as wget and cURL. Note that the WordPress user agent is included because it is used commonly by compromised systems in reflective attacks against non–WordPress sites.
  • ByteUri rules and conditions (split into two sets) to match request strings containing install, update.php, wp-config.php, and internal functions including $password, $user_id, and $session.
  • A whitelist IP condition (empty) is included and added as an exception to the ByteURIRule2 rule as an example of how to block unwanted user agents, unless they match a list of known good IP addresses

Follow this link to create a stack in the Sydney region.

I recommend that you review the filters, conditions, and rules created with this Web ACL sample. If anything, you could easily update and edit the conditions as you desire according to your applications and websites.

Conclusion

In conclusion, there are certain aspects of WAF that need to be considered, like choosing an appropriate WAF solution and managing its availability, and you have to be sure that your WAF solution can keep up with your applications.

The best feature of WAF, and since it is integrated with CloudFront it can be used to protect websites even if they’re not hosted in AWS.

I hope you found this blog informative. Please feel free to add your comments below.

Thanks for reading.

Experiences with the new AWS Application Load Balancer

Originally posted on Andrew’s blog @ cloudconsultancy.info

Summary

Recently I had an opportunity to test drive AWS Application load balancer as my client had a requirement for making their websocket application fault tolerant. The implementation was complete windows stack and utilised ADFS 2.0 for SAML authentication however this should not affect other people’s implementation.

The AWS Application load balancer is a fairly new feature which provides layer 7 load balancing and support for HTTP/2 as well as websockets. In this blog post I will include examples of the configuration that I used to implement as well is some of the troubleshooting steps I needed to resolve.

The application load balancer is an independent AWS resource from classic ELB and is defined as aws elbv2 with a number of different properties.

Benefits of Application Load Balancer include:

  • Content based routing, ie route /store to a different set of instances from /apiv2
  • Support for websocket
  • Support for HTTP/2 over HTTPS only (much larger throughput as it’s a single stream multiplexed meaning it’s great for mobile and other high latency apps)
  • Cheaper cost than classic, roughly 10% cheaper than traditional.
  • Cross-zone load balancing is always enabled for ALB.

Some changes that I’ve noticed:

  • Load balancing algorithm used for application load balancer is currently round robin.
  • Cross-zone load balancing is always enabled for an Application Load Balancer and is disabled by default for a Classic Load Balancer.
  • With an Application Load Balancer, the idle timeout value applies only to front-end connections and not the LB-> server connection and this prevents the LB cycling the connection.
  • Application Load balancer is exactly that and performs at Layer 7, so if you want to perform SSL bridge use Classic load balancer with TCP and configure SSL certs on your server endpoint.
  • cookie-expiration-period value of 0 is not supported to defer session timeout to the application. I ended up having to configure the stickiness.lb_cookie.duration_seconds value. I’d suggest making this 1 minute longer than application session timeout, in my example a value of 1860.
  • The X-Forwarded-For parameter is still supported and should be utilised if you need to track client IP addresses, in particular useful if going through a proxy server.

For more detailed information from AWS see http://docs.aws.amazon.com/elasticloadbalancing/latest/userguide/how-elastic-load-balancing-works.html.

Importing SSL Certificate into AWS – Windows

(http://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_server-certs.html )

  1. Convert the existing pkcs key into .pem format for AWS

You’ll need openssl for this, the pfx and the password for the SSL certificate.

I like to use chocolatey as my Windows package manager, similar to yum or apt-get for Windows, which is a saviour for downloading package and managing dependencies in order to support automation, but enough of that, check it out @ https://chocolatey.org/

Once choco is installed I simply execute the following from an elevated command prompt.

“choco install openssl.light”

Thereafter I run the following two commands which breaks out the private and public keys (during which you’ll be prompted for the password):

openssl pkcs12 -in keyStore.pfx -out SomePrivateKey.key –nodes –nocerts

openssl pkcs12 -in keyStore.pfx -out SomePublic.cert –nodes –nocerts

NB: I’ve found that sometimes copy and paste doesn’t work when trying to convert keys.

  1. Next you’ll need to also break out the trust chain into one contiguous file, like the following.
-----BEGIN CERTIFICATE-----

Intermediate certificate 2

-----END CERTIFICATE-----

-----BEGIN CERTIFICATE-----

Intermediate certificate 1

-----END CERTIFICATE-----

-----BEGIN CERTIFICATE-----

Optional: Root certificate

-----END CERTIFICATE-----

Save the file for future use,

thawte_trust_chain.txt

Example attached above is for a Thawte trust chain with the following properties

“thawte Primary Root CA” Thumbprint ‎91 c6 d6 ee 3e 8a c8 63 84 e5 48 c2 99 29 5c 75 6c 81 7b 81

With intermediate

“thawte SSL CA - G2” Thumbprint ‎2e a7 1c 36 7d 17 8c 84 3f d2 1d b4 fd b6 30 ba 54 a2 0d c5

Ordinarily you’ll only have a root and intermediate CA, although sometimes there will be second intermediary CA.

Ensure that your certificates are base 64 encoded when you export them.

  1. Finally execute the following after authenticating to the AWS CLI (v1.11.14+ to support aws elbv2 function) then run “aws configure” applying your access and secret keys, configuring region and format type. Please note that this includes some of the above elements including trust chain and public and private keys.

If you get the error as below

A client error (MalformedCertificate) occurred when calling the UploadServerCertificate operation: Unable to validate certificate chain. The certificate chain must start with the immediate signing certificate, followed by any intermediaries in order. The index within the chain of the invalid certificate is: 2”

Please check the contents of the original root and intermediate keys as they probably still have the headers and maybe some intermediate,

ie

Bag Attributes

localKeyID: 01 00 00 00

friendlyName: serviceSSL

subject=/C=AU/ST=New South Wales/L=Sydney/O=Some Company/OU=IT/CN=service.example.com

issuer=/C=US/O=thawte, Inc./CN=thawte SSL CA - G2

Bag Attributes

friendlyName: thawte

subject=/C=US/O=thawte, Inc./OU=Certification Services Division/OU=(c) 2006 thawte, Inc. - For authorized use only/CN=thawte Primary Root CA

issuer=/C=US/O=thawte, Inc./OU=Certification Services Division/OU=(c) 2006 thawte, Inc. - For authorized use only/CN=thawte Primary Root CA

AWS Application LB Configuration

Follow this gist with comments embedded. Comments provided based on some gotchas during configuration.

You should be now good to go, the load balancer takes a little while to warm up, however will be available within multiple availability zones.

If you have issues connecting to the ALB, validate connectivity direct to the server using curl.  Again chocolatey comes in handy “choco install curl”

Also double check your security group registered against the ALB and confirm NACLS.

WebServer configuration

You’ll need to import the SSL certificate into the local computer certificate store. Some of the third party issuing (Ensign, Thawte, etc) CAs may not have the intermediate CA within the computed trusted Root CAs store, especially if built in a network isolated from the internet, so make sure after installing the SSL certificate on the server that the trust chain is correct.

You won’t need to update local hosts file on the servers to point to the load balanced address.

Implementation using CNAMEs

In large enterprises where I’ve worked there have been long lead times associated with fairly simple DNS changes, which defeats some of the agility provided by cloud computing. A pattern I’ve often seen adopted is to use multiple CNAMEs to work around such lead times. Generally you’ll have a subdomain domain somewhere where the Ops team have more control over or shorter lead times. Within the target domain (Example.com) create a CNAME pointing to an address within the ops managed domain (aws.corp.internal) and have a CNAME created within that zone to point to the ALB address, ie

Service.example.com -> service.aws.corp.internal -> elbarn.region.elb.amazonaws.com

With this approach I can update service.aws.corp.internal to reflect a new service which I’ve built via a new ELB and avoid the enterprise change lead times associated with a change in .example.com.