Supercharge your CloudFormation templates with Jinja2 Templating Engine

If you are working in an AWS public cloud environment chances are that you have authored a number of CloudFormation templates over the years to define your infrastructure as code. As powerful as this tool is, it has a glaring shortcoming: the templates are fairly static having no inline template expansion feature (think GCP Cloud Deployment Manager.) Due to this limitation, many teams end up copy-pasting similar templates to cater for minor differences like environment (dev, test, prod etc.) and resource names (S3 bucket names etc.)

Enter Jinja2. A modern and powerful templating language for Python. In this blog post I will demonstrate a way to use Jinja2 to enable dynamic expressions and perform variable substitution in your CloudFormation templates.

First lets get the prerequisites out of the way. To use Jinja2, we need to install Python, pip and of course Jinja2.

Install Python

sudo yum install python

Install pip

curl "https://bootstrap.pypa.io/get-pip.py" -o "get-pip.py"
sudo python get-pip.py

Install Jinja2

pip install Jinja2

To invoke Jinja2, we will use a simple python wrapper script.

vi j2.py

Copy the following contents to the file j2.py

import os
import sys
import jinja2

sys.stdout.write(jinja2.Template(sys.stdin.read()).render(env=os.environ))

Save and exit the editor

Now let’s create a simple CloudFormation template and transform it through Jinja2:


vi template1.yaml

Copy the following contents to the file template1.yaml


---

AWSTemplateFormatVersion: '2010-09-09'

Description: Simple S3 bucket for {{ env['ENVIRONMENT_NAME'] }}

Resources:

S3Bucket:

Type: AWS::S3::Bucket

Properties:

BucketName: InstallFiles-{{ env['AWS_ACCOUNT_NUMBER'] }}

As you can see it’s the most basic CloudFormation template with one exception, we are using Jinja2 variable for substituting the environment variable. Now lets run this template through Jinja2:

Lets first export the environment variables


export ENVIRONMENT_NAME=Development

export AWS_ACCOUNT_NUMBER=1234567890

Run the following command:


cat template1.yaml | python j2.py

The result of this command will be as follows:


---

AWSTemplateFormatVersion: '2010-09-09'

Description: Simple S3 bucket for Development

Resources:

S3Bucket:

Type: AWS::S3::Bucket

Properties:

BucketName: InstallFiles-1234567890

As you can see Jinja2 has expanded the variable names in the template. This provides us with a powerful mechanism to insert environment variables into our CloudFormation templates.

Lets take another example, what if we wanted to create multiple S3 buckets in an automated manner. Generally in such a case we would have to copy paste the S3 resource block. With Jinja2, this becomes a matter of adding a simple “for” loop:


vi template2.yaml

Copy the following contents to the file template2.yaml


---

AWSTemplateFormatVersion: '2010-09-09'

Description: Simple S3 bucket for {{ env['ENVIRONMENT_NAME'] }}

Resources:

{% for i in range(1,3) %}

S3Bucket{{ i }}:

Type: AWS::S3::Bucket

Properties:

BucketName: InstallFiles-{{ env['AWS_ACCOUNT_NUMBER'] }}-{{ i }}

{% endfor %}

Run the following command:


cat template2.yaml | python j2.py

The result of this command will be as follows:


---

AWSTemplateFormatVersion: '2010-09-09'

Description: Simple S3 bucket for Development

Resources:

S3Bucket1:

Type: AWS::S3::Bucket

Properties:

BucketName: InstallFiles-1234567890-1

S3Bucket2:

Type: AWS::S3::Bucket

Properties:

BucketName: InstallFiles-1234567890-2

As you can see the resulting template has two S3 Resource blocks. The output of the command can be redirected to another template file to be later used in stack creation.

I am sure you will appreciate the possibilities Jinja2 brings to enhance your CloudFormation templates. Do note that I have barely scratched the surface of this topic, and I highly recommend you to have a look at the Template Designer Documentation found at http://jinja.pocoo.org/docs/2.10/templates/ to explore more possibilities. If you are using Ansible, do note that Ansible uses Jinja2 templating to enable dynamic expressions and access to variables. In this case you can get rid of the Python wrapper script mentioned in this article and use Ansible directly for template expansion.

Replacing the service desk with bots using Amazon Lex and Amazon Connect (Part 2)

Welcome back! Hopefully you had the chance to follow along in part 1 where we started creating our Lex chatbot. In part 2, we attempt to make the conversation more human-like and begin integrating data validation on our slots to ensure we’re getting the correct input.

Creating the Lambda initialisation and validation function

As data validation requires compute, we’ll need to start by creating an AWS Lambda function. Head over to the AWS console, then navigate to the AWS Lambda page. Once you’re there, select Create Function and choose to Author from Scratch then specify the following:

Name: ResetPWCheck

Runtime: Python 2.7 (it’s really a matter of preference)

Role: I use an existing Out of the Box role, “Lambda_basic_execution”, as I only need access to CloudWatch logs for debugging.

Once you’ve populated all the fields, go ahead and select Create Function. The script we’ll be using is provided (further down) in this blog, however before we go through the script in detail, there are two items worth mentioning.

Input Events and Response Formats

It’s well worth familiarising yourself with the page on Lambda Function Input Event and Response Formats in the Lex Developer guide. Every time input is provided to Lex, it invokes the Lambda initalisation and validation function. For example, when I tell my chatbot “I need to reset my password”, the lambda function is invoked and the following event is passed:

Amazon Lex expects a response from the Lambda function in JSON format that provides it with the next dialog action.

Persisting Variables with Session Attributes

There are many ways to determine within your Lambda function where you’re up to in your chat dialog, however my preferred method is to pass state information within the SessionAttributes object of the input event and response as a key/value pair. The SessionAttributes can persist between invocations of the Lambda function (every time input is provided to the chatbot), however you must remember to collect and pass the attributes between input and responses to ensure it persists.

Input Validation Code

With that out of the way, let’s begin looking at the code. The below script is what I’ve used which you can simply copy and paste, assuming you’re using the same slot and intent names in your Lex bot that were used in Part 1.

Let’s break it down.

When the lambda function is first invoked, we check to see if any state is set in the sessionAttributes. If not, we can assume this is the first time the lambda function is invoked and as a result, provide a welcoming response while requesting the User’s ID. To ensure the user isn’t welcomed again, we set a session state so the Lambda function knows to move to User ID validation when next invoked. This is done by setting the “Completed” : “None” key/value pair in the response SessionAttributes.

Next, we check the User ID. You’ll notice the chkUserId function checks for two things; That the slot is populated, and if it is, the length of the field. Because the slot type is AMAZON.Number, any non-numeric characters that are entered will be rejected by the slot. If this occurs, the slot will be left empty, hence this is something we’re looking out for. We also want to ensure the User ID is 6 digits, otherwise it is considered invalid. If the input is correct, we set the session state key/value pair to indicate User ID validation is complete then allow the dialog to continue, otherwise we request the user to re-enter their User ID.

Next, we check the Date of Birth. Because the slot type is strict regarding input, we don’t do much validation here. An utterance for this slot type generally maps to a complete date: YYYY-MM-DD. For validation purpose, we’re just looking for an empty slot. Like the User ID check, we set the session state and allow the dialog to continue if all looks good.

Finally, we check the last slot which is the Month Started. Assuming the input for the month started is correct, we then confirm the intent by reading all the slot values back to the user and asking if it’s correct. You’ll notice here that there’s a bit of logic to determine if the user is using voice or text to interact with Lex. If voice is used, we use Speech Synthesis Markup Language (SSML) to ensure the UserID value is read as digits, rather than as the full number.

If the user is happy with the slot values, the validation completes and Lex then moves to the next Lambda function to fulfil the intent (next blog). If the user isn’t happy with the slot values, the lambda function exits telling the user to call back and try again.

Okay, now that our Lambda function is finished, we need to enable it as a code hook for initialisation and validation. Head over to your Lex bot, select the “ResetPW” intent, then tick the box under Lambda initialisation and validation and select your Lambda function. A prompt will be given to provide permissions to allow your Lex bot to invoke the lambda function. Select OK.

Let’s hit Build on the chatbot, and test it out.

So, we’ve managed to make the conversation a bit more human like and we can now detect invalid input. If you use the microphone to chat with your bot, you’ll notice the User ID value is read as digits. That’s it for this blog. Next blog, we’ll integrate Active Directory and actually get a password reset and sent via SNS to a mobile phone.

Replacing the service desk with bots using Amazon Lex and Amazon Connect (Part 1)

“What! Is this guy for real? Does he really think he can replace the front line of IT with pre-canned conversations?” I must admit, it’s a bold statement. The IT Service Desk has been around for years and has been the foot in the door for most of us in the IT industry. It’s the face of IT operations and plays an important role in ensuring an organisation’s staff can perform to the best of their abilities. But what if we could take some of the repetitive tasks the service desk performs and automate them? Not only would we be saving on head count costs, we would be able to ensure correct policies and procedures are followed to uphold security and compliance. The aim of this blog is to provide a working example of the automation of one service desk scenario to show how easy and accessible the technology is, and how it can be applied to various use cases.
To make it easier to follow along, I’ve broken this blog up into a number of parts. Part 1 will focus on the high-level architecture for the scenario and begin creating the Lex chatbot.

Scenario

Arguably, the most common service desk request is the password reset. While this is a pretty simple issue for the service desk to resolve, many service desk staff seem to skip over, or not realise the importance of user verification. Both the simple resolution and the strict verification requirement make this a prime scenario to automate.

Architecture

So what does the architecture look like? The diagram below dictates the expected process flow. Let’s step through each item in the diagram.

 

Amazon Connect

The process begins when the user calls the service desk and requests to have their password reset. In our architecture, the service desk uses Amazon Connect which is a cloud based customer contact centre service, allowing you to create contact flows, manage agents, and track performance metrics. We’re also able to plug in an Amazon Lex chatbot to handle user requests and offload the call to a human if the chatbot is unable to understand the user’s intent.

Amazon Lex

After the user has stated their request to change their password, we need to begin the user verification process. Their intent is recognised by our Amazon Lex chatbot, which initiates the dialog for the user verification process to ensure they are who they really say they are.

AWS Lambda

After the user has provided verification information, AWS Lambda, which is a compute on demand service, is used to validate the user’s input and verify it against internal records. To do this, Lambda interrogates Active Directory to validate the user.

Amazon SNS

Once the user has been validated, their password is reset to a random string in Active Directory and the password is messaged to the user’s phone using Amazon’s Simple Notification Service. This completes the interaction.

Building our Chatbot

Before we get into the details, it’s worth mentioning that the aim of this blog is to convey the technology capability. There’s many ways of enhancing the solution or improving validation of user input that I’ve skipped over, so while this isn’t a finished production ready product, it’s certainly a good foundation to begin building an automated call centre.

To begin, let’s start with building our Chatbot in Amazon Lex. In the Amazon Console, navigate to Amazon Lex. You’ll notice it’s only available in Ireland and US East. As Amazon Connect and my Active Directory environment is also in US East, that’s the region I’ve chosen.

Go ahead and select Create Bot, then choose to create your own Custom Bot. I’ve named mine “UserAdministration”. Choose an Output voice and set the session timeout to 5 minutes. An IAM Role will automatically be created on your behalf to allow your bot to use Amazon Polly for speech. For COPPA, select No, then select Create.

Once the bot has been created, we need to identify the user action expected to be performed, which is known as an intent. A bot can have multiple intents, but for our scenario, we’re only creating one, which is the password reset intent. Go ahead and select Create Intent, then in the Add Intent window, select Create new intent. My intent name is “ResetPW”. Select Add, which should add the intent to your bot. We now need to specify some expected sample utterances, which are phrases the user can use to trigger the Reset Password intent. There’s quite a few that could be listed here, but I’m going to limit mine to the following:

  • I forgot my password
  • I need to reset my password
  • Can you please reset my password

The next section is the configuration for the Lambda validation function. Let’s skip past this for the time being and move onto the slots. Slots are used to collect information from the user. In our case, we need to collect verification information to ensure the user is who they say they are. The verification information collected is going to vary between environments. I’m looking to collect the following to verify against Active Directory:

  • User ID – In my case, this is a 6-digit employee number that is also the sAMAccountName in Active Directory
  • User’s birthday – This is a custom attribute in my Active Directory
  • Month started – This is a custom attribute in my Active Directory

In addition to this, it’s also worth collecting and verifying the user’s mobile number. This can be done by passing the caller ID information from Amazon Connect, however we’ll skip this, as the bulk of our testing will be text chat and we need to ensure we have a consistent experience.

To define a slot, we need to specify three items:

  • Name of the slot – Think of this as the variable name.
  • Slot type – The data type expected. This is used to train the machine learning model to recognise the value for the slot.
  • Prompt – How the user is prompted to provide the value sought.

Many slot types are provided by Amazon, two of which has been used in this scenario. For “MonthStarted”, I’ve decided to create my own custom slot type, as the in-built “AMAZON.Month” slot type wasn’t strictly enforcing recognisable months. To create your own slot type, press the plus symbol on the left-hand side of the page next to Slot Types, then provide a name and description for your slot type. Select to Restrict to Slot values and Synonyms, then enter each month and its abbreviation. Once completed, click Add slot to intent.

Once the custom slot type has been configured, it’s time to set up the slots for the intent. The screenshot below shows the slots that have been configured and the expected order to prompt the user.

Last step (for this blog post), is to have the bot verify the information collected is correct. Tick the Confirmation Prompt box and in the Confirm text box provided, enter the following:

Just to confirm, your user ID is {UserID}, your Date of Birth is {DOB} and the month you started is {MonthStarted}. Is this correct?

For the Cancel text box, enter the following:

Sorry about that. Please call back and try again.

Be sure to leave the fulfillment to Return parameters to client and hit Save Intent.

Great! We’ve built the bare basics of our bot. It doesn’t really do much yet, but let’s take it for a spin anyway and get a feel for what to expect. In the top right-hand corner, there’s a build button. Go ahead and click the button. This takes some time, as building a bot triggers machine learning and creates the models for your bot. Once completed, the bot should be available to text or voice chat on the right side of the page. As you move through the prompts, you can see at the bottom the slots getting populated with the expected format. i.e. 14th April 1983 is converted to 1983-04-14.

So at the moment, our bot doesn’t do much but collect the information we need. Admittedly, the conversation is a bit robotic as well. In the next few blogs, we’ll give the bot a bit more of a personality, we’ll do some input validation, and we’ll begin to integrate with Active Directory. Once we’ve got our bot working as expected, we’ll bolt on Amazon Connect to allow users to dial in and converse with our new bot.

Re-execute the UserData script in an AWS Windows Instance

First published at https://nivleshc.wordpress.com

Bootstrapping is an awesome way of customising your instances in AWS (similar capability exists in Azure).

To enable bootstrapping, while configuring the launch instance, in Step 3: Configure Instance Details scroll down to the bottom and then expand Advanced Details.

You will notice a User data text box. This is where you can provide your bootstrap script. The script will be run when your instance is first launched.

AWS_BootstrapScript

I went ahead and entered my script in the text box and proceeded to complete my instance configuration. Once my instance was running, I initiated a Remote Desktop connection to it, to confirm that my script had run. Unfortunately, I couldn’t see any customisations (which meant my script didn’t run)

Thinking that the instance had not been able to access the user data, I opened up Internet Explorer and then browsed to the following url (this is an internal url that can be used to access the user-data)

http://169.254.169.254/latest/user-data/

I was able to successfully access the user-data, which meant that there were no issues with that.  However when checking the content, I noticed a typo! Aha, that was the reason why my customisations didn’t happen.

Unfortunately, according to AWS, user-data is only executed during launch (for those that would like to read, here is the official AWS documentation). To get the fixed bootstrap script to run, I would have to terminate my instance and launch a new one with the corrected script (I tried re-booting my windows instance after correcting my typo, however it didn’t run).

I wasn’t very happy on terminating my current instance and then launching a new one, since for those that might not be aware, AWS EC2 compute charges are rounded up to the next hour. Which means that if I terminated my current instance and launched a new one, I would be charged for 2 x 1hour sessions instead of just 1 x 1 hour!

So I set about trying to find another solution. And guess what, I did find it 🙂

Reading through the volumes of documentation on AWS, I found that when Windows Instances are provisioned, the service that does the customisations using user-data is called EC2Config. This service runs the initial startup tasks when the instance is first started and then disables them. HOWEVER, there is a way to re-enable the startup tasks later on 🙂 Here is the document that gives more information on EC2Config.

The Amazon Windows AMIs include a utility called EC2ConfigService Settings. This allows you to configure EC2Config to execute the user-data on next service startup. The utility can be found under All Programs (or you can search for it).

AWS_EC2ConfigSettings_AllApps

AWS_EC2ConfigSettings_Search

Once Open, under General you will see the following option

Enable UserData execution for next service start (automatically enabled at Sysprep) eg. or <powershell></powershell>

AWS_EC2ConfigSettings

Tick this option and then press OK. Then restart your Windows Instance.

After your Windows Instance restarts, EC2Config will execute the userData (bootstrap script) and then it will automatically remove the tick from the above option so that the userData is not executed on subsequent restarts (or service starts)

There you go. A simple way to re-run your bootstrap scripts on an AWS Windows Instance without having to terminate the current instance and launching a new one.

There are other options available in the EC2ConfigService Settings that you can explore as well 🙂

A Closer Look at Amazon Chime

In news this quarter AWS have released a web conferencing cloud service to their existing ‘Business Productivity‘ services which already includes Amazon WorkDocs and Amazon WorkMail. So my thought was to help you gauge where this sits in relation to Skype for Business. I don’t want to put this into a Microsoft versus Amazon review but I do want you to understand the product names that ‘somewhat’ align with each other.

Exchange = WorkMail

SharePoint/OneDrive for Business  =  WorkDocs

Skype for Business  = Chime

The Microsoft products are reasonably well known in the world so I’ll give you a quick one liner about the Amazons products:

WorkMail “Hosted Email”

WorkDocs “Hosted files accessible via web, PC, mobile devices with editing and sharing capability”

So what is Chime?

Chime isn’t exactly a one-to-one feature set for Skype for Business so be wary of articles conveying this sentiment as they haven’t really done their homework. Chime can be called either a web meeting, online meeting, or online video conferencing platform. Unlike Skype for Business, Chime is not a PBX replacement. So what does it offer?

  • Presence
  • 1-1 and group IM
  • Persistent chat / chat rooms
  • Transfer files
  • Group HD video calling
  • Schedule meetings using Outlook or Google calendar
  • Join meeting from desktop or browser
  • Join meeting anonymously
  • Meeting controls for presenter
  • Desktop sharing
  • Remote desktop control
  • Record audio and video of meetings
  • Allow participants to dial-in with no Chime client (PSTN conferencing bridge)
  • Enable legacy H.323 endpoints to join meetings (H.323 bridge)
  • Customisable / personalised meeting space URL

The cloud hosted based products that I see are similar to Chime are WebEx, Google Hangouts, GoToMeeting and ReadyTalk to name just a few. As here at Kloud we are Microsoft Gold Partners and have a Unified Communication team that deliver Skype for Business solutions and not the other products I have previously mentioned, I will tell you a few things that differentiate SFB from Chime.

  • Direct Inward Dial (DID) user assignment
  • Call forwarding and voicemail
  • Automated Call Distribution (ACD)
  • Outbound PSTN dialling and route based on policy assignment
  • Integrated Voice Recording (IVR)
  • Hunt Groups / Call Pickup Groups
  • Shared Line Apperance
  • SIP IP-Phone compatibility
  • Attendant / Reception call pickup
  • On-premises, hybrid and hosted deployment models

Basically all things that replace a PBX Solution Skype for Business will do. Now this isn’t a cheap shot at Amazon, cause that isn’t where they are positioning their product. What I’ve hoped to have done is clarify any misconception about where the product sits in the market and how it relates to features in a well known product like Microsoft Skype for Business.

For the price that Amazon are offering Chime for in the online meeting market it is very competitive against some other hosted products. Their ‘rolls-royce’ plan is simply purchased for $15 USD per user per month. If you’re not invested in the Office 365 E3/E5 license ecosystem and you need a online meeting platform at a low cost, then Chime might be right for you. Amazon offer a 30 day free trial for free that is a great way to test it out.

https://chime.aws

 

Dynamically rename an AWS Windows host deployed via a syspreped AMI

One of my customers presented me with a unique problem last week. They needed to rename a Windows Server 2016 host deployed using a custom AMI without rebooting during the bootstrap process. This lack of a reboot rules out the simple option of using the PowerShell Rename-Computer Cmdlet. While there are a number of methods to do this, one option we came up with is dynamically updating the sysprep unattended answer file using a PowerShell script prior to the unattended install running during first boot of a sysprepped instance.

To begin, we looked at the Windows Server 2016 sysprep process on an EC2 instance to get an understanding of the process required. It’s important to note that this process is slightly different to Server 2012 on AWS, as EC2Launch has replaced EC2Config. The high level process we need to perform is the following:

  1. Deploy a Windows Server 2016 AMI
  2. Connect to the Windows instance and customise it as required
  3. Create a startup PowerShell script to set the hostname that will be invoked on boot before the unattended installation begins
  4. Run InitializeInstance.ps1 -Schedule to register a scheduled task that initializes the instance on next boot
  5. Modify SysprepInstance.ps1 to replace the cmdline registry key after sysprep is complete and prior to shutdown
  6. Run SysprepInstance.ps1 to sysprep the machine.

Steps 1 and 2 are fairly self-explanatory, so let’s start by taking a look at step 3.

After a host is sysprepped, the value for the cmdline registry key located in HKLM:\System\Setup is populated with the windeploy.exe path, indicating that it should be invoked after reboot to initiate the unattended installation. Our aim is to ensure that the answer file (unattend.xml) is modified to include the computer name we want the host to be named prior to windeploy.exe executing. To do this, I’ve created the following PowerShell script named startup.ps1 and placed it in C:\Scripts to ensure my hostname is based on the internal IP address of my instance.

Once this script is in place, we can move to step 4, where we schedule the InitializeInstance.ps1 script to run on first boot. This can be done by running InitializeInstance.ps1 -Schedule located in C:\ProgramData\Amazon\EC2-Windows\Launch\Scripts

Step 5 requires a modification of the SysprepInstance.ps1 file to ensure the shutdown after Sysprep is delayed to allow modification of the cmdline registry entry mentioned in step 3. The aim here is to replace the windeploy.exe path with our startup script, then shutdown the host. The modifications to the PowerShell script to accommodate this are made after the “#Finally, perform Sysprep” comment.

Finally, run the SysprepInstance.ps1 script.

Once the host changes to the stopped state, you can now create an AMI and use this as your image.

Experiences with the new AWS Application Load Balancer

Originally posted on Andrew’s blog @ cloudconsultancy.info

Summary

Recently I had an opportunity to test drive AWS Application load balancer as my client had a requirement for making their websocket application fault tolerant. The implementation was complete windows stack and utilised ADFS 2.0 for SAML authentication however this should not affect other people’s implementation.

The AWS Application load balancer is a fairly new feature which provides layer 7 load balancing and support for HTTP/2 as well as websockets. In this blog post I will include examples of the configuration that I used to implement as well is some of the troubleshooting steps I needed to resolve.

The application load balancer is an independent AWS resource from classic ELB and is defined as aws elbv2 with a number of different properties.

Benefits of Application Load Balancer include:

  • Content based routing, ie route /store to a different set of instances from /apiv2
  • Support for websocket
  • Support for HTTP/2 over HTTPS only (much larger throughput as it’s a single stream multiplexed meaning it’s great for mobile and other high latency apps)
  • Cheaper cost than classic, roughly 10% cheaper than traditional.
  • Cross-zone load balancing is always enabled for ALB.

Some changes that I’ve noticed:

  • Load balancing algorithm used for application load balancer is currently round robin.
  • Cross-zone load balancing is always enabled for an Application Load Balancer and is disabled by default for a Classic Load Balancer.
  • With an Application Load Balancer, the idle timeout value applies only to front-end connections and not the LB-> server connection and this prevents the LB cycling the connection.
  • Application Load balancer is exactly that and performs at Layer 7, so if you want to perform SSL bridge use Classic load balancer with TCP and configure SSL certs on your server endpoint.
  • cookie-expiration-period value of 0 is not supported to defer session timeout to the application. I ended up having to configure the stickiness.lb_cookie.duration_seconds value. I’d suggest making this 1 minute longer than application session timeout, in my example a value of 1860.
  • The X-Forwarded-For parameter is still supported and should be utilised if you need to track client IP addresses, in particular useful if going through a proxy server.

For more detailed information from AWS see http://docs.aws.amazon.com/elasticloadbalancing/latest/userguide/how-elastic-load-balancing-works.html.

Importing SSL Certificate into AWS – Windows

(http://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_server-certs.html )

  1. Convert the existing pkcs key into .pem format for AWS

You’ll need openssl for this, the pfx and the password for the SSL certificate.

I like to use chocolatey as my Windows package manager, similar to yum or apt-get for Windows, which is a saviour for downloading package and managing dependencies in order to support automation, but enough of that, check it out @ https://chocolatey.org/

Once choco is installed I simply execute the following from an elevated command prompt.

“choco install openssl.light”

Thereafter I run the following two commands which breaks out the private and public keys (during which you’ll be prompted for the password):

openssl pkcs12 -in keyStore.pfx -out SomePrivateKey.key –nodes –nocerts

openssl pkcs12 -in keyStore.pfx -out SomePublic.cert –nodes –nokeys

NB: I’ve found that sometimes copy and paste doesn’t work when trying to convert keys.

  1. Next you’ll need to also break out the trust chain into one contiguous file, like the following.
-----BEGIN CERTIFICATE-----

Intermediate certificate 2

-----END CERTIFICATE-----

-----BEGIN CERTIFICATE-----

Intermediate certificate 1

-----END CERTIFICATE-----

-----BEGIN CERTIFICATE-----

Optional: Root certificate

-----END CERTIFICATE-----

Save the file for future use,

thawte_trust_chain.txt

Example attached above is for a Thawte trust chain with the following properties

“thawte Primary Root CA” Thumbprint ‎91 c6 d6 ee 3e 8a c8 63 84 e5 48 c2 99 29 5c 75 6c 81 7b 81

With intermediate

“thawte SSL CA - G2” Thumbprint ‎2e a7 1c 36 7d 17 8c 84 3f d2 1d b4 fd b6 30 ba 54 a2 0d c5

Ordinarily you’ll only have a root and intermediate CA, although sometimes there will be second intermediary CA.

Ensure that your certificates are base 64 encoded when you export them.

  1. Finally execute the following after authenticating to the AWS CLI (v1.11.14+ to support aws elbv2 function) then run “aws configure” applying your access and secret keys, configuring region and format type. Please note that this includes some of the above elements including trust chain and public and private keys.

If you get the error as below

A client error (MalformedCertificate) occurred when calling the UploadServerCertificate operation: Unable to validate certificate chain. The certificate chain must start with the immediate signing certificate, followed by any intermediaries in order. The index within the chain of the invalid certificate is: 2”

Please check the contents of the original root and intermediate keys as they probably still have the headers and maybe some intermediate,

ie

Bag Attributes

localKeyID: 01 00 00 00

friendlyName: serviceSSL

subject=/C=AU/ST=New South Wales/L=Sydney/O=Some Company/OU=IT/CN=service.example.com

issuer=/C=US/O=thawte, Inc./CN=thawte SSL CA - G2

Bag Attributes

friendlyName: thawte

subject=/C=US/O=thawte, Inc./OU=Certification Services Division/OU=(c) 2006 thawte, Inc. - For authorized use only/CN=thawte Primary Root CA

issuer=/C=US/O=thawte, Inc./OU=Certification Services Division/OU=(c) 2006 thawte, Inc. - For authorized use only/CN=thawte Primary Root CA

AWS Application LB Configuration

Follow this gist with comments embedded. Comments provided based on some gotchas during configuration.

You should be now good to go, the load balancer takes a little while to warm up, however will be available within multiple availability zones.

If you have issues connecting to the ALB, validate connectivity direct to the server using curl.  Again chocolatey comes in handy “choco install curl”

Also double check your security group registered against the ALB and confirm NACLS.

WebServer configuration

You’ll need to import the SSL certificate into the local computer certificate store. Some of the third party issuing (Ensign, Thawte, etc) CAs may not have the intermediate CA within the computed trusted Root CAs store, especially if built in a network isolated from the internet, so make sure after installing the SSL certificate on the server that the trust chain is correct.

You won’t need to update local hosts file on the servers to point to the load balanced address.

Implementation using CNAMEs

In large enterprises where I’ve worked there have been long lead times associated with fairly simple DNS changes, which defeats some of the agility provided by cloud computing. A pattern I’ve often seen adopted is to use multiple CNAMEs to work around such lead times. Generally you’ll have a subdomain domain somewhere where the Ops team have more control over or shorter lead times. Within the target domain (Example.com) create a CNAME pointing to an address within the ops managed domain (aws.corp.internal) and have a CNAME created within that zone to point to the ALB address, ie

Service.example.com -> service.aws.corp.internal -> elbarn.region.elb.amazonaws.com

With this approach I can update service.aws.corp.internal to reflect a new service which I’ve built via a new ELB and avoid the enterprise change lead times associated with a change in .example.com.

Site Delivery with AWS CloudFront CDN

Nowadays, most companies are using some sort of a Content Delivery Network (CDN) to improve the performance and high availability of their sites, those include Azure CDN, CloudFlare, CloudFront, Varnish, and so on.

In this blog however, I will demonstrate how you can deliver your entire website through AWS’s CloudFront. This blog will not go through other CDN services. This blog also assumes you have knowledge of AWS services, DNS, and CDN.

What is CloudFront?

Amazon CloudFront is a global content delivery network (CDN) service that accelerates delivery of your websites, APIs, video content or other web assets. It integrates with other Amazon Web Services products to give developers and businesses an easy way to accelerate content to end users with no minimum usage commitments.

CloudFront delivers the contents of your websites through global datacentres known as “Edge Locations”.

Assuming the webserver is located in New York, and you’re accessing the website from Melbourne, then the latency will be greater than someone trying to access the website from London.

CloudFront’s Edge Locations will serve the content of a website depending on location. That is, if you’re trying to access a New York based website from Melbourne, you will be directed to the closest Edge Location available for users from Australia. There are two Edge Locations in Australia, one in Melbourne and one in Sydney.

How CloudFront delivers content?

Please note that contents aren’t delivered from the first request (whether you use CloudFront or any other CDN solution). That is, the first user who accesses a page from Melbourne (first request), the contents of the page hadn’t been cached (yet) and it will be fetched from the webserver. The second user who accesses the website (second request), will get the contents from the Edge Location.

Here’s how:

drawing1

The main features of CloudFront are:

  • Edge Location: This is the location where content will be cached. This is separate to an AWS Region or Availability Zone
  • Origin: This is the origin of all the files that the CDN will distribute. This can be either an S3 bucket, an EC2 instance, and ELB or Route53.
  • Distribution: The name given the CDN which consists of collection of Edge Location
  • Web Distribution: Typically used for Websites.
  • RTMP: Typically used for Media Streaming (Not covered in this blog).

In this blog, we will be covering “Web Distribution”.

There are multiple ways to “define” your origin. You could either upload your contents to an S3 bucket, or let CloudFront cache objects from your webserver.

I advise you to keep the same naming conventions you have previously used for your website.

There’s no real difference between choosing an S3 bucket, or your webserver to deliver contents, or even an Elastic Load Balancer.

What matters however, should you choose for CloudFront to cache objects from your origin, you may need to change your hostname. Alternatively, if your DNS registrar allows it, you can make an APEX DNS change.

Before we dive deep into setting up and configuring CloudFront, know that it is fairly a very simple process, and we will be using CloudFront GUI to achieve this.

Alternatively you can use other third party tools like S3 Browser, Cloudberry Explorer, or even CloudFormation if you have several websites you’d like to enable CloudFront for. These tools are excluded from this blog.

Setup CloudFront with S3

Although I do not recommend this approach because S3 is designed as a storage service and not a delivery (content) service, under load it will not provide you with optimum performance.

  1. Create your bucket
  2. Upload your files
  3. Make your content public (this is achieved through Permissions. Simply choose grantee “everyone”.

Configuring CloudFront with S3

As aforementioned, configuring CloudFront is very straightforward. Here are the steps for doing so.

I will explain the different settings at the end of each image.

Choose your origin (domain name, S3 bucket, ELB…). If you have an S3 bucket or an ELB already configured, they will show in the drop down menu.

You could simply follow the selected options in the image for optimal performance and configuration of your CloudFront distribution.

5-cdn

  • Origin path: This is optional and usually not needed to be specified. This is basically a directory in your bucket in which you’re telling CloudFront to request the content from.
  • Origin ID: This is automatically populated, but you can change it. Its only function is for you to distinguish origins if you have multiple origins in the same distributions.
  • Restrict Bucket Access: This is for users to access your CloudFront URL e.g. 123456.cloudfront.net rather than the S3 URL.
  • Origin Access Identity: This is required if you want your users to always access your Amazon S3 content using CloudFront URLs. You can use the same Access Identity for all your distributions. In fact, it is recommended you do so to make life simpler.
  • Grant Read Permissions on Bucket: This applies on the “Origin Access Identity” so CloudFront can access objects in your Amazon S3 bucket. This is automatically applied.
  • Viewer Protocol Policy: This is to specify how users should access your origin domain name. If you have a website that accepts both HTTP and HTTPS, then choose that. CloudFront will fetch the contents based on this viewer policy. That is, if a user typed in http://iknowtech.com.au then CloudFront will fetch content over HTTP. If HTTPS is used, then CloudFront will fetch contents over HTTPS. If your website only accepts HTTPS, then choose that option.
  •  Allowed HTTP Methods: This basically is used for commerce websites or websites with login forms which requires data from end users for better performance. You can keep it default on “Get, Head”. Nevertheless, make sure to configure your webserver to handle “Delete” appropriately, otherwise users might be able to delete contents.
  • Cached HTTP Methods: You will have an additional “Options”, if you choose the specified HTTP Methods shown above. This is to specify the methods in which you want CloudFront to do caching.

In the second part of the configuration:

6-cdn

  • Price Class: This is to specify which regions of available Edge Locations you want to “serve” your website from.
  • AWS WAF Web ACL: Web Application Firewall (WAF) is a set of ACL rules which you create to protect your website from attacks, e.g. SQL Injections etc. I highlighted that on purpose as there will be another blog for that alone.
  • CNAME: If you don’t want to use CloudFront’s URL e.g.123456.cloudfront.net, and instead you want to use your own domain, then specifying a CNAME is a good idea, and you can specify up to 100 CNAMEs. Nevertheless, there may be a catch. Most DNS hosting services may not allow you to edit the APEX zone of your records. And if you create a CNAME for http://www.domain.com to point to 123456.cloudfront.net, then any requests coming from htttp://domain.com will not be going through CloudFront. And if you have a redirection set up in your webserver, for any request coming from http://www.domain.com to go http://domain.com then there’s no point configuring CloudFront.
  • SSL Certificates: You could either use CloudFront’s certificate, and it is a wildcard certificate of *.cloudfront.net, or you can request to use your own domain’s certificate.
  • Supported HTTP Versions: What you need to know is that CloudFront always forwards requests to the origin using HTTP/1.1. This also is based on the viewer policy, most modern websites support all HTTP methods shown above. HTTP/2 is usually faster. Read more here for more info on HTTP/2 support. In theory this sounds ideal, technically however, nothing much is happening in the backend of CloudFront.
  • Logging: Choose to have logging on. Logs are saved in an S3 bucket.
  • Bucket for Logs: Specify the bucket you want to save the logs onto.
  • Log Prefix: Choose a prefix for your logs. I like to include the domain name for each log of each domain.
  • Cookie logging: Not quite important to have it turned on.
  • Enable IPv6: You can have IPv6 enabled, however as of this writing this is still being deployed.
  • Distribution State: You can choose to deploy/create your CloudFront distribution with either in an enabled or disabled state.

Once you’ve completed the steps above, click on “Create Distribution”. It might take anywhere from 10 to 30 minutes for CloudFront to be deployed. Average waiting time is 15 minutes.

Setup and Configure your DNS Records

Once the Distribution is started, head over to “Distributions” in CloudFront, then click on the Distribution ID, take note of the domain name: d2hpa1xghjdn8b.cloudfront.net.

Head over to your DNS records and add the CNAME (or CNAMEs) you have specified in earlier steps to point to d2hpa1xghjdn8b.cloudfront.net.

Do not wait until the Distribution is complete, add the DNS records while the Distribution is being deployed. This will at least give time for your DNS to propagate, since CloudFront takes anywhere between 10 to 30 minutes to be deployed.

17-dns

18-dns

If you’re delivering the website through an ELB, then you can use the ELB’s CNAME to point your site to it.

Here’s what will appear eventually once the CloudFront Distribution is complete. Notice the URLs: http://d80u8wk4w5p58.cloudfront.net/nasa_blue_marble.jpg (I may remove this link in the future).

8-cdn

You can also access it via: http://cdn.iknowtech.com.au/nasa_blue_marble.jpg (I may also remove this link in the future).

10-cdn

Configuring CloudFront with Custom Origin

Creating a CloudFront Distribution based on Custom Origin, that is to allow CloudFront to cache directly from your domain, is pretty much the same as above, with some differences, as shown below. Every other setting is the same as above.

9-cdn

The changes, as you can see relate to SSL Protocols, HTTP and HTTPS ports.

  • Original SSL Protocols: This is to specify which SSL Protocols CloudFront will use when establishing a connection to your origin. If you don’t have SSLv3, keep it disabled. If you do, and your origin does not support v1, v1.1. and v1.2, then choose SSLv3.
  • Origin Protocol Policy: This is the same as Viewer Protocol Policy discussed above. If you choose “Match Viewer” then it will work with both HTTP and HTTPS. Obviously, it also depends on how your website is set up.
  • HTTP and HTTPS ports: Leave default ports.

Configure CloudFront with WordPress

If you have a WordPress page, it is most probably the easiest way to configure CloudFront for WordPress. Through the use of plugins, you can change the hostname.

  1. Install the W3 Total Cache plugin in your WordPress page.
  2. Enable CDN and choose CloudFront. This is found in the “General Settings” tab of the W3 plugin.11-cdn
  3. While scrolling down to CDN, you may enable other forms of “caching” found in settings.
  4. After saving your changes, click on “CDN” tab of the W3 plugin.
  5. Key in a the required information. I suggest you create an IAM user with permission to CloudFront to be used here.

Note that I used cdn2.iknowtech.com.au because I had already used cdn.iknowtech.com.au. CloudFront will detect this setting and give you an error if you try and use the same CNAME.

12-cdn

Once your settings are saved, here’s how it’ll look.

Note the URLs: http://d2hpa1xghjdn8b.cloudfront.net (I may remove this link in the future).13-cdn

You can also access it via: http://cdn2.iknowtech.com.au (I may also remove this link the future).

14-cdn

 

 

To make sure your CDN is working, you can perform some test, using any of the followings: gtmetrix.com, pingdom.com or webpagetest.org .

Here are the results from gtmetrix, tested for iknowtech.com.au

15-cdn

The same result for cdn2.iknowtech.com.au.

Notice the page load time after the second request.

16-cdn

And that’s it. All you need to know about to create a CloudFront Distribution.

Conclusion

CloudFront is definitely a product to use if you’re looking for CDN. It is true you have many other CDN products out there, but CloudFront is one of the easiest, highly-available CDNs in the market.

Before you actually utilise CloudFront or any other CDN solutions, just be mindful of your hostnames. You need your primary domain or record to be cached.

I hope you found this blog informative. Please feel free to post your comments below.

Thanks for reading.

 

Ubuntu security hardening for the cloud.

Hardening Ubuntu Server Security For Use in the Cloud

The following describes a few simple means of improving Ubuntu Server security for use in the cloud. Many of the optimizations discussed below apply equally to other Linux based distribution although the commands and settings will vary somewhat.

Azure cloud specific recommendations

  1. Use private key and certificate based SSH authentication exclusively and never use passwords.
  2. Never employ common usernames such as root , admin or administrator.
  3. Change the default public SSH port away from 22.

AWS cloud specific recommendations

AWS makes available a small list of recommendation for securing Linux in their cloud security whitepaper.

Ubuntu / Linux specific recommendations

1. Disable the use of all insecure protocols (FTP, Telnet, RSH and HTTP) and replace them with their encrypted counterparts such as sFTP, SSH, SCP and HTTPS

yum erase inetd xinetd ypserv tftp-server telnet-server rsh-server

2. Uninstall all unnecessary packages

dpkg --get-selections | grep -v deinstall
dpkg --get-selections | grep postgres
yum remove packageName

For more information: http://askubuntu.com/questions/17823/how-to-list-all-installed-packages

3. Run the most recent kernel version available for your distribution

For more information: https://wiki.ubuntu.com/Kernel/LTSEnablementStack

4. Disable root SSH shell access

Open the following file…

sudo vim /etc/ssh/sshd_config

… then change the following value to no.

PermitRootLogin yes

For more information: http://askubuntu.com/questions/27559/how-do-i-disable-remote-ssh-login-as-root-from-a-server

5. Grant shell access to as few users as possible and limit their permissions

Limiting shell access is an important means of securing a system. Shell access is inherently dangerous because of the risk of unlawfully privilege escalations as with any operating systems, however stolen credentials are a concern too.

Open the following file…

sudo vim /etc/ssh/sshd_config

… then add an entry for each user to be allowed.

AllowUsers jim,tom,sally

For more information: http://www.cyberciti.biz/faq/howto-limit-what-users-can-log-onto-system-via-ssh/

6. Limit or change the IP addresses SSH listens on

Open the following file…

sudo vim /etc/ssh/sshd_config

… then add the following.

ListenAddress <IP ADDRESS>

For more information:

http://askubuntu.com/questions/82280/how-do-i-get-ssh-to-listen-on-a-new-ip-without-restarting-the-machine

7. Restrict all forms of access to the host by individual IPs or address ranges

TCP wrapper based access lists can be included in the following files.

/etc/hosts.allow
/etc/hosts.deny

Note: Any changes to your hosts.allow and hosts.deny files take immediate effect, no restarts are needed.

Patterns

ALL : 123.12.

Would match all hosts in the 123.12.0.0 network.

ALL : 192.168.0.1/255.255.255.0

An IP address and subnet mask can be used in a rule.

sshd : /etc/sshd.deny

If the client list begins with a slash (/), it is treated as a filename. In the above rule, TCP wrappers looks up the file sshd.deny for all SSH connections.

sshd : ALL EXCEPT 192.168.0.15

This will allow SSH connections from only the machine with IP address 192.168.0.15 and block all other connection attemps. You can use the options allow or deny to allow or restrict access on a per client basis in either of the files.

in.telnetd : 192.168.5.5 : deny
in.telnetd : 192.168.5.6 : allow

Warning: While restricting system shell access by IP address be very careful not to loose access to the system by locking the administrative user out!

For more information: https://debian-administration.org/article/87/Keeping_SSH_access_secure

8. Check listening network ports

Check listening ports and uninstall or disable all unessential or insecure protocols and deamons.

netstat -tulpn

9. Install Fail2ban

Fail2ban is a means of dealing with unwanted system access attempts over any protocol against a Linux host. It uses rule sets to automate variable length IP banning sources of configurable activity patterns such as SPAM, (D)DOS or brute force attacks.

“Fail2Ban is an intrusion prevention software framework that protects computer servers from brute-force attacks. Written in the Python programming language, it is able to run on POSIX systems that have an interface to a packet-control system or firewall installed locally, for example, iptables or TCP Wrapper.” – Wikipedia

For more information: https://www.digitalocean.com/community/tutorials/how-to-protect-ssh-with-fail2ban-on-ubuntu-14-04

10. Improve the robustness of TCP/IP

Add the following to harden your networking configuration…

10-network-security.conf

… such as

sudo vim /etc/sysctl.d/10-network-security.conf
Ignore ICMP broadcast requests
net.ipv4.icmp_echo_ignore_broadcasts = 1

# Disable source packet routing
net.ipv4.conf.all.accept_source_route = 0
net.ipv6.conf.all.accept_source_route = 0 
net.ipv4.conf.default.accept_source_route = 0
net.ipv6.conf.default.accept_source_route = 0

# Ignore send redirects
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0

# Block SYN attacks
net.ipv4.tcp_max_syn_backlog = 2048
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 5

# Log Martians
net.ipv4.conf.all.log_martians = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1

# Ignore ICMP redirects
net.ipv4.conf.all.accept_redirects = 0
net.ipv6.conf.all.accept_redirects = 0
net.ipv4.conf.default.accept_redirects = 0 
net.ipv6.conf.default.accept_redirects = 0

# Ignore Directed pings
net.ipv4.icmp_echo_ignore_all = 1

And load the new rules as follows.

service procps start

For more information: https://blog.mattbrock.co.uk/hardening-the-security-on-ubuntu-server-14-04/

11. If you are serving web traffic install mod-security

Web application firewalls can be helpful in warning of and fending off a range of attack vectors including SQL injection, (D)DOS, cross-site scripting (XSS) and many others.

“ModSecurity is an open source, cross-platform web application firewall (WAF) module. Known as the “Swiss Army Knife” of WAFs, it enables web application defenders to gain visibility into HTTP(S) traffic and provides a power rules language and API to implement advanced protections.”

For more information: https://modsecurity.org/

12. Install a firewall such as IPtables

IPtables is a highlight configurable and very powerful Linux forewall which has a great deal to offer in terms of bolstering hosts based security.

iptables is a user-space application program that allows a system administrator to configure the tables provided by the Linux kernel firewall (implemented as different Netfilter modules) and the chains and rules it stores.” – Wikipedia.

For more information: https://help.ubuntu.com/community/IptablesHowTo

13. Keep all packages up to date at all times and install security updates as soon as possible

 sudo apt-get update        # Fetches the list of available updates
 sudo apt-get upgrade       # Strictly upgrades the current packages
 sudo apt-get dist-upgrade  # Installs updates (new ones)

14. Install multifactor authentication for shell access

Nowadays it’s possible to use multi-factor authentication for shell access thanks to Google Authenticator.

For more information: https://www.digitalocean.com/community/tutorials/how-to-set-up-multi-factor-authentication-for-ssh-on-ubuntu-14-04

15. Add a second level of authentication behind every web based login page

Stolen passwords are a common problem whether as a result of a vulnerable web application, an SQL injection, a compromised end user computer or something else altogether adding a second layer of protection using .htaccess authentication with credentials stored on the filesystem not in a database is great added security.

For more information: http://stackoverflow.com/questions/6441578/how-secure-is-htaccess-password-protection

Viewing AWS CloudFormation and bootstrap logs in CloudWatch

Mature cloud platforms such as AWS and Azure have simplified infrastructure provisioning with toolsets such as CloudFormation and Azure Resource Manager (ARM) to provide an easy way to create and manage a collection of related infrastructure resources. Both tool sets allow developers and system administrators to use JavaScript Object Notation (JSON) to specify resources to provision, as well as provide the means to bootstrap systems, effectively allowing for single click fully configured environment deployments.

While these toolsets are an excellent means to prevent RSI from performing repetitive monotonous tasks, the initial writing and testing of templates and scripts can be incredibly time consuming. Troubleshooting and debugging bootstrap scripts usually involves logging into hosts and checking log files. These hosts are often behind firewalls, resulting in the need to use jump hosts which may be MFA integrated, all resulting in a reduced appetite for infrastructure as code.

One of my favourite things about Azure is the ability to watch the ARM provisioning and host bootstrapping process through the console. Unless there’s a need to rerun a script on a host and watch it in real time, troubleshooting the deployment failure can be performed by viewing the ARM deployment history or viewing the relevant Virtual Machine Extension. Examples can be seen below:

This screenshot shows the ARM resources have been deployed successfully.

This screenshot shows the ARM resources have been deployed successfully.

This screenshot shows the DSC extension status, with more deployment details on the right pane.

This screenshot shows the DSC extension status, with more deployment details on the right pane.

While this seems simple enough in Azure, I found it a little less straight forward in AWS. Like Azure, bootstrap logs for the instance reside on the host itself, however the logs aren’t shown in the console by default. Although there’s a blog post on AWS to view CloudFormation logs in CloudWatch, it was tailored to Linux instances. Keen for a similar experience to Azure, I decided to put together the following instructional to have bootstrap logs appear in CloudWatch.

To enable CloudWatch for instances dynamically, the first step is to create an IAM role that can be attached to EC2 instances when they’re launched, providing them with access to CloudWatch. The following JSON code shows a sample policy I’ve used to define my IAM role.

The next task is to create a script that can be used at the start of the bootstrap process to dynamically enable the CloudWatch plugin on the EC2 instance. The plugin is disabled by default and when enabled, requires a restart of the EC2Config service. I used the following script:

It’s worth noting that the EC2Config service is set to recover by default and therefore starts by itself after the process is killed.

Now that we’ve got a script to enable the CloudWatch plugin, we need to change the default CloudWatch config file on the host prior to enabling the CloudWatch plugin. The default CloudWatch config file is AWS.EC2.Windows.CloudWatch.json and contains details of all the logs that should be monitored as well as defining CloudWatch log groups and log streams. Because there’s a considerable number of changes made to the default file to achieve the desired result, I prefer to create and store a customised version of the file in S3. As part of the bootstrap process, I download it to the host and place it in the default location. My customised CloudWatch config file looks like the following:

Let’s take a closer look at what’s happening here. The first three components are windows event logs I’m choosing to monitor:

You’ll notice I’ve included the Desired State Configuration (DSC) event logs, as DSC is my preferred configuration management tool of choice when it comes to Windows. When defining a windows event log, a level needs to be specified, indicating the verbosity of the output. The values are as follows:

1 – Only error messages uploaded.
2 – Only warning messages uploaded.
4 – Only information messages uploaded.

You can add values together to include more than one type of message. For example, 3 means that error messages (1) and warning messages (2) get uploaded. A value of 7 means that error messages (1), warning messages (2), and information messages (4) get uploaded. For those familiar with Linux permissions, this probably looks very familiar! 🙂

To monitor other windows event logs, you can create additional components in the JSON template. The value of “LogName” can be found by viewing the properties of the event log file, as shown below:

img_57c431da7dfb4

The next two components monitor the two logs that are relevant to the bootstrap process:

Once again, a lot of this is self explanatory. The “LogDirectoryPath” specifies the absolute directory path to the relevant log file, and the filter specifies the log filename to be monitored. The tricky thing here was getting the “TimeStampFormat” parameter correct. I used this article on MSDN plus trial and error to work this out. Additionally, it’s worth noting that cfn-init.log’s timestamp is the local time of the EC2 instance, while EC2ConfigLog.txt takes on UTC time. Getting this right ensures you have the correct timestamps in CloudWatch.

Next, we need to define the log groups in CloudWatch that will hold the log streams. I’ve got three separate Log Groups defined:

You’ll also notice that the Log Streams are named after the instance ID. Each instance that is launched will create a separate log stream in each log group that can be identified by its instance ID.

Finally, the flows are defined:

This section specifies which logs are assigned to which Log Group. I’ve put all the WindowsEventLogs in a single Log Group, as it’s easy to search based on the event log name. Not as easy to differentiate between cfn-init.log and EC2ConfigLog.txt entries, so I’ve split them out.

So how do we get this customised CloudWatch config file into place? My preferred method is to upload the file with the set-cloudwatch.ps1 script to a bucket in S3, then pull them down and run the PowerShell script as part of the bootstrap process. I’ve included a subset of my standard cloudformation template below, showing the monitoring config key that’s part of the ConfigSet.

What does this look like in the end? Here we can see the log groups specified have been created:

img_57c4d1ddcd273

If we drill further down into the cfninit-Log-Group, we can see the instance ID of the recently provisioned host. Finally, if we click on the Instance ID, we can see the cfn-init.log file in all its glory. Yippie!

img_57c4d5c4c3b89

Hummm, looks like my provisioning failed because a file is non-existent. Bootstrap monitoring has clearly served its purpose! Now all that’s left to do is to teardown the infrastructure, remediate the issue, and reprovision!

The next step to reducing the amount of repetitive tasks in the infrastructure as code development process is a custom pipeline to orchestrate the provisioning and teardown workflow… More on that in another blog!