Preparing your Docker container for Azure App Services

Similar to other cloud platforms, Azure is starting to leverage containers to provide flexible managed environments for us to run Applications. The App Service on Linux being such a case, allows us to bring in our own home-baked Docker images containing all the tools we need to make our Apps work.

This service is still in preview and obviously has a few limitations:

  • Only one container per service instance in contrast to Azure Container Instances,
  • No VNET integration.
  • SSH server required to attach to the container.
  • Single port configuration.
  • No ability to limit the container’s memory or processor.

Having said this, we do get a good 50% discount for the time being which is not a bad thing.

The basics

In this post I will cover how to set up an SSH server into our Docker images so that we can inspect and debug our containers hosted in the Azure App Service for Linux.

It is important to note that running SSH in containers is a highly disregarded practice and should be avoided in most cases. Azure App Services mitigates the risk by only granting SSH port access to the Kudu infrastructure which we tunnel through. However, we don’t need SSH if we are not running in the App Services engine so we can just secure ourselves by only enabling SSH when a flag like ENABLE_SSH environment variable is present.

Running an SSH daemon with our App also means that we will have more than one process per container. For cases like these, Docker allows us to enable an init manager per container that makes sure no orphaned child processes are left behind on container exit. Since this feature requires docker run rights that for security reasons the App services does not grant, we must package and configure this binary ourselves when building the Docker image.

Building our Docker image

TLDR: docker pull xynova/appservice-ssh

The basic structure looks like the following:

The SSH configuration

The /ssh-config/sshd_config specifies the SSH server configuration required by App Services to establish connectivity with the container:

  • The daemon needs to listen on port 2222.
  • Password authentication must be enabled.
  • The root user must be able to login.
  • Ciphers and MACs security settings must be the one displayed below.

The container startup script

The script manages the application startup:

If the ENABLE_SSH environment variable equals true then the setup_ssh() function sets up the following:

  • Change the root user password to Docker! (required by App Services).
  • Generate the SSH host keys required by SSH clients to authenticate SSH server.
  • Start the SSH daemon into the background.

App Services requires the container to have an Application listening on the configurable public service port (80 by default). Without this listener, the container will be flagged as unhealthy and restarted indefinitely. The start_app(), as its name implies, runs a web server (http-echo) that listens on port 80 and just prints all incoming request headers back out to the response.

The Dockerfile

There is nothing too fancy about the Dockerfile either. I use the multistage build feature to compile the http-echo server and then copy it across to an alpine image in the PACKAGING STAGE. This second stage also installs openssh, tini and sets up additional configs.

Note that the init process manager is started through ENTRYPOINT ["/sbin/tini","--"] clause, which in turn receives the monitored script as an argument.

Let us build the container image by executing docker build -t xynova/appservice-ssh docker-ssh. You are free to tag the container and push it to your own Docker repository.

Trying it out

First we create our App Service on Linux instance and set the custom Docker container we will use (xynova/appservice-ssh if you want to use mine). Then we then set theENABLE_SSH=true environment variable to activate the SSH Server on container startup.

Now we can make a GET request the the App Service url to trigger a container download and activation. If everything works, you should see something like the following:

One thing to notice here is the X-Arr-Ssl header. This header is passed down by the Azure App Service internal load balancer when the App it is being browsed through SSL. You can check on this header if you want to trigger http to https redirections.

Moving on, we jump into the Kudu dashboard as follows:

Select the SSH option from the Debug console (the Bash option will take you to the Kudu container instead).

DONE! we are now inside the container.

Happy hacking!

Static Security Analysis of Container Images with CoreOS Clair

Container security is (or should be) a concern to anyone running software on Docker Containers. Gone are the days when running random Images found on the internet was common place. Security guides for Containers are common now: examples from Microsoft and others can be found easily online.

The two leading Container Orchestrators also offer their own security guides: Kubernetes Security Best Practices and Docker security.

Container Image Origin

One of the single biggest factors in Container security is determined by the origin of container Images:

  1. It is recommended to run your own private Registry to distribute Images
  2. It is recommended to scan these Images against known vulnerabilities.

Running a private Registry is easy these days (Azure Container Registry for instance).

I will concentrate on the scanning of Images in the remainder of this post and show how to look for common vulnerabilities using Core OS Clair. Clair is probably the most advanced non commercial scanning solution for Containers at the moment, though it requires some elbow grease to run this way. It’s important to note that the GUI and Enterprise features are not free and are sold under the Quay brand.

As security scanning is recommended as part of the build process through your favorite CI/CD pipeline, we will see how to configure Visual Studio Team Services (VSTS) to leverage Clair.

Installing CoreOS Clair

First we are going to run Clair in minikube for the sake of experimenting. I have used Fedora 26 and Ubuntu. I will assume you have minikube, kubectl and docker installed (follow the respective links to install each piece of software) and that you have initiated a minikube cluster by issuing the “minikube start” command.

Clair is distributed through a docker image or you can also compile it yourself by cloning the following Github repository:

In any case, we will run the following commands to clone the repository, and make sure we are on the release 2.0 branch to benefit from the latest features (tested on Fedora 26):

~/Documents/github|⇒ git clone
Cloning into 'clair'...
remote: Counting objects: 8445, done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 8445 (delta 0), reused 2 (delta 0), pack-reused 8440
Receiving objects: 100% (8445/8445), 20.25 MiB | 2.96 MiB/s, done.
Resolving deltas: 100% (3206/3206), done.
Checking out files: 100% (2630/2630), done.

rafb@overdrive:~/Documents/github|⇒ cd clair
⇒ git fetch
⇒ git checkout -b release-2.0 origin/release-2.0
Branch release-2.0 set up to track remote branch release-2.0 from origin.
Switched to a new branch 'release-2.0'

The Clair repository comes with a Kubernetes deployment found in the contrib/k8s subdirectory as shown below. It’s the only thing we are after in the repository as we will run the Container Image distributed by Quay:

⇒ ls -l contrib/k8s
total 8
-rw-r--r-- 1 rafb staff 1443 Aug 15 14:18 clair-kubernetes.yaml
-rw-r--r-- 1 rafb staff 2781 Aug 15 14:18 config.yaml

We will modify these two files slightly to run Clair version 2.0 (for some reason the github master branch carries an older version of the configuration file syntax – as highlighted in this github issue).

In the config.yaml, we will change the postgresql source from:

source: postgres://postgres:password@postgres:5432/postgres?sslmode=disable


source: host=postgres port=5432 user=postgres password=password sslmode=disable

In config.yaml, we will change the version of the Clair image from latest to 2.0.1:





Once these changes have been made, we can deploy Clair to our minikube cluster by running those two commands back to back:

kubectl create secret generic clairsecret --from-file=./config.yaml 
kubectl create -f clair-kubernetes.yaml

By looking at the startup logs for Clair, we can see it fetches a vulnerability list at startup time:

[rbigeard@flanger ~]$ kubectl get pods NAME READY STATUS RESTARTS AGE 
clair-postgres-l3vmn 1/1 Running 1 7d 
clair-snmp2 1/1 Running 4 7d 
[rbigeard@flanger ~]$ kubectl logs clair-snmp2 
{"Event":"fetching vulnerability updates","Level":"info","Location":"updater.go:213","Time":"2017-08-14 06:37:33.069829"}
{"Event":"Start fetching vulnerabilities","Level":"info","Location":"ubuntu.go:88","Time":"2017-08-14 06:37:33.069960","package":"Ubuntu"} 
{"Event":"Start fetching vulnerabilities","Level":"info","Location":"oracle.go:119","Time":"2017-08-14 06:37:33.092898","package":"Oracle Linux"} 
{"Event":"Start fetching vulnerabilities","Level":"info","Location":"rhel.go:92","Time":"2017-08-14 06:37:33.094731","package":"RHEL"}
{"Event":"Start fetching vulnerabilities","Level":"info","Location":"alpine.go:52","Time":"2017-08-14 06:37:33.097375","package":"Alpine"}

Scanning Images through Clair Integrations

Clair is just a backend and we therefore need a frontend to “feed” Images to it. There are a number of frontends listed on this page. They range from full Enterprise-ready GUI frontends to simple command line utilities.

I have chosen to use “klar” for this post. It is a simple command line tool that can be easily integrated into a CI/CD pipeline (more on this in the next section). To install klar, you can compile it yourself or download a release.

Once installed, it’s very easy to use and parameters are passed using environment variables. In the following example, CLAIR_OUTPUT is set to “High” so that we only see the most dangerous vulnerabilities. CLAIR_ADDRESS is the address of Clair running on my minikube cluster.

Note that since I am pulling an image from an Azure Container Registry instance and I have specified a DOCKER_USER and DOCKER_PASSWORD variable in my environment.


Analysing 3 layers 
Found 26 vulnerabilities 
CVE-2017-8804: [High]  
The xdr_bytes and xdr_string functions in the GNU C Library (aka glibc or libc6) 2.25 mishandle failures of buffer deserialization, which allows remote attackers to cause a denial of service (virtual memory allocation, or memory consumption if an overcommit setting is not used) via a crafted UDP packet to port 111, a related issue to CVE-2017-8779. 
CVE-2017-10685: [High]  
In ncurses 6.0, there is a format string vulnerability in the fmt_entry function. A crafted input will lead to a remote arbitrary code execution attack. 
CVE-2017-10684: [High]  
In ncurses 6.0, there is a stack-based buffer overflow in the fmt_entry function. A crafted input will lead to a remote arbitrary code execution attack. 
CVE-2016-2779: [High]  
runuser in util-linux allows local users to escape to the parent session via a crafted TIOCSTI ioctl call, which pushes characters to the terminal's input buffer. 
Unknown: 2 
Negligible: 15 
Low: 1 
Medium: 4 
High: 4

So Clair is showing us the four “High” level common vulnerabilities found in the nginx image that I pulled from Docker Hub. At times of writing, this is consistent with the details listed on docker hub. It’s not necessarily a deal breaker as those vulnerabilities are only potentially exploitable by local users on the Container host which mean we would need to protect the VMs that are running Containers well!

Automating the Scanning of Images in Azure using a CI/CD pipeline

As a proof-of-concept, I created a “vulnerability scanning” Task in a build pipeline in VSTS.

Conceptually, the chain is as follows:

Container image scanning VSTS pipeline

I created an Ubuntu Linux VM and built my own VSTS agent following published instructions after which I installed klar.

I then built a Kubernetes cluster in Azure Container Service (ACS) (see my previous post on the subject which includes a script to automate the deployment of Kubernetes on ACS), and deployed Clair to it, as shown in the previous section.

Little gotcha here: my Linux VSTS agent and my Kubernetes cluster in ACS ran in two different VNets so I had to enable VNet peering between them.

Once those elements are in place, we can create a git repo with a shell script that calls klar and a build process in VSTS with a task that will execute the script in question:

Scanning Task in a VSTS Build

The content of is very simple (This would have to be improved for a production environment obviously, but you get the idea):

CLAIR_ADDR=http://X.X.X.X:30060 klar Ubuntu

Once we run this task in VSTS, we get the list of vulnerabilities in the output which can be used to “break” the build based on certain vulnerabilities being discovered.

Build output view in VSTS


Hopefully you have picked up some ideas around how you can ensure Container Images you run in your environments are secure, or at least you know what potential issues you are having to mitigate, and that a build task similar to the one described here could very well be part of a broader build process you use to build Containers Images from scratch.

Moving from Azure VMs to Azure VM Scale Sets – VM Image Build


I have previously blogged about using Visual Studio Team Services (VSTS) to securely build and deploy solutions to Virtual Machines running in Azure.

In this, and following posts I am going to take the existing build process I have and modify it so I can make use of VM Scale Sets to host my API solution. This switch is to allow the API to scale under load.

My current setup is very much fit for purpose for the limited trial it’s been used in, but I know (at minimum) I’ll see at least 150 times the traffic when I am running at full-scale in production, and while my trial environment barely scratches the surface in terms of consumed resources, I don’t want to have to capacity plan to the n-nth degree for production.

Shifting to VM Scale Sets with autoscale enabled will help me greatly in this respect!

Current State…

View original post 954 more words

Google Cloud Platform: an entrée

The recent opening of a Google Cloud Platform region in Sydney about 2 months ago triggered my interest in learning more about the platform and understand how their offering would affect the local market moving forward.

So far, I have concentrated mainly on GCPs IaaS offering by digging information out of videos, documentation and venturing through the portal and Cloud Shell. I would like to share my first findings and highlight a few features that, in my opinion, make it worth having a closer look.

Virtual Networks are global

Virtual Private Clouds (VPC) are global by default; this means that workloads in any GCP region can be one trace-route hop away from each other in the same private network. Firewall rules can also be applied in a global scope, simplifying preparation activities for regional failover.

Global HTTP Load Balancing is another feature that allows a single entry-point address to direct traffic to the most appropriate backend around the world. This comes as a very interesting advantage over a DNS based solutions because Global Load Balancing can react instantaneously.

Subnets and Availability Zones are independent 

Google Cloud Platform subnets cover an entire region. Regions still have multiple Availability Zones but they are not directly bound to a particular subnet. This comes in handy when we want to move a Virtual Machine across AZs but keep the same IP address.

Subnets also enable turning on/off Private Google API access with simple switch. Private access allows Virtual Machines without Internet access to reach Google APIs and Services using their internal IPs.

Live Migration across Availability Zones

GCP supports Live Migration within a region. This feature maintains machines up and running during events like infrastructure maintenance, host and security upgrades, failed hardware, etc.

A very nice addition to this feature is the ability to migrate a Virtual Machine into a different AZ with a single command:

$ gcloud compute instances move example-instance  \
  --zone <ZONEA> --destination-zone <ZONEB>

Notice the internal IP is preserved:

The Snapshot service is also global

Moving instances across regions is not as straight forward as moving them within Availability Zones. However, since Compute Engine’s Snapshot service is global, the process is still quite simple.

  1. I create a Snapshot from the VM instance’s disk.
  2. I crate a new Disk from the Snapshot but I place it in the target region’s AZ I want to move the VM to.
  3. Then I can create a new VM using the Disk.

An interesting consequence of Snapshots being global is that it allows us to use them as a data transfer alternative between regions that results in no ingress-egress charges.

You can attach VMs to multiple VPCs

Although still in beta, GCP allows us to attach multiple NICs to a machine and have each interface connect to a different VPCs.

Aside from the usual security benefits of perimeter and DMZ isolation, this feature gives us the ability to share third-party appliances across different projects: for example having all Internet ingress and egress traffic inspected and filtered by a common custom firewall box in the account.

Cloud Shell comes with batteries included

Cloud Shell is just awesome. Apart from its outgoing connections restricted to 20, 21, 22, 80, 443, 2375, 2376, 3306, 8080, 9600, and 50051, it is such a handy tool that you can use to quickly put together PoCs.

  • You get your own Debian VM with tmux multi tab support.
  • Docker up and running to build and test containers.
  • Full apt-get capabilities.
  • You can upload files into it directly from your desktop.
  • A brand new integrated code editor if you don’y like using vim, nano and so on.
  • Lastly, it has a web preview feature allowing you to run your own web server on ports 8080 to 8084 to test your PoC from the internet.

SSH is managed for you

GCP SSH key management is one of my favourite features so far. SSH key pairs are created and managed for you whenever you connect to an instance from the browser or with the gcloud command-line tool. User access to is controlled by Identity and Access Management (IAM) roles having CGP create and apply short lived SSH key pairs on the fly when necessary.

Custom instances, custom pricing

Although a custom machine type can be viewed as something that covers a very niche use case, it can in fact help us price the right instance RAM and CPU for the job at hand. Having said this, we also get the option to buy plenty of RAM and CPU that we will never need (see below).

 – Discounts, discounts and more discounts

I wouldn’t put my head in the lion’s mouth about pricing at this time but there are a large number of Cloud cost analysis reports that categorise GPC as being cheaper than the competition. Having said this, I still believe it comes down to having the right implementation and setup: you might not manage the infrastructure directly in the Cloud but you should definitely manage your costs.

GCP offers sustained-use discounts for instances that have been run over a percent of the overall billing month (25%, 50%, 75% and 100%) and it also recently released 1 and 3 year committed-use discounts which can reach up to 57% of the original instance price. Finally, Preemptible instances (similar to AWS spot instances) can reach up to 80% discount from list price.

Another very nice feature to help managing cost is their Compute sizing recommendations. These recommendations are generated based on system metrics and can help identifying workloads that can be resized to have a more appropriate use of resources.

Interesting times ahead

Google has been making big progress with its platform in the last two years. According to some analyses it still has to cover some ground to reach its competitors level but as we just saw GCP is coming with some very interesting cards under its sleeve.

One thing is for sure… interesting times lie ahead.

Happy window shopping!


Making application configuration files dynamic with confd and Azure Redis

Service discovery and hot reconfiguration is a common problem we face in cloud development nowadays. In some cases we can rely on an orchestration engine like Kubernetes to do all the work for us. In other cases we can leverage a configuration management system and do the orchestration ourselves. However, there are still some cases where either of these solutions are impractical or just too complex for the immediate problem… and you don’t have a Consul cluster at hand either :(.

confd to the rescue

Confd is a Golang written binary that allows us to make configuration files dynamic by providing a templating engine driven by backend data stores like etcd, Consul, DynamoDb, Redis, Vault, Zookeeper. It is commonly used to allow classic load balancers like Nginx and HAProxy to automatically reconfigure themselves when new healthy upstream services come online under different IP addresses.

NOTE: For the sake of simplicity I will use a very simple example to demonstrate how to use confd to remotely reconfigure an Nginx route by listening to changes performed against an Azure Redis Cache backend. However, this idea can be extrapolated to solve service discovery problems whereby application instances continuously report their health and location to a Service Registry (in our case Azure Redis) that is monitored by the Load Balancer service in order to reconfigure itself if necessary.

Just as a side note, confd was created by Kelsey Hightower (now Staff Developer Advocate, Google Cloud Platform) in the early Docker and CoreOS days. If you haven’t heard of Kelsey I totally recommend you YouTube around for him to watch any of his talks.


Azure Redis Cache

Redis, our Service Discovery data store will be listening on (whereXXXX-XXXX-XXXX is your DNS prefix). confd will monitor changes on the /myapp/suggestions/drink cache key and then update Nginx configuration accordingly.

Container images

confd + nginx container image
confd’s support for Redis backend using a password is still not available under the stable or alpha release as of August 2017. I explain how to easily compile the binary and include it in an Nginx container in a previous post.

TLDR: docker pull xynova/nginx-confd

socat container image
confd is currently unable to connect to Redis through TLS (required by Azure Redis Cache). To overcome this limitation we will use a protocol translation tool called socat which I also talk about in a previous post.

TLDR: docker pull xynova/socat

Preparing confd templates

Driving Nginx configuration with Azure Redis

We first start a xynova/nginx-confd container and mount our prepared confd configurations as a volume under the /etc/confd path. We are also binding port 80 to 8080 on localhost so that we can access Nginx by browsing to http://localhost:8080.

The interactive session logs show us that confd fails to connect to Redis on because there is no Redis service inside the container.

To fix this we bring xynova/socat to create a tunnel that confd can use to talk to Azure Redis Cache in the cloud. We open a new terminal and type the following (note: replace XXXX-XXXX-XXXX with your own Azure Redis prefix).

Notice that by specifying --net container:nginx option, I am instructing the xynova/socat container to join the xynova/nginx-confd container network namespace. This is the way we get containers to share their own private localhost sandbox.

Now looking back at our interactive logs we can see that confd is now talking to Azure Redis but it cannot find the/myapp/suggestions/drink cache key.

Lets just set a value for that key:

confd is now happily synchronized with Azure Redis and the Nginx service is up and running.

We now browse to http://localhost:8080 and check test our container composition:

Covfefe… should we fix that?
We just set the /myapp/suggestions/drink key to coffee.

Watch how confd notices the change and proceeds to update the target config files.

Now if we refresh our browser we see:

Happy hacking.


Build from source and package into a minimal image with the new Docker Multi-Stage Build feature

Confd is a Golang written binary that can help us make configuration files dynamic. It achieves this by providing a templating engine that is driven by backend data stores like etcd, consul, dynamodb, redis, vault, zookeeper.

A few days ago I started putting together a BYO load-balancing PoC where I wanted to use confd and Nginx. I realised however that some features that I needed from confd were not yet released. Not a problem; I was able to compile the master branch and package the resulting binary into an Nginx container all in one go, and without even having Golang installed on my machine. Here is how:

confd + Nginx with Docker Multi-Stage builds

First I will create my container startup script  docker-confd/
This script launches nginx and confd in the container but tracks both processes so that I can exit the container if either of them fail.

Normally you want to have only once process per container. In my particular case I have inter-process signaling between confd and Nginx and therefore it is easier for me to keep both processes together.

Now I create my Multi-Stage build Dockerfile:  docker-confd/Dockerfile
I denote a build stage by using the AS <STAGE-NAME> keyword: FROM golang:1.8.3-alpine3.6 AS confd-build-stage. I can reference the stage by name further down when I am copying the resulting binary into the Nginx container.

Now I build my image by executing docker build -t confd-nginx-local docker-confd-nginx.

DONE!, just about 15MB extra to the Nginx base alpine image.

Read more about the Multi-Stage Build feature on the Docker website.

Happy hacking.

SSL Tunneling with socat in Docker to safely access Azure Redis on port 6379

Redis Cache is an advanced key-value store that we should have all come across in one way or another by now. Azure, AWS and many other cloud providers have fully managed offerings for it, which is “THE” way we want to consume it.  As a little bit of insight, Redis itself was designed for use within a trusted private network and does not support encrypted connections. Public offerings like Azure use TLS reverse proxies to overcome this limitation and provide security around the service.

However some Redis client libraries out there do not talk TLS. This becomes a  problem when they are part of other tools that you want to compose your applications with.

Solution? We bring in something that can help us do protocol translation.

socat – Multipurpose relay (SOcket CAT)

Socat is a command line based utility that establishes two bidirectional byte streams and transfers data between them. Because the streams can be constructed from a large set of different types of data sinks and sources (see address types), and because lots of address options may be applied to the streams, socat can be used for many different purposes.

In short: it is a tool that can establish a communication between two points and manage protocol translation between them.

An interesting fact is that socat is currently used to port forward docker exec onto nodes in Kubernetes. It does this by creating a tunnel from the API server to Nodes.

Packaging socat into a Docker container

One of the great benefits of Docker is that it allows you to work in sandbox environments. These environments are then fully transportable and can eventually become part of your overall application.

The following procedure prepares a container that includes the socat binary and common certificate authorities required for public TLS certificate chain validation.

We first create our  docker-socat/Dockerfile

Now we build a local Docker image by executing docker build -t socat-local docker-socat. You are free to push this image to a Docker Registry at this point.

Creating TLS tunnel into Azure Redis

To access Azure Redis you we need 2 things:

  1. The FQDN:
    where all the X’s represent your dns name.
  2. The access key, found under the Access Keys menu of your Cache instance. I will call it THE-XXXX-PASSWORD

Let’s start our socat tunnel by spinning up the container we just built an image for. Notice I am binding port 6379 to my Desktop so that I can connect to the tunnel from localhost:6379 on my machine.

Now let’s have a look at the  arguments I am passing in to socat (which is automatically invoked thanks to the ENTRYPOINT ["socat"] instruction we included when building the container image).

  1. -v
    For checking logs when when doing docker logs socat
  2. TCP-LISTEN:6379,fork,reuseaddr
    – Start a socket listener on port 6379
    – fork to allow for subsequent connections (otherwise a one off)
    – reuseaddr to allow socat to restart and use the same port (in case a previous one is still held by the OS)

    – Create a TLS connect tunnel to the Azure Redis Cache.

Testing connectivity to Azure Redis

Now I will just test my tunnel using redis-cli which I can also use from a container.  In this case THE-XXXX-PASSWORD is the Redis Access Key.

The thing to notice here is the--net host flag. This instructs Docker not to create a new virtual NIC and namespace to isolate the container network but instead use the Host’s (my desktop) interface. This means that localhost in the container is really localhost on my Desktop.

If everything is set up properly and outgoing connections on port6379 are allowed, you should get a PONG message back from redis.

Happy hacking!

Running Containers on Azure

Running Containers in public cloud environments brings advantages beyond the realm of “fat” virtual machines: easy deployments through a registry of Images, better use of resources, orchestration are but a few examples.

Azure is embracing containers in a big way (Brendan Burns, one of the primary instigators of Kubernetes while at Google, joined Microsoft last year which might have contributed to it!)

Running Containers nowadays is almost always synonymous with running an orchestrator which allows for automatic deployments of multi-Container workloads.

Here we will explore the use of Kubernetes and Docker Swarm which are arguably the two most popular orchestration frameworks for Containers at the moment. Please note that Mesos is also supported on Azure but Mesos is an entirely different beast which goes beyond the realm of just Containers.

This post is not about comparing the merits of Docker Swarm and Kubernetes, but rather a practical introduction to running both on Azure, as of August 2017.

VMs vs ACS vs Container Instances

When it comes to running containers on Azure, you have the choice of running them on Virtual Machines you create yourself or via the Azure Container Service which takes care of creating the underlying infrastructure for you.

We will explore both ways of doing things with a Kubernetes example running on ACS and a Docker Swarm example running on VMs created by Docker Machine. Note that at times of writing, ACS does not support the relatively new Swarm mode of Docker, but things move extremely fast in the Container space… for instance, Azure Container Instances are a brand new service allowing users to run Containers directly and billed on a per second basis.

In any case, both Docker Swarm and Kubernetes offer a powerful way of managing the lifecycle of Containerised application environments alongside storage and network services.


Kubernetes is a one-stop solution for running a Container cluster and deploying applications to said cluster.

Although the WordPress example found in the official Kubernetes documentation is not specifically geared at Azure, it will run easily on it thanks to the level of standardisation Kubernetes (aka K8s) has achieved.

This example deploys a MySQL instance Container with a persistent volume to store data and a separate Container which combines WordPress and Apache, sporting its own persistent volume.

The PowerShell script below leverages the “az acs create” Azure CLI 2.0 command to spin up a two node Kubernetes cluster. Yes, one command is all it takes to create a K8s cluste! If you need ten agent nodes, just change the “–agent-count” value to 10.

It is invoked as follows:

The azureSshKey parameter point to a private SSH key (the corresponding public key must exist as well) and kubeWordPressLocation is the path to the git clone of the Kubernetes WordPress example.

Note that you need to have ssh connectivity to the Kubernetes Master i.e. TCP port 22.

Following the creation of the cluster, the script leverages kubectl (“az acs kubernetes install-cli” to install it) to deploy the aforementioned WordPress example.

“The” script:

Note that if you specify an OMS workspace ID and key, the script will install the OMS agent on the underlying VMs to monitor the infrastructure and the Containers (more on this later).

Once the example has been deployed, we can check the status of the cluster using the following command (be patient as it takes nearly 20 minutes for the service to be ready):

Screen Shot 2017-07-28 at 15.31.23

Note the value for EXTERNAL-IP (here By entering this value in the location bar of your browser after both containers are running gives you access to the familiar WordPress install screen:

Screen Shot 2017-07-28 at 16.08.36

This simple K8 visualizer written a while back by Brendan Burns shows the various containers running on the cluster:

Screen Shot 2017-07-28 at 15.36.43.png

Docker Swarm

Docker Swarm is the “other” Container orchestrator, delivered by Docker themselves. Since Docker version 1.12, the so called “Swarm mode” does not require any discovery service like consul as it is handled by the Swarm itself. As mentioned in the introduction, Azure Container Service does not support Swarm mode yet so we will run our Swarm on VMs created by Docker Machine.

It is now possible to use Docker Compose against Docker Swarm in Swarm mode in order to deploy complex applications making the trifecta Swarm/Machine/Compose a complete solution competing with Kubernetes.


The new-ish Docker Swarm mode handles service discovery itself

As with the Kubernetes example I have created a script which automates the steps to create a Swarm then deploys a simple nginx continer on it.

As the Swarm mode is not supported by Azure ACS yet, I have leveraged Docker Machine, the docker sanctioned tool to create “docker-ready” VMs in all kinds of public or private clouds.

The following PowerShell script (easily translatable to a Linux flavour of shell) leverages the Azure CLI 2.0, docker machine and docker to create a Swarm and deploy an nginx Container with an Azure load balancer in front. Docker machine and docker were installed on my Windows 10 desktop using chocolatey.

Kubernetes vs Swarm

Compared to Kubernetes on ACS, Docker Swarm takes a bit more effort and is not as compact but would probably be more portable as it does not leverage specific Azure capabilities, save for the load balancer. Since Brendan Burns joinded Microsoft last year, it is understandable that Kubernetes is seeing additional focus.

It is a fairly recent development but deploying multi-Container applications on a Swarm can be done via a docker stack deploy using a docker compose file version 3 (compose files for older revisions need some work as seen here).

Azure Registry Service

I will explore this in more detail a separate post, but a post about docker Containers would not be complete without some thoughts about where the source of Images for your deployments live (namely the registry).

Azure offers a Container Registry service (ACR) to store and retrieve Images which is an easy option to use that well integrated with the rest of Azure.

It might be obvious to point out, but deploying Containers based on Images downloaded from the internet can be a security hazard, so creating and vetting your own Images is a must-have in enterprise environments, hence the criticality of running your own Registry service.

Monitoring containers with OMS

My last interesting bit for Containers on Azure is around monitoring. If you deploy the Microsoft Operations Management Suite (OMS) agent on the underlying VMs running Containers and have the Docker marketplace extension for Azure as well as an OMS account, it is very easy to get useful monitoring information. It is also possible to deploy OMS as a Container but I found that I wasn’t getting metrics for the underlying VM that way.

Screen Shot 2017-07-28 at 16.05.30.png

This allows not only to grab performance metrics from containers and the underlying infrastructure but also to grab logs from applications, here the web logs for instance:

Screen Shot 2017-07-28 at 16.12.23

I hope you found this post interesting and I am looking forward to the next one on this topic.

Getting started with Azure Cloud Shell

A few weeks back I noticed that I now had the option for the Azure Cloud Shell in the Azure Portal.

What is Azure Cloud Shell?

Essentially rather than having the Azure CLI installed on your local workstation, you can now initiate it from the Portal and you have automatically assigned (initiated as part of the setup) 5Gbytes of storage associated with it. So you can now create, manage and delete Azure resources using a centrally hosted CLI session. Each time you start your shell your homedrive will mount and your profile, scripts and whatever else you’ve stored in it will be available to you. Nice. Let’s do it.

Getting Started

Login to the Azure Portal and click on the Cloud Shell icon.

As this is the first time you’ve accessed it, you will not have any storage associated with your Azure Cloud Shell. You will be prompted for storage information.

Azure Files must reside in the same region as the machine being mounted to. Cloud Shell machines currently (July 2017) exist in the below regions:

Area Region
Americas East US, South Central US, West US
Europe North Europe, West Europe
Asia Pacific India Central, Southeast Asia

I hit the Advanced Settings to specify creation of a new Resource Group, Storage Account and File Share.

The UI doesn’t check for uniqueness of the configuration settings until it is written. So you might need a couple of attempts with the naming of your storage account. As you can see below it isn’t surprising that my attempt to use azcloudshell as a “Storage Account Name” was already taken.

Providing unique values for these options

.. let the initial creation go through just nicely. I now had a homedrive created for my profile and any files I create, store for my sessions.

As for commands you can use with the Azure CLI go have a look here for the full list that you can use to create, manage and delete your Azure resources.

Personally I’m currently doing a lot with Azure Functions. A list of the full range of Azure Functions CLI commands is available here.

The next thing I looked to do was to put my scripts etc into the clouddrive. I just navigated to the new StorageAccount that I created as part of this and uploaded via the browser.

Below you can see the file I uploaded on the right which appears in the directory in the middle pane.

Using the Azure CLI I changed directories and could see my uploaded file.

And that is pretty much it. Continue as you would with the CLI, but just now with it all centrally stored. Sweet.

Azure Build Pipeline using ARM Templates and Visual Studio Team Services


When having to deploy resources within Azure you can easily log in to the Azure Portal and start deploying resources, however with the number of components needed to build a working solution this can quickly become time consuming. You may also need to deploy the same solution in a Development, Test, and Production environment and then make some changes to the environment along the way.

There is a lot of talk about DevOps and Infrastructure as Code (IaC) in the IT industry at the moment. A significant part of DevOps is automation. Let’s see how we can automate our deployments using automation and Infrastructure as Code.

There are a range of different tools available for these tasks. For this example we will use the following.

ARM Template

Our starting point is to create an ARM Template (JSON format) for our environment. The resources being deployed for this example are:

  • VNET and subnet
  • Internal Load Balancer
  • Availability Set
  • Storage Account (for diagnostics)
  • 2 x Virtual Machines (using Managed Disks) assigned to our LB.

Information for Managed Disks can be found here –

The ARM Template and parameters file are available here

The two files used are:

  • ARM Template – VSTSDemoDeploy.json
  • Parameters file – VSTSDemoParameters.json

More information for authoring ARM Templates can be found here –

Create our Local Git Repo

Launch a command prompt and change to the root of C drive which is where we want to clone our VSTSDemo folder to.

Run the following command

git clone


You will now see a VSTSDemo folder in the root of C drive. Open the folder and delete the .git folder (it may be hidden)

Our next step is to initiate our local folder as a Git project.

Enter the following Git command from the VSTSDemo folder

git init

Building the Pipeline with VSTS

If you do not already have an account for VSTS then you can sign up for a free account here –

Now we need to create a project in VSTS, if not signed in already sign in.

Click New Team Project.

Give your project a name, fill in the description, set the version control to Git and the Work item process to Agile, click Create.

Once your Project has finished creating expand “or push an existing repository from command line

This gives us the commands that we need to run. Before running them we need to check the status of our local repository. From the command line run this command from the VSTSDemo directory

git status


We can see that our branch has untracked files, so we need to add them to our repo, to do this run

git add .

Now we need to send our commit, to do this run

git commit -m "Initial check-in."

We can now run the commands supplied by VSTS at our command prompt. First run

git remote add origin

Where xxxxx is your VSTS account name and yyyyy is your VSTS Project name

Now run

git push -u origin –all

Sign in to VSTS when/if prompted

You will see something like the below when completed if successful.

Refresh your VSTS page and you will now see that Code has been committed.

Now we need to create the build definition, click on Build & Release. Now click New definition, then empty process.

Check that the sources are correct.

When deploying we will also need to deploy the Resource Group that will contain the resources. To do this click on Add Task. Select Azure Resource Group Deployment and click Add.

Click the tick box next to the Azure Resource Group Deployment and fill in the required settings.

  • Azure Subscription – will need to click the Authorize button
  • Resource Group
  • Location
  • Template – VSTSDemoDeploy.json
  • Template Parameters – VSTSDemoParameters.json
  • Deployment Mode – Incremental

An important note around the Deployment Mode see the description below. Choose carefully!

Now click on the Triggers tab and enable Continuous Integration

Click Save.

We now have a build pipeline. Let’s use it to deploy our environment. From the Build & Release page click on the Build Definition

Click Queue new build

Click OK on the Queue build page.

You will see the below when the build begins.

Wait for the build to finish.

Let’s log in to the Azure Subscription and take a look at our new resources.

Looks like everything is there.

Make a change – Scale Up

Now let’s make a change by increasing the size of the VM’s.

From within VSTS click on the Code tab and edit our VSTSDemoParameters file. Let’s change the Virtual Machine Size to something bigger – Standard A2 for instance. Click Commit when done.

Add a meaningful comment and click Commit.

We can see that a new build has started. That is our Continuous Integration and deployment working, it will build any changes we make automatically. Your VM’s will restart once the build starts as they are resizing.

Wait for the build to finish.

Let’s check our VM’s from the Azure Portal to see the new size.

Our instance sizes are now Standard A2.

Make another change – Scale Out

Instead of using larger VM sizes this time we need to increase the number of VM’s from 2 to 4.

From within VSTS click on the Code tab and edit our VSTSDemoDeploy file. Let’s change the numberOfInstances variable from 2 to 4. Click Commit when finished, which will kick off a new build.

Once the build finishes check your Azure Subscription and you should now have 4 VM’s instead of 2.

If we check our Availability Set all VM’s are members.

Lastly, we can check our Load Balancer Backend Pools, all VM’s are members.


VSTS and ARM Templates can make deployments of your environments a lot quicker and easier, it also makes managing additional deployments along the life cycle of your application an easier task. This method can be used to deploy any resources that are deployable using ARM Templates, whether IaaS or PaaS.