Cloud Operations – Key Service Operation Principles – Consideration

Below are some good IT Service Management Operational Principles to consider when migrating applications into Cloud.  These will help to align your operational goals and organisation’s strategic initiatives.

Principle #1

Organisation’s IT Service Management will govern and lead all IT services utilising strategic processes and technology enablers based on industry best practices.

Implications / Outcomes

  • The selected process and technology will be fit for purpose
  • Suppliers and Service Partners will be required to integrate with strategic processes and technologies
  • Process re-engineering including training will be required
  • Everyone uses or integrates with a central platform
  • Process efficiency through effective process integration
  • Reduced operating cost
  • Ensures contestability of services for Organisation

Principle #2

Contestability between IT Service providers is a key outcome for service management across IT@Organisation, where it does not have a negative impact on the customer experience.

Implications / Outcomes

  • Avoid vendor lock-in
  • Requires strategic platforms
  • Sometimes greater complexity
  • More ownership of process by Organisation
  • Better cost outcomes through competition
  • Improved performance, incumbent advantage is earned
  • Access to innovation
  • Access to capacity

Principle #3

The Organisation’s IT operating model will be based on the principles of Customer-centricity (Organisation’s business and IT), consistency and quality of service and continual improvement of process maturity.

Implications / Outcomes

  • More extensive process integration
  • Possible constraints – cost, time, resources, agility
  • Additional internal expertise
  • Governance as a focal point
  • Continual improvement
  • Improved process alignment with business alignment
  • Quantitative, demonstrable benefits
  • Improved customer satisfaction

Principle #4

Organisation will retain and own all IP for Organisation’s Service Management knowledge assets and processes.

Implications / Outcomes

  • Strong asset, capacity, knowledge management
  • Service provider governance
  • Improved internal capability
  • Service provider independence
  • Reduced risk
  • Exploitation of skills and experience gained
  • Encourage self-healing culture

Principle #5

Changes to existing Organisation processes and procedures will only be made where those changes are necessary to deliver benefits from the Cloud platform.

Implications / Outcomes

  • Vendors adapt to Organisation’s processes
  • Existing process needs to be critically assessed
  • Reduced exposure to risk
  • Reduced levels of disruption
  • Faster adoption of new processes through familiarity
  • Faster Implementation due to less change

Principle #6

Before beginning process design, ownership of the process and its outcomes, resource availability, cost benefit analysis and performance measurements will be defined and agreed.

Implications / Outcomes

  • Ownership of process is known
  • The process is appropriately resourced
  • Alignment of activities with desired outcomes
  • Improved process effectiveness
  • Reduced risk of failure
  • Resourcing cost

 

Summary

Please note that there will be practical implications to organisation’s service management processes (typically – incident management, problem management, capacity management, service restoration, change management, configuration management and release management). Also these are some of the good principles to consider and can be customised as per organisational strategy and priorities.

 

Sending Events from IoT Devices to Azure IoT Hub using HTTPS and REST

Overview

Different IoT Devices have different capabilities. Whether it is a Micro-controller or Single Board Computer your options will vary. In this post I detailed using MQTT to send messages from an IoT Device to an Azure IoT Hub as well as using the AzureIoT PowerShell Module.

For a current project I needed to send the events from an IoT Device that runs Linux and had Python support. The Azure IoT Hub includes an HTTPS REST endpoint. For this particular application using the HTTPS REST endpoint is going to be much easier than compiling the Azure SDK for the particular flavour of Linux running on my device.

Python isn’t my language of choice so first I got it working in PowerShell then converted it to Python. I detail both scripts here as a guide for anyone else trying to do something similar but also for myself as I know I’m going to need these snippets in the future.

Prerequisites

You’ll need to have configured an;

Follow this post to get started.

PowerShell Device to Cloud Events using HTTPS and REST Script

Here is the PowerShell version of the script. Update Line 3 for your DeviceID, Line 5 for your IoT Hub Name and LIne 11 for your SAS Token.

Using Device Explorer to Monitor the Device on the associated IoT Hub I can see that the message is received.

Device Explorer

Python Device to Cloud Events using HTTPS and REST Script

Here is my Python version of the same script. Again update Line 5 for your IoT DeviceID, Line 7 for your IoT Hub and Line 12 for the SAS Token.

And in Device Explorer we can see the message is received.

Device Explorer Python

Summary

When you have a device that has the ability to run Python you can use the IoT Hub HTTPS REST API to send messages from the Client to Cloud negating the need to build and compile the Azure IoT SDK to generate client libraries.

Agile Teams to High Performing Machines

Agile teams are often under scrutiny as they find their feet and as their sponsors and stakeholders realign expectations. Teams can struggle due to many reasons. I won’t list them here, you’ll find many root causes online and may have a few yourself.

Accordingly, this article is for Scrum Masters, Delivery Managers or Project Managers who may work to help turn struggling teams into high performing machines. The key to success here is measures, measures and measures. 

I have a technique I use to performance manage agile teams involving specific Key Performance Indicators (KPIs). To date it’s worked rather well. My overall approach is as follows:

  • Present the KPIs to the team and rationalise them. Ensure you have the team buy-in.
  • Have the team initially set targets against each KPI. It’s OK to be conservative. Goals should be achievable in the current state and subsequently improved upon.
  • Each sprint, issue a mid-sprint report, detailing how the team is tracking against KPIs. Use On Target and Warning, respectively to indicate where the team has to up it’s game.
  • Provide a KPI de-brief as part of the retrospective. Provide insight into why and KPIs were not satisfied.
  • Work with the team on setting the KPIs for the next sprint at the retrospective

I use a total of five KPIs, as follows:

  • Total team hours worked (logged) in scrum
  • Total [business] value delivered vs projected
  • Estimation variance (accuracy in estimation)
  • Scope vs Baseline (effectiveness in managing workload/scope)
  • Micro-Velocity (business value the team can generate in one hour)

I’ve provided a Agile Team Performance Tracker for you to use that tracks some of the data required to use these measures. Here’s an example dashboard you can build using the tracker (click to enlarge):

dashboard

In this article, I’d like to cover some of these measures in detail, including how tracking these measures can start to affect positive change in team performance. These measures have served me well and help to provide clarity to those involved.

Estimation Variance

Estimation variance is a measure I use to track estimation efficiency over time. It relies on the team providing hours based estimates for work items but is attributable to your points based estimates. As a team matures and gets used to estimation. I expect the time invested to more accurately reflect what was estimated.

I define this KPI as a +/-X% value.

So for example, if the current Estimation Variance is +/-20%, it means the target for team hourly estimates, on average for work items in this sprint, should be tracking no more than 20% above or below logged hours for those work items. I calculate the estimation variance as follows:

[estimation variance] = ( ([estimated time] / [actual time]) / [estimated time] ) x 100

If the value is less than the negative threshold, it means the team is under-estimating. If the value is more than the positive threshold, it means the team is over-estimating. Either way, if you’re outside the threshold, it’s bad news.

“But why is over-estimating an issue?” you may ask yourself? “An estimate is just an estimate. The team can simply move more items from the backlog into the sprint“. Remember that estimates are used as baselining for future estimates and planning activities. A lack of discipline in this area may impede release dates for your epics.

You can use this measure against each of the points tiers your team uses. For example:

agile1

n this example, the team is under-estimating bigger ticket items (5’s and 8’s), so targeted efforts can be made next estimation session to bring this within target threshold. Overall though in this example the team is tracking pretty well – the overall variance of -4.30% could well be within target KPI for this sprint.

Scope vs Baseline

Scope vs Baseline is a measure used to assess the team’s effectiveness at managing scope. Let’s consider the following 9-day sprint burndown:

agile2.PNG

The baseline represents the blue line. This is the projected burn-down based on the scope locked in at the start of the sprint. The scope is the orange line, representing the total scope yet to be delivered on each day of the sprint.

Obviously, a strong team tracks against or below the baseline, and will take on additional scope to stay aligned to the baseline without falling too far below it. Teams that overcommit/underdeliver will ‘flatline (not burn down) and track above the baseline, and even worse may increase scope when tracking above the baseline.

The Scope vs Baseline measure is tracked daily, with KPI calculation an average across all days in the sprint.

I define this KPI as a +/-X% value.

So for example, if the current Scope vs Baseline is +/-10%, it means the actual should not track on average more than 10% above or below the baseline. I calculate the estimation variance as follows:

[scope vs baseline] = ( [actual / projected] * 100 ) – 100

Here’s an example based on the burndown chart above:

agile3

The variance column stores the value for the daily calculation. The result is the Scope vs Baseline KPI (+4.89%). We see the value ramp up into the positive towards the end of sprint, representing our team’s challenge closing out it’s last work items. We also see the team tracking at -60% below the baseline on day 5, which subsequently triggers a scope increase to track against the baseline – a behaviour indicative of a good performing team.

Micro-Velocity

Velocity is the most well known measure. If it goes up, the team is well oiled and delivering business value. If it goes down, it’s the by-product of team attrition, communication breakdowns or other distractions.

Velocity is a relative measure, so whether it’s good or bad depends on the team and the measures taken in past sprints.

What I do is create a variation on the velocity measure that is defined as follows:

[micro velocity] = SUM([points done]) / SUM([hours worked])

I use a daily calculation of macro-velocity (vs past iterations) to determine the impact team attrition and on-boarding new users will have within a single sprint.


In conclusion, using some measures as KPIs on top of (but dependent on) the reports provided by the likes of Jira and Visual Studio Online can really help a team to make informed decisions on how to continuously improve. Hopefully some of these measures may be useful to you.

IT Service Management (ITSM) – Process Maturity Evaluation

ITSM Process Maturity Evaluation

Why are we doing this?

It is useful to measure Service Management process maturity for a number of reasons:

  • To understand the strengths and weaknesses of existing processes.
  • As a guide to planning continual process improvement.
  • As a mechanism to measure the impact of process improvement activities.

What is it?

The Process Maturity Evaluation is a method of assessing the current state of Service Management processes and procedures, it needs to:

  • Have the right balance between detail and effort.
  • Be flexible enough to be applied to any Service Management Process.
  • Capture input from all stakeholders, not just Service Management.
  • Deliver actionable information to drive improvement.
  • Can be used to drive Continual Service Improvement.

How does it work?

  • Using a set of standard process maturity questions, we assess the current state of each process/procedure, evaluating 9 aspects of maturity (typically these are pre-requirements, management intent, process capability, internal integration, products, quality control, management information, external integration and customer interface. These may vary depending on organisation types and priorities), grading on a scale from 0-5.
  • Based on this we then identify and execute a process improvement plan to address the maturity issues that we have identified.
  • Using the same approach, we re-evaluate process maturity to measure the improvement achieved and to identify the next cycle of improvement.

Process Maturity Evaluation Cycle

process maturity cycle.jpg

Sample Process Maturity Questions and Model

process questions example

Sample Process Maturity Evaluation Results (Visuals)

process maturity example.jpg

Summary

ITSM process maturity questions can vary depending on organisation types and priorities and also the baseline questions can be changed to track improvements based in initial scores. Hope you found this useful.

 

Using AWS EC2 Instances to train a Convolutional Neural Network to identify Cows and Horses

First published at https://nivleshc.wordpress.com

Background

Machine Learning (ML) and Artificial Intelligence (AI) has been a hobby of mine for years now. After playing with it approximately 8 years back, I let it lapse till early this year, and boy oh boy, how things have matured! There are products in the market these days that use some form of ML – some examples are Apple’s Siri, Google Assistant, Amazon Alexa.

Computational power has increased to the point where calcuations that took months can now be done within days. However, the biggest change has come about due to the vast amounts of data that the models can be trained on. More data means better accuracy in models.

If you have taken any programming course, you would remember the hello world program. This is a foundation program, which introduces you to the language and gives you the confidence to continue on. The hello world for ML is identifying cats and dogs. Almost every online course I have taken, this is the first project that you build.

For anyone wanting a background on Machine Learning, I would highly recommend Andrew Ng’s https://www.coursera.org/learn/machine-learning in Coursera. However, be warned, it has a lot of maths 🙂 If you are able to get through it, you will get a very good foundational knowledge on ML.

If theory is not your cup of tea, another way to approach ML is to just implement it and learn as you go. You don’t need to get a PhD in ML to start implementing it. This is the philosophy behind Jeremy Howard’s and Rachel Thomas’s http://www.fast.ai. They take you through the implementation steps and introduce you to the theory on a need to know basis, in essence you are doing a top down approach.

I am still a few lessons away from finishing the fast.ai course however, I have learnt so much and I cannot recommend it enough.

In this blog, I will take you through the steps to implement a Convolutional Neural Network (CNN) that will be able to pick out horses from cows. CNNs are quite complicated in nature so we won’t go into the nitty-gritty details on creating them from scratch. Instead, we will use the foundational libraries from fast.ai’s lesson 1 and modify it abit, so that instead of identifying cats and dogs, we will use it to identify cows and horses.

In the process, I will introduce you to a tool that will help you scrape Google for your own image dataset.

Most important of all, I will show you how the amount of data used to train your CNN model affects its accuracy.

So, put your seatbelts on and lets get started!

 

1. Setting up the AWS EC2 Instance

ML requires a lot of processing power. To get really good throughput, it is recommended to use GPUs instead of CPUs. If you were to build a kit to try this at home, it can easily cost you a few thousands of dollars, not to mention the bill for the cooling and electricity usage.

However, with Cloud Computing, we don’t need to go out and buy the whole kit, instead we can just rent it for as long as we want. This provides a much affordable way to learn ML.

In this blog, we will be using AWS EC2 instances. For the cheapest GPU cores, we will use a p2.xlarge instance. Be warned, these cost $0.90/hr, so I would suggest turning them off after using them, otherwise you will surely rack up a huge bill.

Reshma has done a fantastic job of putting together the instructions on setting up an AWS Instance for running fast.ai course lessons. I will be using her instructions, with a few modifications. Reshma’s instructions can be found here.

Ok lets begin.

  • Login to your AWS Console
  • Go to the EC2 section
  • On the top left menu, you will see EC2 Dashboard. Click on Limits under it
  • AWS_Dashboard_EC2_Limits
  • Now, on the right you will see all the type of EC2 instances you are allowed to run. Search for p2.xlarge instances. These have a current limit of zero, meaning you cannot launch them. Click on Request limit increase and then fill out the form to justify why you want a p2.xlarge instance. Once done, click on Submit. In my case, within a few minutes, I received an email saying that my limit increase had been approved.
  • Click on EC2 Dashboard from the left menu
  • Click on Launch Instance
  • In the next screen, in the left hand side menu, click on Community AMIs
  • On the right side of the screen, search for fast.ai
  • From the results, select fastai-part1v2-p2
  • In the next screen (Instance Type) filter by GPU compute and choose p2.xlarge
  • In the next screen configure the instance details. Ensure you get a public IP address (Auto-assign Pubic IP) because you will be connecting to this instance over the internet. Once done, click Next: Add Storage
  • In the next screen, you don’t need to do anything. Just be aware that the community AMI comes with a 80GB harddisk (at $0.10/GB/Month, this will amount to $8/Month). Click Next
  • In the next screen, add any tags for the EC2 Instance. To give the instance a name, you can set the Key to Name and the Value to fastai. Click Next
  • For security groups, all you need to do is allow SSH to the instance. You can leave the source as 0.0.0.0/0 (this allows connections to the EC2 instance from any public IP address). However, if you want to be super secure, you can set the source to your current ip address. However, doing this means that should your public ip address change (hardly any ISPs give you a static IP address, unless you pay extra), you will have to go back into the AWS Console and update the source in the security group. Click Next
  • In the next section, check that all details are correct and then click on Launch. You will be asked for your key pair. You can either choose an existing key pair or create a new one. Ensure you keep the key pair in a safe place because whoever possesses it can connect to your EC2 instance.
  • Now, sit back and relax, Within a few minutes, your EC2 instance will be ready. You can monitor the progress in the EC2 Dashboard

DON’T FORGET TO SHUTDOWN THE INSTANCE WHEN NOT USING IT. AT $0.90/hr, IT MIGHT NOT SEEM MUCH, HOWEVER THE COST CAN EASILY ACCUMULATE TO SOMETHING QUITE EXPENSIVE

2. Creating the dataset

To train our Convolutional Neural Network (CNN), we need to get lots of images of cows and horses. This got me thinking. Why not get it off Google? But, then this provided another challenge. How do I download all the images? Surely I don’t want to be sitting there right clicking each search result and saving it!

After some googling, I landed on https://github.com/hardikvasa/google-images-download. It does exactly as to what I wanted. It will do a google image search using a keyword and download the results.

Install it using the instructions provided in the link above. By default, it only downloads 100 images. As CNNs need lots more, I would suggest installing chromedriver. The instructions to do this is in the Troubleshooting section under ## Installing the chromedriver (with Selenium)

To download 1000 images of cows and horses, use the following command line (for some reason the tool only downloads around 800 images)

  • the downloaded images will be stored in the subfolder cows/downloaded and horses/downloaded in the /Users/x/Documents/images folder.
  • keyword denotes what we are searching for in google. For cows, we will use cow because we want a single cow’s photo. The same for horses.
  • –chromedriver provides the path to where the chromedriver has been stored
  • the images will be in jpg format
googleimagesdownload --keywords "cow" --format jpg --output_directory "/Users/x/Documents/images/" --image_directory "cows/downloaded" --limit 1000 --chromedriver /Users/x/Documents/tools/chromedriver
googleimagesdownload --keywords "horse" --format jpg --output_directory "/Users/x/Documents/images/" --image_directory "horses/downloaded" --limit 1000 --chromedriver /Users/x/Documents/tools/chromedriver

3. Finding and Removing Corrupt Images

One disadvantage of using googleimagedownload script is that, at times a downloaded image cannot be opened. This will cause issues when our CNN tried to use it for training/validating.  To ensure our CNN does not encounter any issues, we will do some housekeeping before hand and remove all corrupt images (images that cannot be opened).

I wrote the following python script to find and move the corrupt images to a separate folder. The script uses the matplotlib library (the same library used by the fast.ai CNN framework) If you don’t have it, you will need to download it from https://matplotlib.org/users/installing.html.

The script assumes that within the root folder, there is a subfolder called downloaded which contains all the images. It also assumes there is a subfolder called corrupt within the root folder. This is where the corrupt images will be moved to. Set the root_folder_path to the parent folder of the folder where the images are stored.

#this script will go through the downloaded images and find those that cannot be opened. These will be moved to the corrupt folder.

#load libraries
import matplotlib.pyplot as plt
import os

#image folder
root_folder_path = '/Users/x/Documents/images/cows/'
image_folder_path = root_folder_path + 'downloaded/'
corrupt_folder_path = root_folder_path + 'corrupt' #folder were the corrupt images will be moved to

#get a list of all files in the img folder
image_files = os.listdir(f'{image_folder_path}')

print (f'Total Image Files Found: {len(image_files)}')
num_image_moved = 0

#lets go through each image file and see if we can read it
for imageFile in image_files:
 filePath = image_folder_path + imageFile
 #print(f'Reading {filePath}')
 try:
 valid_img = plt.imread(f'{filePath}')
 except:
 print (f'Error reading {filePath}. File will be moved to corrupt folder')
 os.rename(filePath,os.path.join(corrupt_folder_path,imageFile))
 num_image_moved += 1

print (f'Moved {num_image_moved} images to corrupt folder')

For some unknown reason, the script, at times, moves good images into the corrupt folder as well. I would suggest that you go through the corrupt images and see if you can open them (there won’t be many in the corrupt folder). If you can, just manually move them back into the downloaded folder.

To make the images easier to handle, lets rename them using the following format.

  • For the images in the cows/downloaded folder rename them to a format CowXXX.jpg where XXX is a number starting from 1
  • For the images in the horses/downloaded folder rename them to a format HorseXXX.jpg where XXX is a number starting from 1

 

4. Transferring the images to the AWS EC2 Instance

In the following sections, I am using ssh and scp which come builtin with MacOS. For Windows, you can use putty for ssh and WinSCP for scp

A CNN (or any other Neural Network model) is trained using a set of images. Once training has finished, to find how accurate the model is, we give it a set of validation images (these are different to those it was trained on, however we know what these images are of) and ask it to identify the images. We then compare the results with what the actual image was, to find the accuracy.

 

In this blog, we will first train our CNN on a small set of images.

Do the following

  • create a subfolder inside the cows folder and name it train
  • create a subfolder inside the cows folder and name it valid
  • move 100 images from the cows/downloaded folder into the cows/train folder
  • move 20 images from the cows/downloaded folder into the cows/valid folder

Make sure the images in the cows/train folder are not the same as those in cows/valid folder

Do the same for the horses images, so basically

  • create a subfolder inside the horses folder and name it train
  • create a subfolder inside the horses folder and name it valid
  • move 100 images from the horses/downloaded folder into the horses/train folder
  • move 20 images from the horses/downloaded folder into the horses/valid folder

Now connect to the AWS EC2 instance the following command line

ssh -i key.pem ubuntu@public-ip

where

  • key.pem is the key pair that was used to create the AWS EC2 instance (if the key pair is not in the current folder then provide the full path to it)
  • public-ip is the public ip address for your AWS EC2 instance (this can be obtained from the EC2 Dashboard)

Once connected, use the following commands to create the required folders

cd data
mkdir cowshorses
mkdir cowhorses/train
mkdir cowhorses/valid
mkdir cowhorses/train/cows
mkdir cowhorses/train/horses
mkdir cowhorses/valid/cows
mkdir cowhorses/valid/horses

Close your ssh session by typing exit

Run the following commands to transfer the images from your local computer to the AWS EC2 instance

To transfer the cows training set
scp -i key.pem /Users/x/Documents/images/cows/train/*  ubuntu@public-ip::~/data/cowshorses/train/cows

To transfer the horses training set
scp -i key.pem /Users/x/Documents/images/horses/train/*  ubuntu@public-ip::~/data/cowshorses/train/horses

To transfer the cows validation set
scp -i key.pem /Users/x/Documents/images/cows/valid/*  ubuntu@public-ip::~/data/cowshorses/valid/cows

To transfer the horses validation set
scp -i key.pem /Users/x/Documents/images/horses/valid/*  ubuntu@public-ip::~/data/cowshorses/valid/horses

5. Starting the Jupyter Notebook

Jupyter Notebooks are one of the most popular tools used by ML and data scientists. For those that aren’t familiar with Jupyter Notebooks, in a nutshell, it a web page that contains descriptions and interactive code. The user can run the code live from within the document. This is possible because Jupyter Notebook’s execute the code on the server it is running on and then displays the result in the web page. For more information, you can check out http://jupyter.org

In our case, we will be running the Jupyter Notebook on the AWS EC2 instance. However, we will be accessing it through our local computer. For security reasons, we will not publish our Jupyter Notebook to the whole wide world (lol that does spell www).

Instead, we will use the following ssh command to bind our local computer’s tcp port 8888 to the AWS EC2 instance’s tcp port 8888 (this is the port on which the Jupyter Notebook will be running) when we connect to it. This will allow us to access the Jupyter Notebook as if it is running locally on our computer, however the connection will be tunnelled to the AWS EC2 instance.

ssh  -i key.pem ubuntu@public-ip -L8888:localhost:8888

Next, run the following commands to start an instance of Jupyter Notebook

cd fastai
jupyter notebook

After the Jupyter Notebook starts, it will provide a URL to access it, along with the token to authenticate with. Copy it and then paste it into a browser on your local computer.

You will now be able to access the fastai Jupyter Notebook.

Follow the steps below to open Lesson 1.

  • click on the courses folder
  • once inside the courses folder,  click on the  dl1 folder

In the next screen, find the file lesson1.ipynb and double-click it. This will launch the lesson1 Jupyter Notebook in another tab.

Give yourself a big round of applause for reaching so far!

Now, start from the top of lesson1 and go through the first three code sections and execute them. To execute the code, put the mouse pointer in the code section and then press Shift+Enter.

In the next section, change the path to where we moved the cows and horses pictures to. It should look like below

PATH = "data/cowshorses/"

Then, execute this code section.

Skip the following sections

  • Extra steps if NOT using Crestle or Paperspace or our scripts
  • Extra steps if using Crestle

Just a word of caution. The original Jupyter Notebook is meant to distinguish between cats and dogs. However, since we are using it to distinguish between cows and horses, whenever you see a mention of cats, change it to cows and whenever you see a mention of dogs, change it to horses.

The following lines don’t need any changing, so just execute them as they are

os.listdir(PATH)
os.listdir(f'{PATH}valid')

In the next line, replace cats with cows so that you end up with the following

files = !ls {PATH}valid/cows | head
files

Execute the above code. A list of the first 10 cow image files will be displayed.

Next, lets see what the first cow image looks like.

In the next line, change cats to cows to get the following.

img = plt.imread(f'{PATH}valid/cows/{files[0]}')
plt.imshow(img);

Execute the code and you will see the cow image displayed.

Execute the next two code sections. Leave the section after that commented out.

Now, instead of creating a CNN model from scratch, we will use one that was pre-trained on ImageNet which had 1.2 million images and 1000 classes. So it already knows quite a lot about how to distinguish objects. To make it suitable to what we want to do, we will now train it further on our images of cows and horses.

The following defines which model to use and provides the data to train on (the CNN model that we will be using is called resnet34). Execute the below code section.

data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(resnet34, sz))
learn = ConvLearner.pretrained(resnet34, data, precompute=True)

And now for the best part! Lets train the model and give it a learning rate of 0.01.

learn.fit(0.01, 1)

After you execute the above code, the model will be trained on the cows and horses images that were provided in the train folders. The model will then be tested for accuracy by getting it to identify the images contained in the valid folders. Since we already know what the images are of, we can use this to calculate the model’s accuracy.

When I ran the above code, I got an accuracy of 0.75. This is quite good since it means the model can identify cows from horses 75% of the time. Not to forget, we used only 100 cows and 100 horses images to train it, and it didn’t even take that long to train it !

Now, lets see what happens when we give it loads more images to train on.

BTW to get more insights into the results from the trained model,  you can go through all the sections between the lines learning.fit(0.01,1) and Choosing a learning rate.

Another take at training the model

From all the literature I have been reading, one point keeps on repeating. More data means better models. Lets put this to the test.

This time around we will give the model ALL the images we downloaded.

Do the following.

  • on your local computer, move the photos back to the downloaded folder
    • move photos from cows/train to cows/downloaded
    • move photos from cows/valid to cows/downloaded
    • move photos from horses/train to horses/downloaded
    • move photos from horses/valid to horses/downloaded
  • on your local computer, move 100 photos of cows to cows/valid folder and the rest to the cows/train folder
    • move 100 photos from cows/downloaded to cows/valid folder
    • move the rest of the photos from cows/downloaded to cows/train folder
  • on your local computer, move 100 photos for horses to horses/valid and the rest to horses/train folder
    • move 100 photos from horses/downloaded to horses/valid folder
    • move the rest of the photos from horses/downloaded to horses/train folder
  • on the AWS EC2 instance, delete all the photos under the following folders
    • /data/cowshorses/train/cows
    • /data/cowshorses/train/horses
    • /data/cowshorses/valid/cows
    • /data/cowshorses/valid/horses

Use the following commands to copy the images from the local computer to the AWS EC2 Instance

To transfer the cows training set
scp -i key.pem /Users/x/Documents/images/cows/train/*  ubuntu@public-ip::~/data/cowshorses/train/cows

To transfer the horses training set
scp -i key.pem /Users/x/Documents/images/horses/train/*  ubuntu@public-ip::~/data/cowshorses/train/horses

To transfer the cows validation set
scp -i key.pem /Users/x/Documents/images/cows/valid/*  ubuntu@public-ip::~/data/cowshorses/valid/cows

To transfer the horses validation set
scp -i key.pem /Users/x/Documents/images/horses/valid/*  ubuntu@public-ip::~/data/cowshorses/valid/horses

Now that everything has been prepared, re-run the Jupyter Notebook, as stated under Starting Jupyter Notebook above (ensure you start from the top of the Notebook).

When I trained the model on ALL the images (less those in the valid folder) I got an accuracy of 0.95 ! Wow that is soo amazing! I didn’t do anything other than increase the amount of images in the training set.

Final thoughts

In a future blog post, I will show you how you can use the trained model to identify cows and horses from a unlabelled set of photos.

For now, I would highly recommend that you use the above mentioned image downloader to scrape Google for some other datasets. Then use the above instructions to train the model on those images and see what kind of accuracy you can achieve (maybe try identifying chickens and ducks?)

As mentioned before, once finished, don’t forget to shut down your AWS EC2 instance. If you don’t need it anymore, you can terminate it, to save on storage costs as well.

If you are keen about ML, you can check out the courses at http://www.fast.ai (they are free)

If you want to dabble in the maths behind ML, as perviously mentioned, Andrew Ng’s https://www.coursera.org/learn/machine-learning is one of the finest.

Lastly, if you are keen to take on some ML challenges, check out https://www.kaggle.com They have lots and lots competitions running all the time, some of which pay out actual money. There are lots of resources as well and you can learn off others on the site.

Till the next time, Enjoy 😉

Auto-redirect ADFS 4.0 home realm discovery based on client IP

As I mentioned in my previous post here that I will explain how to auto-redirect the home realm discovery page to an ADFS namespace (claims provider trust) based on client’s IP so here I am.

Let’s say you have many ADFS servers (claims providers trusts) linked to a central ADFS 4.0 server and you want to auto-redirect the user to a linked ADFS server login page based on user’s IP instead of letting the user to choose a respective ADFS server from the list on the home realm discovery page as explained in the below request flow diagram.

You can do so by doing some customization as mentioned below:

  1. Create a database of IP ranges mapped to ADFS namespaces
  2. Develop a Web API which returns the relevant ADFS namespace based on request IP
  3. Add custom code in onload.js file on the central ADFS 4.0 server to call the Web API and do the redirection

It is assumed that all the boxes including Central ADFS, linked ADFS, Web Server, SQL Server are setup. All the nitties and gritties are sorted out in terms of firewall rules, DNS lookups, SSL certificates. If not then you can get help from an infrastructure guy on that.

Lets perform the required action on SQL, Web and ADFS Server.

SQL Server

Perform the following actions on the SQL Server:

  1. Create a new database
  2. Create a new table called Registration as shown below

  1. Insert some records in the table for the linked ADFS server IP range, for example

Start IP: 172.31.117.1, End IP: 172.31.117.254, Redirect Name: http://adfs.adminlab.com/adfs/services/trust

Web Server

Perform the following actions for the Web API development and deployment:

  1. Create a new ASP.NET MVC Web API project using Visual Studio
  2. Create a new class called Redirect.cs as shown below (I would have used the same name as database table name ‘Registration’ but it’s OK for now)

  1. Insert a new Web API controller class called ResolverController.cs as shown below. What we are doing here is getting the request IP address and getting the IP ranges from the database, comparing the request IP address with the IP ranges from the database by converting both to long IP address. If the request IP is in range then returning the redirect object.

  1. Add a connection string in the web.config named DbConnectionString pointing to the database we created above.
  2. Deploy this web API project to the web server IIS
  3. Configure the HTTPS binding as well for this web API project using the SSL certificate
  4. Note down the URL of the web API, something like ‘https://{Web-Server-Web-API-URL}/api/resolver/get’, this will be used in the onload.js

Central ADFS 4.0 Server

Perform the following actions on the central ADFS 4.0 server:

  1. Run the following PowerShell command to export current theme to a location

Export-AdfsWebTheme -Name default -DirectoryPath D:\Themes\Custom

  1. Run the following PowerShell command to create a new custom theme based on current theme

New-AdfsWebTheme -Name custom -SourceName default 

  1. Update onload.js file extracted in step 1 at D:\Themes\Custom\theme\script with following code added at the end of the file. What we are doing here is calling the web API which returns the matched Redirect object with RedirectName as ADFS namespace and setting the HRD.selection as that redirect name.

  1. Run the following PowerShell command to update back the onload.js file in the theme

Set-AdfsWebTheme -TargetName custom -AdditionalFileResource @{Uri=’/adfs/portal/script/onload.js’;path=”D:\Themes\Custom\theme\script\onload.js”} 

  1. Run the following PowerShell command to make the custom theme as your default theme

Set-AdfsWebConfig -ActiveThemeName custom -HRDCookieEnabled $false

Now when you test from your linked ADFS server or a client machine linked to the linked ADFS server (which is linked to a central ADFS server), the auto-redirect kicks in from onload.js and forwards it to web API which gets the client IP and matches it with relevant ADFS where the request came from and redirects the user to the relevant ADFS login page, instead of user selecting the relevant ADFS namespace from the available list on home realm discovery page.

If the relevant match is not found, the default home realm discovery page with list of available ADFS namespaces is shown.

Standard Operational Checks for IT Service Management Processes – Once Implemented

Why Operational Process Checks are Required in Service Management?

  • In order to sustain and measure how effectively we are executing our processes, we require to have process operational checks once implemented.
  • This will allow us to identify inefficiencies and subsequently to improve the current Service Management processes.
  • These checks will also provide input into Continual Service Improvement (CSI) Programme.

checklist.jpg

Operational Checks for Service Management Processes – Once Implemented

Ops checks SM.jpg

Standard Guidelines

  • The operational process checks will be managed by IT Service Management
  • Through process governance meetings, agreement can be established on who will update a section of a process or the whole document.
  • IT Service Management will need to be involved in all process governance meetings.
  • IT Service Management will conduct an internal audit on all Service Management processes at least once annually.

Review and Measure at Regular Intervals

It is very important that IT Service Management reviews all processes and measurements at least once every year to make appropriate changes to obtain their target IT and business goals.

 taylor.jpg

 

Thank you……..

Utilising Azure Speech to Text Cognitive Services with PowerShell

Introduction

Yesterday I posted about using Azure Cognitive Services to convert text to speech. I also eluded that I’ve been leveraging Cognitive Services to do the conversion from Speech to Text. I detail that in this post.

Just as with the Text to Speech we will need an API key to use Cognitive Services. You can get one from Azure Cognitive Services here.

Source Audio File

I created an audio file in Audacity  for testing purposes. In my real application it is direct spoken text, but that’s a topic for another time.  I set the project rate to 16000hz for the conversion source file then exported the file as a .wav file.

Capture Audio

The Script

The Script below needs to be updated for your input file (line 2) and your API Key (line 7). Run it liine by line in VSCode or PowerShell ISE.

Summary

That’s it. Pretty simple once you have a reference script to work with. Enjoy.

Converted

 

Major Incident Management – Inputs and Outputs

Definition

A major incident is an incident that results in significant disruption to the business and demands a response beyond the routine incident management process.

The Major Incident Management Process applies globally to all Customers and includes Incidents resulting in a service outage.  This process is triggered by Incidents directly raised by Users or via referral from the Event Management Process, which are classified as Major Incidents in the Incident Management Process by the Service Desk.

Entry Criteria

Major Incident Management Process commences when an Incident has been identified and recorded as a Major Incident in the Incident Management Tool by the Service Desk.

Inputs

The inputs to the Major Incident Management Process include:

  • Incidents identified as Major Incident from the Incident Management Process raised directly by Users or generated via referral from the Event Management Process.
  • Knowledge Management Database (known errors/ existing resolutions/ accepted workarounds)
  • Details of impacted Configuration Items (CIs) from Configuration Management Database (CMDB)
  • Major Incident Assignment Scripts & Organisational Major Incident Communication Process

Exit Criteria

The Major Incident Management Process is complete when:

  • Major Incident is resolved or self-resolved and the Incident record is closed with a workaround/permanent resolution. In case of a workaround a Problem record will be raised for Root Cause Analysis and permanent resolution.
  • Incident is downgraded and routed to the Incident Management Process for resolution.

Outputs

The expected outputs from the Major Incident Management Process are:

  • Successfully implemented Change through the Change Management Process.
  • Problem record raised in Problem Management tool for Root Cause Analysis (RCA).
  • Priority downgraded Incident routed to Incident Management Process for resolution.
  • Restored Service.
  • Updated Knowledge Base/Known Error Database.
  • Closed Incident record with accurate details of the Incident attributes and the steps taken for resolution.
  • Notification through various channels (email, call etc.) on the initiation, resolution process and closure of Major Incident to various stakeholders.
  • Updated Daily Operation Report.
  • Accurately recorded service and/or component outage details (e.g. start, end, duration, outage classification, etc.).

Key Activities

  • Major Incident identified from Incident Management Process
  • Assign Incident to Incident Resolver Team
  • Notify Resolver & Organisational Incident Management Teams
  • Service Restoration
  • Notifications and Communications
  • Conduct Investigation & Diagnosis
  • Resolution
  • SME inputs
  • Develop Resolution/Workaround
  • Resolution Type (Permanent)
  • Root Cause Analysis
  • Change Record
  • Confirmation of Status of Incident
  • Daily Operations Update
  • Closure of Incident

Summary

It is very important to establish a Major Incident Management Process in conjunction with Incident Management, Problem Management, Change Management and Communication Management processes. A key process in ITSM to restore service in the event of a major incident.

Automate network share migrations to Sharepoint Online using ShareGate PowerShell

Sharegate supports PowerShell scripting which can be used to automate and schedule migrations. In this post, I am going to demonstrate an example of end to end automation to migrate network Shares to SharePoint Online. The process effectively reduces the task of executing migrations to “just flicking a switch”.

Pre-Migration

The following pre-migration activities were conducted before the actual migration:

  1. Analysis of Network Shares
  2. Discussions with stakeholders from different business units to identify content needs
  3. Pilot migrations to identify average throughput capability of migration environment
  4. Identification of acceptable data filtration criteria, and prepare Sharegate migration template files based on business requirements
  5. Derive a migration plan from above steps

Migration Automation flow

The diagram represents a high-level flow of the process:

 

The migration automation was implemented to execute the following steps:

  1. Migration team indicates that migration(s) are ready to be initiated by updating the list item(s) in the SharePoint list
  2. Updated item(s) are detected by a PowerShell script polling the SharePoint list
  3. The list item data is downloaded as a CSV file. It is one CSV file per list item. The list item status is updated to “started”, so that it would not be read again
  4. The CSV file(s) are picked up by another migration PowerShell script to initiate migration using Sharegate
  5. The required migration template is selected based on the options specified in the migration list item / csv to create a migration package
  6. The prepared migration task is queued for migration with Sharegate, and migration is executed
  7. Information mails are “queued” to be dispatched to migration team
  8. Emails are sent out to the recipients
  9. The migration reports are extracted out as CSV and stored at a network location.

Environment Setup

Software Components

The following software components were utilized for the implementing the automation:

  1. SharePoint Online Management shell
  2. SharePoint PnP PowerShell
  3. Sharegate migration Tool

Environment considerations

Master and Migration terminals hosted as Virtual machines – Each terminal is a windows 10 virtual machine. The use of virtual machines provides following advantages over using desktops:

  • VMs are generally deployed directly in datacenters, hence, near the data source.
  • Are available all the time and are not affected by power outages or manual shutdowns
  • Can be easily scaled up or down based on the project requirements
  • Benefit from having better internet connectivity and separate internet routing can be drawn up based on requirements

Single Master Terminal – A single master terminal is useful to centrally manage all other migration terminals. Using a single master terminal offers following advantages:

  • Single point of entry to migration process
  • Acts as central store for scripts, templates, aggregated reports
  • Acts as a single agent to execute non-sequential tasks such as sending out communication mails

Multiple Migration terminals – It would be optimal to initiate parallel migrations and use multiple machines (terminals) to expedite overall throughput by parallel runs in an available migration window (generally non-business hours).  Sharegate has option to use either 1 or 5 licenses at once during migration. We utilized 5 ShareGate licenses on 5 separate migration terminals.

PowerShell Remoting – Using PowerShell remoting allows opening remote PowerShell sessions to other windows machines. This will allow the migration team to control and manage migrations using just one terminal (Master Terminal) and simplify monitoring of simultaneous migration tasks. More information about PowerShell remoting can be found here.

PowerShell execution policy – The scripts running on migration terminals will be stored at a network location in Master Terminal. This will allow changing / updating scripts on the fly without copying the script over to other migration terminals. The script execution policy of the PowerShell window will need to be set as “Bypass” to allow execution of scripts stored in network location (for quick reference, the command is “Set-ExecutionPolicy -ExecutionPolicy Bypass”.

Windows Scheduled Tasks – The PowerShell scripts are scheduled as tasks through Windows Task schedulers and these tasks could be managed remotely using scripts running on the migration terminals. The scripts are stored at a network location in master terminal.

Basic task in a windows scheduler

PowerShell script file configured to run as a Task

Hardware specifications

Master terminal (Manage migrations)

  • 2 cores, 4 GB RAM, 100 GB HDD
  • Used for managing scripts execution tasks on other terminals (start, stop, disable, enable)
  • Used for centrally storing all scripts and ShareGate property mapping and migration templates
  • Used for Initiating mails (configured as basic tasks in task scheduler)
  • Used for detecting and downloading migration configuration of tasks ready to be initiated (configured as basic tasks in task scheduler)
  • Windows 10 virtual machine installed with the required software.
  • Script execution policy set as “Bypass”

Migration terminals (Execute migrations)

  • 8 cores, 16 GB RAM, 100 GB HDD
  • Used for processing migration tasks (configured as basic tasks in windows task scheduler)
  • Multiple migration terminals may be set up based on the available Sharegate licenses
  • Windows 10 Virtual machines each installed with the required software.
  • Activated Sharegate license on each of the migration terminals
  • PowerShell remoting needs to be enabled
  • Script execution policy set as “Bypass”

Migration Process

Initiate queueing of Migrations

Before migration, migration team must perform manual pre – migration tasks (if any as defined by the migration process as defined and agreed with stake holders). Some of the pre-migration tasks / checks may be:

  • Inform other teams about a possible network surge
  • Confirming if another activity is consuming bandwith (scheduled updates)
  • Inform the impacted business users about the migration – this would be generally set up as the communication plan
  • Freezing the source data store as “read-only”

A list was created on a SharePoint online site to enable users to indicate that the migration is ready to be processed. The updates in this list shall trigger the actual migration downstream. The migration plan is pre-populated in this list as a part of migration planning phase. The migration team can then update one of the fields (ReadyToMigrate in this case) to initiate the migration. Migration status is also updated back to this list by the automation process or skip a planned migration (if so desired).

The list provides as a single point of entry to initiate and monitor migrations. In other words, we are abstracting out migration processing with this list and can be an effective tool for migration and communication teams.

The list was created with the following columns:

  • Source => Network Share root path
  • Destination site => https://yourteanant.sharepoint.com/sites/<Sitename>
  • Destination Library => Destination library on the site
  • Ready to migrate => Indicates that the migration is ready to be triggered
  • Migrate all data => Indicate if all data from the source is to be migrated (default is No). Only filtered data based on the predefined options will be migrated. (more on filtered options can be found here)
  • Started => updated by automation when the migration package has been downloaded
  • Migrated => updated by automation after migration completion
  • Terminal Name => updated by automation specifying the terminal being used to migrate the task

 

Migration configuration list

 

After the migration team is ready to initiate the migration, the field “ReadyToMigrate” for the migration item in the SharePoint list is updated to “Yes”.


“Flicking the switch”

 

Script to create the migration configuration list

The script below creres the source list in sharepoint online.

Script to store credentials

The file stores the credentials and can be used subsequent scripts.

Queuing the migration tasks

A PowerShell script is executed to poll the migration configuration list in SharePoint at regular intervals to determine if a migration task is ready to be initiated. The available migration configurations are then downloaded as CSV files, one item / file and stored in a migration packages folder on the master terminal. Each CSV file maps to one migration task to be executed by a migration terminal and ensures that the same migration task is not executed by more than one terminal. It is important that this script runs on a single terminal to ensure only one migration is executed for one source item.

 

 

Execute Migration

The downloaded migration configuration CSV files are detected by migration script tasks executing on each of the migration terminals. Based on the specified source, destination and migration options the following tasks are executed:

  1. Reads the item from configuration list to retrieve updated data based on item ID
  2. Verify Source. Additionally, sends a failure mail if source is invalid or not available
  3. Revalidates if a migration is already initiated by another terminal
  4. Updates the “TerminalName” field in the SharePoint list to indicate an initiated migration
  5. Checks if the destination site is created. Creates if not already available
  6. Checks if the destination library is created. Creates if not already available
  7. Triggers an information mail informing migration start
  8. Loads the required configurations based on the required migration outcome. The migration configurations specify migration options such as cut over dates, source data filters, security and metadata. More about this can be found here.
  9. Initiates the migration task
  10. Extracts the migration report and stores as CSV
  11. Extracts the secondary migration report as CSV to derive paths of all files successfully migrated. These CSV can be read by an optional downstream process.
  12. Triggers an information mail informing migration is completed
  13. Checks for another queued migration to repeat the procedure.

The automatioin script is given below –

 


Additional Scripts

Send Mails

The script triggers emails to required recipients.

This script polls a folder:   ‘\masterterminal\c$\AutomatedMigrationData\mails\input’ to check any files to be send out as emails. The csv files sepcify subject and body to be send out as emails to recipients configured in the script. Processed CSV files are moved to final folder.

 

Manage migration tasks (scheduled tasks)

The PowerShell script utilizes PowerShell remoting to manage windows Task scheduler tasks configured on other terminals.

 


Conclusion

The migration automation process as described above helps in automating the migration project and reduces manual overhead during the migration. Since the scripts utilize pre-configured migration options / templates, the outcome is consistent with the plan. Controlling and monitoring migration tasks utilizing a SharePoint list introduces transparency in the system and abstracts the migration complexity. Business stakeholders can review migration status easily from the SharePoint list and this ensures an effective communication channel. Automated mails informing about migration status provide additional information about the migration. The migration tasks are executed in parallel across multiple migration machines which aids in a better utilization of available migration window.