Publish Content to the Azure CDN with URLRewrite

Content Delivery Network

A content delivery network (CDN) is a distributed network of servers that will cache and serve content from edge nodes closer to the user’s browser. By utilising this functionality websites can offload much of their static content delivery to those servers saving valuable web processing and bandwidth for core business related activities and giving the user a better online experience.

A CDN should be considered for delivering content for Internet workloads that exhibit:

  • Static or slow changing content
  • Content shared by many users
  • Geographically dispersed users
  • Ad-hoc or irregular usage (and therefore don’t get the benefit of the browser cache)
  • Expensive or saturated bandwidth connections

One or more of these characteristics is an indicator that a CDN may be a worthwhile investment, but the investment need not be a large one.

Azure CDN

The Windows Azure CDN is a network of servers deployed at strategically placed locations around the globe. They will cache and deliver Windows Azure blobs and the static content output of compute instances. The CDN can be enabled through the Windows Azure Platform Management Portal as an add-on feature to your subscription. This article shows how to cheaply and easily leverage the Azure CDN in your architecture without any development, migration nor publishing effort.

Before looking at Azure, it’s worth taking a step back to look at some of the foundational components that can be used to build a CDN. IIS 7 and later has a powerful module plug-in capability for affecting the request response pipeline which can been leveraged to implement content serving farms (in fact I bet these components are used under the Azure CDN)

URLRewrite

One of the powerful and under rated plugins is URLRewrite 2.0 which has a config driven regular expression rules engine for mapping incoming requests to new URLs and for modifying the content of the outgoing response. While this is a similar concept to ASP.NET URLRewrite used heavily in MVC, it works at a lower level and doesn’t actually need ASP.NET at all.

Start out by downloading and installing URLRewrite for IIS 7(or 8) which will put an icon like this in your IIS management console.

  • Create a local website with some content in a folder like:
    website/content/Kloud.jpg
  • Create a folder (or virtual directory) called “cdn”.

We can now create a rule that delivers the “Content” folder directly from a new content URL.

  • In IIS click on the cdn folder and double click the URLRewrite icon and create a new “Blank” inbound rule.

  • Add a new rule called “cdn” as a case insensitive regular expression rule to match all incoming URLs (.*)
  • and rewrite to them to the content folder
    (/Content/{R:1})
  • Click Apply.

This will create a web.config file in the cdn folder with the rewrite rules (which can be directly edited if preferred to using the GUI wizard). Now the web site has two URLs on which content be retrieved.
http://localhost/cdn/kloud.jpg and http://localhost/Content/kloud.jpg

While that’s a handy way to create vanity URLs, the same could be achieved with a simple virtual directory. The real power of the Rewrite Module is in partnership with the Application Request Routing module.

Application Request Routing

Application Request Routing is a module designed to provide the functionality to build large content caching farms from IIS servers. ARR 2.5 (at time of writing) is available as a set of 4 dependant modules through the Web Platform Installer, but also as a direct install from here (this will become important later).

In our case we’ll use just one part, the ability to request “off box” for the rewritten URL. That is we can turn our virtual directory into a reverse proxy by simply editing our rewriting rules appropriately.

  • Open the URLRewrite configuration for the cdn folder and “Disable Rule” on the cdn rule created previously
  • Now click Add rule and choose the “Reverse Proxy” rule and click OK to the warning about setting up a proxy rule.
  • Here we will define a server that will be serving the content, for instance this blog kloudsolutions.files.wordpress.com (but any site capable of serving content will do).


(Don’t worry about outbound rules that will come later)

  • Accept the default rules (match all URLS and append query string) and Click Apply which will create a new reverse proxy inbound rule.

Now browse to the URL

and even with different query strings

The local IIS should now be forwarding requests and serving content on behalf of the Kloud blog, which is interesting but not altogether useful unless some other value adding service is provided like caching.

Azure WebRole CDN Folder

The first release of the CDN required static content to be pre-uploaded to an Azure Blob store marked for publishing via the CDN. That meant a big upfront effort to identify and manually publish the content appropriate for CDN publishing. The March 2011 update to Azure quietly released some updates to the CDN that made publishing content much easier and much more powerful.

Developers can now use the Windows Azure Web and VM roles as “origin” for objects to be delivered at scale via the Windows Azure CDN”

That means if a user’s browser requests some content from the CDN that isn’t yet in the Blob store then Azure will try to hit your WebRole on the /cdn folder to deliver the content before giving the user a 404 not found error.

So what if we combine those two ideas and use an Azure Web Role to reverse proxy the requests from the Azure CDN to a content server? To do this we’ll need to install and configure ARR and URLRewrite on a WebRole. This can be done and is detailed here so won’t be detailed here.

  • Open Visual Studio and create a new Windows Azure Project called “ContentProxy” and add a WCF WebRole to it (that’s the simplest role to remove all the code from).
  • Take the web.config file from the CDN folder we created locally and copy it into a CDN folder in the Azure project.
  • Add the install files as described in the blog and remove everything else you can until it looks like this.

Now we have a content proxy deployed to Azure listening on the CDN folder for requests. Now there’s some final configuration required to publish a CDN end point and wire it up to our WebRole.

Azure CDN Endpoint

  • Browse to and log in to the Azure Management Portal.
  • Click on Hosted Services, Storage Accounts and CDN.
  • Click on the CDN menu item and the New Endpoint for your subscription

  • Create a new CDN end point and choose the hosted service running the content proxy and provide the CDN source URL for retrieving content.

  • This will create a new publically available end point like which can be used to get your dynamic content off the CDN:
    http://az190064.vo.msecnd.net/2011/12/121211_0222_lync2010mob1.png?w=50

Now by changing the definition or the URL Rewrite rules you can cache and deliver via CDN any content from anywhere on the Internet including your own corporate web serving infrastructure. The effect will be that a piece of content will be requested from the host server only once when requested by a browser and then every subsequent request will come off the edge nodes of the Azure CDN until it expires from the CDN.

….with no code!

Cloud Revolution – The Blind Entrepreneur

[Following from Cloud Revolution]

Profitable since the first quarter, 375 million page views per month, $4 million in annual revenue, 75 employees and $30 million in venture funding. Certainly sounds like a successful business, and it is.

I Can Has Cheezburger has leveraged the initial success of that hit web site by launching hundreds of other similar but slightly different sites based on the original. Many of these sites fail to attract traffic and are shut down within weeks, but every now and then one works, like “Fail Blog” or “Totally looks like“. With many copy cat sites launching web sites daily, chasing the latest “meme” and clipping the ticket on advertising for every one of your page views there’s plenty of motivation to continue.

Traditional innovation business models involve coming up with a great idea, getting seed capital, driving the business for a year only to watch it fail and leave the inventor bankrupt for his troubles. CEO Ben Hu is a new breed of trial-and-error innovator powered by the “scale fast” and “fail fast” business model enabled by the Cloud. By constantly trying out new variants of sites and seeing what works, and more importantly what doesn’t, he is tightening up and lowering the cost of the innovation cycle.

Could a cringe-worthy blog on cat pictures really change the world? Quite possibly, yes. Mr Hu is disappointed with the way newspapers have been presenting news online and is using all the lessons learnt in publishing cat pictures along with his new $30M of seed capital to reinvent the world of online journalism. He will no doubt try many new ways to present news, most will fail, but eventually something is going to stick and we could then be looking at the next Rupert Murdoch.

We know from Richard Dawkins “Blind Watchmaker” that many small incremental changes combined with a process of selection will eventually turn an amoeba to a human. When applied to the Internet and powered by the pay as you go model of the Cloud, the rate of change can be dialled up to 11 and the consequence of failure is near zero. The result will be revolutionary and highly effective ways of presenting information over the Internet for the benefit of humanity and especially Mr Hu!

Cloud Revolution – The Late Night Scientist

[Following from Cloud Revolution]

It’s late at night in January and an email just arrived that has someone very excited. A medical student has won an auction, but this is no eBay auction, it’s an Amazon EC2 instance auction. She has been preparing for this moment for over a year and will make or break her post-doctoral research paper.

She is continuing the valuable work done by her predecessors in the field of on Parkinson’s research and its relationship to dopamine levels in the brain. Previous studies were painstakingly conducted by carefully tracking, monitoring and documenting the lives of hundreds of thousands of patients presenting with dopamine affecting afflictions such as methamphetamine addictions over the course of 10-15 years.

Our scientist is hoping to show a relationship to other causes of dopamine depletion and to do it she’s has been scouring the huge catalogue of free and paid for databases from the likes of the WHO and Science Direct available in the Cloud. Bringing together datasets as wide as meteorology, geography, drug addiction, depression and death rates she has amassed 20Tb of data that needs to be processed to find correlations. Trial runs on small subsets have shown that it will take around 30,000 hours for one CPU to get through the full set. She can’t afford to wait the 3 years to run on her own machine, and limited by a $5,000 grant she can’t buy the hardware needed to do it herself. Instead she puts in a bid for 5000 Amazon EC2 Spot instances at 4c/hour.

Humming in the darkness under a football field sized roof in Ashburn Virginia are thousands of computers running Amazon.com, Instagram, Reddit, Quora and Foursquare. But now it is late in the evening, many shoppers and users are asleep, post Christmas sales are over and the GFC have all conspired to bring Internet traffic to a record low. One by one computers are being freed up into a pool until there are 5000 available and the deal is struck. 6 hours and $1200 later and she has her answer.

This is an example of the new paradigm of data-intensive scientific discovery and it’s happening right now. Effectively time-shifting 16 years of research into 6 hours of processing by utilising data and computing power that already exists. While many organizations are battling with the concept of how to secure their data in the cloud, others have seen the opportunity and make their data freely available or as a chargeable service.

There are many problems that don’t need to be solved now, or even today. In fact some problems have remained unsolved for years but may now tackled using enormous amounts of otherwise idle computing capacity at prices previously unheard of to scientists. This new tool of human evolution will be used to map the neurons in the brain, solve the riddle of Parkinson’s disease, chip away at the list of cancers and discover unexpected relationships and correlations across massive datasets of medical information.

And it’s all available to anyone with a hunch or a hypothesis they want to test.

Cloud Revolution – The New Cottage Industry

[Following from Cloud Revolution]

In a small office in the back of a house in Melbourne is a business taking on the big players in their own backyard. Running a photo library and acting as an intermediary between a carefully chosen group of photographers and a select genre of magazines and publications.

A new photo shoot has just been uploaded from Hungary to the hosting site and a notification email has been sent. The shoot is first quality controlled, categorised and submitted for key wording using one of the new “mechanical turk” cloud services that provide an electronic portal into large, on demand, pools of human labour.

The business of photo syndication is perfectly suited to an Internet only shop front where communication, marketing and product delivery are all electronic. Where reputation is based on service level and quality of product, success can be achieved by targeting and excelling at a very narrow vertical market.

The monthly outgoings are less than $AU150 including, hosting, CRM, campaign management, marketing, invoicing, accounting and laptop. With such low overheads it doesn’t take many sales to turn a profit. All you need is an eye for good photographers, a supply of good photo shoots and a contact list of regular customers with a need for your niche.

This is a window into the frontline of the new Software as a Service, “Pay As You Go” enabled cottage industry. Appearing to all intents as a large professional operation, leveraging the very same tools for operation, but provisioned at a smaller scale.

Unhindered by technological barriers to entry, cottage industries are finding a new resurgence on a level playing field with their much larger competitors. These new kids on the industrial block are unconcerned about size and don’t need internal economies of scale to be profitable.

Cloud Revolution

There has been much focus recently on cloud computing and its adoption (or not) by the enterprise. Although most cloud vendors are understandably keeping their cards close to the chest, by all reports the enterprise uptake has been slow.
The problem as I see it thus far has been:

  1. Enterprises are notoriously bad at costing internal IT infrastructure. For 20 years the IT department has sat comfortably in the corner of the office occasionally ducking in and out of that off-limits humming room with no windows speaking a secret coded language of CAT-5 cables and firewalls. They have long enjoyed the place of an unavoidable cost of doing business, as a supplier of basic employee needs, like the water cooler and instant coffee and thereby flying under the radar of accountability and business cost alignment
  2. When a cloud computing alternative is investigated, who is called upon to do the cost comparison? The IT Infrastructure department! With vested interests in their own survival they are unlikely to cost in the financial benefits realised by their own demise. The costs benefits of cloud alternatives to an enterprise are often understated and don’t include the material benefits of reduced personnel, electricity, bandwidth, and expensive floor space.
  3. Moving to the cloud takes effort, human effort, and that’s expensive both in raw hourly rates and opportunity cost that could be spent building new features into your products to keep ahead of your existing competition. The human effort required to move workloads to the cloud costs more than the immediate savings.
  4. The Cloud has not been around long enough to outlive the incumbent hardware cycle. Hardware is purchased and depreciated on a 3-4 year capital expenditure and depreciation cycle. The sunk costs in existing hardware deployments need to be ridden out before new alternatives are considered.
  5. Large enterprises already enjoy reasonable economies of scale in existing computing and virtualisation infrastructure. While they certainly can’t match the pricing of a cloud vendor, the price they currently pay is not different enough to motivate immediate corrective action.
  6. The perceived security of having data physically located within the four walls of the organization is a difficult mindset to overcome. “I like to be able to touch my data” is an emotional reaction akin to keeping money under the mattress that doesn’t stand up to technical debate but is deeply rooted in our human nature and our attitude to apparent physical security.

All of these problems are temporary and will inevitably make their way through the process of organisational change. The movement of existing on-premise computing workloads to cheaper processing alternatives is economically inevitable. So with cautious uptake of cloud computing by enterprise, is all this cloud talk really just hype? Is there really a cloud revolution or is it just the Internet in new clothes? Halving an IT budget and moving unpredictable capital expenditure to predictable operating expenditure is certainly great for business, but it’s not that great a leap forward in human evolution.

Cloud is an enabling technology, a familiar tool wielded by a new set of hands for which traditional computing was financially out of reach. It’s here that utility computing is being leveraged to bring new opportunities and leaps forward in innovation. Unencumbered by legacy infrastructure or attitudes the new cloud-enabled industry is stealing away time and market share on the discounted cloud infrastructure deployed waiting for enterprise to arrive.

There is certainly a “Cloud Revolution” underway, and like all great revolutions it is starting from the bottom up which I will cover by looking a little closer at the following three case studies:

Cloud Power

While there are many definitions of cloud computing published on the Internet most fall short on perspective. Cloud computing is touted as everything from just a rebadge of existing technology to a whole new paradigm shift in information technology. To determine the truth, it’s worthwhile taking a step back to get some perspective on computing technology and where we are in the information technology maturity cycle.

History Repeating

We’ve been here before albeit with other technologies. If you owned a factory before the industrial revolution you had a couple of choices, either you build next to a river and tap power from a waterwheel or you use grain fed horsepower. That proved to be quite restrictive and expensive for most factories which often continued to use human power as a more convenient alternative.

Then the industrial revolution brought a leap of technology with the new on-premise steam engine to generate the power for your factory. This freed factories to be able to build closer to the population centres required to run the factory and closer to raw materials used in production. This represents a “mountain to Mohammed” economic leap forward, bringing the energy source to the factory rather than the factory to the natural energy source.

But eventually even here economics sniffs out inefficiencies. Every factory had a large upfront capital expenditure on a power plant, large areas of floor space were dedicated to power generation and expensive specialist engineers were employed on the payroll to support the power generation unit (sound familiar?).

Companies such as General Electric and peers recognised an opportunity to centralise power generation and transport it in the higher value form of electricity to where it was required. The new electric utility “featured reliable central generation, efficient distribution, a successful end use (in 1882, the light bulb), and a competitive price”. By outsourcing power generation, factories could use that free floor space, capital and labour for their core business while paying for only as much power as they use. By centralising generation, power companies were able to realise economies of scale by using larger, more efficient power generation to drive down prices.

Cloud Computing

So what is my definition of Cloud Computing? Simply; ”The utility model applied to compute”. The way I see it, this new utility will earn its place on the Monopoly board of the future alongside the other utilities we now assume to be provided off-premise by infrastructure and utility companies.

So cloud computing isn’t an entirely new technology; rather it is existing technology repackaged and sold in small pay as you go units to many customers. Customers are free from the burden of sunk capital expenditure and providers are free to on sell the service to many customers in a way that will encourage better capital utilisation, such as auctioned or off-peak usage charges.

Betting the house

So if Cloud Computing is the new Electricity then the days of the little steam engine (data centre) on premise should be numbered. This is a game changer for companies providing small servers and the software it runs on. There’s a very strong and fast move toward small personal, always connected devices connected to large centralised computing infrastructure. Little wonder then that one of the leaders in information technology for the last few decades, Microsoft, is “betting the company” on two technologies, mobile phones and cloud computing. If history is any predictor then there should be a veritable gold rush of companies trying to fill the new cloud space. In fact Microsoft has dropped over $US1 billion building cloud infrastructure just trying to catch up with the incumbent providers Amazon and Google. They are adding 10,000 servers, enough to run all of Facebook, every month.

There are also many other vendors rushing for the new goldmine of outsourced compute, some brand new companies others rebadging or shifting up the value chain from hosting or down the value chain from software. Again the history of trains, gas and electricity tells us one thing is for sure, there are already many more players than there needs to be and there will be a period of consolidation after the boom. The trick is to pick a horse that will last the distance. The difference here is if you pick the wrong one you’ll be left with a bigger problem than a train ticket you can’t use.

New Possibilities

With new technology come new possibilities. For Cloud computing it arrives at a time where personal hand held devices are just becoming sensory and always connected. The huge explosion of information possible from these devices is really only consumable by the largest of data centres. Such processing power was previously only available to the largest multinational companies with huge capital expenditure programs but sadly little appetite or scope for innovation. Now, however, that same processing power is available to any kid with a good idea and a credit card. Larry and Sergey had to scratch up their life savings to purchase 40Gb of hard disk to index the Internet, today that barrier to entry is gone.

Penguins and Polar Bears

So what of the large corporates, how is Cloud Computing affecting them? At a recent Gartner conference a good analogy was made. Cloud computing is like an iceberg, snapped off from the mainland and floating in the ocean. On it are penguins and polar bears all looking at each other waiting for someone else to jump.

While there are economic benefits to cloud computing, the human aversion to risk and change especially in large corporate companies will see many polar bears afloat paralysed by fear. Meanwhile the smaller more agile penguins are more likely to make the leaps necessary to capitalise on new technology. Corporates will inevitably take on cloud computing via a number of different routes. There will be the proactive companies who plan and fund innovative cloud development and deployment projects. There will be the reactive companies, who are forced into deploying new cloud services to compete with more competitive players taking big profits and market share. And there will be the reluctant companies who find themselves with cloud deployed services coming through the back door as they assimilate those smaller more agile competitors.

Which Cloud adoption strategy are you?

Follow

Get every new post delivered to your Inbox.