Calling WCF client proxies in Azure Functions

Azure Functions allow developers to write discrete units of work and run these without having to deal with hosting or application infrastructure concerns. Azure Functions are Microsoft’s answer to server-less computing on the Azure Platform and together with Azure ServiceBus, Azure Logic Apps, Azure API Management (to name just a few) has become an essential part of the Azure iPaaS offering.

The problem

Integration solutions often require connecting legacy systems using deprecating protocols such as SOAP and WS-*. It’s not all REST, hypermedia and OData out there in the enterprise integration world. Development frameworks like WCF help us deliver solutions rapidly by abstracting much of the boiler plate code away from us. Often these frameworks rely on custom configuration sections that are not available when developing solutions in Azure Functions. In Azure Functions (as of today at least) we only have access to the generic appSettings and connectionString sections of the configuration.

How do we bridge the gap and use the old boiler plate code we are familiar with in the new world of server-less integration?

So let’s set the scene. Your organisation consumes a number of legacy B2B services exposed as SOAP web services. You want to be able to consume these services from an Azure Function but definitely do not want to be writing any low level SOAP protocol code. We want to be able to use the generated WCF client proxy so we implement the correct message contracts, transport and security protocols.

In this post we will show you how to use a generated WCF client proxy from an Azure Function.

Start by generating the WCF client proxy in a class library project using Add Service Reference, provide details of the WSDL and build the project.

add_service_reference

Examine the generated bindings to determine the binding we need and what policies to configure in code within our Azure Function.

bindings

In our sample service above we need to create a basic http binding and configure basic authentication.

Create an Azure Function App using an appropriate template for your requirements and follow the these steps to call your WCF client proxy:

Add the System.ServiceModel NuGet package to the function via the project.json file so we can create and configure the WCF bindings in our function
project_json

Add the WCF client proxy assembly to the ./bin folder of our function. Use Kudo to create the folder and then upload your assembly using the View Files panelupload_wcf_client_assembly

In your function, add references to both the System.ServiceModel assembly and your WCF client proxy assembly using the #r directive

When creating an instance of the WCF client proxy, instead of specifying the endpoint and binding in a config file, create these in code and pass to the constructor of the client proxy.

Your function will look something like this

Lastly, add endpoint address and client credentials to appSettings of your Azure Function App.

Test the function using the built-in test harness to check the function executes ok

test_func

 

Conclusion

The suite of integration services available on the Azure Platform are developing rapidly and composing your future integration platform on Azure is a compelling option in a maturing iPaaS marketplace.

In this post we have seen how we can continue to deliver legacy integration solutions using emerging integration-platform-as-a-service offerings.

Azure API Management Step by Step – Use Cases

jorge-fotoUse Cases

On this second post about Azure API management, let’s discuss about use cases. Why “Use Cases”?                  

Use cases helps to manage complexity, since it focuses on one specific usage aspect at the time. I am grouping and versioning use cases to facilitate your learning process and helping to keep track with future changes. You are welcome to use these diagrams to demonstrate Azure API management features.

API On-boarding is a key aspect of API governance and first thing to be discussed. How can I publish my existing and future APIs back-ends to API Management?

API description formats like Swagger Specification (aka Open API Initiative https://openapis.org/) are fundamental to properly implement automation and devops on your APIM initiative. API can be imported using swagger, created manually or as part of a custom automation/integration process.

Azure API management administrators can group APIs by product allowing subscription workflow. Products visibility are linked with user groups, providing restricted access to APIs. You can manage your API policies as Code thought an exclusive GIT source control repository available to your APIM instance. Secrets and constants used by policies are managed by a key/value(string) service called properties.

apim-use-cases-adm-api-onboarding

Azure API management platform provides a rich developers portal. Developers can create an account/profile, discover APIs and subscribe to products. API Documentation, multiple language source code samples, console to try APIs, API subscription keys management and Analytics are main features provided. 

apim-use-cases-developer

The management and operation of the platform plays an important role on daily tasks. For enterprises, user groups and user(developers) can be fully integrated with Active Directory. Analytics dashboards and reports are available. Email notification and templates are customizable. APIM REST API and powershell commands are available to most of platform features, including exporting analytics reports.

apim-use-cases-administrator

Security administration use cases groups different configurations. Delegation allows custom development of portal sign-in, sign-up and product subscription. OAuth 2.0 and OpenID providers registration are used by development portal console, when trying APIs, to generate required tokens. Client certificates upload and management are done here or using automation. Developers portal identities configurations brings out of the box integration with social providers. GIT source control settings/management and APIM REST API tokens are available as well.

apim-use-cases-adm-security

Administrators can customize developers portal using built in content management systems functionality. Custom pages and modern javascript development is now allowed. Blogs feature allow of the box blog/post publish/unpublish functionality. Developers submitted applications can be published/unpublished by administrator, to be displayed at developers portal.

apim-use-cases-adm-developer-poral

In Summary, Azure API management is a mature and live platform with a few new features under development, bringing a strong integration with Azure Cloud. Click here for RoadMap

In my next post, I will deep dive in API on-boarding strategies.  

Thanks for reading @jorgearteiro

Posts: 1) Introduction  2) Use Cases

Azure API Management Step by Step

jorge-fotoIntroduction

As a speaker and cloud consultant, I have learned and received a lot of feedback about Azure API management platform from customers and community members. I will share some of my learnings in this series of blog posts. Let’s get started!

apim-image

APIs – Application programming interfaces are everywhere! They are already part of many companies’ strategies. But how could we consolidate internal and external APIs? How could you productize and monetize them for your company?

We often build APIs to be consumed by a unique application. However, we could also build these APIs to be shared. If you write HTTP APIs around a single and specific business requirement, you can encourage API re-usability and adoption. Bleeding edge technologies like containers and serverless architecture are pushing this approach even further.

API strategy and Governance comes in play to help build a Gateway on top of your APIs. Companies are developing MVPs (minimum viable products) and time to market is fundamental. For example, we do not have time to write authentication, caching and Analytics over and over again.  Azure API management can help you make this happen. apim-consolidation

apim-consolidation-operations

This demo API Manegement instance that I created for Kloud solutions illustrates how you could create a unified API endpoint to expose your APIs. Multiple “Services” are published there with a single Authentication layer. If your Email Service back-end implementation uses an external API, like Sendgrid, you can Inject this authentication on the API Management gateway layer, making it transparent for end users.

Azure API management provides a high scalable and multi-regional Gateway that can be deployed on any Azure Region around the world. It is a fully PaaS (platform-as-a-service) API management solution, where you do not have to manage any infrastructure. This, combined with other Azure offerings, like App Services (Web Apps, API Apps, Logic Apps and Functions), provides an Enterprise grade platform to delivery any API strategy.

apim-diagram

Looking this diagram above, we can decouple API Management in 3 main components:

  • Developer Portal – Customizable web site exclusive to your company to allow internal and external developers to engage, discover and consume APIs.
  • Gateway (proxy) – Engine of APIM where Policies can be applied on you inbound, back-end and outbound traffic. It’s very scalable and allows multi-regional deployment, Azure Virtual Network VPN, Azure Active Directory integration and native caching solution. Policies are written in XML and C# expressions to define complex rules like: Rate limit, quota, caching, JWT token validation, Authentication, XML to Json and Json to XML transformations, rewrite URL, CORS, restrict IPs, Set Headers, etc.
  • Administration Portal (aka Publisher Portal) – Administration of your APIM instance can be done via the portal.  Automation and devops teams can use APIM management REST API and/or Powershell commands to fully integrate APIM in your onboarding, build and release processes.

Please keep in mind that this strategy can apply to any environment and architecture where HTTP APIs are exposed, whether they are new microservices or older legacy applications.

Feel free to create an user at https://kloud.portal.azure-api.net, I will try to keep this Azure API Management instance usable for demo purposes only, no guaranties. Then, you can create your own development instance from Azure Portal later.

In my next post, I will talk about API Management use cases and give you a broader view of how deep this platform can go. Click here.

Thanks for reading! @jorgearteiro

Posts: 1) Introduction  2) Use Cases

Connecting Azure Mobile Apps and Web Apps to existing on-premises infrastructure using Azure Hybrid Connections

This article describes a very easy and secure way of connecting your Azure Mobile Apps and Azure Web Apps to your existing on-premises infrastructure. The option is called Hybrid Connections and is a feature of Azure BizTalk Services. The feature does not require any VPN configuration or opening Inbound ports on the Firewall. After completion of a few simple configuration steps in the Azure Portal and installation of so-called Hybrid Connection Manager(HCM) program on your on-premises servers, you are ready to consume your on-premises services from your Cloud App right away.

Hybrid Connection

Hybrid Connections requires you to install a relay agent called the Hybrid Connection Manager(HCM) in your network. The HCM needs to be able to connect to Azure and also to your application. This solution is especially great from a remote network such as your on-premise network or even another cloud-hosted network because it does not require an internet accessible endpoint. The HCM only runs on Windows, and you can have up to 5 instances running to provide high availability. Hybrid Connections only supports TCP though, and each HC endpoint has to match to a specific host:port combination.

There are a few use-cases when you would want to setup a Hybrid Connection:

  • The scenario where a simple, quick and secure integration is required with some of your existing on-premises services
  • When you move some of your old Web Applications to the cloud, while keeping other existing back-end services on-premises without the need to open Inbound connections on the Firewall
  • Hybrid Connections currently depend on a BizTalk account which has pricing levels that start free and then get progressively more expensive based on the amount you need. For small data loads, free option can be very attractive
  • When it comes to working across many networks, there is no other feature like Hybrid Connections which can enable you to access resources in well over 100 separate networks
  • If you are a developer and simply want to run a site in Azure and have it access the database on the workstation under your desk, then the easiest thing to use is Hybrid Connections.

 Benefits of Hybrid Connections:

  • Quick easy setup
  • No need to setup a Virtual Network, VPN, etc.
  • Does not require any open Inbound ports
  • Can be shared between multiple Web and Mobile Apps
  • Good for exposing single on-premise resources to consumers in Azure
  • Supports multiple HCM in a single network that distribute the load using round-robin

Disadvantages of Hybrid Connections:

  • Performance limitations
  • Supports only fixed TCP ports
  • Supports only TCP
  • In Preview (no SLA) and not available in all Azure Regions

Security

Hybrid Connections use Shared Access Signature (SAS) authorization to secure the connections from the Azure applications and the on-premises Hybrid Connection Manager to the Hybrid Connection. Separate connection keys are created for the application and the on-premises Hybrid Connection Manager. These connection keys can be rolled over and revoked independently.

TCP ports

Hybrid Connections require only outbound TCP or HTTP connectivity from your private network. You do not need to open any firewall ports or change your network perimeter configuration to allow any inbound connectivity into your network.

The following TCP ports are used by Hybrid Connections:

Port Why you need it
9350 – 9354 These ports are used for data transmission. The Service Bus relay manager probes port 9350 to determine if TCP connectivity is available. If it is available, then it assumes that port 9352 is also available. Data traffic goes over port 9352. 

Allow outbound connections to these ports.
5671 When port 9352 is used for data traffic, port 5671 is used as the control channel. 

Allow outbound connections to this port.
80, 443 These ports are used for some data requests to Azure. Also, if ports 9352 and 5671 are not usable, then ports 80 and 443 are the fallback ports used for data transmission and the control channel.

Allow outbound connections to these ports. 

Note It is not recommended to use these as the fallback ports in place of the other TCP ports. The HTTP/WebSocket is used as the protocol instead of native TCP for data channels. It could result in lower performance.

If you would like to read more about Hybrid Connections, please read this article

Example: Accessing on-premise SQL Database from the Web App

I’ve captured the process of setting up of a Hybrid Connection using an example where you setup accesses an SQL Database on-premisses from a test Web Application. Below is the diagram that illustrates the solution.

 

Hybrid Connection SQL

To start with Hybrid Connection you need an App Service running in your Azure Subscription. Create a sample test application or publish one of your old apps that use SQL Database and have a connection string setup in Web.config.

Go to Azure Portal and find your App Service there.

Pic 3

Then click “Networking” tab in the menu on the right.

On the next screen click “Configure your hybrid connection endpoints.”

Pic 4

Click “Add”  to add a new connection. Provide any name for the connection and enter a Hostname. The name of this host should be the same as it defined in your local on-premises network. This is the name that HCM will try to resolve while processing requests inside your company network.

Pic 5

Next step is to create a new BizTalk Service account. You can choose any region and price tier. I would recommend starting with a Free Tier as it is good enough for many applications.

Pic 6

Then please click OK two times and Azure will start working on the creation of your new Hybrid Connection. Once it is done, you will see that the Hybrid Connection is there but is not connected yet.

Pic 7

At this point, we need to go to a server located inside of your network. Please login there into the portal and navigate to Hybrid Connections. Click on the new connection, and then, click on “Listener Setup” icon. It will start the process of installation of the HCM.

Pic 8.png

You will see series of prompts and progress windows. Please choose automatic setup: “Install and configure now”. It will guarantee that Shared key will be set up properly.

Pic 9Pic 10

Pic 11

Once system finishes setup process, in a few minutes you will see that new connection is in a Connected state and has a green tick on the left. This indicates that the installation is successful, and HCM can contact the App Service on the cloud.

Pic 12

Now we need to make sure that Connection string in the Web.config file of you Web App is pointing to the correct host. The Hostname must be set exactly as it is resolved inside your local network. In our example, it is “lablob01”. The SQL server must be installed on this host. In this scenario our connection string will look like this:

As you can see the Hostname ( or “Server” parameter in the connection string ) looks the same as if you would have written it in an application that runs on the local network. If you have old web applications running on-premises, they need almost no change in this respect.

Caveat

While working on Hybrid Connections and particularly on integrating with Microsoft SQL Server on-premises, we have found a serious bug in Hybrid Connection Manager. At the moment of the writing of this post, HCM version is 0.6.3. The bug is still there. There is a quick workaround suggested in this post.

You can find yourself in a situation when all steps above have been completed, all indicators are green but your Web Application is still throwing error 500 with Exception that tells you something about Arithmetic overflow while accessing SQL Database.

Exception: System.Data.Entity.Core.EntityException: The underlying provider failed on Open. —> System.OverflowException: Arithmetic operation resulted in an overflow. at System.Data.SqlClient.TdsParser.ConsumePreLoginHandshake(SqlAuthenticationMethod authType, Boolean encrypt, Boolean trustServerCert, Boolean integratedSecurity, Boolean& marsCapable, Boolean& fedAuthRequired)

or (64 bit Web app)

System.OverflowException: Array dimensions exceeded supported range, at System.Data.SqlClient.TdsParser.ConsumePreLoginHandshake

Well apparently, all you need to do is to write a single value to the registry on the server where you have installed HCM. The value is below:

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL]
“SendExtraRecord”=dword:00000002

Conclusion

Azure Hybrid Connections provide very easy and secure way of access to on-premises infrastructure. The feature allows moving existing applications to the cloud without breaking internal ecosystem of back-end services already running in your company. While there are many limitations, the ease and convenience of a Hybrid Connection along with security, offers integrators a great opportunity to connect cloud applications with the existing Back-office tools without completely rebuilding infrastructure on the cloud.

 

Connecting to and Using the Azure MFA Web Service SDK Server SOAP API with Powershell

Background

A colleague and I are validating a number of scenarios for a customer who is looking to deploy Azure MFA Server. One of the requirements from an Identity Management perspective is the ability to interact with the MFA Server for user information. That led us on the exploration of what was possible and how best to approach it.

The title of this post has pretty much given it away as to how. But why ? As Azure MFA Server is a product that Microsoft have acquired through the acquisition of Phone Factor, the usual methods of interacting with applications and services in the Microsoft Stack don’t apply. There is practically no information on how to use Powershell to interact with Azure MFA Server. So this blog post details what we’ve learned and how we have been able to get information out of Azure MFA Server using Powershell.

This post covers;

  • Connecting to the Azure MFA Web Service SDK
  • Searching for users in the MFA Database
  • Returning information about users in the MFA Database
  • Making a test call to a users phone via the MFA Server

Prerequisites

There are a number of prerequisites that I’m not covering here as you can quickly locate many guides to installing/configuring Azure MFA Server.

  • You’ll need to have an Azure MFA environment (obviously)
  • Download the Azure MFA Web Service SDK
  • Install and Configure the Azure MFA Web Service SDK
  • If you aren’t using a Public SSL Cert on the Azure MFA Web Service SDK Server you will need to export the certificate from the Azure MFA Web Service SDK Server and import it to the Trusted Root Certificate Store on the workstation you’ll be using Powershell on to connect to the MFA environment.

 

Connecting to the Azure MFA Web Service SDK

Now that you’ve met the prerequisites listed above you can use Powershell to connect to the API. The URL is the DNS name of the Azure MFA Web Service SDK Server following by the SDK SOAP endpoint. eg. https://mfa.yourdomain.com.au/MultiFactorAuthWebServiceSdk/PfWsSdk.asmx?WSDL

Try out the URL in your browser and using an account that exists in the MFA Server authenticate to the Azure MFA Web Service SDK Server. If it is setup correctly (including your SSL certificate)  you will see the following.

 

The simple script below will perform the same thing, but via Powershell. Update for your domain, username, password and URL for your MFA Web Service SDK Server.

Searching for users in the MFA Database

Now that we’ve setup a web services proxy connection to the MFA Web Service SOAP API endpoint we can start getting some info out of it. Searching for users uses the ‘FindUsers_4’ call. It has many parameters that can be set to alter the results. I’ve gone simple in this one and used ‘*’ as the criteria to return all users in the MFA Database. Alter for your purposes.

Returning information about users in the MFA Database

Using a slightly different criteria to the Search criteria above I returned one entry and set the $mfauser variable to them. I then use that in the GetPhone, GetUserSettings & GetUserDevices calls as shown belown to retrieve all the info about them.

Making a test call to a users phone via the MFA Server

Finally rather than just consuming information from the MFA environment let’s actually trigger something. Selecting an identity from our test environment that had the mobile phone number of a colleague associated with it, I triggered MFA Server to call them to authorize their session (which they hadn’t obviously requested). I may have done this a few times from the other side of the room watching with amusement as their phone rang requesting authentication approval 🙂

Full script snippets below.

Hope that helps someone else.

Follow Darren on Twitter @darrenjrobinson

 

 

SharePoint Web Service Caching using WCF Custom Channel

The New SharePoint App Architecture

To be honest, I’ve had this code for a while now sitting in the blog cupboard. It was originally intended to be used as part of a global content management system delivering content from a single SharePoint authoring point to many distributed content delivery systems. Since then, events have conspired to bring it back out of the cupboard but for a slightly different purpose. As I said in my previous blog, two things have happened in the world of SharePoint:

  • SharePoint went “Online” which means it is hosted centrally in a data centre (most likely) far-far away.
  • SharePoint Online got picky about what you can run in the SharePoint application and “spat out” other application functionality to be hosted elsewhere.
    (This incidentally, not only makes security sense for Microsoft but also economic sense as you will most likely pay for your SharePoint license and some other hosting environment for your apps as well!)

So back to the problem; in short what used to look like this for most on-premise SharePoint deployments:

Has become a more challenging architecture like this:

Where SharePoint Online environment, the App hosting environment and the user may be all separated by significant distance. To be fair, Microsoft did see this coming. After a less than successful attempt at moving SharePoint 2007 online with “BPOS” and then the slightly more successful attempt (using the “Sandbox”) with the SharePoint 2010. There was realisation that the only truly flexible solution to multi-tenant customisation support is to let you run outside SharePoint and call back in through services. That’s why so much work has gone into the SharePoint Web Services and in particular the expanded SharePoint 2013 REST interface.

This blog looks at a solution for bringing the set of SharePoint Web Services (.asmx) closer to the consumer and (if I get time) I’ll have a go at doing something similar with the REST interface.

SharePoint Web Services

The inspiration for building a SharePoint Service cache came from watching how one of the Microsoft provided caches works under the hood. SharePoint Workspace (or SkyDrive Pro) uses the SharePoint Web Services to maintain a remote copy of your SharePoint data. Sync a library with SkyDrive Pro and double click on a folder while Fiddler is running to see what happens.

  • _vti_bin/Webs.asmx (GetWeb)
    Hey SharePoint tell me about this library, and am I still allowed to sync it?
  • vti_bin/UserGroup.asmx (GetCurrentUserInfo)
    Ah and this person trying to sync, do they still have access to this library?
  • _vti_bin/Lists.asmx (GetListCollection)
    Ok tell me about the Lists and Views you have
  • _vti_bin/Lists.asmx (GetListItemChangesWithKnowledge)
    Oh now that’s interesting

Knowledge is a term that turns up in the world of the Sync Framework ad reveals the heritage of these services. Incidentally knowledge looks something like this:

<syncScope>{9FA5B692-0736-4722-9C2E-880F3CDDDC2C}</syncScope>
<knowledge>
  <sync:syncKnowledge xmlns="http://schemas.microsoft.com/2008/03/sync/" xmlns:sync="http://schemas.microsoft.com/2008/03/sync/">
    <idFormatGroup>
      <replicaIdFormat sync:isVariable="0" sync:maxLength="16"/>
      <itemIdFormat sync:isVariable="0" sync:maxLength="16"/>
      <changeUnitIdFormat sync:isVariable="0" sync:maxLength="1"/>
    </idFormatGroup>
    <replicaKeyMap>
      <replicaKeyMapEntry sync:replicaId="XkTpcTZbtEKiPUeWmKjV+Q==" sync:replicaKey="0"/>
      <replicaKeyMapEntry sync:replicaId="N72lfr+UT/qOkqOR0vmh1A==" sync:replicaKey="1"/>
    </replicaKeyMap>
    <clockVector>
      <clockVectorElement sync:replicaKey="0" sync:tickCount="13009622840318"/>
      <clockVectorElement sync:replicaKey="1" sync:tickCount="17943711"/>
    </clockVector>
  </sync:syncKnowledge>
</knowledge>

We don’t have to fully understand it, but it’s enough to say that the client is keeping a token of the last state and making a cheap call to SharePoint to determine if there are changes before showing the user a SkyDrive Pro folder with all green ticks. SharePoint, for its part, is maintaining a change list of all changes to the library which can be checked and walked down by clients and applied as a diff to the client copy of the library to keep it up to date. Wouldn’t it be good if we could implement something similar for SharePoint Web services? Intercept the normal web service call, check if the SharePoint data has changed since last time and respond locally with a cached response if there were no changes.

Well, we can.

WCF Client Cache

The solution works by implementing a “Cached Query and Results” (again see my previous blog) using a WCF Custom Channel to do the caching inline without the client being aware of it. By implementing such a channel, the client can “configure in” the cache generically without affecting the native web service API or the clients. This generic solution is covered well in this article by Paolo Salvatori and this source code.

I won’t revisit the implementation details but we will be using this component, updating it and adding some features that enable better caching outcomes for systems that expose a change list tokens like SharePoint does.

At this point if you want to understand the internals of the cache implementation read on of skip to the SharePoint Example configuration below.

Caching Restrictions

At the heart of the WCF Client Cache is the ability to plug in a number of different cache providers. By far the most powerful and scalable is to use AppFabric Cache. The use of AppFabric gives us some powerful building blocks and is now available either on-premise as a service, as a service in Azure or as a component running in an existing WebRole or WorkerRole in Azure. The beauty of a distributed cache like AppFabric is

  • Cache clients can subscribe to in memory copies of the cache delivering high performance for highly used cache entries
  • Cache clients can be kept consistent, or nearly consistent across a farm.

We won’t be using any cache features that are not available in the Azure AppFabric Cache service, (such as notifications and regions). While that might seem restrictive, it actually simplifies the solution space and keeps the solution cloud deployable. But the impact of that choice is cache invalidation becomes more difficult. Ideally we’d like to be able to knock invalid cache entries when the change token changes, but without Cache Regions enumerating cache entries is not supported. For that reason, and for simplicity; we won’t strictly use cache invalidation, but rather “Cache Deprecation”.

The idea here is to use the fact that a cache is not a database; we don’t need to clean up and delete old entries to save space. All good caches are self-cleaning and will typically use an LRU algorithm to get rid of old items that are no longer needed. Cache deprecation means we don’t bother deleting old cache entries, they can remain in the cache as long as they are used, however we will move on from old to new entries by carefully choosing a cache key that represents change.

This has a couple of benefits

  • No need to implement a cache invalidation event where we need to find and eliminate old entries from the event.
  • Old cache entries can continue to be used if needed which means no cache invalidation storm where the server is hammered for new data.

Improvement 1: Client Cache Key Generation

The WCF Client Cache does a good job of allowing clients to choose a cache key generation option. But it is still restricted to looking at the incoming request and generating a key based on the parameters passed in. Generation of a good cache key is important to clients and may require more information than just the incoming request. The WCF Client Cache now supports a plug in model which can be used to pass back the responsibility of cache key generation to the calling application by implementing and registering and configuring in a class that supports the IKeyGenerate interface.

That’s a pretty powerful plug in point, and in a system where we can cheaply determine if there have been changes on the source system, we can use that point to generate a cache key that represents changes on the server by making outgoing requests to the server to ask for the current “change” state.

But such a system isn’t much good if every client request results in a request to the Change Service and potentially another request to the Service itself. Remember the whole point of this is to remove chattiness and hide away the effect of latency, but we’ve just added another call across the network. OK, now this is where it gets mind bending. What if we also cache the request for the change token too?

What this enables us to do is cache the change tokens for some fixed period of time (n) minutes and cache the Service responses forever (they will eventually fall out or get pushed out of the cache). The responses received by the client will be at most n-minutes behind the changes on the server because when a change is detected on the server (moving from Token n to Token n+1) the WCF Client Cache will go looking for new cache entries.

Token

Request CacheKey CacheData

Token(n)

RequestA Token(n)1RequestA ResponseA
RequestB Token(n)RequestB ResponseB
RequestC Token(n)RequestC ResponseC

Token(n+1)

RequestA Token(n+1)RequestA ResponseA’
RequestB Token(n+1)RequestB ResponseB’
RequestC Token(n+1)RequestC ResponseC’

Of course the cost of this is that if there are a lot of changes on the server (ie a lot of different Tokens) then we’ll leave a trail of unused cache entries.

Improvement 2: MaxStale and Async Requests

MaxStale is a concept borrowed from http 1.1 request protocol.

“If max-stale is assigned a value, then the client is willing to accept a response that has exceeded its expiration time by no more than the specified number of seconds.”
When we say an item expires in 1 minute do we really mean that at exactly T+1 that item is totally unusable? Normally there is some tolerance for use of old items. MaxStale is the way to express that there is some tolerance for using old expired items. This is subtly different to making the expiry longer because it gives us an opportunity to return stale data and concurrently freshen up the stale data by issuing an async request for data. Incidentally using a large MaxStale value has the added benefit of being able to protect against service outages, as old data will be returned rather than failing the service call.

Improvement 3: Locking

Where we have heavily used keys (like the changetoken) or where the cost of sourcing data from the service is high, we have the possibility of multiple detections of the same cache entry expiration by cache clients each of which will issue a request to the service for data. For heavily used cache keys such as the change tokens, it’s worthwhile implementing a lock mechanism to prevent multiple service requests.

Example: SharePoint User Profile Cache

As an example we’ll create a “Client” SharePoint test application which is simply a web reference to the UserProfileService on SharePoint. It will run locally and request user profile information for 30 users (via the web service UserProfileService.asmx) from a remote “Server” SharePoint site hosted in Office 365 Data Centre in Asia. We will use the WCFClientCache to inject caching under the hood (remember the client code has no idea the caching has been injected as it is added using the .config file) for the calls to SharePoint and respond if possible from the cache.

So enough theory, let’s see it in action by load testing with 4 concurrent users for 4 minutes.

Baseline

First up a baseline no-caching load test.
          <cachingBehavior enabled=false>

AVG:0.42

The test shows a response time of between 0.37 and 0.65 and averaging 0.42 seconds. This is pretty much what we would expect from a remote server half way around the world.

1 Minute Cache

Now let’s add in the cache of 1 minute.
<cachingBehavior enabled=true timeout=00:01:00
    header=true keyLock=false cacheType=MemoryCache
    keyCreationMethod=MessageBody regionName=Messages maxBufferSize=524288>



AVG:0.13

The test shows a response time of between 0.012 and 0.51 and averaging 0.13 seconds. From the diagram we can clearly see the short duration of the items coming from cache every minute a “Cache Storm” where items expire from cache and must to be refreshed from the server causing an increase in response time. Despite executing some 800 tests, only around 120 actually made it through to the server (ie 30 requests are required to fill the cache every minute of the 4 minute test, the rest of the requests come from cache).

MaxStale 30 seconds

Now let’s add some tolerance for stale items with MaxStale of 30 seconds, plenty of time to get a fresh one from the server.

<cachingBehavior enabled=true timeout=00:01:00
    maxStale=00:00:30 header=true keyLock=false cacheType=MemoryCache
    keyCreationMethod=MessageBody regionName=Messages maxBufferSize=524288>

AVG:0.032

The test shows a response time of between 0.012 and 0.51 and averaging 0.032 seconds! Now we are getting close. Despite having a 1 minute cache on items, there is no bump at the 1 minute points since the cache storm has been mitigated by returning stale data while the new fresh data is retrieved from the server for next time. Question is can we do any better than that?

Change Tokens

What about changes on the server? The problem with the above solution is we are lagging behind the changes on the server by 1 minute of cache and up to 30 seconds of staleness. Wouldn’t it be nice to get new data when it changes on the server? Here we need to introduce operation level configuration and the plug in key provider UserTokenPrefixed which will check the SharePoint User Profile Change Token. As a reminder, we are configuring in the following between client and server.

          <cachingBehavior enabled="true" timeout="00:01:00" maxStale="00:00:30" header="true" keyLock="false" cacheType="MemoryCache" keyCreationMethod="MessageBody" regionName="Messages" maxBufferSize="524288">
            <operations>
              <!--UserProfile operations-->          
              <!--User change token is cached for a short period as we regularly go back and check so we can track server changes closely
              <operation action="http://microsoft.com/webservices/SharePointPortalServer/UserProfileChangeService/UserProfileChangeToken" 

        enabled="true" timeout="00:00:10" maxStale="00:00:00" keyLock="false" cacheType="MemoryCache" keyCreationMethod="MessageBody"/>
              <!--User Profile items are cached for a looong time with the change token on the cachekey-->
              <operation action="http://microsoft.com/webservices/SharePointPortalServer/UserProfileService/GetUserProfileByIndex" 

        enabled="true" timeout="00:01:00" maxStale="00:00:30" keyLock="false" cacheType="MemoryCache" keyCreationMethod="UserTokenPrefixed"/>

During this test I deliberately browsed to a user profile in SharePoint and updated a mobile phone number at around the 2 minute mark. Less than 10 seconds later the new change token was retrieved from the server, it was different to the previous one which was used to generate the cache keys. This caused a cache miss on all cache items until the cache filled again.

AVG:0.061

The test shows a response time of between 0.012 and 0.51 and averaging 0.061 seconds even though we are running at most 10 seconds behind the changes made on the server. Can we do any better than that?

KeyLock

Maybe. A close look at the above “Virtual User Activity Chart” shows 4 long blue bars where the change token on the server was detected by each of the 4 users simultaneously. High pressure items like this can really benefit from some locking to ensure only one of the 4 users makes the change token request and everyone else uses that response. That’s what “KeyLock” is for. So let’s try again:


AVG:0.049

Here we can see the 4 long concurrent bars that represent getting the new server token and a fresh piece of data are not present as only one of the clients made that request. The test shows a response time of between 0.012 and 0.51 and averaging 0.049 seconds while delivering content lagged by just 10 seconds from the server.

Conclusion

What we have built here is a sort of SharePoint Workspace for Web Services. And its not specific to SharePoint. The lesson from this is two fold:

  • Don’t panic if your server is now a long way away, there may be simple unobtrusive ways to cater for the latency introduced by remote services (or indeed otherwise under-performing services).
  • When designing services consider adding an operation to expose a server change token indicator (like SharePoint does) to help remote clients stay in sync.

WCF is all a bit 2010 now. SharePoint has gone REST (although I deliberately chose UserProfile as it is not adequately implemented in SharePoint REST services yet). The plug in model of WebAPI is actually simpler and provides the same power to influence both server and client side. So next stop we look at some WebAPI alternatives to do the same.

Source Code is here

SharePoint Online Web Service Authentication using WCF Client-side behaviour

With the release SharePoint in 2013 and the ever increasing numbers taking up the SharePoint Online offering, it’s a good time to start looking at some of the challenges when moving to these platforms.

SharePoint has traditionally been a presentation technology with its own unique SharePoint development model utilising SharePoint designer and custom Web Part development. With the latest release, SharePoint 2013 that development model has been challenged by a new autonomous development model where the complexities and constraints of SharePoint as a development and deployment platform has been replaced by a service oriented integration platform for multiple independently hosted applications.

SharePoint has always supported a rich and expanding set of services with every release. As part of the move to the new development model a lot of work has gone into improving the “Client side” access into SharePoint by wrapping services in client side APIs and delivered some as easy to consume REST/oData endpoints, but the functionality of those services is still limited and has not yet evolved to match the power of the native SharePoint Server API nor the native SharePoint Web Services.

When moving from an on-premise SharePoint to SharePoint Online web service clients that were previously written against SharePoint Web Services (like UserProfile.asmx or Lists.asmx) will stop working because the authentication model has changed. A web service client that previously used Active Directory accounts to access SharePoint will need to become “claims aware” to use SharePoint Online. This involves syncing active directory accounts up to the MS Online Active Directory and authenticating at the MS Online Security Token Service (STS) before accessing SharePoint.

We hit this problem where a client had built a set of jobs, utilities and components that request and update SharePoint through services which then broke when migrating to SharePoint Online. The solution to updating these clients is well covered by this blog by Wictor Wilén. What this involves is opening up the source code for each client and inserting the appropriate authentication code using the MSOnlineClaimsHelper to do get the authentication cookies before calling the service.

The problem is we really didn’t want to pick through someone else’s code and find all the places to insert the authentication code. Surely there is an easier way to inject this boilerplate code without having to recode all the clients? Turns out there is, if those clients are written using WCF. What we can do is write a WCF Client side behaviour to detect when authentication needs to be done, make the appropriate requests to MS Online STS, attach the cookies and then allow the SharePoint service request proceed as normal.

To use the ClaimsHelper, first set up the Uris of the SharePoint resources you want to access and the corresponding username/passwords using a CredentialCache and set it to the MsOnlineClaimsHelper.CredentialCache.

            CredentialCache credentialCache = new CredentialCache();
            credentialCache.Add(_hostUri, “Basic”new NetworkCredential(“peter.reid@kloud.com.au”“blah”));
            MsOnlineClaimsHelper.CredentialCache = credentialCache;
Then there are two ways to use this library. Either manually request the cookies or attach to the outgoing request like this:

            UserProfile.UserProfileServiceSoapClient userProfile = new UserProfile.UserProfileServiceSoapClient(“UserProfileServiceSoap”, _hostUri + “_vti_bin/userprofileservice.asmx”);
            using (new OperationContextScope(userProfile.InnerChannel))
            {
                _helper.AddAuthCookies(OperationContext.Current);
                UserProfile.PropertyData[] data = userProfile.GetUserProfileByName(userName);
                return Guid.Parse(GetPropertyValue(data, “UserProfile_GUID”));
            }
Or automatically do the same by adding the WCF client side behaviour like this:

 <!--use behaviour to implement authentication-->
    <behaviors>
      <endpointBehaviors>
        <behavior name="CookieBehavior">
          <CookieBehaviorExtension/>
        </behavior>
      </endpointBehaviors>
    </behaviors>
    <extensions>
      <behaviorExtensions>
        <add name="CookieBehaviorExtension" type="ClaimsHelper.CookieBehaviourExtension, ClaimsHelper, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null"/>
      </behaviorExtensions>
    </extensions>
    <client>
      <endpoint behaviorConfiguration="CookieBehavior" binding="basicHttpBinding" bindingConfiguration="BasicBinding"
               contract="UserProfile.UserProfileServiceSoap" name="UserProfileServiceSoap" />
    </client>

Which enables you to use a standard web service client call knowing that the authentication conversation is happening under the hood by the WCF client side behaviour.

            UserProfile.UserProfileServiceSoapClient userProfile
                    = new UserProfile.UserProfileServiceSoapClient("UserProfileServiceSoap", _hostUri + "_vti_bin/userprofileservice.asmx");
            UserProfile.PropertyData[] data = userProfile.GetUserProfileByName(userName);
            return Guid.Parse(GetPropertyValue(data, "UserProfile_GUID"));

Check out the code
We will use this library and build on this functionality in later blogs that address some other issues when moving to SharePoint Online.

Build a RESTful Router using WCF (Browse Azure Table Storage with Excel)

In the Visualise Azure Table Storage with Excel and Fiddler I described how to use Fiddler as a handy proxy that can be used to inject the protocol differences between standard oData and the non-standard oData supported by Azure Table storage. While Fiddler is a handy desktop tool to illustrate the solution, it is not really appropriate as a part of a system architecture. A simple request to host the same Azure Table Storage to oData functionality as a hosted service lead me down a long and windy road in the world of WCF, but one worth documenting.

What Fiddler really provides is a local proxy through which requests route before leaving the client and responses route through before being returned to the browser. A hosted solution requires something slightly different, a router with the ability to take requests from one URL and serve them from another URL. Additionally we need the router to do content inspection and URL rewriting so that the content returned to the client directs requests back through the router.

So take a look at a couple of design options and building blocks to make sure there’s nothing obvious that already does it:

IIS, URLRewrite and ARR

I covered briefly the use of Application Request Routing as a router in the Publish Content to the Azure CDN with UrRewrite blog. URLRewrite and ARR working together makes a great router. It’s a powerful and high performance way to direct traffic intended for one location to another location while modifying the request and potentially the response along the way. While the URLRewite part is powerful, it is not possible within the constraints of the regexp based rules and simple plugin model, to implement the kind of rich functionality required to create the Azure security key we need.

Fail!

URLRewrite, ARR and ASP.NET HttpModule

An HttpModule very neatly plugs into the IIS pipeline and gives us the opportunity to write powerful .NET code to affect the request on the way through, and potentially (when the requesting Url is changed), hand off to ARR to forward the request “off-box” and route the response back through. There are a few tricks to getting this to work.

Firstly working in the IIS pipeline can be a tricky business. It’s important to remember you are working with streams of data, not strings or collections. The act of reading the stream can have an impact on those that come after you in the processing stack. And when operating between IIS native pipeline (where ARR operates) and .NET integrated pipeline (where HttpModule operates), it’s not obvious when the stream is being read. A simple statement such as this:

HttpContext context = HttpContext.Current;
HttpRequest request = context.Request;
string value = request.Params[“Name”];

Actually permanently affects the incoming stream in a way that makes it unable to be processed back at the native pipeline level. Certain .NET statements lift the stream out of IIS at certain points and surfaces it in a .NET interface, which prohibits further stream processing at the native level. To address this issue, .NET 4.0 now has the request.InsertEntityBody(). This method call will put a copy of the request back into the native pipeline in IIS so both APIs can operate on the streams. That means that after any change to the request a call to request.InsertEntityBody() needs to be made to pass that change back to IIS.

There are further complications when working with stream. To implement the addition of the oData paging <link> requires the module to look in the response stream for the headers and render a new link in the response body using those values. That means buffering the response reading the just the headers and then performing a replace before Flushing the stream. It can be done (I think) but the question is, do we need all the complexity of IIS and URLRewrite, ARR and HttpModule?

Maybe

WCF

WCF abstracts the notion of communication away from the hosting platform, message protocol and formats. Sure you can still use IIS to host a WCF service, but you don’t have to. WCF can be hosted in its own service listening on a certain ports and acting on messages arriving on them. WCF can be used as a transport and message agnostic router which provides everything that IIS, URLRewrite, ARR and HttpModule do together. So WCF looks promising and is worth a closer look.

WCF 4.0 Routing Service

First stop on the WCF trail is the new and very promising sounding WCF 4.0 Routing Service! This looks like it’s exactly what we need, but before going much further there is a statement in the documentation that says “The Routing Service does not currently support routing of WCF REST services. To route REST calls, consider using System.Web.Routing or Application Request Routing”. Huh! Back to ARR? Why WCF would single out REST to not be supported. In the interests of not fighting the available technology, rather than look much further into getting around that limitation, I decided to believe the doco, bin it and move on.

Fail

WCF Router

What about building our own router using WCF.

Michelle covered the basics in her 2 part blog “Building a WCF Router” which has all the basic building blocks and concepts which we can build upon. With a start like that, surely this can’t be too complicated? Next, Bing and Google madly to see if anyone has done it already which reveals these valid attempts:

  • There’s a different approach here but it ties the router solution to the underlying data structure so isn’t as generic as we need, but affirms that others are trying to solve the same problem.
  • This is as close as it gets and but it doesn’t rewrite the responses so they point back to the router.

We’ll use this basic construct and add Routing, Rewriting and WCF Behaviours to plug in the additional features (including oData/Azure Table Storage authentication functionality later). But before implementing any further let’s pause for a moment and look at how each of these WCF components comes together to solve the problem.

In designing the Router I wanted to store no state at the Router. That is everything requires to act on a message has to be available in the message or available message context. That means there are certain points where we need to save Urls and set properties on messages to carry additional information.

  • A request comes in from a client to the Router, but before reaching the Router Rewrite Behaviour (on the service side) picks up the RouterUrl and saves it for later
  • The Router decodes the Url and works out the destination service to send to, but before reaching the service the Rewrite Behaviour (on the client side) saves the Service Url for later
  • The destination service processes the request and responds with some data, but before arriving at the router the Rewrite Behaviour (on the client side) tunnels the saved Service Url into the message properties
  • The Router takes the response and replies to the Client, but before arriving at the Client the message contents are replaced so any Service Urls are replaced with matching Router Urls.

Router

So it’s time to start coding starting with the simplest possible router which has a single method that accepts any Message and responds with a Message.

[ServiceContract]
public interface IRouter
{
  [OperationContract(Action = “*”, ReplyAction = “*”, Name=“*”)]
  [WebGet()]
  Message ProcessMessage(Message requestMessage);
}

RouteTo

First question we need to resolve is how to tell the router where to route to. This is often done with Http headers or similar, but since we are hoping to support vanilla REST and oData clients (like Excel) there is no opportunity to insert headers so I’ve chosen to use the URL. Specifically the first Uri path segment after the domain is mapped to the service URL.

For instance: http://router.cloudapp.net/netflix/v2/Catalog => https://odata.netflix.com/v2/Catalog

Registration

Next we come up with a strategy for registering and unregistering routes. Ideally I’d like to support registration at runtime (using REST) so take control of a couple of special Router URLs.
http://router.cloudapp.net/register?name=netflix&routeTo=odata.netflix.com&type=Route
Will enable the previous route; and similarly for unregister
http://router.cloudapp.net/unregister?name=abc
(The Type parameter is used to describe the type of routing and will be described later.)

Routing

The routing part is very simple in WCF. Before passing a message onto the WCF runtime with ProcessMessage, simply rewrite the “To” Header by replacing the routerUrl with the destination service Url and WCF will do the rest (no Application Request Routing complications here).

if (serviceUri != null)
{
//map the Uri arriving at the Router to the corresponding Uri at the service 
UrlRewriter urlRewriter = new UrlRewriter(serviceUri, request.Headers.To);
request.Headers.To = new Uri(urlRewriter.ReplaceWithService(request.Headers.To.ToString()));
}
//now the Router makes the Service request
Message reply = router.ProcessMessage(request);
return reply;

Rewrite Behaviour

So we can successfully forward messages onto configured endpoints and get responses. The problem is the responses (as is the case for web pages and REST/oData) will contain Uri’s for the originating service so that subsequent requests will go direct to the service and not via the router. What we need to do is look for any service Uri’s in the returned content and replace with the Router address. This can be done with a WCF behaviour using IDispatchMessageInspector and IClientMessageInspector to pick up the Urls from the message and meet up in BeforeSendReply to modify the reply just before sending it back to the client.

//rewrite all the urls to point to the proxy url
    reply = Common.ReplaceInMessage(reply, serviceRequestPath, proxyRequestPath);

With any WCF solution the devil is in the config. To make the behaviour magic happen we use configuration to add the service behaviour before the Router

<service name=WcfRouter.Router>
<endpoint name=routerEndpoint” behaviorConfiguration=RouteAndRewrite” binding=webHttpBinding” contract=WcfRouter.IRouter/>
</service>

And add the client behaviour after the Router

<client>
<endpoint name=Route behaviorConfiguration=Route binding=webHttpBinding contract=WcfRouter.IRouter/>
</client>

Routing Types

To support the different possible routing behaviours we use the “Type” parameter on registration. The value of Type is used to select the name of the client endpoint, which in turn chooses the set of behaviours to be applied on the service side of the router.

<client>
<endpoint name=Route behaviorConfiguration=Route binding=webHttpBinding contract=WcfRouter.IRouter/>
 <endpoint name=RouteAndRewrite behaviorConfiguration=RouteAndRewritebinding=webHttpBindingcontract=WcfRouter.IRouter/>
</client>

Where Type=”Route” selects an empty behaviour set

<endpointBehaviors>
<behavior name=Route></behavior>

And Type=”RouteAndRewrite” selects the RewriteBehaviourExtension

<behavior name=RouteAndRewrite>
<RewriteBehaviorExtension/>
</behaviour>
</endpointBehaviors>

Now we’ve built a generic rewriting router with a registration model. But the real goal here is the ability to register and plugin additional WCF behaviour extensions, so…

Azure Table Storage Auth

Back to the original problem which is Azure Table Storage requires authentication keys to be sent with the request. That means the client endpoint we use to request data from table storage needs an additional behaviour too to modify the request outgoing from the router to table storage.

This is a slightly different behaviour as it is the router acting as a service client and so is and modifying the request using an implementation of IClientMessageInspector.BeforeSendRequest
and adds headers onto the outgoing webrequest.

GenerateAzureAuthHeaders(storageCredentials, request.Headers.To , out dateHeader, out authorizationHeader);
httpRequest.Headers.Add(DateHeader, dateHeader);
httpRequest.Headers.Add(AuthorizationHeader, authorizationHeader);

oData Link Element

One last piece of the puzzle is the oData <link> element. This is an additional element that needs to be added to reply from table storage to implement paging. This replaces the header based implementation used on table storage replies with the standard oData method of a link Uri to the next page. Again a custom behaviour can help here on the router client side, but this time implementing IClientMessageInspector.AfterReceiveReply. This method gives us the opportunity to replace the message contents returned from table storage with a new oData compliant one including link elements (which will be rewritten to be routed through the proxy by our earlier rewrite behaviour).

linkElement = “<link rel=\”next\” href=\”” + WebUtility.HtmlEncode(newUri) + “\”/>”;
//Do the replace and recreate the message
reply = ReplaceInMessage(reply, FeedEndElement, linkElement + “\n” + FeedEndElement);

Transfer Encoding

While all that looks dead simple, one very important note when implementing this extra Azure behaviour (that took a lot of debugging). The responses coming back from Azure table storage have a header that can be seen in Fiddler.

Transfer-Encoding: chunked

When using WCF to change Messages, it is important to create a new copy of the Message and copy across all of the Properties and Headers from the old message. It turns out that the act of copying the message means it is no longer “chunked” so that header must be removed before passing the Message back through WCF. If you don’t remove the header, no Message body is returned and no error is either.

httpResponse.Headers.Remove(TransferEncodingHeader);

Registration

To properly handle the new AzureAuth routing ability, we need one more piece of information, the StorageKey to be associated with a given route like so.

http://router.cloudapp.net/register?name=backtester&routeTo=backtester.table.core.windows.net&type=AzureAuth&connectionstring=DefaultEndpointsProtocol=http;AccountName=backtester;AccountKey=blah

Hosting

The power of WCF is it can be hosted pretty much anywhere. For the most flexibility it is best hosted on its own where it can take over the entire Url space so any Url can be served. However IIS is also an option but requires the use of an annoying .svc extension in your router Urls. I’ve done both and the Router service is happy either way. It can be deployed into Azure as a WebRole, a WorkerRole(with some additional effort) or for purposes of this example hosted in Azure Web Sites for anyone to try.

Just browse to http://tablerouter.azurewebsites.net

Try out the Netflix router with Urls like http://tablerouter.azurewebsites.net/Router.svc/netflix/Catalog/Languages(). Here we are browsing the oData of the Netflix catalog but using our Router to serve the content.

Now try the pre-registered quote route http://tablerouter.azurewebsites.net/Router.svc/quotes/Quotes()

Then finally get an Azure Table Storage Url and Connection String and register your own route.

Browsing

This brings us back to the original request. How to view Azure Table Storage using Excel. As per the original Fiddler based blog, follow the same steps.

  • Use Excel 2013 or Excel 2010 with PowerPivot
  • Data=>From Other Data Sources=>oData Feed
  • Follow the wizard with your registered route
  • And view the data!

A note on Security

A note on security; what we have done with the oData proxy is to remove the need for vanilla oData clients (like Excel) to know and understand the nuances of the Azure Table Storage security keys. But to do that we’ve had to give the Router the ability to impersonate a secure client by given it your security key information which is kept in memory on the Router. You now have a publicly accessible Url to your view.

Code

The code is available here with some additional bits and pieces for hosting in WebRoles and WorkerRoles

Add oData Paging to Azure Table Storage

After reading my previous blog, one of my Kloud colleagues raised a good point:

“What about continuation tokens?”

“Hmm, not sure, let me check. They should just pass through.”

Not so much.

Continuation tokens are used by the Azure Table Storage API as a way to move through a result set which may be larger than the maximum (1000) or has been limited using a $top parameter. Sure enough, a quick check shows that the previous solution is limited to 1000 rows in Excel. It seems Microsoft diverted from the oData spec in this area also which is a shame; but not insurmountable.

It’s time to break out Fiddler again to see what’s happening under the hood. Responses from Azure Table Storage contain HTTP headers x-ms-continuation-NextPartitionKey and x-ms-continuation-NextRowKey.

The intention is that the requesting client will take these HTTP headers and pass them back on a subsequent paging request on the query string as NextPartitionKey and NextRowKey.

oData on the other hand uses a much neater solution. In the returned oData XML, there are entry elements and link elements. The link next elements contain the full URL that the client should use to get the next page of data. The convention is to use a skiptoken but we can actually tunnel any type of token through because it is the whole URL that the oData client will use to get the next page. Here’s how it looks on the NetFlix oData interface which uses an integer skiptoken to page through the dataset.

To bridge the gap between the Azure Table API’s and the oData API we need to look at every response from Azure Table Storage and take the continuation keys from the response headers and form them into a link element to be appended onto the returned data entry elements. This again requires an intermediate proxy between your oData client and the Azure Table Storage which we’ll use Fiddler again for now.

This is pretty easy to implement (with some trial and error) by extending the same Fiddler extension we used earlier and is available as a DLL here or the source code here. Incidentally the same (or similar) code could be reused in a hosted proxy which I might have a go at next time.

Now running Excel over a large data set in Azure Table Storage we see the interaction between Excel and Table Storage as it walks down the pages requesting data using the default maximum page size of 1000 rows (you can elect to use smaller page sizes using the $top=n oData query string)

Now we’ve got more that 1000 rows in our Excel table!

Thanks Dave for reminding me about continuation tokens.

Visualise Azure Table Storage with Excel and Fiddler

Today I came across an interesting problem;

I’m a big fan of Table Storage but its potential is yet to be realised because the tool support just isn’t a match for databases. We’ve got a solution which lays down a lot of data into Azure Table storage but the options to view that data is limited. There are plenty of available viewers including Visual Studio, Azure Storage Explorer and others. The problem with all of these viewers is they are limited to plain old tablular data views.

What if we want to “visualise” the data? Is there a tool to provide a graphical view of my Table Storage data?

Sure, we could open Visual Studio grab a charting control deploy a WebRole to build images from the data but I’m no fan of writing code, and prefer to build solutions on the shoulder of others. This, naturally, got me to thinking about Excel!

Microsoft in their wisdom and a view to the future of an “Open Data” world used oData to interface with Table Storage. oData is a REST protocol built for interacting with data stores over HTTP. The entire Azure Data Market is presented through oData and there are some great examples of presenting data through oData (like the Netflix API).

With that said, shouldn’t it be easy to point Excel at the oData interface and chart the data?

Yes and No.

Yes Excel has the ability to interface with oData through both the PowerPivot plugin add on and natively in the latest version of Excel 2013

But if you point directly at your Table Storage URL endpoint then you’ll get an access denied message. The problem is there is no way to express your table storage credentials to connect to the data store. And it’s more complicated than just adding the storage key onto the request because the Azure implementation of Table Storage requires that authentication headers have just a hash of some request parameters and the storage key. (This is sensible since your secure storage key never needs to cross the wire.) The concept of the “SharedKey” and SharedKeyLite” are well described here.

So what we really need is a local proxy that will intercept the requests from Excel or other clients and insert the required SharedKeyLite Authorization header. One of the best and most powerful local proxy servers is … Fiddler. Fiddler provides an “Extension” plugin architecture that gives us just the hooks required to implement our authentication logic as requests are being intercepted.

If your interested in how to build such a Fiddler extension, build the source , otherwise just grab the assembly and drop it into \Program Files (x86)\Fiddler2\Scripts. Next time you start Fiddler there will be a new Tools menu item “Azure Auth…” to set up your account and storage keys.

After which, the extension will watch for any requests to youraccount.table.core.windows.net and will intercept and add the required Authorization headers for the Azure Table Storage. To test it out, launch a browser (for IE you’ll want to turn off the default feed reading view as per this article) and browse to the URL http://youraccount.table.core.windows.net/Tables(). This will return all the tables available in your account.

Now try with Excel using one of your tables using Data->From Other Sources-> From oData Data Feed

And Choose your table

Finish the Wizard and then choose one of the Excel data presentations

Now you can chart your Azure Table Storage data with no code!

Note: updated now to include oData paging.