Demystifying Managed Service Identities on Azure

Managed service identities (MSIs) are a great feature of Azure that are being gradually enabled on a number of different resource types. But when I’m talking to developers, operations engineers, and other Azure customers, I often find that there is some confusion and uncertainty about what they do. In this post I will explain what MSIs are and are not, where they make sense to use, and give some general advice on how to work with them.

What Do Managed Service Identities Do?

A managed service identity allows an Azure resource to identify itself to Azure Active Directory without needing to present any explicit credentials. Let’s explain that a little more.

In many situations, you may have Azure resources that need to securely communicate with other resources. For example, you may have an application running on Azure App Service that needs to retrieve some secrets from a Key Vault. Before MSIs existed, you would need to create an identity for the application in Azure AD, set up credentials for that application (also known as creating a service principal), configure the application to know these credentials, and then communicate with Azure AD to exchange the credentials for a short-lived token that Key Vault will accept. This requires quite a lot of upfront setup, and can be difficult to achieve within a fully automated deployment pipeline. Additionally, to maintain a high level of security, the credentials should be changed (rotated) regularly, and this requires even more manual effort.

With an MSI, in contrast, the App Service automatically gets its own identity in Azure AD, and there is a built-in way that the app can use its identity to retrieve a token. We don’t need to maintain any AD applications, create any credentials, or handle the rotation of these credentials ourselves. Azure takes care of it for us.

It can do this because Azure can identify the resource – it already knows where a given App Service or virtual machine ‘lives’ inside the Azure environment, so it can use this information to allow the application to identify itself to Azure AD without the need for exchanging credentials.

What Do Managed Service Identities Not Do?

Inbound requests: One of the biggest points of confusion about MSIs is whether they are used for inbound requests to the resource or for outbound requests from the resource. MSIs are for the latter – when a resource needs to make an outbound request, it can identify itself with an MSI and pass its identity along to the resource it’s requesting access to.

MSIs pair nicely with other features of Azure resources that allow for Azure AD tokens to be used for their own inbound requests. For example, Azure Key Vault accepts requests with an Azure AD token attached, and it evaluates which parts of Key Vault can be accessed based on the identity of the caller. An MSI can be used in conjunction with this feature to allow an Azure resource to directly access a Key Vault-managed secret.

Authorization: Another important point is that MSIs are only directly involved in authentication, and not in authorization. In other words, an MSI allows Azure AD to determine what the resource or application is, but that by itself says nothing about what the resource can do. For some Azure resources this is Azure’s own Identity and Access Management system (IAM). Key Vault is one exception – it maintains its own access control system, and is managed outside of Azure’s IAM. For non-Azure resources, we could communicate with any authorisation system that understands Azure AD tokens; an MSI will then just be another way of getting a valid token that an authorisation system can accept.

Another important point to be aware of is that the target resource doesn’t need to run within the same Azure subscription, or even within Azure at all. Any service that understands Azure Active Directory tokens should work with tokens for MSIs.

How to Use MSIs

Now that we know what MSIs can do, let’s have a look at how to use them. Generally there will be three main parts to working with an MSI: enabling the MSI; granting it rights to a target resource; and using it.

  1. Enabling an MSI on a resource. Before a resource can identify itself to Azure AD,it needs to be configured to expose an MSI. The way that you do this will depend on the specific resource type you’re enabling the MSI on. In App Services, an MSI can be enabled through the Azure Portal, through an ARM template, or through the Azure CLI, as documented here. For virtual machines, an MSI can be enabled through the Azure Portal or through an ARM template. Other MSI-enabled services have their own ways of doing this.

  2. Granting rights to the target resource. Once the resource has an MSI enabled, we can grant it rights to do something. The way that we do this is different depending on the type of target resource. For example, Key Vault requires that you configure its Access Policies, while to use the Event Hubs or the Azure Resource Manager APIs you need to use Azure’s IAM system. Other target resource types will have their own way of handling access control.

  3. Using the MSI to issue tokens. Finally, now that the resource’s MSI is enabled and has been granted rights to a target resource, it can be used to actually issue tokens so that a target resource request can be issued. Once again, the approach will be different depending on the resource type. For App Services, there is an HTTP endpoint within the App Service’s private environment that can be used to get a token, and there is also a .NET library that will handle the API calls if you’re using a supported platform. For virtual machines, there is also an HTTP endpoint that can similarly be used to obtain a token. Of course, you don’t need to specify any credentials when you call these endpoints – they’re only available within that App Service or virtual machine, and Azure handles all of the credentials for you.

Finding an MSI’s Details and Listing MSIs

There may be situations where we need to find our MSI’s details, such as the principal ID used to represent the application in Azure AD. For example, we may need to manually configure an external service to authorise our application to access it. As of April 2018, the Azure Portal shows MSIs when adding role assignments, but the Azure AD blade doesn’t seem to provide any way to view a list of MSIs. They are effectively hidden from the list of Azure AD applications. However, there are a couple of other ways we can find an MSI.

If we want to find a specific resource’s MSI details then we can go to the Azure Resource Explorer and find our resource. The JSON details for the resource will generally include an identity property, which in turn includes a principalId:

Screenshot 1

That principalId is the client ID of the service principal, and can be used for role assignments.

Another way to find and list MSIs is to use the Azure AD PowerShell cmdlets. The Get-AzureRmADServicePrincipal cmdlet will return back a complete list of service principals in your Azure AD directory, including any MSIs. MSIs have service principal names starting with https://identity.azure.net, and the ApplicationId is the client ID of the service principal:

Screenshot 2

Now that we’ve seen how to work with an MSI, let’s look at which Azure resources actually support creating and using them.

Resource Types with MSI and AAD Support

As of April 2018, there are only a small number of Azure services with support for creating MSIs, and of these, currently all of them are in preview. Additionally, while it’s not yet listed on that page, Azure API Management also supports MSIs – this is primarily for handling Key Vault integration for SSL certificates.

One important note is that for App Services, MSIs are currently incompatible with deployment slots – only the production slot gets assigned an MSI. Hopefully this will be resolved before MSIs become fully available and supported.

As I mentioned above, MSIs are really just a feature that allows a resource to assume an identity that Azure AD will accept. However, in order to actually use MSIs within Azure, it’s also helpful to look at which resource types support receiving requests with Azure AD authentication, and therefore support receiving MSIs on incoming requests. Microsoft maintain a list of these resource types here.

Example Scenarios

Now that we understand what MSIs are and how they can be used with AAD-enabled services, let’s look at a few example real-world scenarios where they can be used.

Virtual Machines and Key Vault

Azure Key Vault is a secure data store for secrets, keys, and certificates. Key Vault requires that every request is authenticated with Azure AD. As an example of how this might be used with an MSI, imagine we have an application running on a virtual machine that needs to retrieve a database connection string from Key Vault. Once the VM is configured with an MSI and the MSI is granted Key Vault access rights, the application can request a token and can then get the connection string without needing to maintain any credentials to access Key Vault.

API Management and Key Vault

Another great example of an MSI being used with Key Vault is Azure API Management. API Management creates a public domain name for the API gateway, to which we can assign a custom domain name and SSL certificate. We can store the SSL certificate inside Key Vault, and then give Azure API Management an MSI and access to that Key Vault secret. Once it has this, API Management can automatically retrieve the SSL certificate for the custom domain name straight from Key Vault, simplifying the certificate installation process and improving security by ensuring that the certificate is not directly passed around.

Azure Functions and Azure Resource Manager

Azure Resource Manager (ARM) is the deployment and resource management system used by Azure. ARM itself supports AAD authentication. Imagine we have an Azure Function that needs to scan our Azure subscription to find resources that have recently been created. In order to do this, the function needs to log into ARM and get a list of resources. Our Azure Functions app can expose an MSI, and so once that MSI has been granted reader rights on the resource group, the function can get a token to make ARM requests and get the list without needing to maintain any credentials.

App Services and Event Hubs/Service Bus

Event Hubs is a managed event stream. Communication to both publish onto, and subscribe to events from, the stream can be secured using Azure AD. An example scenario where MSIs would help here is when an application running on Azure App Service needs to publish events to an Event Hub. Once the App Service has been configured with an MSI, and Event Hubs has been configured to grant that MSI publishing permissions, the application can retrieve an Azure AD token and use it to post messages without having to maintain keys.

Service Bus provides a number of features related to messaging and queuing, including queues and topics (similar to queues but with multiple subscribers). As with Event Hubs, an application could use its MSI to post messages to a queue or to read messages from a topic subscription, without having to maintain keys.

App Services and Azure SQL

Azure SQL is a managed relational database, and it supports Azure AD authentication for incoming connections. A database can be configured to allow Azure AD users and applications to read or write specific types of data, to execute stored procedures, and to manage the database itself. When coupled with an App Service with an MSI, Azure SQL’s AAD support is very powerful – it reduces the need to provision and manage database credentials, and ensures that only a given application can log into a database with a given user account. Tomas Restrepo has written a great blog post explaining how to use Azure SQL with App Services and MSIs.

Summary

In this post we’ve looked into the details of managed service identities (MSIs) in Azure. MSIs provide some great security and management benefits for applications and systems hosted on Azure, and enable high levels of automation in our deployments. While they aren’t particularly complicated to understand, there are a few subtleties to be aware of. As long as you understand that MSIs are for authentication of a resource making an outbound request, and that authorisation is a separate thing that needs to be managed independently, you will be able to take advantage of MSIs with the services that already support them, as well as the services that may soon get MSI and AAD support.

Cosmos DB Server-Side Programming with TypeScript – Part 6: Build and Deployment

So far in this series we’ve been compiling our server-side TypeScript code to JavaScript locally on our own machines, and then copying and pasting it into the Azure Portal. However, an important part of building a modern application – especially a cloud-based one – is having a reliable automated build and deployment process. There are a number of reasons why this is important, ranging from ensuring that a developer isn’t building code on their own machine – and therefore may be subject to environmental variations or differences that cause different outputs – through to running a suite of tests on every build and release. In this post we will look at how Cosmos DB server-side code can be built and released in a fully automated process.

This post is part of a series:

  • Part 1 gives an overview of the server side programmability model, the reasons why you might want to consider server-side code in Cosmos DB, and some key things to watch out for.
  • Part 2 deals with user-defined functions, the simplest type of server-side programming, which allow for adding simple computation to queries.
  • Part 3 talks about stored procedures. These provide a lot of powerful features for creating, modifying, deleting, and querying across documents – including in a transactional way.
  • Part 4 introduces triggers. Triggers come in two types – pre-triggers and post-triggers – and allow for behaviour like validating and modifying documents as they are inserted or updated, and creating secondary effects as a result of changes to documents in a collection.
  • Part 5 discusses unit testing your server-side scripts. Unit testing is a key part of building a production-grade application, and even though some of your code runs inside Cosmos DB, your business logic can still be tested.
  • Finally, part 6 (this post) explains how server-side scripts can be built and deployed into a Cosmos DB collection within an automated build and release pipeline, using Microsoft Visual Studio Team Services (VSTS).

Build and Release Systems

There are a number of services and systems that provide build and release automation. These include systems you need to install and manage yourself, such as Atlassian Bamboo, Jenkins, and Octopus Deploy, through to managed systems like Amazon CodePipeline/CodeBuild, Travis CI, and AppVeyor. In our case, we will use Microsoft’s Visual Studio Team System (VSTS), which is a managed (hosted) service that provides both build and release pipeline features. However, the steps we use here can easily be adapted to other tools.

I will assume that you have a VSTS account, that you have loaded the code into a source code repository that VSTS can access, and that you have some familiarity with the VSTS build and release system.

Throughout this post, we will use the same code that we used in part 5 of this series, where we built and tested our stored procedure. The exact same process can be used for triggers and user-defined functions as well. I’ll assume that you have a copy of the code from part 5 – if you want to download it, you can get it from the GitHub repository for that post. If you want to refer to the finished version of the whole project, you can access it on GitHub here.

Defining our Build Process

Before we start configuring anything, let’s think about what we want to achieve with our build process. I find it helpful to think about the start point and end point of the build. We know that when we start the build, we will have our code within a Git repository. When we finish, we want to have two things: a build artifact in the form of a JavaScript file that is ready to deploy to Cosmos DB; and a list of unit test results. Additionally, the build should pass if all of the steps ran successfully and the tests passed, and it should fail if any step or any test failed.

Now that we have the start and end points defined, let’s think about what we need to do to get us there.

  • We need to install our NPM packages. On VSTS, every time we run a build our build environment will be reset, so we can’t rely on any files being there from a previous build. So the first step in our build pipeline will be to run npm install.
  • We need to build our code so it’s ready to be tested, and then we need to run the unit tests. In part 5 of this series we created an NPM script to help with this when we run locally – and we can reuse the same script here. So our second build step will be to run npm run test.
  • Once our tests have run, we need to report their results to VSTS so it can visualise them for us. We’ll look at how to do this below. Importantly, VSTS won’t fail the build automatically if there are any test failures, so we’ll look at how to do this ourselves shortly.
  • If we get to this point in the build then our code is successfully passing the tests, so now we can create the real release build. Again we have already defined an NPM script for this, so we can reuse that work and call npm run build.
  • Finally, we can publish the release JavaScript file as a build artifact, which makes it available to our release pipeline.

We’ll soon see how we can actually configure this. But before we can write our build process, we need to figure out how we’ll report the results of our unit tests back to VSTS.

Reporting Test Results

When we run unit tests from inside a VSTS build, the unit test runner needs some way to report the results back to VSTS. There are some built-in integrations with common tools like VSTest (for testing .NET code). For Jasmine, we need to use a reporter that we configure ourselves. The jasmine-tfs-reporter NPM package does this for us – its reporter will emit a specially formatted results file, and we’ll tell VSTS to look at this.

Let’s open up our package.json file and add the following line into the devDependencies section:

Run npm install to install the package.

Next, create a file named spec/vstsReporter.ts and add the following lines, which will configure Jasmine to send its results to the reporter we just installed:

Finally, let’s edit the jasmine.json file. We’ll add a new helpers section, which will tell Jasmine to run that script before it starts running our tests. Here’s the new jasmine.json file we’ll use:

Now run npm run test. You should see that a new testresults folder has been created, and it contains an XML file that VSTS can understand.

That’s the last piece of the puzzle we need to have VSTS build our code. Now let’s see how we can make VSTS actually run all of these steps.

Creating the Build Configuration

VSTS has a great feature – currently in preview – that allows us to specify our build definition in a YAML file, check it into our source control system, and have the build system execute it. More information on this feature is available in a previous blog post I wrote. We’ll make use of this feature here to write our build process.

Create a new file named build.yaml. This file will define all of our build steps. Paste the following contents into the file:

This YAML file tells VSTS to do the following:

  • Run the npm install command.
  • Run the npm run test command. If we get any test failures, this command will cause VSTS to detect an error.
  • Regardless of whether an error was detected, take the test results that have been saved into the testresults folder and publish them. (Publishing just means showing them within the build; they won’t be publicly available.)
  • If everything worked up till now, run npm run build to build the releaseable JavaScript file.
  • Publish the releasable JavaScript file as a build artifact, so it’s available to the release pipeline that we’ll configure shortly.

Commit this file and push it to your Git repository. In VSTS, we can now set up a new build configuration, point it to the YAML file, and let it run. After it finishes, you should see something like this:

release-1

We can see that four tests ran and passed. If we click on the Artifacts tab, we can view the artifacts that were published:

release-2

And by clicking the Explore button and expanding the drop folder, we can see the exact file that was created:

release-3

You can even download the file from here, and confirm that it looks like what we expect to be able to send to Cosmos DB. So, now we have our code being built and tested! The next step is to actually deploy it to Cosmos DB.

Deciding on a Release Process

Cosmos DB can be used in many different types of applications, and the way that we deploy our scripts can differ as well. In some applications, like those that are heavily server-based and have initialisation logic, we might provision our database, collections, and scripts through our application code. In other systems, like serverless applications, we want to provision everything we need during our deployment process so that our application can immediately start to work. This means there are several patterns we can adopt for installing our scripts.

Pattern 1: Use Application Initialisation Logic

If we have an Azure App Service, Cloud Service, or another type of application that provides initialisation lifecycle events, we can use the initialisation code to provision our Cosmos DB database and collection, and to install our stored procedures, triggers, and UDFs. The Cosmos DB client SDKs provide a variety of helpful methods to do this. For example, the .NET and .NET Core SDKs provide this functionality. If the platform you are using doesn’t have an SDK, you can also use the REST API provided by Cosmos DB.

This approach is also likely to be useful if we dynamially provision databases and collections while our application runs. We can also use this approach if we have an application warmup sequence where the existence of the collection can be confirmed and any missing pieces can be added.

Pattern 2: Initialise Serverless Applications with a Custom Function

When we’re using serverless technologies like Azure Functions or Azure Logic Apps, we may not have the opportunity to initialise our application the first time it loads. We could check the existence of our Cosmos DB resources whenever we are executing our logic, but this is quite wasteful and inefficient. One pattern that can be used is to write a special ‘initialisation’ function that is called from our release pipeline. This can be used to prepare the necessary Cosmos DB resources, so that by the time our callers execute our code, the necessary resources are already present. However, this presents some challenges, including the fact that it necessitates mixing our deployment logic and code with our main application code.

Pattern 3: Deploying from VSTS

The approach that I will adopt in this post is to deploy the Cosmos DB resources from our release pipeline in VSTS. This means that we will keep our release process separate from our main application code, and provide us with the flexibility to use the Cosmos DB resources at any point in our application logic. This may not suit all applications, but for many applications that use Cosmos DB, this type of workflow will work well.

There is a lot more to release configuration than I’ll be able to discuss here – that could easily be its own blog series. I’ll keep this particular post focused just on installing server-side code onto a collection.

Defining the Release Process

As with builds, it’s helpful to think through the process we want the release to follow. Again, we’ll think first about the start and end points. When we start the release pipeline, we will have the build that we want to release (which will include our compiled JavaScript script). For now, I’ll also assume that you have a resource group containing a Cosmos DB account with an existing database and collection, and that you know the account key. In a future post I will elaborate how some of this process can also be automated, but this is outside of the scope of this series. Once the release process finishes, we expect that the collection will have the server-side resource installed and ready to use.

VSTS doesn’t have built-in support for Cosmos DB. However, we can easily use a custom PowerShell script to install Cosmos DB scripts on our collection. I’ve written such a script, and it’s available for download here. The script uses the Cosmos DB API to deploy stored procedures, triggers, and user-defined functions to a collection.

We need to include this script into our build artifacts so that we can use it from our deployment process. So, download the file and save it into a deploy folder in the project’s source repository. Now that we have that there, we need to tell the VSTS build process to include it as an artifact, so open the build.yaml file and add this to the end of the file, being careful to align the spaces and indentation with the sections above it:

Commit these changes, and then run a new build.

Now we can set up a release definition in VSTS and link it to our build configuration so it can receive the build artifacts. We only need one step currently, which will deploy our stored procedure using the PowerShell script we included as a build artifact. Of course, a real release process is likely to do a lot more, including deploying your application. For now, though, let’s just add a single PowerShell step, and configure it to run an inline script with the following contents:

This inline script does the following:

  • It loads in the PowerShell file from our build artifact, so that the functions within that file are available for us to use.
  • It then runs the DeployStoredProcedure function, which is defined in that PowerShell file. We pass in some parameters so the function can contact Cosmos DB:
    • AccountName – this is the name of your Cosmos DB account.
    • AccountKey – this is the key that VSTS can use to talk to Cosmos DB’s API. You can get this from the Azure Portal – open up the Cosmos DB account and click the Keys tab.
    • DatabaseName – this is the name of the database (in our case, Orders).
    • CollectionName – this is the name of the collection (in our case again, Orders).
    • StoredProcedureName – this is the name we want our stored procedure to have in Cosmos DB. This doesn’t need to match the name of the function inside our code file, but I recommend it does to keep things clear.
    • SourceFilePath – this is the path to the JavaScript file that contains our script.

Note that in the script above I’ve assumed that the build configuration’s name is CosmosServer-CI, so that appears in the two file paths. If you have a build configuration that uses a different name, you’ll need to replace it. Also, I strongly recommend you don’t hard-code the account name, account key, database name, and collection name like I’ve done here – you would instead use VSTS variables and have them dynamically inserted by VSTS. Similarly, the account key should be specified as a secret variable so that it is encrypted. There are also other ways to handle this, including creating the Cosmos DB account and collection within your deployment process, and dynamically retrieving the account key. This is beyond the scope of this series, but in a future blog post I plan to discuss some ways to achieve this.

After configuring our release process, it will look something like this:

release-4

Now that we’ve configured our release process we can create a new release and let it run. If everything has been configured properly, we should see the release complete successfully:

release-5

And if we check the collection through the Azure Portal, we can see the stored procedure has been deployed:

release-6

This is pretty cool. It means that whenever we commit a change to our stored procedure’s TypeScript file, it can automatically be compiled, tested, and deployed to Cosmos DB – without any human intervention. We could now adapt the exact same process to deploy our triggers (using the DeployTrigger function in the PowerShell script) and UDFs (using the DeployUserDefinedFunction function). Additionally, we can easily make our build and deployments into true continuous integration (CI) and continuous deployment (CD) pipelines by setting up automated builds and releases within VSTS.

Summary

Over this series of posts, we’ve explored Cosmos DB’s server-side programming capabilities. We’ve written a number of server-side scripts including a UDF, a stored procedure, and two triggers. We’ve written them in TypeScript to ensure that we’re using strongly typed objects when we interact with Cosmos DB and within our own code. We’ve also seen how we can unit test our code using Jasmine. Finally, in this post, we’ve looked at how our server-side scripts can be built and deployed using VSTS and the Cosmos DB API.

I hope you’ve found this series useful! If you have any questions or similar topics that you’d like to know more about, please post them in the comments below.

Key Takeaways

  • Having an automated build and release pipeline is very important to ensure reliable, consistent, and safe delivery of software. This should include our Cosmos DB server-side scripts.
  • It’s relatively easy to adapt the work we’ve already done with our build scripts to work on a build server. Generally it will simply be a matter of executing npm install and then npm run build to create a releasable build of our code.
  • We can also run our unit tests by simply executing npm run test.
  • Test results from Jasmine can be published into VSTS using the jasmine-tfs-reporter package. Other integrations are available for other build servers too.
  • Deploying our server-side scripts onto Cosmos DB can be handled in different ways for different applications. With many applications, having server-side code deployed within an existing release process is a good idea.
  • VSTS doesn’t have built-in support for Cosmos DB, but I have provided a PowerShell script that can be used to install stored procedures, triggers, and UDFs.
  • You can view the code for this post on GitHub.

Cosmos DB Server-Side Programming with TypeScript – Part 5: Unit Testing

Over the last four parts of this series, we’ve discussed how we can write server-side code for Cosmos DB, and the types of situations where it makes sense to do so. If you’re building a small sample application, you now have enough knowledge to go and build out UDFs, stored procedures, and triggers. But if you’re writing production-grade applications, there are two other major topics that need discussion: how to unit test your server-side code, and how to build and deploy it to Cosmos DB in an automated and predictable manner. In this part, we’ll discuss testing. In the next part, we’ll discuss build and deployment.

This post is part of a series:

  • Part 1 gives an overview of the server side programmability model, the reasons why you might want to consider server-side code in Cosmos DB, and some key things to watch out for.
  • Part 2 deals with user-defined functions, the simplest type of server-side programming, which allow for adding simple computation to queries.
  • Part 3 talks about stored procedures. These provide a lot of powerful features for creating, modifying, deleting, and querying across documents – including in a transactional way.
  • Part 4 introduces triggers. Triggers come in two types – pre-triggers and post-triggers – and allow for behaviour like validating and modifying documents as they are inserted or updated, and creating secondary effects as a result of changes to documents in a collection.
  • Part 5 (this post) discusses unit testing your server-side scripts. Unit testing is a key part of building a production-grade application, and even though some of your code runs inside Cosmos DB, your business logic can still be tested.
  • Finally, part 6 explains how server-side scripts can be built and deployed into a Cosmos DB collection within an automated build and release pipeline, using Microsoft Visual Studio Team Services (VSTS).

Unit Testing Cosmos DB Server-Side Code

Testing JavaScript code can be complex, and there are many different ways to do it and different tools that can be used. In this post I will outline one possible approach for unit testing. There are other ways that we could also test our Cosmos DB server-side code, and your situation may be a bit different to the one I describe here. Some developers and teams place different priorities on some of the aspects of testing, so this isn’t a ‘one size fits all’ approach. In this post, the testing approach we will build out allows for:

  • Mocks: mocking allows us to pass in mocked versions of our dependencies so that we can test how our code behaves independently of a working external system. In the case of Cosmos DB, this is very important: the getContext() method, which we’ve looked at throughout this series, provides us with access to objects that represent the request, response, and collection. Our code needs to be tested without actually running inside Cosmos DB, so we mock out the objects it sends us.
  • Spies: spies are often a special type of mock. They allow us to inspect the calls that have been made to the object to ensure that we are triggering the methods and side-effects that we expect.
  • Type safety: as in the rest of this series, it’s important to use strongly typed objects where possible so that we get the full benefit of the TypeScript compiler’s type system.
  • Working within the allowed subset of JavaScript: although Cosmos DB server-side code is built using the JavaScript language, it doesn’t provide all of the features of JavaScript. This is particularly important when testing our code, because many test libraries make assumptions about how the code will be run and the level of JavaScript support that will be available. We need to work within the subset of JavaScript that Cosmos DB supports.

I will assume some familiarity with these concepts, but even if they’re new to you, you should be able to follow along. Also, please note that this series only deals with unit testing. Integration testing your server-side code is another topic, although it should be relatively straightforward to write integration tests against a Cosmos DB server-side script.

Challenges of Testing Cosmos DB Server-Side Code

Cosmos DB ultimately executes JavaScript code, and so we will use JavaScript testing frameworks to write and run our unit tests. Many of the popular JavaScript and TypeScript testing frameworks and helpers are designed specifically for developers who write browser-based JavaScript or Node.js applications. Cosmos DB has some properties that can make these frameworks difficult to work with.

Specifically, Cosmos DB doesn’t support modules. Modules in JavaScript allow for individual JavaScript files to expose a public interface to other blocks of code in different files. When I was preparing for this blog post I spent a lot of time trying to figure out a way to handle the myriad testing and mocking frameworks that assume modules are able to be used in our code. Ultimately I came to the conclusion that it doesn’t really matter if we use modules inside our TypeScript files as long as the module code doesn’t make it into our release JavaScript files. This means that we’ll have to build our code twice – once for testing (which include the module information we need), and again for release (which doesn’t include modules). This isn’t uncommon – many development environments have separate ‘Debug’ and ‘Release’ build configurations, for example – and we can use some tricks to achieve our goals while still getting the benefit of a good design-time experience.

Defining Our Tests

We’ll be working with the stored procedure that we built out in part 3 of this series. The same concepts could be applied to unit testing triggers, and also to user-defined functions (UDFs) – and UDFs are generally easier to test as they don’t have any context variables to mock out.

Looking back at the stored procedure, the purpose is to do return the list of customers who have ordered any of specified list of product IDs, grouped by product ID, and so an initial set of test cases might be as follows:

  1. If the productIds parameter is empty, the method should return an empty array.
  2. If the productIds parameter contains one item, it should execute a query against the collection containing the item’s identifier as a parameter.
  3. If the productIds parameter contains one item, the method should return a single CustomersGroupedByProduct object in the output array, which should contain the productId that was passed in, and whatever customerIds the mocked collection query returned.
  4. If the method is called with a valid productIds array, and the queryDocuments method on the collection returns false, an error should be returned by the function.

You might have others you want to focus on, and you may want to split some of these out – but for now we’ll work with these so we can see how things work. Also, in this post I’ll assume that you’ve got a copy of the stored procedure from part 3 ready to go – if you haven’t, you can download it from the GitHub repository for that part.

If you want to see the finished version of the whole project, including the tests, you can access it on GitHub here.

Setting up TypeScript Configurations

The first change we’ll need to make is to change our TypeScript configuration around a bit. Currently we only have one tsconfig.json file that we use to build. Now we’ll need to add a second file. The two files will be used for different situations:

  • tsconfig.json will be the one we use for local builds, and for running unit tests.
  • tsconfig.build.json will be the one we use for creating release builds.

First, open up the tsconfig.json file that we already have in the repository. We need to change it to the following:

The key changes we’re making are:

  • We’re now including files from the spec folder in our build. This folder will contain the tests that we’ll be writing shortly.
  • We’ve added the line "module": "commonjs". This tells TypeScript that we want to compile our code with module support. Again, this tsconfig.json will only be used when we run our builds locally or for running tests, so we’ll later make sure that the module-related code doesn’t make its way into our release builds.
  • We’ve changed from using outFile to outDir, and set the output directory to output/test. When we use modules like we’re doing here, we can’t use the outFile setting to combine our files together, but this won’t matter for our local builds and for testing. We also put the output files into a test subfolder of the output folder so that we keep things organised.

Now we need to create a tsconfig.build.json file with the following contents:

This looks more like the original tsconfig.json file we had, but there are a few minor differences:

  • The include element now looks for files matching the pattern *.ready.ts. We’ll look at what this means later.
  • The module setting is explicitly set to none. As we’ll see later, this isn’t sufficient to get everything we need, but it’s good to be explicit here for clarity.
  • The outFile setting – which we can use here because module is set to none – is going to emit a JavaScript file within the build subfolder of the output folder.

Now let’s add the testing framework.

Adding a Testing Framework

In this post we’ll use Jasmine, a testing framework for JavaScript. We can import it using NPM. Open up the package.json file and replace it with this:

There are a few changes to our previous version:

  • We’ve now imported the jasmine module, as well as the Jasmine type definitions, into our project; and we’ve imported moq.ts, a mocking library, which we’ll discuss below.
  • We’ve also added a new test script, which will run a build and then execute Jasmine, passing in a configuration file that we will create shortly.

Run npm install from a command line/terminal to restore the packages, and then create a new file named jasmine.json with the following contents:

We’ll understand a little more about this file as we go on, but for now, we just need to understand that this file defines the Jasmine specification files that we’ll be testing against. Now let’s add our Jasmine test specification so we can see this in action.

Starting Our Test Specification

Let’s start by writing a simple test. Create a folder named spec, and within it, create a file named getGroupedOrdersImpl.spec.ts. Add the following code to it:

This code does the following:

  • It sets up a new Jasmine spec named getGroupedOrdersImpl. This is the name of the method we’re testing for clarity, but it doesn’t need to match – you could name the spec whatever you want.
  • Within that spec, we have a test case named should return an empty array.
  • That test executes the getGroupedOrdersImpl function, passing in an empty array, and a null object to represent the Collection.
  • Then the test confirms that the result of that function call is an empty array.

This is a fairly simple test – we’ll see a slightly more complex one in a moment. For now, though, let’s get this running.

There’s one step we need to do before we can execute our test. If we tried to run it now, Jasmine would complain that it can’t find the getGroupedOrdersImpl method. This is because of the way that JavaScript modules work. Our code needs to export its externally accessible methods so that the Jasmine test can see it. Normally, exporting a module from a Cosmos DB JavaScript file will mean that Cosmos DB doesn’t accept the file anymore – we’ll see a solution to that shortly.

Open up the src/getGroupedOrders.ts file, and add the following at the very bottom of the file:

The export statement sets up the necessary TypeScript compilation instruction to allow our Jasmine test spec to reach this method.

Now let’s run our test. Execute npm run test, which will compile our stored procedure (including the export), compile the test file, and then execute Jasmine. You should see that Jasmine executes the test and shows 1 spec, 0 failures, indicating that our test successfully ran and passed. Now let’s add some more sophisticated tests.

Adding Tests with Mocks and Spies

When we’re testing code that interacts with external services, we often will want to use mock objects to represent those external dependencies. Most mocking frameworks allow us to specify the behaviour of those mocks, so we can simulate various conditions and types of responses from the external system. Additionally, we can use spies to observe how our code calls the external system.

Jasmine provides a built-in mocking framework, including spy support. However, the Jasmine mocks don’t support TypeScript types, and so we lose the benefit of type safety. In my opinion this is an important downside, and so instead we will use the moq.ts mocking framework. You’ll see we have already installed it in the package.json.

Since we’ve already got it available to us, we need to add this line to the top of our spec/getGroupedOrders.spec.ts file:

This tells TypeScript to import the relevant mocking types from the moq.ts module. Now we can use the mocks in our tests.

Let’s set up another test, in the same file, as follows:

This test does a little more than the last one:

  • It sets up a mock of the ICollection interface.
  • This mock will send back a hard-coded string (self-link) when the getSelfLink() method is called.
  • It also provides mock behaviour for the queryDocuments method. When the method is called, it invokes the callback function, passing back a list of documents with a single empty string, and then returns true to indicate that the query was accepted.
  • The mock.object() method is used to convert the mock into an instance that can be provided to the getGroupedOrderImpl function, which then uses that in place of the real Cosmos DB collection. This means we can test out how our code will behave, and we can emulate the behaviour of Cosmos DB as we wish.
  • Finally, we call mock.verify to ensure that the getGroupedOrdersImpl function executed the queryDocuments method on the mock collection exactly once.

You can run npm run test again now, and verify that it shows 2 specs, 0 failures, indicating that our new test has successfully passed.

Now let’s fill out the rest of the spec file – here’s the complete file with all of our test cases included:

You can execute the tests again by calling npm run test. Try tweaking the tests so that they fail, then re-run them and see what happens.

Building and Running

All of the work we’ve just done means that we can run our tests. However, if we try to build our code to submit to Cosmos DB, it won’t work anymore. This is because the export statement we added to make our tests work will emit code that Cosmos DB’s JavaScript engine doesn’t understand.

We can remove this code at build time by using a preprocessor. This will remove the export statement – or anything else we want to take out – from the TypeScript file. The resulting cleaned file is the one that then gets sent to the TypeScript compiler, and it emits a Cosmos DB-friendly JavaScript file.

To achieve this, we need to chain together a few pieces. First, let’s open up the src/getGroupedOrders.ts file. Replace the line that says export { getGroupedOrdersImpl } with this section:

The extra lines we’ve added are preprocessor directives. TypeScript itself doesn’t understand these directives, so we need to use an NPM package to do this. The one I’ve used here is jspreproc. It will look through the file and handle the directives it finds in specially formatted comments, and then emits the resulting cleaned file. Unfortunately, the preprocessor only works on a single file at a time. This is OK for our situation, as we have all of our stored procedure code in one file, but we might not do that for every situation. Therefore, I have also used the foreach-cli NOM package to search for all of the *.ts files within our src folder and process them. It saves the cleaner files with a .ready.ts extension, which our tsconfig.build.json file refers to.

Open the package.json file and replace it with the following contents:

Now we can run npm install to install all of the packages we’re using. You can then run npm run test to run the Jasmine tests, and npm run build to build the releasable JavaScript file. This is emitted into the output/build/sp-getGroupedOrders.js file, and if you inspect that file, you’ll see it doesn’t have any trace of module exports. It looks just like it did back in part 3, which means we can send it to Cosmos DB without any trouble.

Summary

In this post, we’ve built out the necessary infrastructure to test our Cosmos DB server-side code. We’ve used Jasmine to run our tests, and moq.ts to mock out the Cosmos DB server objects in a type-safe manner. We also adjusted our build script so that we can compile a clean copy of our stored procedure (or trigger, or UDF) while keeping the necessary export statements to enable our tests to work. In the final post of this series, we’ll look at how we can automate the build and deployment of our server-side code using VSTS, and integrate it into a continuous integration and continuous deployment pipeline.

Key Takeaways

  • It’s important to test Cosmos DB server-side code. Stored procedures, triggers, and UDFs contain business logic and should be treated as a fully fledged part of our application code, with the same quality criteria we would apply to other types of source code.
  • Because Cosmos DB server-side code is written in JavaScript, it is testable using JavaScript and TypeScript testing frameworks and libraries. However, the lack of support for modules means that we have to be careful in how we use these since they may emit release code that Cosmos DB won’t accept.
  • We can use Jasmine for testing. Jasmine also has a mocking framework, but it is not strongly typed.
  • We can get strong typing using a TypeScript mocking library like moq.ts.
  • By structuring our code correctly – using a single entry-point function, which calls out to getContext() and then sends the necessary objects into a function that implements our actual logic – we can easily mock and spy on our calls to the Cosmos DB server-side libraries.
  • We need to export the functions we are testing using the export statement. This makes them available to the Jasmine test spec.
  • However, these export statements need to be removed before we can compile our release version. We can use a preprocessor to remove those statements.
  • You can view the code for this post on GitHub.

Cosmos DB Server-Side Programming with TypeScript – Part 4: Triggers

scTriggers are the third type of server-side code in Cosmos DB. Triggers allow for logic to be run while an operation is running on a document. When a document is to be created, modified, or deleted, our custom logic can be executed – either before or after the operation takes place – allowing us to validate documents, transform documents, and even create secondary documents or perform other operations on the collection. As with stored procedures, this all takes place within the scope of an implicit transaction. In this post, we’ll discuss the two types of triggers (pre- and post-triggers), and how we can ensure they are executed when we want them to be. We’ll also look at how we can validate, modify, and cause secondary effects from triggers.

This post is part of a series of posts about server-side programming for Cosmos DB:

  • Part 1 gives an overview of the server side programmability model, the reasons why you might want to consider server-side code in Cosmos DB, and some key things to watch out for.
  • Part 2 deals with user-defined functions, the simplest type of server-side programming, which allow for adding simple computation to queries.
  • Part 3 talks about stored procedures. These provide a lot of powerful features for creating, modifying, deleting, and querying across documents – including in a transactional way.
  • Part 4 (this post) introduces triggers. Triggers come in two types – pre-triggers and post-triggers – and allow for behaviour like validating and modifying documents as they are inserted or updated, and creating secondary effects as a result of changes to documents in a collection.
  • Part 5 discusses unit testing your server-side scripts. Unit testing is a key part of building a production-grade application, and even though some of your code runs inside Cosmos DB, your business logic can still be tested.
  • Finally, part 6 explains how server-side scripts can be built and deployed into a Cosmos DB collection within an automated build and release pipeline, using Microsoft Visual Studio Team Services (VSTS).

Triggers

Triggers let us run custom logic within the scope of an operation already in progress against our collection. This is different to stored procedures, which are explicitly invoked when we want to use them. Triggers give us a lot of power, as they allow us to intercept documents before they are created, modified, or deleted. We can then write our own business logic to perform some custom validation of the operation or document, modify it in some form, or perform another action on the collection instead or as well as the originally requested action.

Similarly to stored procedures, triggers run within a implicit transactional scope – but unlike stored procedures, the transaction isn’t just isolated to the actions we perform within our code. It includes the operation that caused the trigger. This means that if we throw an unhandled error inside a trigger, the whole operation – including the original caller’s request to insert or modify a document – will be rolled back.

Trigger Behaviour

Triggers have two ways they can be configured: which operations should trigger them and when they should run. We set both of these options at the time when we deploy or update the trigger, using the Trigger Operation and Trigger Type settings respectively. These two settings dictate exactly when a trigger should run.

The trigger operation specifies the types of document operations that should cause the trigger to fire. This can be set to Create (i.e. fire the trigger only when documents are inserted), Delete (i.e. fire the trigger only when documents are deleted), Replace (i.e. fire the trigger only when documents are replaced), or All (fire when any of these operations occur).

The trigger type specifies when the trigger should run with respect to the operation. Pre-triggers run before the operation is performed against the collection, and post-triggers run after the operation is performed. Regardless of when we choose for the trigger to run, the trigger still takes place within the transaction, and the whole transaction can be cancelled by throwing any unhandled error within either type of trigger.

Note that Cosmos DB also runs a validity check before it starts executing any pre-triggers, too. This validity check ensures that the operation is able to be performed – for example, a deletion operation can’t be performed on a document that doesn’t exist, and an insertion operation can’t be performed if a document with the specified ID already exists. For this reason, we won’t see a pre-trigger fire for invalid requests. The following figure gives an overview of how this flow works, and where triggers can execute our custom code:

Trigger Flow

If we want to filter to only handle certain sub-types of documents, for example based on their properties, then this can be done within the trigger logic. We can inspect the contents of the document and apply logic selectively.

Working with the Context

Just like in stored procedures, when we write triggers we have access to the context object through the getContext() method. In triggers, we’ll typically make use of the Collection object (getContext().getCollection()), and depending on the type of trigger we’ll also use the Request object (getContext().getRequest()) and potentially the Response object (getContext().getResponse()).

Request

The Request object gives us the ability to read and modify the original request that caused this trigger to fire. We might choose to inspect the document that the user tried to manipulate, perhaps looking for required fields or imposing our own validation logic. Or we can also modify the document if we need to. We’ll discuss how we do this below.

If you’ve set up a trigger to fire on all operations, then we might sometimes need to find out which operation type has been used for the request. The request.getOperationType() function can be used for this purpose. Please note that the getOperationType() function returns a string, and somewhat confusingly, the possible values that it can return are slightly different to the operation types we discussed above. The function can return one of these strings: Create, Replace, Upsert, or Delete.

Collection

The Collection object can be used in much the same way as in a stored procedure, so refer to part 3 of this series for more details on how this works. In summary, we can retrieve documents by their ID (readDocument()), perform queries on the documents in the collection (queryDocuments()), and pass in parameterised queries (using another overload of the queryDocuments() function). We can also insert documents (createDocument()), modify existing documents (replaceDocument()), upsert documents (upsertDocument()), and delete documents (deleteDocument()). We can even change documents that aren’t directly being modified by the operation that caused this trigger.

Response

The Response object is only available in post-triggers. If we try to use it from within a pre-trigger, an exception will be thrown by the Cosmos DB runtime. Within post-triggers, the Response represents the document after the operation was completed by Cosmos DB. This can be a little unclear, so here’s an example. Let’s imagine we’re inserting a simple document, and not including our own ID – we want Cosmos DB to add it for us. Here’s the document we’re going to insert from our client:

If we intercepted this document within a pre-trigger, by using the Request object, we’d be able to inspect and modify the document as it appears above. Once the pre-trigger has finished its work, Cosmos DB will actually perform the insert, and at that time it appends a number of properties to the document automatically for its own tracking purposes. These properties are then available to us within the Response object in a post-trigger. So this means that, within a post-trigger, we can inspect the Request – which will show the original document as it appears above – and we can also inspect the Response, where the document will look something like this:

We can even use the response.setBody() function to make a change to this response object if we want to, and the change will be propagated to the document in the collection before control returns back to the client.

You might be wondering what a delete operation’s behaviour is. If we examine the body through either the Request or Response during a deletion operation, you’ll see that these functions return the contents of document that is being deleted. This means we can easily perform validation logic based on the contents of the document, such as cancelling the deletion based on the contents of the document. Note that request.setBody() cannot be used within a deletion, although interestingly response.setBody() does appear to work, but doesn’t actually do anything (at least that I can find!).

The combination of the Request, Response, and Collection objects gives us a lot of power to run custom logic at all points along the process. Now we know what we can do with triggers, let’s talk about how we cause them to be fired.

Firing a Trigger

There is a non-obvious aspect to using triggers within Cosmos DB: at the time we perform a collection operation, we must explicitly request that the trigger be fired. If you’re used to triggers in other databases like SQL Server, this might seem like a rather strange limitation. But – as with so much about Cosmos DB – it is because Cosmos DB prioritises performance above many other considerations. Including triggers within an operation will cause that operation to use more request units, so given the focus on performance, it makes some sense that if a trigger isn’t needed for a transaction then it shouldn’t be executed. It does make our lives as developers a little more difficult though, because we have to remember to request that triggers be included.

Another important caveat to be aware of is that at most one pre-trigger and one post-trigger can be executed for any given operation. Confusingly, the Cosmos DB .NET client library actually makes it seem like we can execute multiple triggers of a given type because it accepts a list of triggers to fire, but in reality we can only specify one. I assume that this restriction may be lifted in future, but for now if we want to have multiple pieces of logic executing as triggers, we have to combine them into one trigger.

It’s also important to note that triggers can’t be called from within server-side code. This means that we can’t nest triggers – for example, we can’t have an operation call trigger A, which in turn would insert a document and expect trigger B to be called. Similarly, if we have a stored procedure inserting a document, it can’t expect a trigger to be called either.

And there’s one further limitation – the Azure Portal doesn’t provide a way to fire triggers within Document Explorer. This makes it a little more challenging to test triggers when you’re working on them. I typically do one of three things:

  • I write a basic C# console application or similar, and have it use the Cosmos DB client library to perform the operation and fire the trigger; or
  • I use the REST API to execute a document operation directly – this is a little more challenging as we need to authenticate and structure the request manually, which can be a little complex; or
  • I use a tool named DocumentDB Studio, which is a sample application that uses the Cosmos DB client library. It provides the ability to specify triggers to be fired.

Later in this post we’ll look at DocumentDB Studio. For now, let’s quickly look at how to request that the Cosmos DB .NET client library fire a trigger during an operation too. In the C# application (with a reference to the Cosmos DB library already in place – i.e. one of the .NET Framework or .NET Standard/.NET Core NuGet packages) we can write something like this:

Now that we’ve seen when and how triggers can be fired, let’s go through a few common use cases for them.

Validation

A common application for triggers is to validate documents when they are being modified or inserted. For example, we might want to validate that a particular field is present, or that the value of a field conforms to our expectations. We can write sophisticated validation logic using JavaScript or TypeScript, and then Cosmos DB can execute this when it needs to. If we find a validation failure, we can throw an error and this will cause the transaction to be rolled back by Cosmos DB. We’ll work through an example of this scenario later in this post.

Transformation

We might also need to make a change to a document as it’s being inserted or updated. For example, we might want to add some custom metadata to each document, or to transform documents from an old schema into a new schema. The setBody() function on the Request object can be used for this purpose. For example, we can write code like the following:

Secondary Effects

Sometimes we might need to create or modify a different document in the collection as a result of an operation. For example, we might have to create an audit log entry for each document that is deleted, update a metadata document to include some new information as the result of an insert, or automatically write a second document when we see a certain type of document or operation occurring. We’ll see an example of this later in this post.

Using Triggers for Aggregation

If we need to perform aggregation of data within a collection, it may be tempting to consider using a trigger to calculate running aggregates as data comes into, or is changed within, the collection. However, as noted in part 1 of this series, this is not a good idea in many cases. Although triggers are transactional, transactions are not serialised, and therefore race conditions can emerge. For example, if we have a document being inserted by transaction A, and a post-trigger is calculating an aggregate across the collection as a result of that insertion, it won’t include the data within transaction B, which is occurring simultaneously. These scenarios are extremely difficult to test and may result in unexpected behaviour and inconsistent results. In some situations using optimistic concurrency (with the _etag property of the running aggregate document) may be able to resolve this, but it is very difficult to get this right.

Instead of calculating running aggregate within a trigger, it is usually better to calculate aggregates at query time. Cosmos DB recently added support for aggregations within its SQL API, so querying using functions such as SUM() and COUNT() will work across your data. For applications that require grouping as well as aggregation, such as calculating aggregates by group, an upcoming addition to Cosmos DB will allow for GROUP BY to be used within SQL queries. In the meantime, a stored procedure – similar to that from part 3 of this series – can be used to emulate this behaviour.

Now that we’ve talked through how triggers work and what they can do, let’s try building some.

Defining our Triggers

In this post, we’ll walk through creating two triggers – one pre-trigger and one post-trigger.

Our pre-trigger will let us do some validation and manipulation of incoming documents. In part 1 of this series, we built a UDF to allow for a change in the way that customer IDs are specified on order documents. Now we want to take the next step and ensure that any new order documents are using our new format right from the start. This means that we want to allow for this document to be inserted:

But this one should not be able to be inserted as-is:

So there are four code paths that we need to consider in our pre-trigger:

  1. The document is not an order – in this case, the trigger shouldn’t do anything and should let the document be inserted successfully.
  2. The document is an order and uses the correct format for the customer ID – in this case, the trigger should let the document be inserted successfully without any modification.
  3. The document is an order and uses the old format for the customer ID – in this case, we want to modify the document so that it uses the correct format for the customer ID.
  4. The document is an order and doesn’t include the customer ID at all – this is an error, and the transaction should be rolled back.

Our post-trigger will be for a different type of business rule. When an order comes through containing an item with a negative quantity, we should consider this to be a refund. In our application, refunds need a second type of document to be created automatically. For example, when we see this order:

Then we should save the order as it was provided, and also create a document like this:

These two example triggers will let us explore many of the features of triggers that we discussed above.

Preparing Folders

Let’s set up some folder structures and our configuration files. If you want to see the finished versions of these triggers, you can access them on GitHub – here is the pre-trigger and here is the post-trigger.

In a slight departure from what we did in parts 2 and 3 of this series, we’ll be building two triggers in this part. First, create a folder named pre for the pre-trigger. Within that, create a package.json and use the following:

In the same folder, add a tsconfig.json file with the following contents:

And create a src folder with an empty file name validateOrder.ts inside it. We’ll fill this in later.

Next, we’ll create more or less the same folder structure for our post-trigger, inside a folder named post. The two differences will be: (1) that the trigger code file will be named addRefund.ts; and (2) the tsconfig.json should have the outFile line replaced with the correct filename, like this:

 

Our folder structure should now look like this:

  • /
    • /pre
      • package.json
      • tsconfig.json
      • /src
        • validateOrder.ts
    • /post
      • package.json
      • tsconfig.json
      • /src
        • addRefund.ts

Writing the Pre-Trigger

Now let’s fill in our pre-trigger. Open up the empty validateOrder.ts file and add this code:

There’s quite a lot here, so let’s examine it – this time from the bottom up:

  • We declare a TypeScript interface named OrderDocument, which describes the shape of our order documents. We also include a base class named BaseDocument to keep things consistent with our post-trigger, described in the next section. As in part 1 of this series, we include both possible ways of expressing the customer ID so that we can reference these two properties within our script.
  • We declare a TypeScript const enum named DocumentTypes. This is where we’ll define our possible order types. For now we only have one. TypeScript will substitute any reference to the values of these enums for their actual values – so in this case, any reference to DocumentTypes.Order will be replaced with the value of the string "order".
  • Our validateOrderImpl function contains our core implementation.Before we start, we also check the operation type that is being performed; if it’s a Delete then we don’t worry about doing the validation, but for Create and Replace operations we do. Then we pull out the document to validate and then perform the actual validation. If the customerId property is set, we transform the document so that it conforms to our expectation. If the customer ID is not provided at all then we throw an error.
  • Finally, our validateOrder method handles the interaction with the getContext() function and calls into the validateOrderImpl() function. As is the case in earlier parts of this series, we do this so that we can make this code more testable when we get to the part 5 of this series.

That’s all there is to our pre-trigger. Now let’s look at our post-trigger.

Writing the Post-Trigger

Open up the empty addRefund.ts file and add this in:

Let’s explain this, again working from the bottom up:

  • Firstly, we declare interfaces for our OrderDocument and our RefundDocument, both of which will inherit (extend) the BaseDocument interface.
  • We also declare our const enum for DocumentTypes, and in this case we need to track both orders and refunds, so both of these are members of this enum.
  • Then we define our addRefundImpl method. It implements the core business rules we discussed – it checks the type of document, and if it’s an order, it checks to see whether there are any items with a negative quantity. If there are, it creates a refund document and upserts that into the same collection. Like the pre-trigger, this trigger also only runs its logic when the operation type is Create or Replace, but not Delete.
  • Finally, the addRefund method is our entry point method, which mostly just wraps the addRefundImpl method and makes it easier for us to test this code out later.

Compiling the Triggers

Because we have two separate triggers, we need to compile them separately. First, open a command prompt or terminal to the pre folder, and execute npm run build. Then switch to the post folder and run npm run build again. You should end up with two output files.

In the pre/output/trig-validateOrder.js file, you should have a JavaScript file that looks like the following:

And in the post/output/trig-addRefund.js file, you should have the following JavaScript:

Now that we’ve got compiled versions of our triggers, let’s deploy them.

Deploying the Triggers

As in parts 2 and 3 of this series, we’ll deploy using the Azure Portal for now. In part 6 of this series we’ll look at some better ways to deploy our code to Cosmos DB.

Open the Azure Portal to your Cosmos DB account, select Script Explorer, and then click the Create Trigger button. Now let’s add the contents of the trig-validateOrder.js file into the big text box, enter the name validateOrder, and make sure that the Trigger Type drop-down is set to Pre and that the Trigger Operation drop-down is set to All, like this:

cosmos-trigger-1

Click Save, close the blade, and click Create Trigger to create our second trigger. This one will be named addRefund, and ensure that Trigger Type is set to Post for this one. Trigger Operation should still be set to All, like this:

cosmos-trigger-2

Click Save. Our triggers are now deployed! Let’s test them.

Testing the Triggers

Testing triggers can be a little tricky, because the Azure Portal doesn’t have a way to fire triggers when we work with documents through Document Explorer. This means we have to use another approach to try out our triggers.

DocumentDB Studio is an open-source tool for working with Cosmos DB data. There are several such tools available now, including Azure Storage Explorer and Visual Studio, but other than DocumentDB Studio, I haven’t found any that allow for firing triggers during document operations. Unfortunately DocumentDB Studio is only available for Windows, so if you’re on a Mac or Linux, you may have to look at another approach to fire the triggers, like writing a test console application using the Cosmos DB SDK.

Install the latest release of DocumentDB Studio from the GitHub releases page, and go to File and then Add Account to log in using your Cosmos DB account endpoint and key. This will authenticate you to Cosmos DB and should populate the databases and collections list in the left pane.

Expand out the account, and go to the Orders database and Orders collection. Right-click the collection name and click Create Document:

cosmos-trig-1

Paste the contents of the following file into the document textbox:

We can instruct DocumentDB Studio to fire triggers by clicking on the RequestOptions tab and deselecting the Use default checkbox. Then in the PreTrigger field, enter validateOrder and in the PostTrigger field, enter addRefund:

cosmos-trig-2

Now we’ve prepared our request, click the Execute button on the toolbar, and we should see an error returned by Cosmos DB:

cosmos-trig-3

We can see the error we’re getting is Customer ID is missing. Customer ID must be provided within the customer.id property. This is exactly what we wanted – the trigger is noticing that neither the customerId field or the customer.id fields are provided, and it throws an error and rolls back the transaction.

Now let’s try a second document. This time, the customer ID is provided in the old format:

This time, you should see that the order is created, but that it’s been transformed by the validateOrder trigger – the customer ID is now within the customer.id field:

cosmos-trig-4

Now let’s try a third document. This one has a negative quantity on one of the items:

This document should be inserted correctly. To see the refund document that was also created, we need to refresh the collection – right-click the Orders collection on the left pane and click Refresh Documents feed:

cosmos-trig-5

You should see that a new document, refund-ORD6, has been created. The refund document lists the refunded product IDs, which in this case is just PROD1:

cosmos-trig-6

Summary

Triggers can be used to run our own custom code within a Cosmos DB document transaction. We can do a range of actions from within triggers such as validating the document’s contents, modifying the document, querying other documents within the collection, and creating or modifying other documents. Triggers run within the context of the operation’s transaction, meaning that we can cause the whole operation to be rolled back by throwing an unhandled error. Although they come with some caveats and limitations, triggers are a very powerful feature and enable a lot of useful custom functionality for our applications. In the next part of this series, we will discuss how to test our server-side code – triggers, stored procedures, and functions – so that we can be confident that it is doing what we expect.

Key Takeaways

  • Triggers come in two types – pre-triggers, which run before the operation, and post-triggers, which run after the operation. Both types can intercept and change the document, and can throw an error to cancel the operation and roll back the transaction.
  • Triggers can be configured to fire on specific operation types, or to run on all operation types.
  • The getOperationType() method on the Request object can be used to identify which operation type is underway, although note that the return values of this method don’t correspond to the list of operation types that we can set for the trigger.
  • Cosmos DB must be explicitly instructed to fire a trigger when we perform an operation. We can do this through the client SDK or REST API when we perform the operation.
  • Only a single trigger of each type (pre- and post-operation) can be called for a given operation.
  • The Collection object can be used to query and modify documents within the collection.
  • The Request object’s getBody() method can be used to examine the original request, and in pre-triggers, setBody() can modify the request before it’s processed by Cosmos DB.
  • The Response object can only be used within post-triggers. It allows both reading the document after it was inserted (including auto-generated properties like _ts) and modifying the document.
  • Triggers cannot be nested – i.e. we cannot have a trigger perform an operation on the collection and request a second trigger be fired from that operation. Similarly a stored procedure can’t fire a trigger.
  • Testing triggers can be a little difficult, since the Azure Portal doesn’t provide a good way to do this currently. We can test using our own custom code, the REST API, or by using DocumentDB Studio.
  • You can view the code for this post on GitHub. The pre-trigger is here, and the post-trigger is here.

Cosmos DB Server-Side Programming with TypeScript – Part 3: Stored Procedures

Stored procedures, the second type of server-side code that can run within Cosmos DB, provide the ability to execute blocks of functionality from inside the database engine. Typically we use stored procedures for discrete tasks that can be encapsulated within a single invocation. In this post, we will discuss some situations where stored procedures can be used and the actions and queries that they can perform. We’ll then start to work through the server-side API model, and look at how we can work with the incoming stored procedure invocation’s request and response as well as the Cosmos DB collection itself. Then we’ll build a simple stored procedure using TypeScript.

This post is part of a series of posts about server-side programming for Cosmos DB:

  • Part 1 gives an overview of the server side programmability model, the reasons why you might want to consider server-side code in Cosmos DB, and some key things to watch out for.
  • Part 2 deals with user-defined functions, the simplest type of server-side programming, which allow for adding simple computation to queries.
  • Part 3 (this post) talks about stored procedures. These provide a lot of powerful features for creating, modifying, deleting, and querying across documents – including in a transactional way.
  • Part 4 introduces triggers. Triggers come in two types – pre-triggers and post-triggers – and allow for behaviour like validating and modifying documents as they are inserted or updated, and creating secondary effects as a result of changes to documents in a collection.
  • Part 5 discusses unit testing your server-side scripts. Unit testing is a key part of building a production-grade application, and even though some of your code runs inside Cosmos DB, your business logic can still be tested.
  • Finally, part 6 explains how server-side scripts can be built and deployed into a Cosmos DB collection within an automated build and release pipeline, using Microsoft Visual Studio Team Services (VSTS).

Stored Procedures

Stored procedures let us encapsulate blocks of functionality, and then later invoke them with a single call. As with the other server-side code types, the code inside the stored procedure run inside the Cosmos DB database engine. Within a stored procedure we can perform a range of different actions, including querying documents as well as creating, updating, and deleting documents. These actions are done within the collection that the stored procedure is installed in. Of course, we can augment these collection-based actions with custom logic that we write ourselves in JavaScript or – as is the case in this series – TypeScript.

Stored procedures are extremely versatile, and can be used for a number of different tasks including:

  • Encapsulating a complex set of queries and executing them as one logical operation – we will work with this example below.
  • Retrieving data for a report using a complex set of queries, and combining the results into a single response that can be bound to a report UI.
  • Generating mock data and inserting it into the collection.
  • Doing a batch insert, update, upsert, or delete of multiple documents, taking advantage of the transactional processing of stored procedures.

Of course, if you are building a simple application without the need for complex queries, you may be able to achieve everything you need with just the Cosmos DB client-side SDKs. However, stored procedures do give us some power and flexibility that is not possible with purely client-based development, including transaction semantics.

Transactions and Error Handling

Cosmos DB’s client programming model does not provide for transactional consistency. However, stored procedures do run within an implicit transaction. This means that if you modify multiple documents within a stored procedure, then either all of those changes will be saved or – in the event of an error – none of them will be saved. Transactions provide four guarantees (atomicity, consistency, isolation, and durability, also known as ACID). More information on Cosmos DB transactions is available here.

The Cosmos DB engine handles committing and rolling back transactions automatically. If a stored procedure completes without any errors being thrown then the transaction will be committed. However, if even one unhandled error is thrown, the transaction will be rolled back and no changes will be made.

Working with the Context

Unlike user-defined functions, which we discussed in part 1 of this series, stored procedures allow us to access and make changes to the collection that they run within. We can also return results from stored procedures. Both of these types of actions require that we work with the context object.

Within a stored procedure, we can make a call to the getContext() function. This returns back an object with three functions.

  • getContext().getRequest() is used to access the request details. This is mostly helpful for triggers, and we will discuss this in part 4 of this series.
  • getContext().getResponse() lets us set the response that we should send back. For stored procedures, this is the way that we will return data back to the client if the stored procedure has something to return.
  • getContext().getCollection() gives us access to the Cosmos DB collection that the stored procedure runs within. In turn, this will let us read and write documents.

Each of the calls above corresponds to a type – Context, Request, Response, and Collection, respectively. Each of these types, in turn, provide a set of functions for interacting with the object. For example, getContext().getCollection().queryDocuments() lets us run a query against documents in the collection, and getContext().getResponse().setBody() lets us specify the output that we want the stored procedure to return. We’ll see more of these functions as we go through this series.

Also note that the double-underscore (__) is automatically mapped to the getContext().getCollection() function. In this series we won’t use this shortcut because I want to be more explicit, especially to help with testing when we get to part 5.

Type Definitions

Human-readable documentation for the types and their members is provided by Microsoft here. Of course, one of the main reasons we’re using TypeScript in this series is so that we get type checking and strong typing against the Cosmos DB object model, so a human-readable description isn’t really sufficient. In TypeScript, this is done through type definitions – descriptions of types and their members that TypeScript can use to power its type system.

While Microsoft doesn’t provide first-party TypeScript type definitions for Cosmos DB, an open-source project named DefinitelyTyped provides and publishes these definitions here. (Note that the type definitions use the old name for Cosmos DB – DocumentDB – but they are still valid and being updated.)

Queries

One of the main things we’ll frequently do from inside stored procedures is execute queries against Cosmos DB. This is how we can retrieve data and perform custom logic on it within the stored procedure. Cosmos DB provides an integrated JavaScript query syntax for executing queries. The syntax is documented here and lets us write queries like this:

which will map to the following SQL query:

However, there are some limitations to this query syntax. We can’t perform aggregations, and we can’t do queries using user-defined functions. These limitations may be lifted in future, but for now, in this series we will use the SQL syntax instead so that we can get the full power of Cosmos DB’s query engine. We can use this type of syntax to make a SQL-based query:

 

In your own stored procedures and triggers you can decide which approach – integrated JavaScriptor SQL – makes more sense.

We can also request a document by ID if we want to as well:

 

 

Another important consideration when executing queries is that Cosmos DB limits the amount of time that a stored procedure can run for. We can test whether our stored procedure is approaching that time limit by inspecting the return value we receive when we submit the query. If we receive a false response, it means the query wasn’t accepted – and that it’s probably time to wrap up our logic before we get forcibly shut off. In some stored procedures, receiving a false like this may mean that you simply throw an error and consider the whole stored procedure execution to have failed.

Parameterised Queries

In the last section, we discussed using the collection.queryDocuments function to execute a SQL query against the collection. While this technically works, once we start including parameters in our queries then we’d need to concatenate them into the query string. This is a very, very bad idea – it opens us up to a class of security vulnerabilities called SQL injection attacks.

When we’re writing SQL queries with parameters, we should instead the overload of the collection.queryDocuments function that accepts an IParameterizedQuery instead of a string. By doing this, we pass our parameters explicitly and ensure that they are handled and escaped appropriately by the database engine. Here’s an example of how we can do this from our TypeScript code:

 

 

 

 

Updating Documents

We can also make changes to documents within the collection,. There are several functions on the Collection type to help with this, including:

  • createDocument inserts a new document into the collection.
  • replaceDocument updates an existing document in the collection. You must provide the document link to use this function.
  • deleteDocument deletes a document from the collection.
  • upsertDocumentThere are also functions to deal with attachments to documents, but we won’t work with those in this series.

These functions that work with existing documents also take an optional parameter to specify the etag of the document. This allows for us to take advantage of optimistic concurrency. Optimistic concurrency is very useful, but is outside the scope of this series.

Structuring Your Stored Procedure Code

Stored procedures will often become more complex than UDFs, and may incorporate business logic as well as interaction with the Cosmos DB collection inside the code. When we’re writing a production-grade application it is important that we structure our code correctly so that each part is testable, and has a clearly defined responsibility and interactions with other components. In Cosmos DB, we are interacting with the collection in order to execute queries and update documents, and these side effects are important to test as well. We’ll discuss testing in more detail in part 5, but even before we worry about testing, it’s a good idea to structure our code properly.

When I write Cosmos DB stored procedures, I find it’s helpful to have a simple ‘entry point’ function. This entry point does the interaction with the getContext() function and retrieves the Collection, Request, and Response objects as required. These, along with any other parameters, are then passed into an internal implementation function, which in turn may invoke other functions to do other parts of the logic. By structuring the functions in this way we can ensure that each function has a clear purpose, and that the external components can be mocked and/or spied upon during our tests.

Writing our stored procedure in TypeScript also gives us the ability to store our functions in different .ts files if we want. This is helpful when we have long and complicated stored procedures, or if we want to keep interfaces and functions in separate files. This is largely a choice of style, since TypeScript’s compiler will simply combine all of the functions together at compilation time and will emit a single JavaScript output file (because we have set the outFile property in our tsconfig.json file). One important note on this though – if you have functions spread across multiple files, it is important to pay attention to the order in which the functions get emitted. The stored procedure’s ‘entry point’ function must appear first in the output JavaScript file. TypeScript can be instructed to do this by explicitly listing the entry point function’s file first in the include directive within the tsconfig.json file, and then having a wildcard * to catch the remaining files, like this:

Calling Stored Procedures

Once a stored procedure is written, we need to call it in order to check that it’s working, and then to use it in our real applications. There are several ways we can call our stored procedure and pass in the arguments it expects.

  • The Azure Portal provides a test interface for invoking stored procedures. This is how we will test the stored procedure we write below.
  • The client SDKs provide platform-specific features for invoking stored procedures. For example, the .NET client library for Cosmos DB provides the DocumentClient.ExecuteStoredProcedureAsync function, which accepts the ID of a stored procedure and any arguments that it might be expecting.
  • The Cosmos DB REST API also allows for invoking stored procedures directly.

Again, the exact way you call stored procedures may depend on the Cosmos DB API you are targeting – in this series we are using the SQL API, and the invocation mechanisms for MongoDB, Gremlin, and the other APIs may be different.

Now that we have talked about the different aspects of writing stored procedures, we can think about how we might use a stored procedure in our sample scenario.

Defining our Stored Procedure

In our hypothetical ordering application, we have a Cosmos DB collection of documents representing orders. Each order document contains a customer’s ID (as we saw in part 2 of this series), and also contains a set of items that the customer ordered. An order item consists of the product ID the customer ordered and the quantity of that product.

Because Cosmos DB is a schemaless database, our order documents may coexist with many other documents of different types. Therefore, we also include a type parameter on the document to indicate that it is an order. This type discriminator pattern is quite common in schemaless databases.

An example of an order document in our collection is:

For this stored procedure we want to pass in a set of product IDs, and get back a grouped list of IDs for customers who have ordered any of those products. This is similar to doing a GROUP BY in a relational database – but currently Cosmos DB doesn’t provide this sort of grouping feature, and so we are using a stored procedure to fill in the gap. Doing this from the client SDK would require multiple queries against the collection, but by using a stored procedure we can just make one call.

At a high level, our stored procedure logic looks like this:

  1. Accept a list of product IDs as an argument.
  2. Iterate through the product IDs.
  3. For each product ID, run a query to retrieve the customer IDs for the customers that have ordered that product, filtering to only query on order documents. The query we’ll run looks like this:

  1. Once we have all of the customer IDs for each product in our list, create a JSON object to represent the results like this:

 

 

 

Preparing a Folder

Now we can start writing our stored procedure. If you want to compare against my completed stored procedure, you can access it on GitHub.

In part 2 of this series we covered how to set up the Cosmos DB account and collection, so I won’t go through that again. We also will reuse the same folder structure as we did in part 2, so you can refer to that post if you’re following along.

There’s one major difference this time though. In our package.json file, we need to add a second entry into the devDependencies list to tell NPM that we want to include the TypeScript type definitions for Cosmos DB. Our package.json file will look like this:

Open a command prompt or terminal window, and run npm install within this folder. This will initialise TypeScript and the type definitions.

We’ll also adjust the tsconfig.json file to emit a file with the name sp-getGroupedOrders.js:

Writing the Stored Procedure

Now let’s create a file named src/getGroupedOrders.ts. This is going to contain our stored procedure code. Let’s start with adding a basic function that will act as our entry point:

 

 

 

 

 

As discussed above, this is a pattern I try to follow when I write Cosmos DB server-side code. It helps to keep the interaction with the getContext() function isolated to this one place, and then all subsequent functions will work with explicit objects that we’ll pass around. This will help when we come to test this code later. You can see that this function calls the getGroupedOrdersImpl function, which does the actual work we need done – we’ll write this shortly.

Before then, though, let’s write a basic interface that will represent our response objects:

 

 

 

 

 

 

Our getGroupedOrdersImpl function will accept an array of product IDs and the collection in which to query them, and it will return an array of these CustomersGroupedByProducts. Of course, since CustomersGroupedByProduct is a TypeScript interface, we can use it within our functions for type safety, but it will be stripped out when we compile the code into JavaScript.

Now we can add a basic shell of an implementation for our getGroupedOrdersImpl function. As you type this, notice that (if you’re in an editor that supports it, like Visual Studio Code) you get IntelliSense and statement completion thanks to the TypeScript definitions:

 

This function prepares a variable called outputArray, which will contain all of our product/customer groupings. Then we have some placeholder code to perform our actual queries, which we’ll fill in shortly. Finally, this function returns the output array.

Now we can fill in the placeholder code. Where we have REPLACEME in the function, replace it with this:

There’s a lot going on here, so let’s break it down:

  • The first part (lines 1-6) sets up a new IParameterizedQuery, which lets us execute a SQL query using parameters. As discussed above, this is a much more secure way to handle parameters than string concatenation. The query will find all orders containing the product ID we’re looking for, and will return back the customer ID.
  • Next, the query callback function is prepared (lines 7-18). This is what will be called when the query results are available. In this function we pull out the results and push them onto our outputArray, ready to return to the calling function.
  • Then we try to execute the query against the collection by using the collection.queryDocuments() function (line 19). This function returns a boolean to indicate whether the query was accepted (line 20). If it wasn’t, we consider this to be an error and immediately throw an error ourselves (line 22).

That’s it! The full stored procedure file looks like this:

Here’s what the your folder structure should now look like:

  • /
    • package.json
    • tsconfig.json
    • src/
      • getGroupedOrders.ts

Compiling the Stored Procedure

As in part 2, we can now compile our function to JavaScript. Open up a command prompt or terminal, and enter npm run build. You should see that a new output folder has been created, and inside that is a file named sp-getGroupedOrders.js. If you open this, you’ll see the JavaScript version of our function, ready to submit to Cosmos DB. This has all of the type information removed, but the core logic remains the same. Here’s what it should look like:

Deploying the Stored Procedure

Now let’s deploy the stored procedure to our Cosmos DB collection. Open the Azure Portal, browse to the Cosmos DB account, and then go to Script Explorer. Click Create Stored Procedure.

cosmos-sp-1

Enter getGroupedOrders as the ID, and then paste the contents of the compiled sp-getGroupedOrder.js JavaScript file into the body.

cosmos-sp-2

Click Save to install the stored procedure to Cosmos DB. (Once again, this isn’t the best way to install a stored procedure – we’ll look at better ways in part 6 of this series.)

Testing the Stored Procedure

Now let’s insert some sample data so that we can try the stored procedure out. Let’s insert these sample documents using Document Explorer, as described in part 2.

Here are the three sample documents we’ll use:

Now go back into Script Explorer, open the stored procedure, and notice that there is a test panel below the script body textbox. We can enter our stored procedures parameters into the Inputs field. Let’s do that now. Enter [["P1", "P2", "P10"]] – be careful to include the double square brackets around the array. Then click the Save & Execute button, and you should see the results.

cosmos-sp-3

If we reformat them, our results look like the following. We can see that we have an array containing an object for each product ID we passed into the query, and each object has a list of customer IDs who ordered that product:

So our stored procedure works! We’ve now successfully encapsulated the logic involved in querying for customers that have ordered any of a set of products.

Summary

Stored procedures give us a way to encapsulate queries and operations to be performed on a Cosmos DB collection, and to invoke them as a single unit. Stored procedures run within an implicit transaction, so any unhandled errors will result in the changes being rolled back. Unlike in UDFs, we are also able to access the collection within a stored procedure by using the getContext() function, and by retrieving the Response and Collection objects. This allows us to return rich data, including objects and arrays, as well as to interact with the collection and its documents. In the next part of this series we will discuss triggers, the third type of server-side programming available in Cosmos DB, which allow us to intercept changes happening to documents within the collection.

Key Takeaways

  • Stored procedures encapsulate logic, queries, and actions upon documents within the collection.
  • Stored procedures provide transactional isolation, and all stored procedures run within a transaction scope.
  • The getContext() function allows us to access the Response and Collection objects.
  • TypeScript definitions are available to help when writing code against these objects.
  • Cosmos DB provides an integrated query syntax, which is great for simple queries, but doesn’t cover all possible queries that can be executed against the collection.
  • Arbitrary SQL queries can also be executed. If these contain parameters then the IParameterizedQuery interface should be used to ensure safe coding practices are adhered to.
  • The order of functions inside the stored procedure’s file matters. The first function will be the one that Cosmos DB treats as the entry point.
  • You can view the code for this post on GitHub.

Cosmos DB Server-Side Programming with TypeScript – Part 2: User-Defined Functions

User-defined functions (UDFs) in Cosmos DB allow for simple calculations and computations to be performed on values, entities, and documents. In this post I will introduce UDFs, and then provide detailed steps to set up a basic UDF written in TypeScript. Many of these same steps will be applicable to stored procedures and triggers, which we’ll look at in future posts.

This is the second part of a series of blog posts on server-side development using Cosmos DB with TypeScript.

  • Part 1 gives an overview of the server side programmability model, the reasons why you might want to consider server-side code in Cosmos DB, and some key things to watch out for.
  • Part 2 (this post) deals with user-defined functions, the simplest type of server-side programming, which allow for adding simple computation to queries.
  • Part 3 talks about stored procedures. These provide a lot of powerful features for creating, modifying, deleting, and querying across documents – including in a transactional way.
  • Part 4 introduces triggers. Triggers come in two types – pre-triggers and post-triggers – and allow for behaviour like validating and modifying documents as they are inserted or updated, and creating secondary effects as a result of changes to documents in a collection.
  • Part 5 discusses unit testing your server-side scripts. Unit testing is a key part of building a production-grade application, and even though some of your code runs inside Cosmos DB, your business logic can still be tested.
  • Finally, part 6 explains how server-side scripts can be built and deployed into a Cosmos DB collection within an automated build and release pipeline, using Microsoft Visual Studio Team Services (VSTS).

User-Defined Functions

UDFs are the simplest type of server-side development available for Cosmos DB. UDFs generally accept one or more parameters and return a value. They cannot access Cosmos DB’s internal resources, and cannot read or write documents from the collection, so they are really only intended for simple types of computation. They can be used within queries, including in the SELECT and WHERE clauses.

UDFs are simple enough that types are almost not necessary, but for consistency we will use TypeScript for these too. This will also allow us to work through the setup of a TypeScript project, which we’ll reuse for the next parts of this series.

One note on terminology: the word function can get somewhat overloaded here, since it can refer to the Cosmos DB concept of a UDF, or to the TypeScript and JavaScript concept of a function. This can get confusing, especially since a UDF can contain multiple JavaScript functions within its definition. For consistency I will use UDF when I’m referring to the Cosmos DB concept, and function when referring to the JavaScript or TypeScript concept.

Parameters

UDFs can accept zero or more parameters. Realistically, though, most UDFs will accept at least one parameter, since UDFs almost always operate on a piece of data of some kind. The UDF parameters can be of any type, and since we are running within Cosmos DB, they will likely be either a primitive type (e.g. a single string, number, array, etc), a complex type (e.g. a custom JavaScript object, itself comprised of primitive types and other complex types), or even an entire document (which is really just a complex type). This gives us a lot of flexibility. We can have UDFs that do all sorts of things, including:

  • Accept a single string as a parameter. Do some string parsing on it, then return the parsed value.
  • Accept a single string as well as another parameter. Based on the value of the second parameter, change the parsing behaviour, then return the parsed value.
  • Accept an array of values as a parameter. Combine the values using some custom logic that you define, then return the combined value.
  • Accept no parameters. Return a piece of custom data based on the current date and time.
  • Accept a complex type as a parameter. Do some parsing of the document and then return a single output.

Invoking a UDF

A UDF can be invoked from within the SELECT and WHERE clauses of a SQL query. To invoke a UDF, you need to include the prefix udf. before the function name, like this:

SELECT udf.parseString(c.stringField) FROM c

In this example, udf. is a prefix indicating that we want to call a UDF, and parseString is the name of the UDF we want to call. Note that this is identifier that Cosmos DB uses for the UDF, and is not necessarily the name of the JavaScript function that implements the UDF. (However, I strongly recommend that you keep these consistent, and will do so throughout this series.)

You can pass in multiple parameters to the UDF by comma delimiting them, like this:

SELECT udf.parseString(c.stringField, 1234) FROM c
SELECT udf.parseString(c.stringField1, c.stringField2) FROM C

To pass a hard-coded array into a UDF, you can simply use square brackets, like this:

SELECT udf.parseArray(["arrayValue1", "arrayValue2", "arrayValue3"])

Now that we’ve talked through some of the key things to know about UDFs let’s try writing one, using our sample scenario from part 1.

Defining our UDF

Let’s imagine that our order system was built several years ago, and our business has now evolved significantly. As a result, we are in the middle of changing our order schema to represent customer IDs in different ways. Cosmos DB makes this easy by not enforcing a schema, so we can simply switch to the new schema when we’re ready.

Our old way of representing a customer ID was like this:

Now, though, we are representing customers with additional metadata, like this:

However, we still want to be able to easily use a customer’s ID within our queries. We need a way to dynamically figure out the customer’s ID for an order, and this needs to work across our old and new schemas. This is a great candidate for a UDF. Let’s deploy a Cosmos DB account and set up this UDF.

Setting Up a Cosmos DB Account and Collection

First, we’ll deploy a Cosmos DB account and set up a database and collection using the Azure Portal. (Later in this series, we will discuss how this can be made more automatable.) Log into the Azure Portal and click New, then choose Cosmos DB. Click Create.

cosmos-udf-1

We need to specify a globally unique name for our Cosmos DB account – I have used johnorders, but you can use whatever you want. Make sure to select the SQL option in the API drop-down list. You can specify any subscription, resource group, and location that you want. Click Create, and Cosmos DB will provision the account – this takes around 5-10 minutes.

Once the account is created, open it in the Portal. Click Add Collection to add a new collection.

cosmos-udf-2

Let’s name the collection Orders, and it can go into a database also named Orders. Provision it with a fixed (10GB) capacity, and 400 RU/s throughput (the minimum).

cosmos-udf-3

Note that this collection will cost money to run, so you’ll want to remember to delete the collection when you’re finished. You can leave an empty Cosmo DB account for no charge, though.

Preparing a Folder

TypeScript requires that we provide it with some instructions on how to compile our code. We’ll also use Node Package Manager (NPM) to tie all of our steps together, and so need to prepare a few things before we can write our UDF’s code.

Note that in parts 2, 3, and 4 of this series, we will write each server-side component as if it was its own independent application. This is to keep each of these posts easy to follow in a standalone way. However, in parts 5 and 6, we will combine these into a single folder structure. This will more accurately represent a real-world application, in which you are likely to have more than one server-side component.

Create a new folder on your local machine and add a file named package.json into it. This contains the NPM configuration we need. The contents of the file should be as follows:

The project.json file does the following:

  • It defines the package dependencies we have when we develop our code. Currently we only have one – typescript.
  • It also defines a script that we can execute from within NPM. Currently we only have one – build, which will build our UDF into JavaScript using TypeScript’s tsc command.

At a command prompt, open the folder we’ve been working in. Run npm install, which will find and install the TypeScript compiler. If you don’t have NPM installed, install it by following the instructions here.

Next, create a file named tsconfig.json. This is the TypeScript project configuration. This should contain the following:

The tsconfig.json instructs TypeScript to do the following:

  • Find and compile all of the files with the .ts extension inside the src folder (we haven’t created this yet!).
  • Target ECMAScript 2015, which is the version of JavaScript that Cosmos DB supports. If we use more modern features of TypeScript, it will handle the details of how to emit these as ECMAScript 2015-compatible code.
  • Save the output JavaScript to a single file named output/udf-getCustomerId.js. This filename is arbitrary and it could be any name we want, but I find it helpful to use the convention of  {kind}-{objectName}.js, especially as we add more code in later parts of this series.
    Note that the outFile directive means that, even if we included multiple source TypeScript files, TypeScript will save the complete compiled script into a single output file. This is in keeping with the requirement that Cosmos DB imposes that a server-side component has to be specified in a single file.

Writing the UDF

Now we can write our actual UDF code! If you want to compare against my completed UDF, you can access it on GitHub.

Create a folder named src, and within that, create a file named getCustomerId.ts. (Once again, this file doesn’t need to be named this way, but I find it helpful to use the UDF’s name for the filename.) Here’s the contents of this file:

Briefly, here’s an explanation of what this file does:
  • It declares a function named getCustomerId, which accepts a single parameter of type OrderDocument and returns a string. This is the function that represents our UDF.
  • The function inspects the document provided, and depending on which version of the schema it follows, it pulls the customer ID out of the appropriate field.
  • If the customer ID isn’t in either of the places it expects to find them, it throws an error. This will be further thrown up to the client by Cosmos DB.
  • Finally, it declares an interface named OrderDocument. This represents the shape of the data we’re expecting to store in our collection, and it has both of the ways of representing customer IDs.
    Note that we are using an interface and not a class, because this data type has no meaning to Cosmos DB – it’s only for use at development and build time.
    Also note that we could put this into its own orderDocument.ts file if wanted to keep things separated out.

At the end of this, your folder should look something like this:

  • /
    • package.json
    • tsconfig.json
    •  src/
      • getCustomerId.ts

You can access a copy of this as a GitHub repository here.

We have now written our first UDF! We’re almost ready to run it – but before then, we need to compile it.

Compiling the UDF

At the command line run npm run build. This will run the build script we defined inside the package.json file, which in turn simply runs the tsc (TypeScript compiler) command-line application. tsc will find the tsconfig.json file and knows what to do with it.

Once it’s finished, you should see a new output folder containing a file named udf-getCustomerId.js. This is our fully compiled UDF! It should look like the following:

If you compare this to the code we wrote in TypeScript, you’ll see that it is almost the same – except all of the type information (variable types and the interface) have been stripped away. This means that we get the type safety benefits of TypeScript at authoring and compilation time, but the file we provide to Cosmos DB is just a regular JavaScript file.

Deploying the UDF

Now we can deploy the UDF. Back in the Azure Portal, open Script Explorer under the Collections section, and then click Create User Defined Function.

cosmos-udf-4.png

Enter getCustomerId in the ID field. This will be the name we address the UDF by when we start to call it from our queries. Note that you don’t have to use the same ID field here as the JavaScript function name – in fact, the JavaScript function can be named anything you like. For clarify, though, I find it helpful to keep everything in sync.

Now we can copy and paste the contents of the udf-getCustomerId.js file into the large script text box.

cosmos-udf-5

Click Save to install the UDF to Cosmos DB.

Testing the UDF

Finally, let’s test the UDF! We’ll need to add a couple of pieces of sample data. Click Document Explorer under the Collections section, and then click the Create button. Paste in this sample document:

Click Save, and then close the blade and create a second document with the following contents:

This gives us enough to test with. Now click Query Explorer under the Collections section. Enter the following query:

SELECT c.id, udf.getCustomerId(c) AS customerId FROM c

This query does the following:

  • Refers to each document within the current collection (c).
  • Runs the getCustomerId UDF, passing in the document contents. Note that to refer to a UDF, you must prefix the name of the UDF with udf..
  • Projects out the document ID (id) and the customerId as customerId.

You should see the following output:

That’s exactly what we wanted to see – the UDF has pulled out the correct field for each document.

As a point of interest, notice the Request Charge on the query results. Try running the query a few times, and you should see that it fluctuates a little – but is generally around 3.5 RUs.

Now let’s try passing in an invalid input into our UDF. Run this query:

SELECT udf.getCustomerId('123') FROM c

Cosmos DB will give you back the error that our UDF threw because the input data (123) didn’t match either of the schemas it expected:

Encountered exception while executing Javascript.
  Exception = Error: Document with id undefined does not contain customer ID in recognised format.
  Stack trace: Error: Document with id undefined does not contain customer ID in recognised format.
    at getCustomerId (getCustomerId.js:11:5)
    at __docDbMain (getCustomerId.js:15:5)
    at Global code (getCustomerId.js:1:2)

So we’ve now tested out the old customer ID format, the new customer ID format, and some invalid input, and the UDF behaves as we expect.

Summary

UDFs provide us with a way to encapsulate simple computational logic, and to expose this within queries. Although we can’t refer to other documents or external data sources, UDFs are a good way to expose certain types of custom business logic. In this post, we’ve created a simple UDF for Cosmos DB, and tested it using a couple of simple documents. In the next part of this series we’ll move on to stored procedures, which allow for considerably more complexity in our server-side code.

Key Takeaways

  • UDFs are intended for simple computation.
  • They can be used within queries, including in the SELECT and WHERE clauses.
  • UDFs cannot access anything within the Cosmos DB collection, nor can they access any external resources.
  • They can accept one or more parameters, which must be provided when calling the UDF.
  • The name of the JavaScript function does not have to match the name of the UDF, but to keep your sanity, I highly recommend keeping them consistent.
  • UDFs can be invoked from within a query by using the udf. prefix, e.g. SELECT udf.getCustomerId(c) FROM c.
  • You can view the code for this post on GitHub.

 

Cosmos DB Server-Side Programming with TypeScript – Part 1: Introduction

Cosmos DB is a NoSQL database provided as part of Microsoft’s Azure platform. Designed for very high performance and scalability, Cosmos DB is rapidly becoming one of the default data storage options I recommend for new green-field applications and microservices. It is a fairly opinionated database, with some guidelines that you need to follow to take full advantage of its scalability and performance, but it also provides a number of features to enable sophisticated and powerful applications to be built on top of its engine.

One such feature is its server-side programmability model. Cosmos DB allows for stored procedures, triggers, and user-defined functions to run within its database engine. Interestingly, these are written using JavaScript and uploaded to the Cosmos DB collection in which they will run. Server-side programming gives a lot of extra power to a Cosmos DB-based application, including the ability to run transactions across multiple documents within the collection. In fact, server-side programming is the only way to get transaction semantics within Cosmos DB.

In this series of blog posts, we will explore server-side programming in Cosmos DB, and we will use TypeScript to write the server-side code. I will focus on how to build real-world applications, including adding unit tests to ensure the code behaves as expected, and incorporating the build and deployment of Cosmos DB server-side code into your CI/CD process. The series is split into six parts:

  • Part 1 (this post) gives an overview of the server side programmability model, the reasons why you might want to consider server-side code in Cosmos DB, and some key things to watch out for.
  • Part 2 deals with user-defined functions, the simplest type of server-side programming, which allow for adding simple computation to queries.
  • Part 3 talks about stored procedures. These provide a lot of powerful features for creating, modifying, deleting, and querying across documents – including in a transactional way.
  • Part 4 introduces triggers. Triggers come in two types – pre-triggers and post-triggers – and allow for behaviour like validating and modifying documents as they are inserted or updated, and creating secondary effects as a result of changes to documents in a collection.
  • Part 5 discusses unit testing your server-side scripts. Unit testing is a key part of building a production-grade application, and even though some of your code runs inside Cosmos DB, your business logic can still be tested.
  • Finally, part 6 explains how server-side scripts can be built and deployed into a Cosmos DB collection within an automated build and release pipeline, using Microsoft Visual Studio Team Services (VSTS).

In this series I presume some basic knowledge of Cosmos DB. If you’re completely new to Cosmos DB then I recommend reading Microsoft’s overview, and following along with one of the quick starts. A passing familiarity with TypeScript will also be helpful, but even if you don’t know how to use TypeScript, I’ll try to cover the key points you need to know to get started.

Using TypeScript

TypeScript is a language that compiles (or, technically, transpiles) into JavaScript. It provides a number of nice features and improvements over JavaScript, the main one being type safety. This means that the TypeScript compiler can check that your code is accessing the correct types and members as you write it. Writing code in TypeScript allows for a better level of certainty that your code is actually going to work, as well as providing some very nice development-time features such as IntelliSense. It also helps to have strong typing when unit testing, and particularly when mocking out external interfaces and classes within tests.

Because TypeScript compiles into JavaScript code, any JavaScript runtime will be able to run code that had been written in TypeScript. This includes Cosmos DB’s JavaScript engine, Chakra. Even though Cosmos DB doesn’t know about TypeScript or support it directly, we can still take advantage of many of the features that TypeScript provides, and then compile our script into JavaScript before handing it over to Cosmos DB for execution.

TypeScript also lets us separate out our code into multiple .ts files, keeping it tidy and well-organised. Cosmos DB requires that our code be in a single .js file, though – but thankfully, TypeScript can be configured to combine our code when it compiles it.

When working with external libraries and APIs within TypeScript, we need to use type definitions. These specify the details of the types we will use. While the Cosmos DB team doesn’t provide first-party type definitions for their server-side API, there are publicly accessible, open-source type definitions available from the DefinitelyTyped repository. We will use these later in this series.

Note that Cosmos DB supports the ECMAScript 2015 version of JavaScript. TypeScript can be configured to emit JavaScript in several different versions, including ECMAScript 2015 code, so this is not a problem for us. We’ll see how to do this in part 2 of this series.

Impact of Server-Side Programming on Request Units

When using Cosmos DB, we don’t provision CPU cores or disk speed or memory. Instead, Cosmos DB uses request units as its currency. A Cosmos DB collection is provisioned with a certain number of request units per second (RU/s), which can be scaled as necessary to cope with your application’s demands. The RU/s provisioned dictates the monetary cost of running the collection. For example, a simple collection with a light query load might be provisioned with 1000 RU/s, which (as of January 2018) costs approximately USD$60 per month. For more information on Cosmos DB’s request unit model see here, and for the latest pricing information, see here.

Server-side code running within Cosmos DB can easily consume a lot of request units, potentially exhausting your allowance for that second and forcing your application to have to retry operations against Cosmos DB. Furthermore, the cost of running queries server-side may sometimes be higher than the cost of running the equivalent query using the standard client-side APIs, due to the resources it takes to start up a stored procedure or function from JavaScript. Cosmos DB does have some optimisations to reduce the cost of running JavaScript code – for example, internally Cosmos DB compiles the JavaScript code to bytecode and then caches this bytecode so that it doesn’t need to recompile it on every invocation. However, running arbitrary code will usually be more expensive than just using the client-side query APIs, and this means that server-side code may not be appropriate if you don’t actually need the benefits it provides.

Additionally, the request unit usage for a given piece of server-side code is not fully predictable or consistent – in my own testing, I’ve seen the exact same piece of code, working on the same data set, take anywhere from 3.2 RUs through to 4.8 RUs to execute. This is in contrast to the rest of Cosmos DB, where request unit usage is very predictable.

Nevertheless for some scenarios, such as bulk inserts of multiple documents, or generating sample data, it may take fewer request units to run code server-side than client side. It is important to benchmark your code and compare the possible approaches to fully understand the best option for your requirements.

Consistency: Transactions and Indexes

Cosmos DB’s client-side programming model does not provide for transactional consistency across multiple documents. For example, you may have two documents to insert or update, and require that either both operations succeed or – if there is a problem with writing one of them – that both of them should fail. The Cosmos DB server-side programmability model allows for this behaviour to be implemented, because all server-side code runs within an implicit transaction. This means that we get ACID transaction semantics automatically whenever we execute a stored procedure or trigger. Cosmos DB runs stored procedures and triggers on the primary replica that is used to host the data, which allows it to give this level of transactional isolation while still allowing for high performance operations within the transaction.

Note, however, that transactions are not serialised. This means that other transactions may be happening simultaneously on other documents within the collection in parallel with your transaction. This is important because it means that functionality like real-time aggregation of data may not always be looking at a consistent view of the world, and you can get race conditions. This is simply due to the way Cosmos DB works, and is not something that we can easily program around.

A further nuance to be aware of is that, in Cosmos DB, indexes are updated asynchronously. This means that if you query the collection within a trigger, you may not see the document currently being inserted or updated. Again, this makes it challenging to do certain types of queries (such as aggregations), but is a byproduct of Cosmos DB’s emphasis on enabling high performance and throughput.

API Models

Cosmos DB provides several API models to access data in your databases: SQL to use a SQL-based syntax; MongoDB for using the MongoDB client libraries and tools; Table to use the Azure Storage table API; and Gremlin to use the Gremlin graph protocol. All of these ultimately store data in the same way though, and all of them allow for Cosmos DB’s server-side programmability model.

In this series I will focus purely on the SQL API, but most of the same concepts can be applied to the other API models.

Restrictions

Cosmos DB has placed some restrictions on the types of operations that can be run from within the server. This is mostly to optimise the performance and security of the service.

Time restriction: each server-side operation has a fixed amount of time that it must execute within. The exact amount of time is not documented, but the server-side API provides some features to indicate when your script is approaching its limit. We will discuss this more in later parts of the series. It is important to build in this restriction when designing your stored procedures and triggers, and to avoid writing server-side code that will make high volumes of queries. Instead, if you batch these up across multiple stored procedure calls, you are more likely to have all of your code execute successfully.

Limited set of functionality: although it executes arbitrary JavaScript, Cosmos DB’s server-side programming model is designed for basic computation and for interacting with the Cosmos DB collection itself. We cannot call external web APIs, communicate with other collections, or import any complex JavaScript libraries.

Limited fluent query capability: Cosmos DB’s server-side API has a fluent JavaScript-based query syntax to perform various types of queries, filters, projections, etc on the underlying collection. However, this API does not support all of the rich functionality that the SQL grammar provides, nor does it allow for the same types of queries as the other API models, such as Gremlin’s graph query capability.

One example feature that is missing from the server-side query API is aggregation. The Cosmos DB SQL dialect allows for queries such as SUM, MIN, MAX, and COUNT. These cannot be performed using the fluent server-side query API. However, SQL queries can be executed from within server-side code, so this is not a serious limitation and really just affects the way the code is written, not the functionality that is exposed.

Single JavaScript file: a single stored procedure, trigger, or user-defined function is represented by a JavaScript function, which in turn may call other functions. However, all of the code must be placed in a single file. JavaScript modules and other similar features are not supported. We will see how to split our functionality into multiple TypeScript files, while still emitting a single JavaScript file, later in this series.

Following Along

In this series, you will be able to follow along and create each type of server-side programming entity in Cosmos DB: a user-defined function, a stored procedure, and a trigger. We will build up a set of server-side code, and then in parts 5 and 6 of this series we will look at how to get these ready for a production deployment by testing and automatically building and deploying them to Cosmos DB.

You can follow along whether you use Windows, macOS, or Linux to develop. There are just a few prerequisites:

  • A good text editor: I use Visual Studio Code, which comes with the TypeScript programming extension, but you can use anything you like.
  • Node Package Manager (NPM): you can install this here if you don’t already have it.
  • An Azure subscription. Alternatively, you can use the Cosmos DB emulator to run this locally and at no charge, but you will need to adapt the instructions slightly.

Sample Scenario

A common use for a database is to store information about orders that customers make for products. Orders typically contain some basic overall information, such as an order ID, date, customer ID, and a set of order items – references to products and the quantities ordered. In this series we will work with a simple hypothetical order database implemented in Cosmos DB.

We will use the SQL API, and we will use a non-partitioned collection. Note that partitioned collections behave the same way as non-partitioned collections when it comes to server-side programmability, but they also have a few nuances in their behaviour that we won’t go through here.

Summary

Server-side programming in Cosmos DB is extremely powerful. It gives us the ability to write functions, stored procedures, and triggers that execute within the database engine, and allow for features that are simply not possible through the client-side programming model. However, there are limitations and things to be aware of, including the potentially high cost of running some types of operations from the server. The features also do not provide the same degree of flexibility and power as their counterparts in SQL Server and other relational databases. Nevertheless, the server-side programming model in Cosmos DB is enormously useful for certain types of situations.

By using TypeScript to add type safety, and by adding unit tests and good continuous integration and continuous deployment practices, we can build advanced behaviour into our production-grade applications – all while taking advantage of the high performance and scale capabilities of Cosmos DB.

In the next part of this blog series, we will start writing some server-side code – first by building a user-defined function.

VSTS Build Definitions as YAML Part 2: How?

In the last post, I described why you might want to define your build definition as a YAML file using the new YAML Build Definitions feature in VSTS. In this post, we will walk through an example of a simple VSTS YAML build definition for a .NET Core application.

Application Setup

Our example application will be a blank ASP.NET Core web application with a unit test project. I created these using Visual Studio for Mac’s ASP.NET Core Web application template, and added a blank unit test project. The template adds a single unit test with no logic in it. For our purposes here we don’t need any actual code beyond this. Here’s the project layout in Visual Studio:

Visual Studio Project

I set up a project on VSTS with a Git repository, and I pushed my code into that repository, in a /src folder. Once pushed, here’s how the code looks in VSTS:

Code in VSTS

Enable YAML Build Definitions

As of the time of writing (November 2017), YAML build definitions are currently a preview feature in VSTS. This means we have to explicitly turn them on. To do this, click on your profile picture in the top right of the screen, and then click Preview Features.

VSTS Preview Features option

Switch the drop-down to For this account [youraccountname], and turn the Build YAML Definitions feature to On.

Enable Build YAML Definitions feature in VSTS

Design the Build Process

Now we need to decide how we want to build our application. This will form the initial set of build steps we’ll run. Because we’re building a .NET Core application, we need to do the following steps:

  • We need to restore the NuGet packages (dotnet restore).
  • We need to build our code (dotnet build).
  • We need to run our unit tests (dotnet test).
  • We need to publish our application (dotnet publish).
  • Finally, we need to collect our application files into a build artifact.

As it happens, we can actually collapse this down to a slightly shorter set of steps (for example, the dotnet build command also runs an implicit dotnet restore), but for now we’ll keep all four of these steps so we can be very explicit in our build definition. For a real application, we would likely try to simplify and optimise this process.

I created a new build folder at the same level as the src folder, and added a file in there called build.yaml.

VSTS also allows you to create a file in the root of the repository called .vsts-ci.yml. When this file is pushed to VSTS, it will create a build configuration for you automatically. Personally I don’t like this convention, partly because I want the file to live in the build folder, and partly because I’m not generally a fan of this kind of ‘magic’ and would rather do things manually.

Once I’d created a placeholder build.yaml file, here’s how things looked in my repository:

Build folder in VSTS repository

Writing the YAML

Now we can actually start writing our build definition!

In future updates to VSTS, we will be able to export an existing build definition as a YAML file. For now, though, we have to write them by hand. Personally I prefer that anyway, as it helps me to understand what’s going on and gives me a lot of control.

First, we need to put a steps: line at the top of our YAML file. This will indicate that we’re about to define a sequence of build steps. Note that VSTS also lets you define multiple build phases, and steps within those phases. That’s a slightly more advanced feature, and I won’t go into that here.

Next, we want to add an actual build step to call dotnet restore. VSTS provides a task called .NET Core, which has the internal task name DotNetCoreCLI. This will do exactly what we want. Here’s how we define this step:

Let’s break this down a bit. Also, make sure to pay attention to indentation – this is very important in YAML.

- task: DotNetCoreCLI@2 indicates that this is a build task, and that it is of type DotNetCoreCLI. This task is fully defined in Microsoft’s VSTS Tasks repository. Looking at that JSON file, we can see that the version of the task is 2.1.8 (at the time of writing). We only need to specify the major version within our step definition, and we do that just after the @ symbol.

displayName: Restore NuGet Packages specifies the user-displayable name of the step, which will appear on the build logs.

inputs: specifies the properties that the task takes as inputs. This will vary from task to task, and the task definition will be one source you can use to find the correct names and values for these inputs.

command: restore tells the .NET Core task that it should run the dotnet restore command.

projects: src tells the .NET Core task that it should run the command with the src folder as an additional argument. This means that this task is the equivalent of running dotnet restore src from the command line.

The other .NET Core tasks are similar, so I won’t include them here – but you can see the full YAML file below.

Finally, we have a build step to publish the artifacts that we’ve generated. Here’s how we define this step:

This uses the PublishBuildArtifacts task. If we consult the definition for this task on the Microsoft GitHub repository, we can see This task accepts several arguments. The ones we’re setting are:

  • pathToPublish is the path on the build agent where the dotnet publish step has saved its output. (As you will see in the full YAML file below, I manually overrode this in the dotnet publish step.)
  • artifactName is the name that is given to the build artifact. As we only have one, I’ve kept the name fairly generic and just called it deploy. In other projects, you might have multiple artifacts and then give them more meaningful names.
  • artifactType is set to container, which is the internal ID for the Artifact publish location: Visual Studio Team Services/TFS option.

Here is the complete Build.yaml file:

Set up a Build Configuration and Run

Now we can set up a build configuration in VSTS and tell it to use the YAML file. This is a one-time operation. In VSTS’s Build and Release section, go to the Builds tab, and then click New Definition.

Create new build definition in VSTS

You should see YAML as a template type. If you don’t see this option, check you enabled the feature as described above.

YAML build template in VSTS

We’ll configure our build configuration with the Hosted VS2017 queue (this just means that our builds will run on a Microsoft-managed build agent, which has Visual Studio 2017 installed). We also have to specify the relative path to the YAML file in the repository, which in our case is build/build.yaml.

Build configuration in VSTS

Now we can save and queue a build. Here’s the final output from the build:

Successful build

(Yes, this is build number 4 – I made a couple of silly syntax errors in the YAML file and had to retry a few times!)

As you can see, the tasks all ran successfully and the test passed. Under the Artifacts tab, we can also see that the deploy artifact was created:

Build artifact

Tips and Other Resources

This is still a preview feature, so there are still some known issues and things to watch out for.

Documentation

There is a limited amount of documentation available for creating and using this feature. The official documentation links so far are:

In particular, the properties for each task are not documented, and you need to consult the task’s task.json file to understand how to structure the YAML syntax. Many of the built-in tasks are defined in Microsoft’s GitHub repository, and this is a great starting point, but more comprehensive documentation would definitely be helpful.

Examples

There aren’t very many example templates available yet. That is why I wrote this article. I also recommend Marcus Felling’s recent blog post, where he provides a more complex example of a YAML build definition.

Tooling

As I mentioned above, there is limited tooling available currently. The VSTS team have indicated that they will soon provide the ability to export an existing build definition as YAML, which will help a lot when trying to generate and understand YAML build definitions. My personal preference will still be to craft them by hand, but this would be a useful feature to help bootstrap new templates.

Similarly, there currenly doesn’t appear to be any validation of the parameters passed to each task. If you misspell a property name, you won’t get an error – but the task won’t behave as you expect. Hopefully this experience will be improved over time.

Error Messages

The error messages that you get from the build system aren’t always very clear. One error message I’ve seen frequently is Mapping values are not allowed in this context. Whenever I’ve had this, it’s been because I did something wrong with my YAML indentation. Hopefully this saves somebody some time!

Releases

This feature is only available for VSTS build configurations right now. Release configurations will also be getting the YAML treatment, but it seems this won’t be for another few months. This will be an exciting development, as in my experience, release definitions can be even more complex than build configurations, and would benefit from all of the peer review, versioning, and branching goodness I described above.

Variable Binding and Evaluation Order

While you can use VSTS variables in many places within the YAML build definitions, there appear to be some properties that can’t be bound to a variable. One I’ve encountered is when trying to link an Azure subscription to a property.

For example, in one of my build configurations, I want to publish a Docker image to an Azure Container Registry. In order to do this I need to pass in the Azure service endpoint details. However, if I specify this as a variable, I get an error – I have to hard-code the service endpoint’s identifier into the YAML file. This is not something I want to do, and will become a particular issue when release definitions can be defined in YAML, so I hope this gets fixed soon.

Summary

Build definitions and build scripts are an integral part of your application’s source code, and should be treated as such. By storing your build definition as a YAML file and placing it into your repository, you can begin to improve the quality of your build pipeline, and take advantage of source control features like diffing, versioning, branching, and pull request review.

VSTS Build Definitions as YAML Part 1: What and Why?

Visual Studio Team Services (VSTS) has recently gained the ability to create build definitions as YAML files. This feature is currently in preview. In this post, I’ll explain why this is a great addition to the VSTS platform and why you might want to define your builds in this way. In the next post I’ll work through an example of using this feature, and I’ll also provide some tips and links to documentation and guidance that I found helpful when constructing some build definitions myself.

What Are Build Definitions?

If you use a build server of some kind (and you almost certainly should!), you need to tell the build server how to actually build your software. VSTS has the concept of a build configuration, which specifies how and when to build your application. The ‘how’ part of this is the build definition. Typically, a build definition will outline how the system should take your source code, apply some operations to it (like compiling the code into binaries), and emit build artifacts. These artifacts usually then get passed through to a release configuration, which will deploy them to your release environment.

In a simple static website, the build definition might simply be one step that copies some files from your source control system into a build artifact. In a .NET Core application, you will generally use the dotnet command-line tool to build, test, and publish your application. Other application frameworks and languages will have their own way of building their artifacts, and in a non-trivial application, the steps involved in building the application might get fairly complex, and may even trigger PowerShell or Bash scripts to allow for further flexibility and advanced control flow and logic.

Until now, VSTS has really only allowed us to create and modify build definitions in its web editor. This is a great way to get started with a new build definition, and you can browse the catalog of available steps, add them into your build definition, and configure them as necessary. You can also define and use variables to allow for reuse of information across steps. However, there are some major drawbacks to defining your build definition in this way.

Why Store Build Definitions in YAML?

Build definitions are really just another type of source code. A build definition is just a specification of how your application should be built. The exact list of steps, and the sequence in which they run, is the way in which we are defining part of our system’s behaviour. If we adopt a DevOps mindset, then we want to make sure we treat all of our system as source code, including our build definitions, and we want to take this source code seriously.

In November 2017, Microsoft announced that VSTS now has the ability to run builds that have been defined as a YAML file, instead of through the visual editor. The YAML file gets checked into the source control system, exactly the same way as any other file. This is similar to the way Travis CI allows for build definitions to be specified in YAML files, and is great news for those of us who want to treat our build definitions as code. It gives us many advantages.

Versioning

We can keep build definitions versioned, and can track the history of a build definition over time. This is great for audibility, as well as to ensure that we have the ability to consult or roll back to a previous version if we accidentally make a breaking change.

Keeping Build Definitions with Code

We can store the build definitions alongside the actual code that it builds, meaning that we are keeping everything tidy and in one place. Until now, if we wanted to fully understand the way the application was built, we’d have to look at at the code repository as well as the VSTS build definition. This made the overall process harder to understand and reason about. In my opinion, the fewer places we have to remember to check or update during changes the better.

Branching

Related to the last point, we can also use make use of important features of our source control system like branching. If your team uses GitHub Flow or a similar Git-based branching strategy, then this is particularly advantageous.

Let’s take the example of adding a new unit test project to a .NET Core application. Until now, the way you might do this is to set up a feature branch on which you develop the new unit test project. At some point, you’ll want to add the execution of these tests to your build definition. This would require some careful timing and consideration. If you update the build definition before your branch is merged, then any builds you run in the meantime will likely fail – they’ll be trying to run a unit test project that doesn’t exist outside of your feature branch yet. Alternatively, you can use a draft version of your build configuration, but then you need to plan exactly when to publish that draft.

If we specify our build definition in a YAML file, then this change to the build process is simply another change that happens on our feature branch. The feature branch’s YAML file will contain a new step to run the unit tests, but the master branch will not yet have that step defined in its YAML file. When the build configuration runs on the master branch before we merge, it will get the old version of the build definition and will not try to run our new unit tests. But any builds on our feature branch, or the master branch once our feature branch is merged, will include the new tests.

This is very powerful, especially in the early stages of a project where you are adding and changing build steps frequently.

Peer Review

Taking the above point even further, we can review a change to a build definition YAML file in the same way that we would review any other code changes. If we use pull requests (PRs) to merge our changes back into the master branch then we can see the change in the build definition right within the PR, and our team members can give them the same rigorous level of review as they do our other code files. Similarly, they can make sure that changes in the application code are reflected in the build definition, and vice versa.

Reuse

Another advantage of storing build definitions in a simple format like YAML is being able to copy and paste the files, or sections from the files, and reuse them in other projects. Until now, copying a step from one build definition to another was very difficult, and required manually setting up the steps. Now that we can define our build steps in YAML files, it’s often simply a matter of copying and pasting.

Linting

A common technique in many programming environments is linting, which involves running some sort of analysis over a file to check for potential errors or policy violations. Once our build definition is defined in YAML, we can perform this same type of analysis if we wanted to write a tool to do it. For example, we might write a custom linter to check that we haven’t accidentally added any credentials or connection strings into our build definition, or that we haven’t mistakenly used the wrong variable syntax. YAML is a standard format and is easily parsable across many different platforms, so writing a simple linter to check for your particular policies is not a complex endeavour.

Abstraction and Declarative Instruction

I’m a big fan of declarative programming – expressing your intent to the computer. In declarative programming, software figures out the right way to proceed based on your high-level instructions. This is the idea behind many different types of ‘desired-state’ automation, and in abstractions like LINQ in C#. This can be contrasted with an imperative approach, where you specify an explicit sequence of steps to achieve that intent.

One approach that I’ve seen some teams adopt in their build servers is to use PowerShell or Bash scripts to do the entire build. This provided the benefits I outlined above, since those script files could be checked into source control. However, by dropping down to raw script files, this meant that these teams couldn’t take advantage of all of the built-in build steps that VSTS provides, or the ecosystem of custom steps that can be added to the VSTS instance.

Build definitions in VSTS are a great example of a declarative approach. They are essentially an abstraction of a sequence of steps. If you design your build process well, it should be possible for a newcomer to look at your build definition and determine what the sequence of steps are, why they are happening, and what the side-effects and outcomes will be. By using abstractions like VSTS build tasks, rather than hand-writing imperative build logic as a sequence of command-line steps, you are helping to increase the readability of your code – and ultimately, you may increase the performance and quality by allowing the software to translate your instructions into actions.

YAML build definitions give us all of the benefits of keeping our build definitions in source control, but still allows us to make use of the full power of the VSTS build system.

Inline Documentation

YAML files allow for comments to be added, and this is a very helpful way to document your build process. Frequently, build definitions can get quite complex, with multiple steps required to do something that appears rather trivial, so having the ability to document the process right inside the definition itself is very helpful.

Summary

Hopefully through this post, I’ve convinced you that storing your build definition in a YAML file is a much tidier and saner approach than using the VSTS web UI to define how your application is built. In the next post, I’ll walk through an example of how I set up a simple build definition in YAML, and provide some tips that I found useful along the way.

Monitoring Azure Storage Queues with Application Insights and Azure Monitor

Azure Queues provides an easy queuing system for cloud-based applications. Queues allow for loose coupling between application components, and applications that use queues can take advantage of features like peek-locking and multiple retry attempts to enable application resiliency and high availability. Additionally, when Azure Queues are used with Azure Functions or Azure WebJobs, the built-in poison queue support allows for messages that repeatedly fail processing attempts to be moved to a dedicated queue for later inspection.

An important part of operating a queue-based application is monitoring the length of queues. This can tell you whether the back-end parts of the application are responding, whether they are keeping up with the amount of work they are being given, and whether there are messages that are causing problems. Most applications will have messages being added to and removed from queues as part of their regular operation. Over time, an operations team will begin to understand the normal range for each queue’s length. When a queue goes out of this range, it’s important to be alerted so that corrective action can be taken.

Azure Queues don’t have a built-in queue length monitoring system. Azure Application Insights allows for the collection of large volumes of data from an application, but it does not support monitoring queue lengths with its built-in functionality. In this post, we will create a serverless integration between Azure Queues and Application Insights using an Azure Function. This will allow us to use Application Insights to monitor queue lengths and set up Azure Monitor alert emails if the queue length exceeds a given threshold.

Solution Architecture

There are several ways that Application Insights could be integrated with Azure Queues. In this post we will use Azure Functions. Azure Functions is a serverless platform, allowing for blocks of code to be executed on demand or at regular intervals. We will write an Azure Function to poll the length of a set of queues, and publish these values to Application Insights. Then we will use Application Insights’ built-in analytics and alerting tools to monitor the queue lengths.

Base Application Setup

For this sample, we will use the Azure Portal to create the resources we need. You don’t even need Visual Studio to follow along. I will assume some basic familiarity with Azure.

First, we’ll need an Azure Storage account for our queues. In our sample application, we already have a storage account with two queues to monitor:

  • processorders: this is a queue that an API publishes to, and a back-end WebJob reads from the queue and processes its items. The queue contains orders that need to be processed.
  • processorders-poison: this is a queue that WebJobs has created automatically. Any messages that cannot be processed by the WebJob (by default after five attempts) will be moved into this queue for manual handling.

Next, we will create an Azure Functions app. When we create this through the Azure Portal, the portal helpfully asks if we want to create an Azure Storage account to store diagnostic logs and other metadata. We will choose to use our existing storage account, but if you prefer, you can have a different storage account than the one your queues are in. Additionally, the portal offers to create an Application Insights account. We will accept this, but you can create it separately later if you want.

1-FunctionsApp

Once all of these components have been deployed, we are ready to write our function.

Azure Function

Now we can write an Azure Function to connect to the queues and check their length.

Open the Azure Functions account and click the + button next to the Functions menu. Select a Timer trigger. We will use C# for this example. Click the Create this function button.

2-Function

By default, the function will run every five minutes. That might be sufficient for many applications. If you need to run the function on a different frequency, you can edit the schedule element in the function.json file and specify a cron expression.

Next, paste the following code over the top of the existing function:

This code connects to an Azure Storage account and retrieves the length of each queue specified. The key parts here are:

var connectionString = System.Configuration.ConfigurationManager.AppSettings["AzureWebJobsStorage"];

Azure Functions has an application setting called AzureWebJobsStorage. By default this refers to the storage account created when we provisioned the functions app. If you wanted to monitor a queue in another account, you could reference the storage account connection string here.

var queue = queueClient.GetQueueReference(queueName);
queue.FetchAttributes();
var length = queue.ApproximateMessageCount;

When you obtain a reference to a queue, you must explicitly fetch the queue attributes in order to read the ApproximateMessageCount. As the name suggests, this count may not be completely accurate, especially in situations where messages are being added and removed at a high rate. For our purposes, an approximate message count is enough for us to monitor.

log.Info($"{queueName}: {length}");

For now, this line will let us view the length of the queues within the Azure Functions log window. Later, we will switch this out to log to Application Insights instead.

Click Save and run. You should see something like the following appear in the log output window below the code editor:

2017-09-07T00:35:00.028 Function started (Id=57547b15-4c3e-42e7-a1de-1240fdf57b36)
2017-09-07T00:35:00.028 C# Timer trigger function executed at: 9/7/2017 12:35:00 AM
2017-09-07T00:35:00.028 processorders: 1
2017-09-07T00:35:00.028 processorders-poison: 0
2017-09-07T00:35:00.028 Function completed (Success, Id=57547b15-4c3e-42e7-a1de-1240fdf57b36, Duration=9ms)

Now we have our function polling the queue lengths. The next step is to publish these into Application Insights.

Integrating into Azure Functions

Azure Functions has integration with Appliation Insights for logging of each function execution. In this case, we want to save our own custom metrics, which is not currently supported by the built-in integration. Thankfully, integrating the full Application Insights SDK into our function is very easy.

First, we need to add a project.json file to our function app. To do this, click the View files tab on the right pane of the function app. Then click the + Add button, and name your new file project.json. Paste in the following:

This adds a NuGet reference to the Microsoft.ApplicationInsights package, which allows us to use the full SDK from our function.

Next, we need to update our function so that it writes the queue length to Application Insights. Click on the run.csx file on the right-hand pane, and replace the current function code with the following:

The key new parts that we have just added are:

private static string key = TelemetryConfiguration.Active.InstrumentationKey = System.Environment.GetEnvironmentVariable("APPINSIGHTS_INSTRUMENTATIONKEY", EnvironmentVariableTarget.Process);
private static TelemetryClient telemetry = new TelemetryClient() { InstrumentationKey = key };

This sets up an Application Insights TelemetryClient instance that we can use to publish our metrics. Note that in order for Application Insights to route the metrics to the right place, it needs an instrumentation key. Azure Functions’ built-in integration with Application Insights means that we can simply reference the instrumentation key it has set up for us. If you did not set up Application Insights at the same time as the function app, you can configure this separately and set your instrumentation key here.

Now we can publish our custom metrics into Application Insights. Note that Application Insights has several different types of custom diagnostics that can be tracked. In this case, we use metrics since they provide the ability to track numerical values over time, and set up alerts as appropriate. We have added the following line to our foreach loop, which publishes the queue length into Application Insights.

telemetry.TrackMetric($"Queue length - {queueName}", (double)length);

Click Save and run. Once the function executes successfully, wait a few minutes before continuing with the next step – Application Insights takes a little time (usually less than five minutes) to ingest new data.

Exploring Metrics in Application Insights

In the Azure Portal, open your Application Insights account and click Metrics Explorer in the menu. Then click the + Add chart button, and expand the Custom metric collection. You should see the new queue length metrics listed.

4-AppInsights

Select the metrics, and as you do, notice that a new chart is added to the main panel showing the queue length count. In our case, we can see the processorders queue count fluctuate between one and five messages, while the processorders-poison queue stays empty. Set the Aggregation property to Max to better see how the queue fluctuates.

5-AppInsightsChart

You may also want to click the Time range button in the main panel, and set it to Last 30 minutes, to fully see how the queue length changes during your testing.

Setting up Alerts

Now that we can see our metrics appearing in Application Insights, we can set up alerts to notify us whenever a queue exceeds some length we specify. For our processorders queue, this might be 10 messages – we know from our operations history that if we have more than ten messages waiting to be processed, our WebJob isn’t processing them fast enough. For our processorders-poison queue, though, we never want to have any messages appear in this queue, so we can have an alert if more than zero messages are on the queue. Your thresholds may differ for your application, depending on your requirements.

Click the Alert rules button in the main panel. Azure Monitor opens. Click the + Add metric alert button. (You may need to select the Application Insights account in the Resource drop-down list if this is disabled.)

On the Add rule pane, set the values as follows:

  • Name: Use a descriptive name here, such as `QueueProcessOrdersLength`.
  • Metric: Select the appropriate `Queue length – queuename` metric.
  • Condition: Set this to the value and time period you require. In our case, I have set the rule to `Greater than 10 over the last 5 minutes`.
  • Notify via: Specify how you want to be notified of the alert. Azure Monitor can send emails, call a webhook URL, or even start a Logic App. In our case, I have opted to receive an email.

Click OK to save the rule.

6-Alert.PNG

If the queue count exceeds your specified limit, you will receive an email shortly afterwards with details:

7-Alert

Summary

Monitoring the length of your queues is a critical part of operating a modern application, and getting alerted when a queue is becoming excessively long can help to identify application failures early, and thereby avoid downtime and SLA violations. Even though Azure Storage doesn’t provide a built-in monitoring mechanism, one can easily be created using Azure Functions, Application Insights, and Azure Monitor.