Cosmos DB is a fantastic database service for many different types of applications. But it can also be quite expensive, especially if you have a number of instances of your database to maintain. For example, in some enterprise development teams you may need to have dev, test, UAT, staging, and production instances of your application and its components. Assuming you’re following best practices and keeping these isolated from each other, that means you’re running at least five Cosmos DB collections. It’s easy for someone to accidentally leave one of these Cosmos DB instances provisioned at a higher throughput than you expect, and before long you’re racking up large bills, especially if the higher throughput is left overnight or over a weekend.

In this post I’ll describe an approach I’ve been using recently to ensure the Cosmos DB collections in my subscriptions aren’t causing costs to escalate. I’ve created an Azure Function that will run on a regular basis. It uses a managed service identity to identify the Cosmos DB accounts throughout my whole Azure subscription, and then it looks at each collection in each account to check that they are set at the expected throughput. If it finds anything over-provisioned, it sends an email so that I can investigate what’s happening. You can run the same function to help you identify over-provisioned collections too.

Step 1: Create Function App

First, we need to set up an Azure Functions app. You can do this in many different ways; for simplicity, we’ll use the Azure Portal for everything here.

Click Create a Resource on the left pane of the portal, and then choose Serverless Function App. Enter the information it prompts for – a globally unique function app name, a subscription, a region, and a resource group – and click Create.

Screen Shot 2018-07-23 at 9.07.43 pm

Step 2: Enable a Managed Service Identity

Once we have our function app ready, we need to give it a managed service identity. This will allow us to connect to our Azure subscription and list the Cosmos DB accounts within it, but without us having to maintain any keys or secrets. For more information on managed service identities, check out my previous post.

Open up the Function Apps blade in the portal, open your app, and click Platform Features, then Managed service identity:

Screen Shot 2018-07-23 at 9.09.51 pm
Switch the feature to On and click Save.

Step 3: Create Authorisation Rules

Now we have an identity for our function, we need to grant it access to the parts of our Azure subscription we want it to examine for us. In my case I’ll grant it the rights over my whole subscription, but you could just give it rights on a single resource group, or even just a single Cosmos DB account. Equally you can give it access across multiple subscriptions and it will look through them all.

Open up the Subscriptions blade and choose the subscription you want it to look over. Click Access Control (IAM):

Screen Shot 2018-07-23 at 9.13.33 pm copy

Click the Add button to create a new role assignment.

The minimum role we need to grant the function app is called Cosmos DB Account Reader Role. This allows the function to discover the Cosmos DB accounts, and to retrieve the read-only keys for those accounts, as described here. The function app can’t use this role to make any changes to the accounts.

Finally, enter the name of your function app, click it, and click Save:

Screen Shot 2018-07-23 at 9.14.46 pm
This will create the role assignment. Your function app is now authorised to enumerate and access Cosmos DB accounts throughout the subscription.

Step 4: Add the Function

Next, we can actually create our function. Go back into the function app and click the + button next to Functions. We’ll choose to create a custom function:

Screen Shot 2018-07-23 at 9.34.58 pm

Then choose a timer trigger:

Screen Shot 2018-07-23 at 9.35.27 pm

Choose C# for the language, and enter the name CosmosChecker. (Feel free to use a name with more panache if you want.) Leave the timer settings alone for now:

Screen Shot 2018-07-23 at 9.36.46 pm

Your function will open up with some placeholder code. We’ll ignore this for now. Click the View files button on the right side of the page, and then click the Add button. Create a file named project.json, and then open it and paste in the following, then click Save:

This will add the necessary package references that we need to find and access our Cosmos DB collections, and then to send alert emails using SendGrid.

Now click on the run.csx file and paste in the following file:

I won’t go through the entire script here, but I have added comments to try to make its purpose a little clearer.

Finally, click on the function.json file and replace the contents with the following:

This will configure the function app with the necessary timer, as well as an output binding to send an email. We’ll discuss most of these settings later, but one important setting to note is the schedule setting. The value I’ve got above means the function will run every hour. You can change it to other values using CRON expressions, such as:

  • Run every day at 9.30am UTC: 0 30 9 * * *
  • Run every four hours: 0 0 */4 * * *
  • Run once a week: 0 0 * * 0

You can decide how frequently you want this to run and replace the schedule with the appropriate value from above.

Step 5: Get a SendGrid Account

We’re using SendGrid to send email alerts. SendGrid has built-in integration with Azure Functions so it’s a good choice, although you’re obviously welcome to switch out for anything else if you’d prefer. You might want an SMS message to be sent via Twilio, or a message to be sent to Slack via the Slack webhook API, for example.

If you don’t already have a SendGrid account you can sign up for a free account on their website. Once you’ve got your account, you’ll need to create an API key and have it ready for the next step.

Step 6: Configure Function App Settings

Click on your function app name and then click on Application settings:

Screen Shot 2018-07-23 at 9.39.34 pm

Scroll down to the Application settings section. We’ll need to enter three settings here:

  1. Setting name: SendGridKey. This should have a value of your SendGrid API key from step 5.
  2. Setting name: AlertToAddress. This should be the email address that you want alerts to be sent to.
  3. Setting name: AlertFromAddress. This should be the email address that you want alerts to be sent from. This can be the same as the ‘to’ address if you want.

Your Application settings section should look something like this:
Screen Shot 2018-07-23 at 9.31.26 pm

Step 7: Run the Function

Now we can run the function! Click on the function name again (CosmosChecker), and then click the Run button. You can expand out the Logs pane at the bottom of the screen if you want to watch it run:

Screen Shot 2018-07-23 at 9.42.36 pm

Depending on how many Cosmos DB accounts and collections you have, it may take a minute or two to complete.

If you’ve got any collections provisioned over 2000 RU/s, you should receive an email telling you this fact:

Screen Shot 2018-07-23 at 9.48.37 pm.png

Configuring Alert Policies

By default, the function is configured to alert whenever it sees a Cosmos DB collection provisioned over 2000 RU/s. However, your situation may be quite different to mine. For example, you may want to be alerted whenever you have any collections provisioned over 1000 RU/s. Or, you may have production applications that should be provisioned up to 100,000 RU/s, but you only want development and test collections provisioned at 2000 RU/s.

You can configure alert policies in two ways.

First, if you have a specific collection that should have a specific policy applied to it – like the production collection I mentioned that should be allowed to go to 100,000 RU/s – then you can create another application setting. Give it the name MaximumThroughput:{account_name}:{database_name}:{collection_name}, and set the value to the limit you want for that collection.

For example, a collection named customers in a database named customerdb in an account named myaccount-prod would have a setting named MaximumThroughput:myaccount-prod:customerdb:customers. The value would be 100000, assuming you wanted the function to check this collection against a limit of 100,000 RU/s.

Second, by default the function has a default quota of 2000 RU/s. You can adjust this to whatever value you want by altering the value on line 17 of the function code file (run.csx).

ARM Template

If you want to deploy this function for yourself, you can also use an ARM template I have prepared. This performs all the steps listed above except step 3, which you still need to do manually.
 

Of course, you are also welcome to adjust the actual logic involved in checking the accounts and collections to suit your own needs. The full code is available on GitHub and you are welcome to take and modify it as much as you like! I hope this helps to avoid some nasty bill shocks.

Category:
Azure Platform
Tags:
, , ,

Join the conversation! 1 Comment

  1. Yeah, welcome back the John Downs blog

Comments are closed.