Cosmos DB Server-Side Programming with TypeScript – Part 3: Stored Procedures

Stored procedures, the second type of server-side code that can run within Cosmos DB, provide the ability to execute blocks of functionality from inside the database engine. Typically we use stored procedures for discrete tasks that can be encapsulated within a single invocation. In this post, we will discuss some situations where stored procedures can be used and the actions and queries that they can perform. We’ll then start to work through the server-side API model, and look at how we can work with the incoming stored procedure invocation’s request and response as well as the Cosmos DB collection itself. Then we’ll build a simple stored procedure using TypeScript.

This post is part of a series of posts about server-side programming for Cosmos DB:

  • Part 1 gives an overview of the server side programmability model, the reasons why you might want to consider server-side code in Cosmos DB, and some key things to watch out for.
  • Part 2 deals with user-defined functions, the simplest type of server-side programming, which allow for adding simple computation to queries.
  • Part 3 (this post) talks about stored procedures. These provide a lot of powerful features for creating, modifying, deleting, and querying across documents – including in a transactional way.
  • Part 4 (coming soon) introduces triggers. Triggers come in two types – pre-triggers and post-triggers – and allow for behaviour like validating and modifying documents as they are inserted or updated, and creating secondary effects as a result of changes to documents in a collection.
  • Part 5 (coming soon) discusses unit testing your server-side scripts. Unit testing is a key part of building a production-grade application, and even though some of your code runs inside Cosmos DB, your business logic can still be tested.
  • Finally, part 6 (coming soon) explains how server-side scripts can be built and deployed into a Cosmos DB collection within an automated build and release pipeline, using Microsoft Visual Studio Team Services (VSTS).

Stored Procedures

Stored procedures let us encapsulate blocks of functionality, and then later invoke them with a single call. As with the other server-side code types, the code inside the stored procedure run inside the Cosmos DB database engine. Within a stored procedure we can perform a range of different actions, including querying documents as well as creating, updating, and deleting documents. These actions are done within the collection that the stored procedure is installed in. Of course, we can augment these collection-based actions with custom logic that we write ourselves in JavaScript or – as is the case in this series – TypeScript.

Stored procedures are extremely versatile, and can be used for a number of different tasks including:

  • Encapsulating a complex set of queries and executing them as one logical operation – we will work with this example below.
  • Retrieving data for a report using a complex set of queries, and combining the results into a single response that can be bound to a report UI.
  • Generating mock data and inserting it into the collection.
  • Doing a batch insert, update, upsert, or delete of multiple documents, taking advantage of the transactional processing of stored procedures.

Of course, if you are building a simple application without the need for complex queries, you may be able to achieve everything you need with just the Cosmos DB client-side SDKs. However, stored procedures do give us some power and flexibility that is not possible with purely client-based development, including transaction semantics.

Transactions and Error Handling

Cosmos DB’s client programming model does not provide for transactional consistency. However, stored procedures do run within an implicit transaction. This means that if you modify multiple documents within a stored procedure, then either all of those changes will be saved or – in the event of an error – none of them will be saved. Transactions provide four guarantees (atomicity, consistency, isolation, and durability, also known as ACID). More information on Cosmos DB transactions is available here.

The Cosmos DB engine handles committing and rolling back transactions automatically. If a stored procedure completes without any errors being thrown then the transaction will be committed. However, if even one unhandled error is thrown, the transaction will be rolled back and no changes will be made.

Working with the Context

Unlike user-defined functions, which we discussed in part 1 of this series, stored procedures allow us to access and make changes to the collection that they run within. We can also return results from stored procedures. Both of these types of actions require that we work with the context object.

Within a stored procedure, we can make a call to the getContext() function. This returns back an object with three functions.

  • getContext().getRequest() is used to access the request details. This is mostly helpful for triggers, and we will discuss this in part 4 of this series.
  • getContext().getResponse() lets us set the response that we should send back. For stored procedures, this is the way that we will return data back to the client if the stored procedure has something to return.
  • getContext().getCollection() gives us access to the Cosmos DB collection that the stored procedure runs within. In turn, this will let us read and write documents.

Each of the calls above corresponds to a type – Context, Request, Response, and Collection, respectively. Each of these types, in turn, provide a set of functions for interacting with the object. For example, getContext().getCollection().queryDocuments() lets us run a query against documents in the collection, and getContext().getResponse().setBody() lets us specify the output that we want the stored procedure to return. We’ll see more of these functions as we go through this series.

Also note that the double-underscore (__) is automatically mapped to the getContext().getCollection() function. In this series we won’t use this shortcut because I want to be more explicit, especially to help with testing when we get to part 5.

Type Definitions

Human-readable documentation for the types and their members is provided by Microsoft here. Of course, one of the main reasons we’re using TypeScript in this series is so that we get type checking and strong typing against the Cosmos DB object model, so a human-readable description isn’t really sufficient. In TypeScript, this is done through type definitions – descriptions of types and their members that TypeScript can use to power its type system.

While Microsoft doesn’t provide first-party TypeScript type definitions for Cosmos DB, an open-source project named DefinitelyTyped provides and publishes these definitions here. (Note that the type definitions use the old name for Cosmos DB – DocumentDB – but they are still valid and being updated.)

Queries

One of the main things we’ll frequently do from inside stored procedures is execute queries against Cosmos DB. This is how we can retrieve data and perform custom logic on it within the stored procedure. Cosmos DB provides an integrated JavaScript query syntax for executing queries. The syntax is documented here and lets us write queries like this:

 

 

 

 

 

 

 

 

 

which will map to the following SQL query:

 

 

 

 

However, there are some limitations to this query syntax. We can’t perform aggregations, and we can’t do queries using user-defined functions. These limitations may be lifted in future, but for now, in this series we will use the SQL syntax instead so that we can get the full power of Cosmos DB’s query engine. We can use this type of syntax to make a SQL-based query:

 

 

 

 

 

 

 

 

In your own stored procedures and triggers you can decide which approach – integrated JavaScriptor SQL – makes more sense.

We can also request a document by ID if we want to as well:

 

 

 

 

 

 

Another important consideration when executing queries is that Cosmos DB limits the amount of time that a stored procedure can run for. We can test whether our stored procedure is approaching that time limit by inspecting the return value we receive when we submit the query. If we receive a false response, it means the query wasn’t accepted – and that it’s probably time to wrap up our logic before we get forcibly shut off. In some stored procedures, receiving a false like this may mean that you simply throw an error and consider the whole stored procedure execution to have failed.

Parameterised Queries

In the last section, we discussed using the collection.queryDocuments function to execute a SQL query against the collection. While this technically works, once we start including parameters in our queries then we’d need to concatenate them into the query string. This is a very, very bad idea – it opens us up to a class of security vulnerabilities called SQL injection attacks.

When we’re writing SQL queries with parameters, we should instead the overload of the collection.queryDocuments function that accepts an IParameterizedQuery instead of a string. By doing this, we pass our parameters explicitly and ensure that they are handled and escaped appropriately by the database engine. Here’s an example of how we can do this from our TypeScript code:

 

 

 

 

 

Updating Documents

We can also make changes to documents within the collection,. There are several functions on the Collection type to help with this, including:

  • createDocument inserts a new document into the collection.
  • replaceDocument updates an existing document in the collection. You must provide the document link to use this function.
  • deleteDocument deletes a document from the collection.
  • upsertDocumentThere are also functions to deal with attachments to documents, but we won’t work with those in this series.

These functions that work with existing documents also take an optional parameter to specify the etag of the document. This allows for us to take advantage of optimistic concurrency. Optimistic concurrency is very useful, but is outside the scope of this series.

Structuring Your Stored Procedure Code

Stored procedures will often become more complex than UDFs, and may incorporate business logic as well as interaction with the Cosmos DB collection inside the code. When we’re writing a production-grade application it is important that we structure our code correctly so that each part is testable, and has a clearly defined responsibility and interactions with other components. In Cosmos DB, we are interacting with the collection in order to execute queries and update documents, and these side effects are important to test as well. We’ll discuss testing in more detail in part 5, but even before we worry about testing, it’s a good idea to structure our code properly.

When I write Cosmos DB stored procedures, I find it’s helpful to have a simple ‘entry point’ function. This entry point does the interaction with the getContext() function and retrieves the Collection, Request, and Response objects as required. These, along with any other parameters, are then passed into an internal implementation function, which in turn may invoke other functions to do other parts of the logic. By structuring the functions in this way we can ensure that each function has a clear purpose, and that the external components can be mocked and/or spied upon during our tests.

Writing our stored procedure in TypeScript also gives us the ability to store our functions in different .ts files if we want. This is helpful when we have long and complicated stored procedures, or if we want to keep interfaces and functions in separate files. This is largely a choice of style, since TypeScript’s compiler will simply combine all of the functions together at compilation time and will emit a single JavaScript output file (because we have set the outFile property in our tsconfig.json file). One important note on this though – if you have functions spread across multiple files, it is important to pay attention to the order in which the functions get emitted. The stored procedure’s ‘entry point’ function must appear first in the output JavaScript file. TypeScript can be instructed to do this by explicitly listing the entry point function’s file first in the include directive within the tsconfig.json file, and then having a wildcard * to catch the remaining files, like this:

 

 

 

 

 

 

 

Calling Stored Procedures

Once a stored procedure is written, we need to call it in order to check that it’s working, and then to use it in our real applications. There are several ways we can call our stored procedure and pass in the arguments it expects.

  • The Azure Portal provides a test interface for invoking stored procedures. This is how we will test the stored procedure we write below.
  • The client SDKs provide platform-specific features for invoking stored procedures. For example, the .NET client library for Cosmos DB provides the DocumentClient.ExecuteStoredProcedureAsync function, which accepts the ID of a stored procedure and any arguments that it might be expecting.
  • The Cosmos DB REST API also allows for invoking stored procedures directly.

Again, the exact way you call stored procedures may depend on the Cosmos DB API you are targeting – in this series we are using the SQL API, and the invocation mechanisms for MongoDB, Gremlin, and the other APIs may be different.

Now that we have talked about the different aspects of writing stored procedures, we can think about how we might use a stored procedure in our sample scenario.

Defining our Stored Procedure

In our hypothetical ordering application, we have a Cosmos DB collection of documents representing orders. Each order document contains a customer’s ID (as we saw in part 2 of this series), and also contains a set of items that the customer ordered. An order item consists of the product ID the customer ordered and the quantity of that product.

Because Cosmos DB is a schemaless database, our order documents may coexist with many other documents of different types. Therefore, we also include a type parameter on the document to indicate that it is an order. This type discriminator pattern is quite common in schemaless databases.

An example of an order document in our collection is:

For this stored procedure we want to pass in a set of product IDs, and get back a grouped list of IDs for customers who have ordered any of those products. This is similar to doing a GROUP BY in a relational database – but currently Cosmos DB doesn’t provide this sort of grouping feature, and so we are using a stored procedure to fill in the gap. Doing this from the client SDK would require multiple queries against the collection, but by using a stored procedure we can just make one call.

At a high level, our stored procedure logic looks like this:

  1. Accept a list of product IDs as an argument.
  2. Iterate through the product IDs.
  3. For each product ID, run a query to retrieve the customer IDs for the customers that have ordered that product, filtering to only query on order documents. The query we’ll run looks like this:

 

 

  1. Once we have all of the customer IDs for each product in our list, create a JSON object to represent the results like this:

Preparing a Folder

Now we can start writing our stored procedure. If you want to compare against my completed stored procedure, you can access it on GitHub.

In part 2 of this series we covered how to set up the Cosmos DB account and collection, so I won’t go through that again. We also will reuse the same folder structure as we did in part 2, so you can refer to that post if you’re following along.

There’s one major difference this time though. In our package.json file, we need to add a second entry into the devDependencies list to tell NPM that we want to include the TypeScript type definitions for Cosmos DB. Our package.json file will look like this:

 

 

 

Open a command prompt or terminal window, and run npm install within this folder. This will initialise TypeScript and the type definitions.

We’ll also adjust the tsconfig.json file to emit a file with the name sp-getGroupedOrders.js:

Writing the Stored Procedure

Now let’s create a file named src/getGroupedOrders.ts. This is going to contain our stored procedure code. Let’s start with adding a basic function that will act as our entry point:

As discussed above, this is a pattern I try to follow when I write Cosmos DB server-side code. It helps to keep the interaction with the getContext() function isolated to this one place, and then all subsequent functions will work with explicit objects that we’ll pass around. This will help when we come to test this code later. You can see that this function calls the getGroupedOrdersImpl function, which does the actual work we need done – we’ll write this shortly.

Before then, though, let’s write a basic interface that will represent our response objects:

 

Our getGroupedOrdersImpl function will accept an array of product IDs and the collection in which to query them, and it will return an array of these CustomersGroupedByProducts. Of course, since CustomersGroupedByProduct is a TypeScript interface, we can use it within our functions for type safety, but it will be stripped out when we compile the code into JavaScript.

Now we can add a basic shell of an implementation for our getGroupedOrdersImpl function. As you type this, notice that (if you’re in an editor that supports it, like Visual Studio Code) you get IntelliSense and statement completion thanks to the TypeScript definitions:

This function prepares a variable called outputArray, which will contain all of our product/customer groupings. Then we have some placeholder code to perform our actual queries, which we’ll fill in shortly. Finally, this function returns the output array.

Now we can fill in the placeholder code. Where we have REPLACEME in the function, replace it with this:

There’s a lot going on here, so let’s break it down:

  • The first part (lines 1-6) sets up a new IParameterizedQuery, which lets us execute a SQL query using parameters. As discussed above, this is a much more secure way to handle parameters than string concatenation. The query will find all orders containing the product ID we’re looking for, and will return back the customer ID.
  • Next, the query callback function is prepared (lines 7-18). This is what will be called when the query results are available. In this function we pull out the results and push them onto our outputArray, ready to return to the calling function.
  • Then we try to execute the query against the collection by using the collection.queryDocuments() function (line 19). This function returns a boolean to indicate whether the query was accepted (line 20). If it wasn’t, we consider this to be an error and immediately throw an error ourselves (line 22).

That’s it! The full stored procedure file looks like this:

Here’s what the your folder structure should now look like:

  • /
    • package.json
    • tsconfig.json
    • src/
      • getGroupedOrders.ts

Compiling the Stored Procedure

As in part 2, we can now compile our function to JavaScript. Open up a command prompt or terminal, and enter npm run build. You should see that a new output folder has been created, and inside that is a file named sp-getGroupedOrders.js. If you open this, you’ll see the JavaScript version of our function, ready to submit to Cosmos DB. This has all of the type information removed, but the core logic remains the same. Here’s what it should look like:

Deploying the Stored Procedure

Now let’s deploy the stored procedure to our Cosmos DB collection. Open the Azure Portal, browse to the Cosmos DB account, and then go to Script Explorer. Click Create Stored Procedure.

cosmos-sp-1

Enter getGroupedOrders as the ID, and then paste the contents of the compiled sp-getGroupedOrder.js JavaScript file into the body.

cosmos-sp-2

Click Save to install the stored procedure to Cosmos DB. (Once again, this isn’t the best way to install a stored procedure – we’ll look at better ways in part 6 of this series.)

Testing the Stored Procedure

Now let’s insert some sample data so that we can try the stored procedure out. Let’s insert these sample documents using Document Explorer, as described in part 2.

Here are the three sample documents we’ll use:

Now go back into Script Explorer, open the stored procedure, and notice that there is a test panel below the script body textbox. We can enter our stored procedures parameters into the Inputs field. Let’s do that now. Enter [["P1", "P2", "P10"]] – be careful to include the double square brackets around the array. Then click the Save & Execute button, and you should see the results.

cosmos-sp-3

If we reformat them, our results look like the following. We can see that we have an array containing an object for each product ID we passed into the query, and each object has a list of customer IDs who ordered that product:

So our stored procedure works! We’ve now successfully encapsulated the logic involved in querying for customers that have ordered any of a set of products.

Summary

Stored procedures give us a way to encapsulate queries and operations to be performed on a Cosmos DB collection, and to invoke them as a single unit. Stored procedures run within an implicit transaction, so any unhandled errors will result in the changes being rolled back. Unlike in UDFs, we are also able to access the collection within a stored procedure by using the getContext() function, and by retrieving the Response and Collection objects. This allows us to return rich data, including objects and arrays, as well as to interact with the collection and its documents. In the next part of this series we will discuss triggers, the third type of server-side programming available in Cosmos DB, which allow us to intercept changes happening to documents within the collection.

Key Takeaways

  • Stored procedures encapsulate logic, queries, and actions upon documents within the collection.
  • Stored procedures provide transactional isolation, and all stored procedures run within a transaction scope.
  • The getContext() function allows us to access the Response and Collection objects.
  • TypeScript definitions are available to help when writing code against these objects.
  • Cosmos DB provides an integrated query syntax, which is great for simple queries, but doesn’t cover all possible queries that can be executed against the collection.
  • Arbitrary SQL queries can also be executed. If these contain parameters then the IParameterizedQuery interface should be used to ensure safe coding practices are adhered to.
  • The order of functions inside the stored procedure’s file matters. The first function will be the one that Cosmos DB treats as the entry point.
  • You can view the code for this post on GitHub.

Easy Filtering of IoT Data Streams with Azure Stream Analytics and JSON reference data

siliconvalve

I am currently working on an next-gen widget dispenser solution that is gradually being rolled out to trial sites across Australia. The dispenser hardware is a modern platform that provides telemetry data that can be used for various purposes by the locations at which the dispenser is deployed and potentially by other third parties.

In addition to these next-gen dispensers we already have existing dispenser hardware at the locations that emits telemetry that we already use for other purposes in our solution. To our benefit both the new and existing hardware emits the same format telemetry data 🙂

A sample telemetry entry is shown below.

We take all of the telemetry data from new and old hardware at all our sites and feed it into an Azure Event Hub which allows us to perform multiple actions, such as archival of the data to Blob Storage using Azure Event Hub Capture

View original post 468 more words

Cosmos DB Server-Side Programming with TypeScript – Part 2: User-Defined Functions

User-defined functions (UDFs) in Cosmos DB allow for simple calculations and computations to be performed on values, entities, and documents. In this post I will introduce UDFs, and then provide detailed steps to set up a basic UDF written in TypeScript. Many of these same steps will be applicable to stored procedures and triggers, which we’ll look at in future posts.

This is the second part of a series of blog posts on server-side development using Cosmos DB with TypeScript.

  • Part 1 gives an overview of the server side programmability model, the reasons why you might want to consider server-side code in Cosmos DB, and some key things to watch out for.
  • Part 2 (this post) deals with user-defined functions, the simplest type of server-side programming, which allow for adding simple computation to queries.
  • Part 3 talks about stored procedures. These provide a lot of powerful features for creating, modifying, deleting, and querying across documents – including in a transactional way.
  • Part 4 (coming soon) introduces triggers. Triggers come in two types – pre-triggers and post-triggers – and allow for behaviour like validating and modifying documents as they are inserted or updated, and creating secondary effects as a result of changes to documents in a collection.
  • Part 5 (coming soon) discusses unit testing your server-side scripts. Unit testing is a key part of building a production-grade application, and even though some of your code runs inside Cosmos DB, your business logic can still be tested.
  • Finally, part 6 (coming soon) explains how server-side scripts can be built and deployed into a Cosmos DB collection within an automated build and release pipeline, using Microsoft Visual Studio Team Services (VSTS).

User-Defined Functions

UDFs are the simplest type of server-side development available for Cosmos DB. UDFs generally accept one or more parameters and return a value. They cannot access Cosmos DB’s internal resources, and cannot read or write documents from the collection, so they are really only intended for simple types of computation. They can be used within queries, including in the SELECT and WHERE clauses.

UDFs are simple enough that types are almost not necessary, but for consistency we will use TypeScript for these too. This will also allow us to work through the setup of a TypeScript project, which we’ll reuse for the next parts of this series.

One note on terminology: the word function can get somewhat overloaded here, since it can refer to the Cosmos DB concept of a UDF, or to the TypeScript and JavaScript concept of a function. This can get confusing, especially since a UDF can contain multiple JavaScript functions within its definition. For consistency I will use UDF when I’m referring to the Cosmos DB concept, and function when referring to the JavaScript or TypeScript concept.

Parameters

UDFs can accept zero or more parameters. Realistically, though, most UDFs will accept at least one parameter, since UDFs almost always operate on a piece of data of some kind. The UDF parameters can be of any type, and since we are running within Cosmos DB, they will likely be either a primitive type (e.g. a single string, number, array, etc), a complex type (e.g. a custom JavaScript object, itself comprised of primitive types and other complex types), or even an entire document (which is really just a complex type). This gives us a lot of flexibility. We can have UDFs that do all sorts of things, including:

  • Accept a single string as a parameter. Do some string parsing on it, then return the parsed value.
  • Accept a single string as well as another parameter. Based on the value of the second parameter, change the parsing behaviour, then return the parsed value.
  • Accept an array of values as a parameter. Combine the values using some custom logic that you define, then return the combined value.
  • Accept no parameters. Return a piece of custom data based on the current date and time.
  • Accept a complex type as a parameter. Do some parsing of the document and then return a single output.

Invoking a UDF

A UDF can be invoked from within the SELECT and WHERE clauses of a SQL query. To invoke a UDF, you need to include the prefix udf. before the function name, like this:

SELECT udf.parseString(c.stringField) FROM c

In this example, udf. is a prefix indicating that we want to call a UDF, and parseString is the name of the UDF we want to call. Note that this is identifier that Cosmos DB uses for the UDF, and is not necessarily the name of the JavaScript function that implements the UDF. (However, I strongly recommend that you keep these consistent, and will do so throughout this series.)

You can pass in multiple parameters to the UDF by comma delimiting them, like this:

SELECT udf.parseString(c.stringField, 1234) FROM c
SELECT udf.parseString(c.stringField1, c.stringField2) FROM C

To pass a hard-coded array into a UDF, you can simply use square brackets, like this:

SELECT udf.parseArray(["arrayValue1", "arrayValue2", "arrayValue3"])

Now that we’ve talked through some of the key things to know about UDFs let’s try writing one, using our sample scenario from part 1.

Defining our UDF

Let’s imagine that our order system was built several years ago, and our business has now evolved significantly. As a result, we are in the middle of changing our order schema to represent customer IDs in different ways. Cosmos DB makes this easy by not enforcing a schema, so we can simply switch to the new schema when we’re ready.

Our old way of representing a customer ID was like this:

Now, though, we are representing customers with additional metadata, like this:

However, we still want to be able to easily use a customer’s ID within our queries. We need a way to dynamically figure out the customer’s ID for an order, and this needs to work across our old and new schemas. This is a great candidate for a UDF. Let’s deploy a Cosmos DB account and set up this UDF.

Setting Up a Cosmos DB Account and Collection

First, we’ll deploy a Cosmos DB account and set up a database and collection using the Azure Portal. (Later in this series, we will discuss how this can be made more automatable.) Log into the Azure Portal and click New, then choose Cosmos DB. Click Create.

cosmos-udf-1

We need to specify a globally unique name for our Cosmos DB account – I have used johnorders, but you can use whatever you want. Make sure to select the SQL option in the API drop-down list. You can specify any subscription, resource group, and location that you want. Click Create, and Cosmos DB will provision the account – this takes around 5-10 minutes.

Once the account is created, open it in the Portal. Click Add Collection to add a new collection.

cosmos-udf-2

Let’s name the collection Orders, and it can go into a database also named Orders. Provision it with a fixed (10GB) capacity, and 400 RU/s throughput (the minimum).

cosmos-udf-3

Note that this collection will cost money to run, so you’ll want to remember to delete the collection when you’re finished. You can leave an empty Cosmo DB account for no charge, though.

Preparing a Folder

TypeScript requires that we provide it with some instructions on how to compile our code. We’ll also use Node Package Manager (NPM) to tie all of our steps together, and so need to prepare a few things before we can write our UDF’s code.

Note that in parts 2, 3, and 4 of this series, we will write each server-side component as if it was its own independent application. This is to keep each of these posts easy to follow in a standalone way. However, in parts 5 and 6, we will combine these into a single folder structure. This will more accurately represent a real-world application, in which you are likely to have more than one server-side component.

Create a new folder on your local machine and add a file named package.json into it. This contains the NPM configuration we need. The contents of the file should be as follows:

The project.json file does the following:

  • It defines the package dependencies we have when we develop our code. Currently we only have one – typescript.
  • It also defines a script that we can execute from within NPM. Currently we only have one – build, which will build our UDF into JavaScript using TypeScript’s tsc command.

At a command prompt, open the folder we’ve been working in. Run npm install, which will find and install the TypeScript compiler. If you don’t have NPM installed, install it by following the instructions here.

Next, create a file named tsconfig.json. This is the TypeScript project configuration. This should contain the following:

The tsconfig.json instructs TypeScript to do the following:

  • Find and compile all of the files with the .ts extension inside the src folder (we haven’t created this yet!).
  • Target ECMAScript 2015, which is the version of JavaScript that Cosmos DB supports. If we use more modern features of TypeScript, it will handle the details of how to emit these as ECMAScript 2015-compatible code.
  • Save the output JavaScript to a single file named output/udf-getCustomerId.js. This filename is arbitrary and it could be any name we want, but I find it helpful to use the convention of  {kind}-{objectName}.js, especially as we add more code in later parts of this series.
    Note that the outFile directive means that, even if we included multiple source TypeScript files, TypeScript will save the complete compiled script into a single output file. This is in keeping with the requirement that Cosmos DB imposes that a server-side component has to be specified in a single file.

Writing the UDF

Now we can write our actual UDF code! If you want to compare against my completed UDF, you can access it on GitHub.

Create a folder named src, and within that, create a file named getCustomerId.ts. (Once again, this file doesn’t need to be named this way, but I find it helpful to use the UDF’s name for the filename.) Here’s the contents of this file:

Briefly, here’s an explanation of what this file does:
  • It declares a function named getCustomerId, which accepts a single parameter of type OrderDocument and returns a string. This is the function that represents our UDF.
  • The function inspects the document provided, and depending on which version of the schema it follows, it pulls the customer ID out of the appropriate field.
  • If the customer ID isn’t in either of the places it expects to find them, it throws an error. This will be further thrown up to the client by Cosmos DB.
  • Finally, it declares an interface named OrderDocument. This represents the shape of the data we’re expecting to store in our collection, and it has both of the ways of representing customer IDs.
    Note that we are using an interface and not a class, because this data type has no meaning to Cosmos DB – it’s only for use at development and build time.
    Also note that we could put this into its own orderDocument.ts file if wanted to keep things separated out.

At the end of this, your folder should look something like this:

  • /
    • package.json
    • tsconfig.json
    •  src/
      • getCustomerId.ts

You can access a copy of this as a GitHub repository here.

We have now written our first UDF! We’re almost ready to run it – but before then, we need to compile it.

Compiling the UDF

At the command line run npm run build. This will run the build script we defined inside the package.json file, which in turn simply runs the tsc (TypeScript compiler) command-line application. tsc will find the tsconfig.json file and knows what to do with it.

Once it’s finished, you should see a new output folder containing a file named udf-getCustomerId.js. This is our fully compiled UDF! It should look like the following:

If you compare this to the code we wrote in TypeScript, you’ll see that it is almost the same – except all of the type information (variable types and the interface) have been stripped away. This means that we get the type safety benefits of TypeScript at authoring and compilation time, but the file we provide to Cosmos DB is just a regular JavaScript file.

Deploying the UDF

Now we can deploy the UDF. Back in the Azure Portal, open Script Explorer under the Collections section, and then click Create User Defined Function.

cosmos-udf-4.png

Enter getCustomerId in the ID field. This will be the name we address the UDF by when we start to call it from our queries. Note that you don’t have to use the same ID field here as the JavaScript function name – in fact, the JavaScript function can be named anything you like. For clarify, though, I find it helpful to keep everything in sync.

Now we can copy and paste the contents of the udf-getCustomerId.js file into the large script text box.

cosmos-udf-5

Click Save to install the UDF to Cosmos DB.

Testing the UDF

Finally, let’s test the UDF! We’ll need to add a couple of pieces of sample data. Click Document Explorer under the Collections section, and then click the Create button. Paste in this sample document:

Click Save, and then close the blade and create a second document with the following contents:

This gives us enough to test with. Now click Query Explorer under the Collections section. Enter the following query:

SELECT c.id, udf.getCustomerId(c) AS customerId FROM c

This query does the following:

  • Refers to each document within the current collection (c).
  • Runs the getCustomerId UDF, passing in the document contents. Note that to refer to a UDF, you must prefix the name of the UDF with udf..
  • Projects out the document ID (id) and the customerId as customerId.

You should see the following output:

That’s exactly what we wanted to see – the UDF has pulled out the correct field for each document.

As a point of interest, notice the Request Charge on the query results. Try running the query a few times, and you should see that it fluctuates a little – but is generally around 3.5 RUs.

Now let’s try passing in an invalid input into our UDF. Run this query:

SELECT udf.getCustomerId('123') FROM c

Cosmos DB will give you back the error that our UDF threw because the input data (123) didn’t match either of the schemas it expected:

Encountered exception while executing Javascript.
  Exception = Error: Document with id undefined does not contain customer ID in recognised format.
  Stack trace: Error: Document with id undefined does not contain customer ID in recognised format.
    at getCustomerId (getCustomerId.js:11:5)
    at __docDbMain (getCustomerId.js:15:5)
    at Global code (getCustomerId.js:1:2)

So we’ve now tested out the old customer ID format, the new customer ID format, and some invalid input, and the UDF behaves as we expect.

Summary

UDFs provide us with a way to encapsulate simple computational logic, and to expose this within queries. Although we can’t refer to other documents or external data sources, UDFs are a good way to expose certain types of custom business logic. In this post, we’ve created a simple UDF for Cosmos DB, and tested it using a couple of simple documents. In the next part of this series we’ll move on to stored procedures, which allow for considerably more complexity in our server-side code.

Key Takeaways

  • UDFs are intended for simple computation.
  • They can be used within queries, including in the SELECT and WHERE clauses.
  • UDFs cannot access anything within the Cosmos DB collection, nor can they access any external resources.
  • They can accept one or more parameters, which must be provided when calling the UDF.
  • The name of the JavaScript function does not have to match the name of the UDF, but to keep your sanity, I highly recommend keeping them consistent.
  • UDFs can be invoked from within a query by using the udf. prefix, e.g. SELECT udf.getCustomerId(c) FROM c.
  • You can view the code for this post on GitHub.

 

Cosmos DB Server-Side Programming with TypeScript – Part 1: Introduction

Cosmos DB is a NoSQL database provided as part of Microsoft’s Azure platform. Designed for very high performance and scalability, Cosmos DB is rapidly becoming one of the default data storage options I recommend for new green-field applications and microservices. It is a fairly opinionated database, with some guidelines that you need to follow to take full advantage of its scalability and performance, but it also provides a number of features to enable sophisticated and powerful applications to be built on top of its engine.

One such feature is its server-side programmability model. Cosmos DB allows for stored procedures, triggers, and user-defined functions to run within its database engine. Interestingly, these are written using JavaScript and uploaded to the Cosmos DB collection in which they will run. Server-side programming gives a lot of extra power to a Cosmos DB-based application, including the ability to run transactions across multiple documents within the collection. In fact, server-side programming is the only way to get transaction semantics within Cosmos DB.

In this series of blog posts, we will explore server-side programming in Cosmos DB, and we will use TypeScript to write the server-side code. I will focus on how to build real-world applications, including adding unit tests to ensure the code behaves as expected, and incorporating the build and deployment of Cosmos DB server-side code into your CI/CD process. The series is split into six parts:

  • Part 1 (this post) gives an overview of the server side programmability model, the reasons why you might want to consider server-side code in Cosmos DB, and some key things to watch out for.
  • Part 2 deals with user-defined functions, the simplest type of server-side programming, which allow for adding simple computation to queries.
  • Part 3 talks about stored procedures. These provide a lot of powerful features for creating, modifying, deleting, and querying across documents – including in a transactional way.
  • Part 4 (coming soon) introduces triggers. Triggers come in two types – pre-triggers and post-triggers – and allow for behaviour like validating and modifying documents as they are inserted or updated, and creating secondary effects as a result of changes to documents in a collection.
  • Part 5 (coming soon) discusses unit testing your server-side scripts. Unit testing is a key part of building a production-grade application, and even though some of your code runs inside Cosmos DB, your business logic can still be tested.
  • Finally, part 6 (coming soon) explains how server-side scripts can be built and deployed into a Cosmos DB collection within an automated build and release pipeline, using Microsoft Visual Studio Team Services (VSTS).

In this series I presume some basic knowledge of Cosmos DB. If you’re completely new to Cosmos DB then I recommend reading Microsoft’s overview, and following along with one of the quick starts. A passing familiarity with TypeScript will also be helpful, but even if you don’t know how to use TypeScript, I’ll try to cover the key points you need to know to get started.

Using TypeScript

TypeScript is a language that compiles (or, technically, transpiles) into JavaScript. It provides a number of nice features and improvements over JavaScript, the main one being type safety. This means that the TypeScript compiler can check that your code is accessing the correct types and members as you write it. Writing code in TypeScript allows for a better level of certainty that your code is actually going to work, as well as providing some very nice development-time features such as IntelliSense. It also helps to have strong typing when unit testing, and particularly when mocking out external interfaces and classes within tests.

Because TypeScript compiles into JavaScript code, any JavaScript runtime will be able to run code that had been written in TypeScript. This includes Cosmos DB’s JavaScript engine, Chakra. Even though Cosmos DB doesn’t know about TypeScript or support it directly, we can still take advantage of many of the features that TypeScript provides, and then compile our script into JavaScript before handing it over to Cosmos DB for execution.

TypeScript also lets us separate out our code into multiple .ts files, keeping it tidy and well-organised. Cosmos DB requires that our code be in a single .js file, though – but thankfully, TypeScript can be configured to combine our code when it compiles it.

When working with external libraries and APIs within TypeScript, we need to use type definitions. These specify the details of the types we will use. While the Cosmos DB team doesn’t provide first-party type definitions for their server-side API, there are publicly accessible, open-source type definitions available from the DefinitelyTyped repository. We will use these later in this series.

Note that Cosmos DB supports the ECMAScript 2015 version of JavaScript. TypeScript can be configured to emit JavaScript in several different versions, including ECMAScript 2015 code, so this is not a problem for us. We’ll see how to do this in part 2 of this series.

Impact of Server-Side Programming on Request Units

When using Cosmos DB, we don’t provision CPU cores or disk speed or memory. Instead, Cosmos DB uses request units as its currency. A Cosmos DB collection is provisioned with a certain number of request units per second (RU/s), which can be scaled as necessary to cope with your application’s demands. The RU/s provisioned dictates the monetary cost of running the collection. For example, a simple collection with a light query load might be provisioned with 1000 RU/s, which (as of January 2018) costs approximately USD$60 per month. For more information on Cosmos DB’s request unit model see here, and for the latest pricing information, see here.

Server-side code running within Cosmos DB can easily consume a lot of request units, potentially exhausting your allowance for that second and forcing your application to have to retry operations against Cosmos DB. Furthermore, the cost of running queries server-side may sometimes be higher than the cost of running the equivalent query using the standard client-side APIs, due to the resources it takes to start up a stored procedure or function from JavaScript. Cosmos DB does have some optimisations to reduce the cost of running JavaScript code – for example, internally Cosmos DB compiles the JavaScript code to bytecode and then caches this bytecode so that it doesn’t need to recompile it on every invocation. However, running arbitrary code will usually be more expensive than just using the client-side query APIs, and this means that server-side code may not be appropriate if you don’t actually need the benefits it provides.

Additionally, the request unit usage for a given piece of server-side code is not fully predictable or consistent – in my own testing, I’ve seen the exact same piece of code, working on the same data set, take anywhere from 3.2 RUs through to 4.8 RUs to execute. This is in contrast to the rest of Cosmos DB, where request unit usage is very predictable.

Nevertheless for some scenarios, such as bulk inserts of multiple documents, or generating sample data, it may take fewer request units to run code server-side than client side. It is important to benchmark your code and compare the possible approaches to fully understand the best option for your requirements.

Consistency: Transactions and Indexes

Cosmos DB’s client-side programming model does not provide for transactional consistency across multiple documents. For example, you may have two documents to insert or update, and require that either both operations succeed or – if there is a problem with writing one of them – that both of them should fail. The Cosmos DB server-side programmability model allows for this behaviour to be implemented, because all server-side code runs within an implicit transaction. This means that we get ACID transaction semantics automatically whenever we execute a stored procedure or trigger. Cosmos DB runs stored procedures and triggers on the primary replica that is used to host the data, which allows it to give this level of transactional isolation while still allowing for high performance operations within the transaction.

Note, however, that transactions are not serialised. This means that other transactions may be happening simultaneously on other documents within the collection in parallel with your transaction. This is important because it means that functionality like real-time aggregation of data may not always be looking at a consistent view of the world, and you can get race conditions. This is simply due to the way Cosmos DB works, and is not something that we can easily program around.

A further nuance to be aware of is that, in Cosmos DB, indexes are updated asynchronously. This means that if you query the collection within a trigger, you may not see the document currently being inserted or updated. Again, this makes it challenging to do certain types of queries (such as aggregations), but is a byproduct of Cosmos DB’s emphasis on enabling high performance and throughput.

API Models

Cosmos DB provides several API models to access data in your databases: SQL to use a SQL-based syntax; MongoDB for using the MongoDB client libraries and tools; Table to use the Azure Storage table API; and Gremlin to use the Gremlin graph protocol. All of these ultimately store data in the same way though, and all of them allow for Cosmos DB’s server-side programmability model.

In this series I will focus purely on the SQL API, but most of the same concepts can be applied to the other API models.

Restrictions

Cosmos DB has placed some restrictions on the types of operations that can be run from within the server. This is mostly to optimise the performance and security of the service.

Time restriction: each server-side operation has a fixed amount of time that it must execute within. The exact amount of time is not documented, but the server-side API provides some features to indicate when your script is approaching its limit. We will discuss this more in later parts of the series. It is important to build in this restriction when designing your stored procedures and triggers, and to avoid writing server-side code that will make high volumes of queries. Instead, if you batch these up across multiple stored procedure calls, you are more likely to have all of your code execute successfully.

Limited set of functionality: although it executes arbitrary JavaScript, Cosmos DB’s server-side programming model is designed for basic computation and for interacting with the Cosmos DB collection itself. We cannot call external web APIs, communicate with other collections, or import any complex JavaScript libraries.

Limited fluent query capability: Cosmos DB’s server-side API has a fluent JavaScript-based query syntax to perform various types of queries, filters, projections, etc on the underlying collection. However, this API does not support all of the rich functionality that the SQL grammar provides, nor does it allow for the same types of queries as the other API models, such as Gremlin’s graph query capability.

One example feature that is missing from the server-side query API is aggregation. The Cosmos DB SQL dialect allows for queries such as SUM, MIN, MAX, and COUNT. These cannot be performed using the fluent server-side query API. However, SQL queries can be executed from within server-side code, so this is not a serious limitation and really just affects the way the code is written, not the functionality that is exposed.

Single JavaScript file: a single stored procedure, trigger, or user-defined function is represented by a JavaScript function, which in turn may call other functions. However, all of the code must be placed in a single file. JavaScript modules and other similar features are not supported. We will see how to split our functionality into multiple TypeScript files, while still emitting a single JavaScript file, later in this series.

Following Along

In this series, you will be able to follow along and create each type of server-side programming entity in Cosmos DB: a user-defined function, a stored procedure, and a trigger. We will build up a set of server-side code, and then in parts 5 and 6 of this series we will look at how to get these ready for a production deployment by testing and automatically building and deploying them to Cosmos DB.

You can follow along whether you use Windows, macOS, or Linux to develop. There are just a few prerequisites:

  • A good text editor: I use Visual Studio Code, which comes with the TypeScript programming extension, but you can use anything you like.
  • Node Package Manager (NPM): you can install this here if you don’t already have it.
  • An Azure subscription. Alternatively, you can use the Cosmos DB emulator to run this locally and at no charge, but you will need to adapt the instructions slightly.

Sample Scenario

A common use for a database is to store information about orders that customers make for products. Orders typically contain some basic overall information, such as an order ID, date, customer ID, and a set of order items – references to products and the quantities ordered. In this series we will work with a simple hypothetical order database implemented in Cosmos DB.

We will use the SQL API, and we will use a non-partitioned collection. Note that partitioned collections behave the same way as non-partitioned collections when it comes to server-side programmability, but they also have a few nuances in their behaviour that we won’t go through here.

Summary

Server-side programming in Cosmos DB is extremely powerful. It gives us the ability to write functions, stored procedures, and triggers that execute within the database engine, and allow for features that are simply not possible through the client-side programming model. However, there are limitations and things to be aware of, including the potentially high cost of running some types of operations from the server. The features also do not provide the same degree of flexibility and power as their counterparts in SQL Server and other relational databases. Nevertheless, the server-side programming model in Cosmos DB is enormously useful for certain types of situations.

By using TypeScript to add type safety, and by adding unit tests and good continuous integration and continuous deployment practices, we can build advanced behaviour into our production-grade applications – all while taking advantage of the high performance and scale capabilities of Cosmos DB.

In the next part of this blog series, we will start writing some server-side code – first by building a user-defined function.

Use Azure Health to track active incidents in your Subscriptions

siliconvalve

Yesterday afternoon while doing some work I ran into an issue in Azure. Initially I thought this issue was due to a bug in my (new) code and went to my usual debugging helper Application Insights to review what was going on.

The below graphs a little old, but you can see a clear spike on the left of the graphs which is where we started seeing issues and which gave me a clue that something was not right!

App Insights views

Initially I thought this was a compute issue as the graphs are for a VM-hosted REST API (left) and a Functions-based backend (right).

At this point there was no service status indicating an issue so I dug a little deeper and reviewed the detailed Exception information from Application Insights and realised that the source of the problem was the underlying Service Bus and Event Hub features that we use to glue…

View original post 268 more words

Azure AD Domain Services

I recently had what I thought was a rather unique requirement from a customer.

The requirement was to build Azure IaaS virtual machines and have them joined to a managed domain, while also being able to authenticate to the virtual machines using Azure AD credentials.

The answer is Azure AD Domain Services!

Azure AD Domain Services provides managed domain services such as domain join, group policy and Kerberos/NTLM authentication without the need for you to deploy and  manage domain controllers in the cloud. For more information see https://docs.microsoft.com/en-us/azure/active-directory-domain-services/active-directory-ds-overview

It is not without its limitations though, main things to call out is that configuring domain trusts and applying schema extensions is not possible with Azure AD Domain Services. For a full list of limitations see: https://docs.microsoft.com/en-us/azure/active-directory-domain-services/active-directory-ds-comparison

Unfortunately at this point in time you cannot use ARM templates to configure Azure AD Domain Services so you are limited to the Azure Portal or PowerShell. I am not going to bore you with the details of the deployment steps as it is quite simple and you can easily follow the steps supplied in the Microsoft documentation: https://docs.microsoft.com/en-us/azure/active-directory-domain-services/active-directory-ds-enable-using-powershell

What I would like to do is point out the following learnings that I discovered during my deployment.

  1. In order to utilise Azure AD credentials that are synchronised from on-premises, synchronisation of NTLM/Kerberos credential hashes must be enabled in Azure AD Connect, this is not enabled by default.
  2. If there is any cloud-only user accounts, all users who need to use Azure AD Domain Services must change their passwords after Azure AD Domain Services is provisioned. The password change process causes the credential hashes for Kerberos and NTLM authentication to be generated in Azure AD.
  3. Once a cloud-only user account has changed their password, you will need to wait for a minimum of 20 minutes before you will be able to use Azure AD Domain Services (this got me as I was impatient).
  4. Speaking of patience the provisioning process of Azure Domain Services takes about an hour.
  5. Have a dedicated subnet for Azure AD Domain services to avoid any connectivity issues that may occur with NSGs/firewalls.
  6. You can only have one managed domain connected to your Azure Active Directory.

That’s it, hopefully this helped you get a better understanding of Azure AD Domain Services and assists with a smooth deployment.

Understanding Azure’s Container PaaS Capabilities

siliconvalve

If you’ve been using Azure over the past twelve months, you can’t but have the feeling that it’s become a bit like this…

Containers... Containers Everywhere

.. and you’d be right.

To be fair, though, Containers have been one of the hot topics in computing in general and certainly one that’s been getting the most interest in my recent Azure Open Source Roadshows.

One thing that has struck me though is that people are not clear on the purpose of all the services in Azure that have ‘Containers’ listed as a capability, so in this post I am going to try and review the Azure Platform-as-a-Service offerings that have Container capabilities and cover what the services can be used for.

First, before we begin, let’s quickly get some fundamentals under our belts.

What is a Container?

Containers provide encapsulation and isolation for workloads and remove the need for a complete Operating System image…

View original post 1,698 more words

Exchange Online & Splunk – Automating the solution

NOTES FROM THE FIELD:

I have recently been consulting on, what I think is a pretty cool engagement to integrate some Office365 mailbox data into the Splunk reporting platform.

I initially thought about using a .csv export methodology however through trial & error (more error than trial if I’m being honest), and realising that this method still required some manual interaction, I decided to embark on finding a fully automated solution.

The final solution comprises the below components:

  • Splunk HTTP event collector
    • Splunk hostname
    • Token from HTTP event collector config page
  • Azure automation account
    • Azure Run As Account
    • Azure Runbook
    • Exchange Online credentials (registered to Azure automation account

I’m not going to run through the creation of the automation account, or required credentials as these had already been created, however there is a great guide to configuring the solution I have used for this customer at  https://www.splunk.com/blog/2017/10/05/splunking-microsoft-cloud-data-part-3.html

What the PowerShell script we are using will achieve is the following:

  • Connect to Azure and Exchange Online – Azure run as account authentication
  • Configure variables for connection to Splunk HTTP event collector
  • Collect mailbox data from the Exchange Online environment
  • Split the mailbox data into parts for faster processing
  • Specify SSL/TLS protocol settings for self-signed cert in test environment
  • Create a JSON object to be posted to the Splunk environment
  • HTTP POST the data directly to Splunk

The Code:

#Clear Existing PS Sessions
Get-PSSession | Remove-PSSession | Out-Null
#Create Split Function for CSV file
function Split-array {
param($inArray,[int]$parts,[int]$size)
if($parts) {
$PartSize=[Math]::Ceiling($inArray.count/$parts)
}
if($size) {
$PartSize=$size
$parts=[Math]::Ceiling($inArray.count/$size)
}
$outArray=New-Object’System.Collections.Generic.List[psobject]’
for($i=1;$i-le$parts;$i++) {
$start=(($i-1)*$PartSize)
$end=(($i)*$PartSize)-1
if($end-ge$inArray.count) {$end=$inArray.count-1}
$outArray.Add(@($inArray[$start..$end]))
}
return,$outArray
}
function Connect-ExchangeOnline {
param(
$Creds
)
#Connect to Exchange Online
$Session=New-PSSession –ConfigurationName Microsoft.Exchange -ConnectionUri https://outlook.office365.com/powershell-liveid/-Credential $Credentials-Authentication Basic -AllowRedirection
$Commands=@(“Add-MailboxPermission”,”Add-RecipientPermission”,”Remove-RecipientPermission”,”Remove-MailboxPermission”,”Get-MailboxPermission”,”Get-User”,”Get-DistributionGroupMember”,”Get-DistributionGroup”,”Get-Mailbox”)
Import-PSSession-Session $Session-DisableNameChecking:$true-AllowClobber:$true-CommandName $commands|Out-Null
}
#Create Variables
$SplunkHost = “Your Splunk hostname or IP Address”
$SplunkEventCollectorPort = “8088”
$SplunkEventCollectorToken = “Splunk Token from Http Event Collector”
$servicePrincipalConnection = Get-AutomationConnection -Name ‘AzureRunAsConnection’
$credentials = Get-AutomationPSCredential -Name ‘Exchange Online’
#Connect to Azure
Add-AzureRMAccount -ServicePrincipal -Tenant $servicePrincipalConnection.TenantID -ApplicationId $servicePrincipalConnection.ApplicationID -CertificateThumbprint $servicePrincipalConnection.CertificateThumbprint
#Connect to Exchange Online
Connect-ExchangeOnline -Creds $credentials
#Invoke Script
$mailboxes = Get-Mailbox -resultsize unlimited | select-object -property DisplayName, PrimarySMTPAddress, IsMailboxEnabled, ForwardingSmtpAddress, GrantSendOnBehalfTo, ProhibitSendReceiveQuota, AddressBookPolicy
#Get Current Date & Time
$time = get-date -Format s
#Convert Timezone to Australia/Brisbane
$bnetime = [System.TimeZoneInfo]::ConvertTimeBySystemTimeZoneId($time, [System.TimeZoneInfo]::Local.Id, ‘E. Australia Standard Time’)
#Adding Time Column to Output
$mailboxes = $mailboxes | Select-Object @{expression = {$bnetime}; Name = ‘Time’}, DisplayName, PrimarySMTPAddress, IsMailboxEnabled, ForwardingSmtpAddress, GrantSendOnBehalfTo, ProhibitSendReceiveQuota, AddressBookPolicy
#Create Split Array for Mailboxes Spreadsheet
$recipients = Split-array -inArray $mailboxes -parts 5
#Create JSON objects and HTTP Post to Splunk HTTP Event Collector
foreach ($recipient in $recipients) {
foreach($rin$recipient) {
#Create SSL Validation Bypass for Self-Signed Certificate in Testing
$AllProtocols = [System.Net.SecurityProtocolType]’Ssl3,Tls,Tls11,Tls12′
[System.Net.ServicePointManager]::SecurityProtocol = $AllProtocols
[System.Net.ServicePointManager]::ServerCertificateValidationCallback = {$true}
#Get JSON string to post to Splunk
$StringToPost = “{ `”Time`”: `”$($r.Time)`”, `”DisplayName`”: `”$($r.DisplayName)`”, `”PrimarySMTPAddress`”: `”$($r.PrimarySmtpAddress)`”, `”IsMailboxEnabled`”: `”$($r.IsMailboxEnabled)`”, `”ForwardingSmtpAddress`”: `”$($r.ForwardingSmtpAddress)`”, `”GrantSendOnBehalfTo`”: `”$($r.GrantSendOnBehalfTo)`”, `”ProhibitSendReceiveQuota`”: `”$($r.ProhibitSendReceiveQuota)`”, `”AddressBookPolicy`”: `”$($r.AddressBookPolicy)`” }”
$uri = “https://” + $SplunkHost + “:” + $SplunkEventCollectorPort + “/services/collector/raw”
$header = @{“Authorization”=”Splunk ” + $SplunkEventCollectorToken}
#Post to Splunk Http Event Collector
Invoke-RestMethod -Method Post -Uri $uri -Body $StringToPost -Header $header
}
}
Get-PSSession | Remove-PSSession | Out-Null

 

The final output that can be seen in Splunk looks like the following:

11/13/17
12:28:22.000 PM
{ [-]
AddressBookPolicy:
DisplayName: Shane Fisher
ForwardingSmtpAddress:
GrantSendOnBehalfTo:
IsMailboxEnabled: True
PrimarySMTPAddress: shane.fisher@xxxxxxxx.com.au
ProhibitSendReceiveQuota: 50 GB (53,687,091,200 bytes)
Time: 11/13/2017 12:28:22
}Show as raw text·         AddressBookPolicy =  

·         DisplayName = Shane Fisher

·         ForwardingSmtpAddress =  

·         GrantSendOnBehalfTo =  

·         IsMailboxEnabled = True

·         PrimarySMTPAddress = shane.fisher@xxxxxxxx.com.au

·         ProhibitSendReceiveQuota = 50 GB (53,687,091,200 bytes)

I hope this helps some of you out there.

Cheers,

Shane.

 

 

 

Connect SharePoint Online and SQL Server On-Premises with BCS/SharePoint Apps using Hybrid Connection and WCF Services

SharePoint Online cannot directly connect to on-premises data sources such as SQL Server. A recommended approach is to use Hybrid with SharePoint 2013/2016 but adds an overhead of infrastructure and maintenance costs. Hence to overcome it, I am going to describe in this blog how to use the Azure PaaS workloads and connect to on-premises data sources using BCS.

Using Azure Hybrid Connection (refer this post) and BCS with Azure Web App hosting WCF endpoint, we can now expose on-premises SQL data to SharePoint Online and Cloud by external content types (ECTs) or SharePoint Hosted Apps.

Below are two approaches by which BCS can connect these data sources to SharePoint.
1. Azure Web App hosting WCF Service and External Lists
2. Azure Web App hosting WCF Data Service and Hosted Apps

Azure WCF Service Web App and External Lists
SPOAzureBCSHybrid
Pros: The advantage of using this approach is the reusability of External Content Types (ECT). ECTs can be used across multiple lists and sites in the same site collection. ECTs can also be used for complex associations across multiple types of data.

Cons: Some shortcomings of this approach are:
– Dependency on pass through authentication for users and/or implement custom authentication to authenticate with WCF by passing SQL authentication
– Added development effort because of WCF build and hosting

High-Level Steps:
1. Create a WCF Solution using Visual Studio
2. Use ADO.Net and WCF Service calls to fetch data using web methods. Implement at least two web methods – one to return all items and one to return a specific item
3. Update Web.Config of the WCF service with required configuration for data calls
4. Create an Azure Web App
5. Publish the WCF Service to Azure Web App and get the single wsdl signature from the WCF service
6. Create an External Content Type using SharePoint Designer using the WSDL signature
7. Add GetItems and GetItem finder to ECT
8. Create an External List from ECT

Azure Web App hosting WCF Data Service and Hosted Apps
SPOAzureAppsHybrid
Pros: The advantages of using a WCF Data Service is that the OData method maps directly to the schema of the SQL table which makes it easy to build and maintain. Additionally, using SharePoint hosted apps isolates the CRUD operations from the Host Web decreasing the overhead of external content types and external lists.

Cons: The disadvantage of using this approach is that the data is scoped within the app and cannot be exposed to Host Web components making interaction limited to Web App only. There is a customization requirement to expose and operate on this data in the App Web.

High-Level Steps:
1. Create a WCF service project using Visual Studio
2. Install the EntityFramework Nuget package
3. Add a WCF data service file and implement EntityFrameworkDataService instead of DataService
4. Override the “InitializeService” method as below
5. Add an ADO.Net Entity Data Model project and configure it to fetch data from SQL Tables you want
6. Update Web.config with required configuration for data calls
7. Create an Azure Web App and enable SSL on it
8. Publish the WCF Service to Azure Web App
9. Next create a new SharePoint hosted app solution in Visual Studio
10. In the SharePoint hosted app solution, add an External Content type and select the Azure Web Application hosting the WCF data service as source
11. After the External Content type is created, then create an External List using ECT created above
12. The external list is now added to the Hosted app which can then be referenced in the app default page and app part


Hence in this blog, we have seen the two choices to host BCS connectivity services via Azure PaaS workloads, advantages and disadvantages of each and broad level steps to configure them.

Resolving “User not found” issue while assigning permissions using SharePoint CSOM

I was recently working on a SharePoint Online project where we were trying to automate library creation and provide required permissions on those libraries. We had an issue while modifying permissions with CSOM code on SharePoint libraries when the Created By user had left the company.

In this post I will outline the cause and the resolution as there was no online reference for resolving this error.

Issue: The CSOM code was throwing an error “User not found” even when creating a User object from web.EnsureUser() method.

Cause: The User object returned by web.EnsureUser() method was empty but not null and hence couldn’t be instantiated while adding after breaking permissions.

Resolution: The resolution to this issue was to explicitly load of the user object, then catch the exception while loading, and set a flag to false which could be later be checked to prevent the add method from erroring out. Yeah, this is a roundabout way of overcoming the issue but it works. Hopefully it will save you some hours.

Below is the code that could be used to do that.