A customer I am working with at the moment is in the (very) early stages of discussion around the gathering and application of profile photos across their internal systems. In this particular case, we are considering that the photos themselves do not exist. Sure, there are ID card photos of startled staff taken on day one of their employment, but people being people, they would rather not be forever digitally represented by their former selves – particularly not the version of themselves which had an ID photo taken in a poorly lit un-used meeting room 7 years ago before they got that gym membership. There are many technical considerations when embarking on something like this: Where will we store the data? What file formats do we need to support? What is the maximum size and resolution for photos across our systems? But before answering any of these questions we should first think about how we are actually going to gather these photos, and once you have them, how do you ensure that they comply with whatever business rules you wish to apply to them? Not very long ago, the answer to this question would have been ‘hire a grad’ (sorry grads) – but we live in the future now, and we have artificial intelligence to do our bidding, so let’s take a look at how we do just that.
The Rules
Let’s make up some rules which might be applicable to corporate profile photos.
- The photo should be of your face
- The photo should contain only one person
- The face should be framed in the photo similarly to a passport photo
- The photo should not contain any adult content
The API’s
Our rules can be satisfied using two of Microsoft’s Cognitive Services API’s, namely the Face API and the Computer Vision API. Both of these API’s have free tiers which more than satisfy the requirements for this little demo, but have paid tiers which are actually extraordinarily fairly priced. To sign up for the API’s, head over to portal.azure.com and click new (A), Intelligence (B), then Cognitive Services API (C).
And then fill in the relevant details.
We are using both the Face API and the Computer Vision API in this example, so the above steps will be repeated for each API.
Once you have completed this process, you will find details of your new accounts in the portal under “Cognitive Services accounts”. This is going to give you the details you’ll need to interact with the API’s.
Now that we have an account setup and ready to go, we can start to play! Because I am an infrastructure guy rather than a dev, I will be using PowerShell for this demonstration. Let’s work through the rules we defined earlier.
Rule #1: The photo should be of your face
To check our image against this rule, we want a response from the Face API to simply confirm that the image is of a face. As I already own a face, I will use my own for this example. The image “C:\temp\DanThom_Profile.jpg” which is referenced in the code snippet is the same image used as my profile photo on this blog.
Executing the above code will give you a simple true/false set against the variable $FaceDetected. This gives us some confidence that the uploaded photo is in fact a face – it doesn’t necessarily mean it’s my face, but I will talk about that a little later.
Rule #2: The photo should contain only one person
We’re going to reuse the same API and almost the same code to validate this rule. Feeding the API a crudely photoshopped version of my original photo with my face duplicated using the snippet below, the variable $MultipleFaces is set to true. Feeding in the original photo sets the variable as false.
Rule #3: The face should be framed in the photo similarly to a passport photo
For this rule, we will use a combination of the Computer Vision and the Face API. The Face API is going to give us some data about how many pixels are occupied by the face in the photo, and we’re simply using the Computer Vision API to get the dimensions of the photo. I appreciate there are many other ways you can retrieve this data without having to call out to an external API, but seeing as we’re playing with these API’s today, why not?
The following snippet of code will get the dimensions of the photo, get the width of the Face Rectangle (the width of the detected face) then work out the percentage of the width of the photo which is consumed by the face. My profile picture is a good example of good framing, and the width of my face consumes 43.59375% of the width of the photo. Based on this, I’m going to say a ‘good’ photo ranges somewhere between 35% and 65%. The following code snippet will work out if the picture meets this criteria, and return a true/false for the variable of $GoodFraming.
Rule #4: the photo should not contain any adult content
We are all decent human beings, so it seems like this shouldn’t be a concern, but the reality is if you work in a larger organisation, someone may choose to perform a ‘mic drop’ by updating their profile picture to something unsavoury in their last weeks of employment. Luckily the Computer Vision API also has adult content detection. The following code snippet will return a simple true/false against the variable $NSFW.
Interestingly, the ‘visualFeatures=Adult’ query returns a true/false for ‘isAdult’ and for ‘isRacy’ as well as numerical results for ‘adultScore’ and ‘racyScore’. I was wondering what might be considered ‘racy’, so I fed the API the following image.
As it turns out, old mate Bill gets himself a racyScore of 0.2335. My profile picture gets 0.0132, and an actual racing car got 0.0087. Bill Gates is twenty times as racy as I am, and off the charts compared to a racing car.
Other Cool Things
There are all sorts of other neat things these API’s can return, which would be even more helpful in vetting corporate profile pictures. Some things such as returning the landmarks of the face, whether or not the person is wearing sunglasses, the individuals gender, the photos similarity to other photos or whether or not the person is a celebrity would all be helpful in a fully developed solution around vetting corporate profile photos.
Conclusion
Hopefully the above examples have provided a little insight into how the Microsoft Cognitive Services API’s might be useful for business applications. It is amazing to me the kind of easily available and affordable artificial cognitive power we now have at our fingertips for interested parties with minimal coding skills.
You can see from the examples above how easily you could scale something like this. You could offer users an application where they can take their own profile picture, have that picture immediately reviewed by Microsoft Cognitive Services and approved, or immediately rejected giving the user the option to either submit the photo for manual approval (hooray! one of the grads got their job back!), or discard the photo and try again.
I have yet to see any of the Microsoft Cognitive Services implemented in any of the businesses I have been involved with, but I suspect in the coming years we will be seeing more and more of it, and that is certainly something I look forward to.