The Consumerisation of Artificial Intelligence

We live in very exciting times. Computing capabilities have increased exponentially in the last two centuries to allow us to easily utilise highly complex artificial intelligence functionality today. It all began with the first mechanical computer in the early 19th century by Charles Babbage. The concepts and work that resulted from that were iterated upon, which enabled the invention of the first programmable digital computer in the 1940s. Fast forward 70 years, and Artificial Intelligence is evolving into something that people can easily use, without having a degree in mathematics, access to enormous amounts of processing hardware or access to enormous amount of data.

Where we are now

There is still a long way to go, however progress is being made exponentially faster than ever before. Instead of being able to solve just mathematical problems, we can now detect emotion, perform facial recognition, identify landmarks or celebrities in images, understand spoken language and provide textual descriptions of images.

More importantly though, these advanced services are now available to a broad general audience. Utilising cloud technology to provide these complex services, we are now at the tipping point where vast numbers of people are empowered to create solutions or applications that leverage this power.

The vehicle for this is predominantly (but not limited to) the exposition of Application Pro amming Interfaces (API) to consumption by integrators or solution providers. The simplicity of these API’s lowers the boundary to what was previously complex and hard to leverage operations.

Offerings

There are quite a few vendors providing offerings in this space. Google, Microsoft, AWS, Salesforce to name a few. All provide different levels and types of usage of these advanced services. Microsoft is well positioned here with its Azure cloud and Cognitive Services technology stack. The remainder of this article will concentrate on the Microsoft offerings.

It is not all easy

While use of high level advanced artificial intelligence services is available today, the more involved technology such as machine learning, is widely used to build these consumer-friendly services. To show the spectrum on offer, the following image shows the broad Microsoft suite, known as the Microsoft AI Platform, with least complex on the left moving to more complex towards the right.

  • Cognitive Services
    A set of API’s built using pre trained models and example data with easy to consume interfaces. The API’s allow developers to build more intelligence into their applications with little to no knowledge on how to build specific machine learning or artificial intelligence solutions.
  • Machine learning and bots
    This features a toolset called “Cortana Intelligence Suite” to allow a relatively easy interface for the building and training of machine learning models. In addition, a “Bot Framework” is also available that developers can leverage to build applications with intelligent bots. These tools are a little more advanced than the Cognitive Services in that a higher degree of investment and development effort is required.
  • Cognitive Toolkit
    This provides a framework to access the lower levels of machine learning through neural networks. It is predominantly script driven and very complex. This toolset provides the basis from which both Cognitive Services, Cortana Intelligence Suite and the Bot Framework are built. Large amounts of sample data and a good understanding of the mathematics behind machine learning are required. As such, the barrier for entry is typically quite high.

An example

Now that you have an idea of the scope of what is provided, we will show an actual example of how easy it is to consume the powerful services that Cognitive Services have to offer.

Setup

To utilise Microsoft Cognitive Services, you must first provision a service you wish to use in Azure. To do that you simply go into the Azure portal, and add a specific Cognitive Service to your selected resource group. The following image shows a list of some of the possible Cognitive Services you may add.

Once this is done, select the service and click on the ‘Keys’ option to bring up the ‘Keys’ blade. This has some access keys which is what allows you to use the service and is how your usage is measured. In this image below, we have setup an instance of the ‘Computer Vision’ Cognitive Service which will allow us to analyse an image (amongst other things).

You are now all setup to use the Cognitive Service.

Let’s have a play

For this example, we will analyse the following image which is located at https://www.planwallpaper.com/static/images/Child-Girl-with-Sunflowers-Images.jpg

In order to analyse this image, we send a request to the Cognitive Service Vision API at the following URL: https://southeastasia.api.cognitive.microsoft.com/vision/v1.0/analyze?visualFeatures=Categories,tags,description,adult&language=en

In that URL, you can see that there some options specified, specifically “visualFeatures=Categories,tags,description,adult”. This tells the API that we want to analyse this image with the following as part of the requested response: categories, tags associated with the image, a plain text description, and an indication if there is any adult content.

As part of that request, we need to specify the access key that was provisioned when we first setup this API. This is provided as part of the HTTP header collection, with a header named “Ocp-Apim-Subscription-Key” and the value of that header, one of the keys that was listed in the azure portal for that’s service.

In addition, we need to tell the service where the image is located to analyse. We do this by including the following content as part of the body:

{ "url": "https://www.planwallpaper.com/static/images/Child-Girl-with-Sunflowers-Images.jpg" }

This request is then posted to the API endpoint via a standard HTTP POST.

Once this is done, the following response is received (with some sections removed for brevity):

{
    "categories": [
        {
            "name": "people_young",
            "score": 0.765625
        }
    ],
    "adult": {
        "isAdultContent": false,
        "isRacyContent": false,
        "adultScore": 0.0080434931442141533,
        "racyScore": 0.012916238978505135
    },
    "tags": [
        {
            "name": "outdoor",
            "confidence": 0.99058502912521362
        },
        {
            "name": "tree",
            "confidence": 0.98921263217926025
        },
        {
            "name": "flower",
            "confidence": 0.91467201709747314
        },
        {
            "name": "person",
            "confidence": 0.90480440855026245
        },
        {
            "name": "plant",
            "confidence": 0.89624994993209839
        },
        {
            "name": "little",
            "confidence": 0.87906831502914429
        },
        {
            "name": "yellow",
            "confidence": 0.87603884935379028
        }
    ],
    "description": {
        "tags": [
            "outdoor",
            "flower",
            "person",
            "plant",
            "little",
….more responses here
            "grass",
            "yellow",
            "holding",
            "garden"
        ],
        "captions": [
            {
                "text": "a little girl wearing a yellow flower",
                "confidence": 0.89666165480076254
            }
        ]
    },
}

Looking at this response, there is a wealth of information that was taken from the image. We can see that:

  • The category was “people_young” with a confidence that this the correct answer of 0.765625 or approximately 77%
  • There is no adult or racy content within the image. “Racy” content is somewhat subjective but racy content is generally an image that may contain provocative material, but not necessarily pornographic. “Adult” content on the other hand does refer to pornographic material.
  • The tags show that the image contains items relating to outdoor, trees, flowers, persons, plants, little and yellow, all with a relatively high degree of confidence.
  • The description tags follow closely with the requested overall tags but also contain a plain text caption of “a little girl wearing a yellow flower” with a confidence score of 0.89666165480076254 or approximately 90%.

From a simple API call, we have been able to determine a vast amount of information with varying confidence levels. This allows the developer or consumer to use only the information which they feel is more accurate based on the confidence level. In addition, being able to convert a static image into plain text with no manual intervention required!

This is just one example from this set of API’s in the ComputerVision family. There are many more families of API’s such as Text analysis, face recognition, emotion recognition and a range of others.

Conclusion

The development of artificial intelligence and application of machine learning has come a long way in a relatively short amount of time. Not just the development of the algorithms and associated models, but also the ease of consumption of these technologies. Using an industry standard HTTP request, we can leverage the huge amount of work that has been developed by specialists in the artificial intelligence space, harnessing the large scale of the cloud.

With such a vast amount of resources available in such an easy way, it is no longer about how to harness this technology, but where can we apply this technology to make our applications and systems more intelligent. End users see immediate benefit as applications seamlessly become more “human like” or intelligent, and as a result, become more engaged.

So, the challenge now is to decide how best to augment or enhance your applications with this new-found intelligence. For example:

  • Applications such as feedback forms could easily be enhanced to determine if content was offensive, or show whether content is positive or negative.
  • An organisations internal CMS system could have automatic content moderation to determine if the content is offensive or adult in nature and flag for approval.
  • Image upload functionality can be automatically tagged or categorised when storing for easy searchability without manual intervention.
  • Automatically recognising text in images and providing that as metadata about the image.

These are the simple use cases. Combining these services into specific application scenarios seamlessly can take your applications to a new level of customer engagement. The rest is up to you.

The examples and information presented here are just the tip of the iceberg though. For more information, head over too https://azure.microsoft.com/en-us/services/cognitive-services/directory/ to see the full list of Cognitive Services available. In addition, there is the ability to try interactive versions of a few of the Cognitive Services to get a feel for what they can do.

No Comments

Add a Comment

As it will appear on the website

Not displayed

Your website