AI-900

The AI Fundamentals exam is an opportunity to demonstrate knowledge of common ML and AI workloads and how to implement them on Azure.

Use this exam cram to familiarise yourself with the general technology concepts prior to sitting your certification.

Each testing domain is listed along with the key review points that you will need to be familiar with in order to be successful.

Exam AI-900: Microsoft AI Fundamentals Exam Prep

Watch this 50 minute video to remind yourself of the key concepts of Artificial Intelligence and boost your chances of certification success.

Section 1: Identify features of common AI workloads.

1.1 – Identify prediction/forecasting workloads

Review

AI is the creation of software that imitates human behaviours and capability.

Key elements include:

  • Machine learning – This is often the foundation for an AI system, and is the way we “teach” a computer model to make prediction and draw conclusions from data.
  • Anomaly detection – The capability to automatically detect errors or unusual activity in a system.
  • Computer vision – The capability of software to interpret the world visually through cameras, video, and images.
  • Natural language processing – The capability for a computer to interpret written or spoken language, and respond in kind.
  • Conversational AI – The capability of a software “agent” to participate in a conversation.

So how do machines learn?
The answer is, from data. In today’s world, we create huge volumes of data as we go about our everyday lives. From the text messages, emails, and social media posts we send to the photographs and videos we take on our phones, we generate massive amounts of information. More data still is created by millions of sensors in our homes, cars, cities, public transport infrastructure, and factories.

Data scientists can use all of that data to train machine learning models that can make predictions and inferences based on the relationships they find in the data.

1.2 – Identify features of anomaly detection workloads

Review

Anomaly detection – a machine learning based technique that analyses data over time and identifies unusual changes.

This could be credit card transactions that detects unusual usage, or an automated production line that identifies failures. Or even a formula 1 racing car that uses sensors to warn about potential mechanical failures.

1.3 – Identify computer vision workloads

Review

Computer Vision is an area of AI that deals with visual processing.

Most computer vision solutions are based on machine learning models that can be applied to visual input from cameras, videos, or images. The following describes common computer vision tasks.

  • Image classification involves training a machine learning model to classify images based on their contents. For example, in a traffic monitoring solution you might use an image classification model to classify images based on the type of vehicle they contain, such as taxis, buses, cyclists, and so on.
  • Object detection machine learning models are trained to classify individual objects within an image, and identify their location with a bounding box. For example, a traffic monitoring solution might use object detection to identify the location of different classes of vehicle.
  • Semantic segmentation is an advanced machine learning technique in which individual pixels in the image are classified according to the object to which they belong. For example, a traffic monitoring solution might overlay traffic images with “mask” layers to highlight different vehicles using specific colours.
  • Image analysisYou can create solutions that combine machine learning models with advanced image analysis techniques to extract information from images, including “tags” that could help Catalog the image or even descriptive captions that summarize the scene shown in the image.
  • Face detection is a specialised form of object detection that locates human faces in an image. This can be combined with classification and facial geometry analysis techniques to infer details such as age and emotional state; and even recognise individuals based on their facial features.
  • Optical character recognition is a technique used to detect and read text in images. You can use OCR to read text in photographs or to extract information from scanned documents such as letters, invoices, or forms.

1.4 – Identify natural language processing or knowledge mining workloads

Review

Natural language processing (NLP) is the area of AI that deals with creating software that understands written and spoken language.

In Microsoft Azure, you can use the following cognitive services to build natural language processing solutions:

  • Text Analytics use this service to analyse text documents and extract key phrases, detect entities (such as places, dates, and people), and evaluate sentiment (how positive or negative a document is).
  • Translator Text use this service to translate text between more than 60 languages.
  • Speech use this service to recognize and synthesize speech, and to translate spoken languages.
  • Language Understanding Intelligent Service (LUIS) use this service to train a language model that can understand spoken or text-based commands.

1.5 – Identify conversational AI workloads

Review

Conversational AI is the term used to describe solutions where AI agents participate in conversations with humans. Most commonly, conversational AI solutions use bots to manage dialogs with users. These dialogs can take place through web site interfaces, email, social media platforms, messaging systems, phone calls, and other channels.

To create conversational AI solutions on Microsoft Azure, you can use the following services:

  • QnA Maker This cognitive service enables you to quickly build a knowledge base of questions and answers that can form the basis of a dialog between a human and an AI agent.
  • Azure Bot Service This service provides a platform for creating, publishing, and managing bots. Developers can use the Bot Framework to create a bot and manage it with Azure Bot Service – integrating back-end services like QnA Maker and LUIS, and connecting to channels for web chat, email, Microsoft Teams, and others.

Section 2: Identify guiding principles for responsible AI.

2.1 – Describe considerations for fairness in an AI solution

Review

AI systems should treat all people fairly. For example, suppose you create a machine learning model to support a loan approval application for a bank. The model should make predictions of whether or not the loan should be approved without incorporating any bias based on gender, ethnicity, or other factors that might result in an unfair advantage or disadvantage to specific groups of applicants.

2.2 – Describe considerations for reliability and safety in an AI solution

Review

AI systems should perform reliably and safely. For example, consider an AI-based software system for an autonomous vehicle; or a machine learning model that diagnoses patient symptoms and recommends prescriptions. Unreliability in these kinds of system can result in substantial risk to human life.

AI-based software application development must be subjected to rigorous testing and deployment management processes to ensure that they work as expected before release.

2.3 – Describe considerations for privacy and security in an AI solution

Review

AI systems should be secure and respect privacy. The machine learning models on which AI systems are based rely on large volumes of data, which may contain personal details that must be kept private. Even after the models are trained and the system is in production, it uses new data to make predictions or take action that may be subject to privacy or security concerns.

2.4 – Describe considerations for inclusiveness in an AI solution

Review

AI systems should empower everyone and engage people. AI should bring benefits to all parts of society, regardless of physical ability, gender, sexual orientation, ethnicity, or other factors

2.5 – Describe considerations for transparency in an AI solution

Review

AI systems should be understandable. Users should be made fully aware of the purpose of the system, how it works, and what limitations may be expected.

2.6 – Describe considerations for accountability in an AI solution

Review

People should be accountable for AI systems. Designers and developers of AI-based solution should work within a framework of governance and organizational principles that ensure the solution meets ethical and legal standards that are clearly defined.

Section 3: Identify common machine learning types.

3.1 – Identify regression machine learning scenarios

Review

Regression is a form of machine learning that is used to predict a numeric label based on an item’s features. For example, an automobile sales company might use the characteristics of car (such as engine size, number of seats, mileage, and so on) to predict its likely selling price.

In this case, the characteristics of the car are the features, and the selling price is the label.

Regression is an example of a supervised machine learning technique in which you train a model using data that includes both the features and known values for the label, so that the model learns to fit the feature combinations to the label. Then, after training has been completed, you can use the trained model to predict labels for new items for which the label is unknown.

3.2 – Identify classification machine learning scenarios

Review

Classification is a form of machine learning that is used to predict which category, or class, an item belongs to. For example, a health clinic might use the characteristics of a patient (such as age, weight, blood pressure, and so on) to predict whether the patient is at risk of diabetes.

In this case, the characteristics of the patient are the features, and the label is a classification of either 0 or 1, representing non-diabetic or diabetic.

Classification is an example of a supervised machine learning technique in which you train a model using data that includes both the features and known values for the label, so that the model learns to fit the feature combinations to the label. Then, after training has been completed, you can use the trained model to predict labels for new items for which the label is unknown.

3.3 – Identify clustering machine learning scenarios

Review

Clustering is a form of machine learning that is used to group similar items into clusters based on their features. For example, a researcher might take measurements of penguins, and group them based on similarities in their proportions.

Clustering is an example of unsupervised machine learning, in which you train a model to separate items into clusters based purely on their characteristics, or features. There is no previously known cluster value (or label) from which to train the model.

Section 4: Describe core machine learning concepts.

4.1 – Identify features and labels in a dataset for machine learning

Review

Machine learning is a technique that uses mathematics and statistics to create a model that can predict unknown values.

For example, suppose a cycle rental business used historic data to train a model that predicts daily rental demand in order to make sure sufficient staff and cycles are available.

To do this, they could create a machine learning model that takes information such as the day of week, anticipated weather conditions, etc as an input, and predicts the expected number of rentals as an output.

Mathematically, you can think of machine learning as a way of defining a function (f) that operates on one or more features (x) to calculate a predicted label (y)

f(x)=y

In this example, the details about a given day (day of the week, weather, and so on) are the features (x), the number of rentals for that day is the label (y), and the function (f) that calculates the number of rentals based on the information about the day is encapsulated in a machine learning model.

4.2 – Describe how training and validation datasets are used in machine learning

Review

When we train a machine learning model we typically split the data into three parts.

  • Training
  • Validating
  • Testing

In every iteration of training, we use the training dataset as examples for the model to extract and generalise.

After training, we use the validation dataset to check the performance of the model. If the results are not satisfactory we tweak the model and begin the process again.

When training completes, we use the test dataset to check the performance of the model.

Note that the test dataset can only be used once.

4.3 – Describe how machine learning algorithms are used for model training

Review

Machine learning algorithms are pieces of code that help people explore, analyse and find meaning in complex data sets. Each algorithm is a finite set of unambiguous step-by-step instructions that a machine can follow to achieve a certain goal. In a machine learning model, the goal is to establish or discover patterns that people can use to make predictions or categorise information.

Different algorithms analyse data in different ways. They’re often grouped by the machine learning techniques that they’re used for: supervised learning, unsupervised learning and reinforcement learning. The most commonly used algorithms use regression and classification to predict target categories, find unusual data points, predict values and discover similarities.

Examples of machine learning algorithms

  • Linear regression algorithms show or predict the relationship between two variables or factors by fitting a continuous straight line to the data. The line is often calculated using the Squared Error Cost function. Linear regression is one of the most popular types of regression analysis.
  • Logistic regression algorithms fit a continuous S-shaped curve to the data. Logistic regression is another popular type of regression analysis.
  • Naïve Bayes algorithms calculate the probability that an event will occur, based on the occurrence of a related event.
  • Support Vector Machines draw a hyperplane between the two closest data points. This marginalises the classes and maximises the distances between them to more clearly differentiate them.
  • Decision tree algorithms split the data into two or more homogeneous sets. They use if–then rules to separate the data based on the most significant differentiator between data points.
  • K-Nearest neighbour algorithms store all available data points and classify each new data point based on the data points that are closest to it, as measured by a distance function.
  • Random forest algorithms are based on decision trees, but instead of creating one tree, they create a forest of trees and then randomise the trees in that forest. Then, they aggregate votes from different random formations of the decision trees to determine the final class of the test object.
  • Gradient boosting algorithms produce a prediction model that bundles weak prediction models – typically decision trees – through an ensembling process that improves the overall performance of the model.
  • K-Means algorithms classify data into clusters – where K equals the number of clusters. The data points inside of each cluster are homogeneous, and they’re heterogeneous to data points in other clusters.

Section 5: Identify core tasks in creating a machine learning solution.

5.1 – Describe common features of data ingestion and preparation

Review

Data ingestion is the process in which unstructured data is extracted from one or multiple sources and then prepared for training machine learning models. It’s also time intensive, especially if done manually, and if you have large amounts of data from multiple sources. Automating this effort frees up resources and ensures your models use the most recent and applicable data.

5.2 – Describe common features of model training and evaluation

Review

When we train a machine learning model we typically split the data into three parts.

  • Training
  • Validating
  • Testing

In every iteration of training, we use the training dataset as examples for the model to extract and generalise.

After training, we use the validation dataset to check the performance of the model. If the results are not satisfactory we tweak the model and begin the process again.

When training completes, we use the test dataset to check the performance of the model.

Note that the test dataset can only be used once.

5.3 – Describe common features of model deployment and management

Review

In Azure Machine Learning, you can deploy a service as an Azure Container Instances (ACI) or to an Azure Kubernetes Service (AKS) cluster. For production scenarios, an AKS deployment is recommended, for which you must create an inference cluster compute target.

Section 6: Describe capabilities of no-code machine learning with Azure Machine Learning studio.

6.1 – Describe automated ML UI

Review

Data scientists expend a lot of effort exploring and pre-processing data, and trying various types of model-training algorithms to produce accurate models, which is time consuming, and often makes inefficient use of expensive compute hardware.

Azure Machine Learning is a cloud-based platform for building and operating machine learning solutions in Azure. It includes a wide range of features and capabilities that help data scientists prepare data, train models, publish predictive services, and monitor their usage. Most importantly, it helps data scientists increase their efficiency by automating many of the time-consuming tasks associated with training models; and it enables them to use cloud-based compute resources that scale effectively to handle large volumes of data while incurring costs only when actually used.

6.2 – Describe Azure Machine Learning designer

Review

The Azure Machine Learning designer is a drag-and-drop tool that lets you create machine learning models without a single line of code.

It focuses on the following core tasks.

  • Pipeline creation
  • Data import
  • Data preperation
  • Training
  • Evaluating
  • Deployment

Section 7: Identify common types of computer vision solution.

7.1 – Identify features of image classification solutions

Review

Computer Vision has the ability to analyse an image, evaluate the objects that are detected, and generate a human-readable phrase or sentence that can describe what was detected in the image. Depending on the image contents, the service may return multiple results, or phrases. Each returned phrase will have an associated confidence score, indicating how confident the algorithm is in the supplied description. The highest confidence phrases will be listed first.

To help you understand this concept, consider the following image of the Empire State building in New York. The returned phrases are listed below the image in the order of confidence.

  • A black and white photo of a city
  • A black and white photo of a large city
  • A large white building in a city

The image descriptions generated by Computer Vision are based on a set of thousands of recognizable objects, which can be used to suggest tags for the image. These tags can be associated with the image as metadata that summarizes attributes of the image; and can be particularly useful if you want to index an image along with a set of key terms that might be used to search for images with specific attributes or contents.

7.2 – Identify features of object detection solutions

Review

The object detection capability is similar to tagging, in that the service can identify common objects; but rather than tagging, or providing tags for the recognised objects, the service can also return what is known as bounding box coordinates.

Not only will you get the type of object, but you will also receive a set of coordinates that indicate the top, left, width, and height of the object detected, which you can use to identify the location of the object in the image, like this.

7.3 – Identify features of optical character recognition solutions

Review

The ability for computer systems to process written or printed text is an area of artificial intelligence (AI) where computer vision intersects with natural language processing. You need computer vision capabilities to “read” the text, and then you need natural language processing capabilities to make sense of it.

The basic foundation of processing printed text is optical character recognition (OCR), in which a model can be trained to recognise individual shapes as letters, numerals, punctuation, or other elements of text.

7.4 – Identify features of facial detection, facial recognition, and facial analysis solutions

Review

Face detection and analysis is an area of artificial intelligence (AI) in which we use algorithms to locate and analyse human faces in images or video content.

Face detection involves identifying regions of an image that contain a human face, typically by returning bounding box coordinates that form a rectangle around the face, like this:

Facial analysismoves beyond simple face detection and returns other information, such as facial landmarks (nose, eyes, eyebrows, lips, and others).

These facial landmarks can be used as features with which to train a machine learning model from which you can infer information about a person, such as their perceived age or perceived emotional state, like this:

A further application of facial analysis is to train a machine learning model to identify known individuals from their facial features. This usage is more generally known as facial recognition, and involves using multiple images of each person you want to recognize to train a model so that it can detect those individuals in new images on which it wasn’t trained.

Section 8: Identify Azure tools and services for computer vision tasks.

8.1 – Identify capabilities of the Computer Vision service

Review

Computer vision is one of the core areas of artificial intelligence and focuses on creating solutions that enable AI-enabled applications to “see” the world and make sense of it.

Of course, computers don’t have biological eyes that work the way ours do, but they are capable of processing images; either from a live camera feed or from digital photographs or videos. This ability to process images is the key to creating software that can emulate human visual perception.

To an AI application, an image is just an array of pixel values. These numeric values can be used as features to train machine learning models that make predictions about the image and its contents.

In Microsoft Azure, the Computer Vision cognitive service uses pre-trained models to analyse images, enabling software developers to easily build applications that can:

  • Interpret an image and suggest an appropriate caption.
  • Suggest relevant tags that could be used to index an image.
  • Categorize an image.
  • Identify objects in an image.
  • Detect faces and people in an image.
  • Recognize celebrities and landmarks in an image.
  • Read text in an image.

8.2 – Identify capabilities of the Custom Vision service

Review

Creating an object detection solution with Custom Vision consists of three main tasks. First you must use upload and tag images, then you can train the model, and finally you must publish the model so that client applications can use it to generate predictions.

Image taggingBefore you can train an object detection model, you must tag the classes and bounding box coordinates in a set of training images. This process can be time-consuming, but the Custom Vision portal provides a graphical interface that makes it straightforward. The interface will automatically suggest areas of the image where discrete objects are detected, and you can apply a class label to these suggested bounding boxes or drag to adjust the bounding box area.

Key considerations when tagging training images for object detection are ensuring that you have sufficient images of the objects in question, preferably from multiple angles; and making sure that the bounding boxes are defined tightly around each object.

Model training and evaluationis an iterative process in which the Custom Vision service repeatedly trains the model using some of the data, but holds some back to evaluate the model. At the end of the training process, the performance for the trained model is indicated by the following evaluation metrics:

  • Precision: What percentage of class predictions did the model correctly identify?
  • Recall: What percentage of the class predictions made by the model were correct?
  • Mean Average Precision: An overall metric that takes into account both precision and recall across all classes

.

Using the model for prediction After you’ve trained the model, and you’re satisfied with its evaluated performance, you can publish the model to your prediction resource.

8.3 – Identify capabilities of the Face service

Review

Microsoft Azure provides multiple cognitive services that you can use to detect and analyse faces, including:

  • Computer Vision, which offers face detection and some basic face analysis, such as determining age.
  • Video Indexer, which you can use to detect and identify faces in a video.
  • Face, which offers pre-built algorithms that can detect, recognize, and analyze faces.

Of these, Face offers the widest range of facial analysis capabilities and currently supports the following functionality:

  • Face Detection
  • Face Verification
  • Find Similar Faces
  • Group faces based on similarities
  • Identify people

8.4 – Identify capabilities of the Form Recognizer service

Review

A common problem in many organizations is the need to process receipt or invoice data. For example, a company might require expense claims to be submitted electronically with scanned receipts, or invoices might need to be digitized and routed to the correct accounts department.

It’s relatively easy to scan receipts to create digital images or PDF documents, and it’s possible to use optical character recognition (OCR) technologies to extract the text contents from the digitized documents. However, typically someone still needs to review the extracted text to make sense of the information it contains.

The Form Recognizer in Azure provides intelligent form processing capabilities that you can use to automate the processing of data in documents such as forms, invoices, and receipts. It combines state-of-the-art optical character recognition (OCR) with predictive models that can interpret form data by:

  • Matching field names to values.
  • Processing tables of data.
  • Identifying specific types of field, such as dates, telephone numbers, addresses, totals, and others.

Section 9: Identify features of common NLP Workload Scenarios.

9.1 – Identify features and uses for key phrase extraction

Review

The Key Phrase Extraction skill evaluates unstructured text, and for each record, returns a list of key phrases. This skill uses the machine learning models provided by Text Analytics in Cognitive Services.

This capability is useful if you need to quickly identify the main talking points in the record. For example, given input text “The food was delicious and there were wonderful staff”, the service returns “food” and “wonderful staff”.

9.2 – Identify features and uses for entity recognition

Review

The Entity Recognition skill extracts entities of different types from text. These entities fall under 14 distinct categories, ranging from people and organizations to URLs and phone numbers.

Named Entity Recognition (NER) can Identify and categorize entities in your text as people, places, organizations, quantities, Well-known entities are also recognized and linked to more information on the web.

9.3 – Identify features and uses for sentiment analysis

Review

Use sentiment analysis and find out what people think of your brand or topic by mining the text for clues about positive or negative sentiment.

The feature provides sentiment labels (such as “negative”, “neutral” and “positive”) based on the highest confidence score found by the service at a sentence and document-level. This feature also returns confidence scores between 0 and 1 for each document & sentences within it for positive, neutral and negative sentiment.

9.4 – Identify features and uses for speech recognition and synthesis

Review

Increasingly, we expect artificial intelligence (AI) solutions to accept vocal commands and provide spoken responses. Consider the growing number of home and auto systems that you can control by speaking to them – issuing commands such as “turn off the lights”, and soliciting verbal answers to questions such as “will it rain today?”

To enable this kind of interaction, the AI system must support two capabilities:

  • Speech recognition – the ability to detect and interpret spoken input.
  • Speech synthesis – the ability to generate spoken output.

Speech recognition is concerned with taking the spoken word and converting it into data that can be processed – often by transcribing it into a text representation. The spoken words can be in the form of a recorded voice in an audio file, or live audio from a microphone. Speech patterns are analysed in the audio to determine recognisable patterns that are mapped to words.

Speech synthesis is in many respects the reverse of speech recognition. It is concerned with vocalising data, usually by converting text to speech.

9.5 – Identify features and uses for translation

Review

As organisations and individuals increasingly need to collaborate with people in other cultures and geographic locations, the removal of language barriers has become a significant problem.

One solution is to find bilingual, or even multilingual, people to translate between languages. However the scarcity of such skills, and the number of possible language combinations can make this approach difficult to scale. Increasingly, automated translation, sometimes known as machine translation, is being employed to solve this problem.

Text translation can be used to translate documents from one language to another, translate email communications that come from foreign governments, and even provide the ability to translate web pages on the Internet. Many times you will see a Translate option for posts on social media sites, or the Bing search engine can offer to translate entire web pages that are turned in search results.

Speech translation is used to translate between spoken languages, sometimes directly (speech-to-speech translation) and sometimes by translating to an intermediary text format (speech-to-text translation).

Section 10: Identify Azure tools and services for NLP workloads.

10.1 – Identify capabilities of the Text Analytics service

Review

Analysing text is a process where you evaluate different aspects of a document or phrase, in order to gain insights into the content of that text. For the most part, humans are able to read some text and understand the meaning behind it. Even without considering grammar rules for the language the text is written in, specific insights can be identified in the text.

As an example, you might read some text and identify some key phrases that indicate the main talking points of the text. You might also recognize names of people or well-known landmarks such as the Eiffel Tower. Although difficult at times, you might also be able to get a sense for how the person was feeling when they wrote the text, also commonly known as sentiment.

Text analytics is a process where an artificial intelligence (AI) algorithm, running on a computer, evaluates these same attributes in text, to determine specific insights. A person will typically rely on their own experiences and knowledge to achieve the insights. A computer must be provided with similar knowledge to be able to perform the task. There are some commonly used techniques that can be used to build software to analyse text, including:

  • Statistical analysis of terms used in the text. For example, removing common “stop words” (words like “the” or “a”, which reveal little semantic information about the text), and performing frequency analysis of the remaining words (counting how often each word appears) can provide clues about the main subject of the text.
  • Extending frequency analysis to multi-term phrases, commonly known as N-grams (a two-word phrase is a bi-gram, a three-word phrase is a tri-gram, and so on).
  • Applying stemming or lemmatization algorithms to normalize words before counting them – for example, so that words like “power”, “powered”, and “powerful” are interpreted as being the same word.
  • Applying linguistic structure rules to analyze sentences – for example, breaking down sentences into tree-like structures such as a noun phrase, which itself contains nouns, verbs, adjectives, and so on.
  • Encoding words or terms as numeric features that can be used to train a machine learning model. For example, to classify a text document based on the terms it contains. This technique is often used to perform sentiment analysis, in which a document is classified as positive or negative.
  • Creating vectorized models that capture semantic relationships between words by assigning them to locations in n-dimensional space. This modeling technique might, for example, assign values to the words “flower” and “plant” that locate them close to one another, while “skateboard” might be given a value that positions it much further away.

While these techniques can be used to great effect, programming them can be complex. In Microsoft Azure, the Text Analytics cognitive service can help simplify application development by using pre-trained models that can:

  • Determine the language of a document or text (for example, French or English).
  • Perform sentiment analysis on text to determine a positive or negative sentiment.
  • Extract key phrases from text that might indicate its main talking points.
  • Identify and categorize entities in the text. Entities can be people, places, organizations, or even everyday items such as dates, times, quantities, and so on.

10.2 – Identify capabilities of the Language Understanding service (LUIS)

Review

In 1950, the British mathematician Alan Turing devised the Imitation Game, which has become known as the Turing Test and hypothesizes that if a dialog is natural enough, you may not know whether you’re conversing with a human or a computer. As artificial intelligence (AI) grows ever more sophisticated, this kind of conversational interaction with applications and digital assistants is becoming more and more common, and in specific scenarios can result in human-like interactions with AI agents. Common scenarios for this kind of solution include customer support applications, reservation systems, and home automation among others.

To realize the aspiration of the imitation game, computers need not only to be able to accept language as input (either in text or audio format), but also to be able to interpret the semantic meaning of the input – in other words, understand what is being said.

On Microsoft Azure, language understanding is supported through the Language Understanding Intelligent Service, more commonly known as Language Understanding. To work with Language Understanding, you need to take into account three core concepts: utterances, entities, and intents.

  • An utterance is an example of something a user might say, and which your application must interpret.
  • An entity is an item to which an utterance refers. For example, “fan” from the utterance “Switch the fan on”
  • An intent represents the purpose, or goal, expressed in a user’s utterance. For example, with the utterance “Switch the fan on”the intent is to turn the device on; so in your Language Understanding application, you might define a TurnOn as an intent that is related to these utterances.

10.3 – Identify capabilities of the Speech service

Review

Increasingly, we expect artificial intelligence (AI) solutions to accept vocal commands and provide spoken responses. Consider the growing number of home and auto systems that you can control by speaking to them – issuing commands such as “turn off the lights”, and soliciting verbal answers to questions such as “will it rain today?”

To enable this kind of interaction, the AI system must support two capabilities:

  • Speech recognition – the ability to detect and interpret spoken input.
  • Speech synthesis – the ability to generate spoken output.

Speech recognition is concerned with taking the spoken word and converting it into data that can be processed – often by transcribing it into a text representation. The spoken words can be in the form of a recorded voice in an audio file, or live audio from a microphone. Speech patterns are analysed in the audio to determine recognisable patterns that are mapped to words.

Speech synthesis is in many respects the reverse of speech recognition. It is concerned with vocalising data, usually by converting text to speech.

10.4 – Identify capabilities of the Translator Text service

Review

As organisations and individuals increasingly need to collaborate with people in other cultures and geographic locations, the removal of language barriers has become a significant problem.

One solution is to find bilingual, or even multilingual, people to translate between languages. However the scarcity of such skills, and the number of possible language combinations can make this approach difficult to scale. Increasingly, automated translation, sometimes known as machine translation, is being employed to solve this problem.

Text translation can be used to translate documents from one language to another, translate email communications that come from foreign governments, and even provide the ability to translate web pages on the Internet. Many times you will see a Translate option for posts on social media sites, or the Bing search engine can offer to translate entire web pages that are turned in search results.

Speech translation is used to translate between spoken languages, sometimes directly (speech-to-speech translation) and sometimes by translating to an intermediary text format (speech-to-text translation).

Section 11: Identify common use cases for conversational AI.

11.1 – Identify features and uses for webchat bots

Review

In today’s connected world, people use a variety of technologies to communicate. For example:

  • Voice calls
  • Messaging services
  • Online chat applications
  • Email
  • Social media platforms
  • Collaborative workplace tools

We’ve become so used to ubiquitous connectivity, that we expect the organizations we deal with to be easily contactable and immediately responsive through the channels we already use. Additionally, we expect these organizations to engage with us individually, and be able to answer complex questions at a personal level.

11.2 – Identify common characteristics of conversational AI solutions

Review

Increasingly, organizations are turning to artificial intelligence (AI) solutions that make use of AI agents, commonly known as bots to provide a first-line of automated support through the full range of channels that we use to communicate.

Conversations typically take the form of messages exchanged in turns; and one of the most common kinds of conversational exchange is a question followed by an answer. This pattern forms the basis for many user support bots, and can often be based on existing FAQ documentation. To implement this kind of solution, you need:

  • A knowledge base of question and answer pairs – usually with some built-in natural language processing model to enable questions that can be phrased in multiple ways to be understood with the same semantic meaning.
  • A bot service that provides an interface to the knowledge base through one or more channels.

Section 12: Identify Azure services for conversational AI.

12.1 – Identify capabilities of the QnA Maker service

Review

QnA Maker is a cloud-based Natural Language Processing (NLP) service that allows you to create a natural conversational layer over your data. It is used to find the most appropriate answer for any input from your custom knowledge base (KB) of information.

QnA Maker is commonly used to build conversational client applications, which include social media applications, chat bots, and speech-enabled desktop applications.

When to use QnA Maker

  • When you have static information – Use QnA Maker when you have static information in your knowledge base of answers. This knowledge base is custom to your needs, which you’ve built with documents such as PDFs and URLs.
  • When you want to provide the same answer to a request, question, or command – when different users submit the same question, the same answer is returned.
  • When you want to filter static information based on meta-information – add metadata tags to provide additional filtering options relevant to your client application’s users and the information. Common metadata information includes chit-chat, content type or format, content purpose, and content freshness.

  • When you want to manage a bot conversation that includes static information – your knowledge base takes a user’s conversational text or command and answers it. If the answer is part of a pre-determined conversation flow, represented in your knowledge base with multi-turn context, the bot can easily provide this flow.

12.2 – Identify capabilities of the Azure Bot service

Review

The Bot Framework, along with the Azure Bot Service, provides tools to build, test, deploy, and manage intelligent bots, all in one place. The Bot Framework includes a modular and extensible SDK for building bots, as well as tools, templates, and related AI services. With this framework, developers can create bots that use speech, understand natural language, handle questions and answers, and more.

When your bot is ready to be delivered to users, you can connect it to multiple channels; making it possible for users to interact with it through web chat, email, Microsoft Teams, and other common communication media.

Users can submit questions to the bot through any of its channels, and receive an appropriate answer from the knowledge base on which the bot is based.