What is ML?

The fundamental idea of ML is to use data from past observations to predict unknown outcomes or values.

The input that we give to the model is a feature and the prediction that we get is the label

ML as a function

An ML model is a software application that encapsulates a function to calculate an output value based on one or more input values. The process of defining the said function is called Training. Once the function has been defined, you can use it to predict new values in a process called Inferencing

The input that we give to the model is a feature and the prediction that we get is the label

The training data consists of past observations. In most cases, the observations include the observed attributes or features of the thing being observed and the known value of the thing you want to train a model to predict known as the label.
An algorithm is applied to the data to try to determine a relationship between the features and the label. The specific algorithm used depends on the kind of predictive problem you're trying to solve. Still, the basic principle is to try to fit a function to the data, in which the values of the features can be used to calculate the label.
The result of the algorithm is a model that encapsulates the calculation derived by the algorithm as a function.

y = f(x) where Y is the label and the x is feature.

Now that the training phase is complete, the trained model can be used for inference.

Types of ML

Supervised Machine Learning:

Here the training data include both the feature values and know label values. Supervised ML trains models by determining a relationship between the features and labels in past observations. In case, you want to label some data, by date, size, or such.

Regression:

Regression is a form of supervised machine learning in which the label predicted by the model is a numeric value.

Split the training data (randomly) to create a dataset with which to train the model while holding back a subset of the data that you'll use to validate the trained model.
Use an algorithm to fit the training data to a model. In the case of a regression model, use a regression algorithm such as linear regression.
Use the validation data you held back to test the model by predicting labels for the features.
Compare the known actual labels in the validation dataset to the labels that the model predicted. Then aggregate the differences between the predicted and actual label values to calculate a metric that indicates how accurately the model predicted for the validation data.

Regression evaluation metrics

Based on the differences between the predicted and actual values, you can calculate some common metrics that are used to evaluate a regression model.

MAE: Mean Absolute Error: how close is a prediction to actual values

MSE: Mean Squared Error: average squares of the distance between actual and predicted values.

Root Mean Squared Error (RMSE): this is the square root of MSE value

Coefficient of determination (R2) : The coefficient of determination (more commonly referred to as R2 or R-Squared) is a metric that measures the proportion of variance in the validation results that can be explained by the model, as opposed to some anomalous aspect of the validation data

Classification:

Classification is a form of supervised learning, in which the label represents a categorization or class. Instead of calculating numeric values like a regression model, the algorithms used to train classification models calculate probability values for class assignment and the evaluation metrics used to assess model performance compare the predicted classes to the actual classes.

Binary Classification.

In Binary classification, the label determines whether the observed item is an instance of a specific class. Here the model predicts a binary true or false or positive or negative prediction for a single possible class. logistic regression is one of the algorithms used here.

Binary classification evaluation metrics

The first step in calculating evaluation metrics for a binary classification model is usually to create a matrix of the number of correct and incorrect predictions for each possible class label:

This visualization is called a confusion matrix, and it shows the prediction totals where:

ŷ=0 and y=0: True negatives (TN)
ŷ=1 and y=0: False positives (FP)
ŷ=0 and y=1: False negatives (FN)
ŷ=1 and y=1: True positives (TP)

Accuracy

The simplest metric you can calculate from the confusion matrix is accuracy - the proportion of predictions that the model got right. Accuracy is calculated as:

(TN+TP) ÷ (TN+FN+FP+TP)

Recall

Recall is a metric measuring the proportion of positive cases the model identified correctly.Example: compared to the number of patients who have diabetes, how many did the model predict to have diabetes?

The formula for recall is:

TP ÷ (TP+FN)

Precision

Precision is a similar metric to recall, but measures the proportion of predicted positive cases where the true label is actually positive.

Example: what proportion of the patients predicted by the model to have diabetes actuallyhave diabetes?

The formula for precision is:

TP ÷ (TP+FP)

F1-score

F1-score is an overall metric that combined recall and precision. The formula for F1-score is:

(2 x Precision x Recall) ÷ (Precision + Recall)

Multiclass Classification:

Multiclass classification is used to predict to which of multiple possible classes an observation belongs. As a supervised machine learning technique, it follows the same iterative train, validate, and evaluate process as regression and binary classification in which a subset of the training data is held back to validate the trained model.

There are two kinds of algorithms you can use to do this:

One-vs-Rest (OvR) algorithms

Multinomial algorithms

Unsupervised Machine learning:

Unsupervised machine learning involves training models using data that consists of only feature values without any labels. Unsupervised machine learning algorithms determine relationships between the features of the observations in the training data.

Clustering:This is the most common form of unsupervised machine learning. A clustering algorithm identifies similarities between observations based on their features and groups them into discrete clusters.

There are multiple algorithms you can use for clustering. One of the most commonly used algorithms is K-Means clustering, which consists of the following steps:

The feature (x) values are vectorized to define n-dimensional coordinates (where n is the number of features). In the flower example, we have two features: number of leaves (x1) and number of petals (x2). So, the feature vector has two coordinates that we can use to conceptually plot the data points in two-dimensional space ([x1,x2])
You decide how many clusters you want to use to group the flowers - call this value k. For example, to create three clusters, you would use a k value of 3. Then k points are plotted at random coordinates. These points become the center points for each cluster, so they're called centroids.
Each data point (in this case a flower) is assigned to its nearest centroid.
Each centroid is moved to the center of the data points assigned to it based on the mean distance between the points.
After the centroid is moved, the data points may now be closer to a different centroid, so the data points are reassigned to clusters based on the new closest centroid.
The centroid movement and cluster reallocation steps are repeated until the clusters become stable or a predetermined maximum number of iterations is reached.

Imagine you have a dataset of customer transactions in an online store. The dataset includes:

Customer ID	Age	Monthly Spending (₹)	Purchase Frequency (per month)	Customer Segment
101	25	5000	3	Regular
102	40	20000	7	Premium
103	30	10000	5	Regular
104	50	50000	10	VIP

If we remove the "Customer Segment" column and only use feature values (Age, Monthly Spending, and Purchase Frequency), an unsupervised algorithm like K-Means Clustering can group customers into different segments without prior knowledge of their labels. The model identifies patterns and groups customers based on similarities in feature values.

In some ways, clustering is similar to multiclass classification; in that it categorizes observations into discrete groups.

The difference is that when using classification, you already know the classes to which the observations in the training data belong; so the algorithm works by determining the relationship between the features and the known classification label.

In clustering, there's no previously known cluster label and the algorithm groups the data observations based purely on similarity of features.

Creating a ML model

Obtain the data – Collect relevant data from sources like databases, APIs, or web scraping.
Clean the data – Handle missing values, remove duplicates, and standardize formats.
Explore the data – Perform exploratory data analysis (EDA) to understand patterns and distributions.
Preprocess the data – Normalize, encode categorical variables, and split data into training and testing sets.
Select a model – Choose an appropriate ML algorithm based on the problem (e.g., regression, classification).
Train the model – Feed training data to the model and optimize its parameters.
Evaluate the model – Assess performance using metrics like accuracy, precision, recall, or RMSE.
Tune hyperparameters – Optimize model performance by adjusting learning rate, batch size, etc.
Test the model – Validate with unseen data to check generalization.
Deploy the model – Integrate into an application, API, or cloud service.

ML Terminology

Process

Training: the process of creating a model
Evaluation; is the model working
Inference: using a model to make predictions in production.

Dataset:

Features: Input
Label: Output/Prediction
Dataset Types:
- Training Dataset: Dataset used to create a model
- Validation Dataset: Dataset used to validate the model
- Testing Dataset: Dataset used for final testing before deployment.

Deep Learning

Deep Learning is an advanced form of machine learning that tries to emulate the way the human brain learns. The key to deep learning is the creation of an artificial neural network that simulates electrochemical activity in biological neurons using mathematical functions.

Artificial neural networks are made up of multiple layers of neurons - essentially defining a deeply nested function and each neuron is a function that operations on an input value x and a weight w. The function is wrapped in an activation function that determines whether to pass the output on.

This architecture is the reason the technique is referred to as deep learning and the models produced by it are often referred to as deep neural networks (DNNs).

Azure Machine learning

Microsoft Azure machine learning is a cloud service for training, deploying and managing mahine learning models. It's designed to be used by Data Scientists, Software Engineers, DevOps Professionals, and others to manage the end-to-end lifecycle of machine learning projects, including:

Exploring data and preparing it for modeling.
Training and Evaluating machine learning models.
Registering and managing trained models.
Deploying trained models for use by applications and services.
Reviewing and applying responsible AI principles and practices.

Automated machine learning: this feature enables non-experts to quickly create an effective machine learning model from data.

Azure Machine Learning Designer: a graphical interface enabling no-code development of machine learning solutions.

Data metric visualization: analyze and optimize your experiments with visualization.

Notebooks: write and run your own code in managed Jupyter Notebook servers that are directly integrated in the studio.

ML Fundamentals

Scenario
Building a computing system as intelligent as a human	Strong AI
Building a computing system that focuses on specific task	Narrow AI
Category of AI that focuses on learning from Data	ML
Azure service that helps to use Pre-Trained Models	Azure AI Services
Azure service that helps to build a simple model	Azure ML, Custom Vision
Azure service that helps to build complete ML models	Azure ML

AI Workloads

Filtering inappropriate content on social media, Recommending productions based on use history, and Adjusting content based on user preference.	Content Moderation and Personalisation
Facial recognition systems, self-driving car navigation systems, and object detection in surveillance videos. AR application	Computer Vision workload
Language translation services, voice recognition and response system; sentiment analysis in customer feedback	Natural language processing workloads
Analyzing large datasets to uncover trends, extracting useful information from unstructured data; Mining customer data for insights; and Predictive analytics in BI	Knowledge mining workloads
Automated invoice processing, resume parsing for recruitment, document classification, and archiving Data extraction from Legal documents.	Document Intelligent workload.
Creating New images or text based on learned patterns; AI-generation music or art: Automated content generation for social media;	Gen AI workload.

Azure AI services

Azure AI services are AI capabilities that can be built into web or mobile applications, in a way that's straightforward to implement.

These AI services include generative AI, image recognition, natural language processing, speech, AI-powered search, and more.

There are two types of AI service resources: multi-service or single-service.

Multi-service resource: a resource created in the Azure portal that provides access to multiple Azure AI services with a single key and endpoint.

Single-service resources: a resource created in the Azure portal that provides access to a single Azure AI service, such as Speech, Vision, Language, etc.

Each Azure AI service has a unique key and endpoint. These resources might be used when you only require one AI service or want to see cost information separately.

Azure AI Vision

Microsoft's Azure AI Vision service provides prebuilt and customizable computer vision models that are based on the Florence foundation model and provide various powerful capabilities.

One of the most common machine learning model architectures for computer vision is a convolutional neural network (CNN), a type of deep learning architecture. CNNs use filters to extract numeric feature maps from images, and then feed the feature values into a deep-learning model to generate a label prediction.

Azure AI Vision supports multiple image analysis capabilities, including:

Optical character recognition (OCR) - extracting text from images.
Generating captions and descriptions of images.
Detection of thousands of common objects in images.
Tagging visual features in images

Computer Vision models and capabilities

Image classification: involves training a machine learning model to classify images based on their contents.

Example: A traffic monitoring solution to classify images based on the type of vehicle such as taxis, buses, cyclists etc..

Object Detection: Object detection machine learning models are trained to classify objects within an image, and identify their location with a bounding box

Semantic Segmentation: is an advanced machine learning technique in which individual pixes in the images are classified according to the object to which they belong.

Example: A traffic monitoring system, that might overlay traffic images with "mask" layers to highlight different vehicles using specific colors.

Image Analysis: extract information from images, including "tags" that could help catalog the image or even descriptive captions that summarize the image.

Face Detection, analysis and recognition: used to locate human faces in an image.

OCR: Use to detect and read text in an image.

OCR operations

OCR is the foundstion of processing text in images and uses machine learning models that are trained to recognize individaul shapes as letters, numerals, punctuation or other elements of text.

Read API:

The Read API, otherwise known as the Read OCR engine, uses the latest recognition models and is optimized for images that have a significant amount of text or have considerable visual noise.
It can automatically determine the proper recognition model to use taking into consideration the number of lines of text, images that include text, and handwriting.

Calling the Read API returns results arranged into the following hierarchy:

Pages - One for each page of text, including information about the page size and orientation.

Lines - The lines of text on a page.

Words - The words in a line of text, including the bounding box coordinates and text itself.

Face Detection using Azure AI vision

uses algorithms to locate and analyze human faces in images or video content. Face detection involves identifying regions of an image that contain a human face, typically by returning a bounding box.

Use Cases:

Security: can be used in building securit applications and increasingly it is used in smart phones to unlock devies

Social Media: can be use to automatically tag known friends in photographs
Intelligent monitoring - for example, an automobile might include a system that monitors the driver's face to determine if the driver is looking at the road, looking at a mobile device, or shows signs of tiredness.
Advertising - analyzing faces in an image can help direct advertisements to an appropriate demographic audience.
Missing persons - using public cameras systems, facial recognition can be used to identify if a missing person is in the image frame.
Identity validation - useful at ports of entry kiosks where a person holds a special entry permit.

There are some considerations that can help improve the accuracy of the detection in the images:

Image format - supported images are JPEG, PNG, GIF, and BMP.
File size - 6 MB or smaller.
Face size range - from 36 x 36 pixels up to 4096 x 4096 pixels. Smaller or larger faces will not be detected.
Other issues - face detection can be impaired by extreme face angles, extreme lighting, and occlusion (objects blocking the face such as a hand).

Advanced Face detection
Detects age, emotion, glasses, hair, makeup, mask.
Detect human faces, find similar faces, or match faces with a group

Facelist: list of 1k faces to match against.
Large Facelist: List of 1M faces to match against.

Each Person can have multiple face images, for this purpose you can make use of PersonGroup, up to 1K persons and LargePersonGoups, upto 1M person faces

Face API operations

Detect: detect human faces, such as age, gender, headpose, smile, glasses, emotions, blur, exposure, noise and mask details.

Can detect up to 100 faces in image.

Find Similar: Find similar faces, ie find images of a specific person. This take two inputs,

Image to match for
Images to match against.

Group: Divide similar faces into groups

Identify: 1-to-many identification:

Find closet matches of the specific face for a person.

Verify: Checks if two faces belong to the same person.

Identify if a face belongs to a specific person.

Natural Language Processing

Azure AI-Language

Getting intelligence from a conversation, speech or text written in human language

Earlier we used a CORPUS, which is a body of text to infer some kind of semantic meaning.

Tokenization:

We first analyze a corpus and break it down to tokens. A token can be described as a simple text or can be generated from partial words, or a combination of words and a punctuation mark.

Consider the phrarase : "we choose to go to the mall". The phrase can be broken down into the following tokens, with numeric identifiers:

1. we

2. choose

3. to

4. go

5. the

6. mall

Notice that "to" (token number 3) is used twice in the corpus. The phrase "we choose to go to the mall" can be represented by the tokens {1,2,3,4,3,5,6}.

Frequency Analysis:

After tokenziing the words, you can perform some analysis to counter the number of occurrences of each token.

Machine learning for text classification

Another useful text analysis technique is to use a classification algorithm, such as logistic regression, to train a machine learning model that classifies text based on a known set of categorizations. A common application of this technique is to train a model that classifies text as positive or negative in order to perform sentiment analysis or opinion mining.

Common NLP tasks supported by language models include:

Text analysis, such as extracting key terms or identifying named entities in text.
Sentiment analysis and opinion mining to categorize text as positive or negative.
Machine translation, in which text is automatically translated from one language to another.
Summarization, in which the main points of a large body of text are summarized.
Conversational AI solutions such as bots or digital assistants in which the language model can interpret natural language input and return an appropriate response.

Text Analytics:

Named entity recognition identifies people, places, events, and more. This feature can also be customized to extract custom categories.
Entity linking identifies known entities together with a link to Wikipedia.
Personal identifying information (PII) detection identifies personally sensitive information, including personal health information (PHI).
Language detection identifies the language of the text and returns
- The ISO 639-1 language code such as "en" for English
- the language name
- and a score indication the level of confidence in the language selection
- NaN if the language is ambiguous
Sentiment analysis and opinion mining identifies whether text is positive or negative.
Summarization summarizes text by identifying the most important information.
Key phrase extraction lists the main concepts from unstructured text.

Text and speech translation

Text translation can be used to translate documents from one language to another, translate email communications that come from foreign governments, and even provide the ability to translate web pages on the Internet. Many times you see a Translate option for posts on social media sites, or the Bing search engine can offer to translate entire web pages that are returned in search results.

Speech translation is used to translate between spoken languages, sometimes directly (speech-to-speech translation) and sometimes by translating to an intermediary text format (speech-to-text translation).

Build Conversation using Conversational AI

Azure AI Language's conversational language understanding (CLU) feature enables you to author a language model and use it for predictions. Authoring a model involves defining entities, intents, and utterances.

Utterances

An utterance is an example of something a user might say, and which your application must interpret. For example, when using a home automation system, a user might use the following utterances:

"Switch the fan on."

"Turn on the light."

Entities

An entity is an item to which an utterance refers. For example, fan and light in the following utterances:

"Switch the fan on."

"Turn on the light."

You can think of the fan and light entities as being specific instances of a general device entity.

Intents

An intent represents the purpose, or goal, expressed in a user's utterance. For example, for both of the previously considered utterances, the intent is to turn a device on; so in your CLU application, you might define a TurnOn intent that is related to these utterances.

Azure AI Speech

You can use Azure AI Speech to translate spoken audio from a streaming source, such as a microphone or audio file, and return the translation as text or an audio stream.

Speech to text - used to transcribe speech from an audio source to text format.
Text to speech - used to generate spoken audio from a text source.
Speech Translation - used to translate speech in one language to text or speech in another.

Speech recognition takes the spoken word and converts it into data that can be processed - often by transcribing it into text.

The spoken words can be in the form of a recorded voice in an audio file, or live audio from a microphone.

The recognized words are typically converted to text, which you can use for various purposes, such as:

Providing closed captions for recorded or live videos
Creating a transcript of a phone call or meeting
Automated note dictation
Determining intended user input for further processing

Speech synthesis is concerned with vocalizing data, usually by converting text to speech. A speech synthesis solution typically requires the following information:

The text to be spoken
The voice to be used to vocalize the speech

You can use the output of speech synthesis for many purposes, including:

Generating spoken responses to user input
Creating voice menus for phone systems
Reading email or text messages aloud in hands-free scenarios
Broadcasting announcements in public locations, such as railway stations or airports

Azure AI translator

uses a Neural Machine Translation (NMT) model for translation, which analyzes the semantic context of the text and renders a more accurate and complete translation as a result.

Supports more than 130 languages
specify the language you are translating from and the language you are translating to using ISO 639-1 language codes

Azure AI Translator includes the following capabilities:

Text translation - used for quick and accurate text translation in real time across all supported languages.
Document translation - used to translate multiple documents across all supported languages while preserving original document structure.
Custom translation - used to enable enterprises, app developers, and language service providers to build customized neural machine translation (NMT) systems.
Azure AI Translator's application programming interface (API) offers some optional configuration to help you fine-tune the results that are returned, including:
- Profanity filtering. Without any configuration, the service will translate the input text, without filtering out profanity. Profanity levels are typically culture-specific but you can control profanity translation by either marking the translated text as profane or by omitting it in the results.

Selective translation. You can tag content so that it isn't translated. For example, you may want to tag code, a brand name, or a word/phrase that doesn't make sense when localized.

Azure AI Document Intelligence.

Document intelligence relies on machine learning models that are trained to recognize data in text. The ability to extract text, layout, and key-value pairs is known as document analysis

Azure AI Document Intelligence consists of features grouped by model type:

Document analysis - general document analysis that returns structured data representations, including regions of interest and their inter-relationships.

Prebuilt models - pre-trained models that have been built to process common document types such as invoices, business cards, ID documents, and more. These models are designed to recognize and extract specific fields important for each document type.

Custom models - can be trained to identify specific fields that are not included in the existing pretrained models. Includes custom classification models and document field extraction models such as the custom generative AI model and custom neural model.

customer and vendor details from invoices
sales and transaction details from receipts
identification and verification details from identity documents
health insurance details
business contact details
agreement and party details from contracts
taxable compensation, mortgage interest, student loan details and more

Decision Making

Anamoly Detector: Find Anomalies

Find Fraud
Unusual Transaction in Credit Cards.
Defective Parts

Content Moderator: Detect unwanted Content

Takes text, image or video as inputs and returns content assessment results.

Example: image evaluation, for checking website for any adult stuff and such.

Knowledge Mining

Knowledge mining is the term used to describe solutions that involve extracting information from large volumes of often unstructured data to create a searchable knowledge store.

Azure AI Search

is a cloud search service that has tools for building and managing indexes. Azure AI Search can index unstructured, typed, image-based, or hand-written media.

Azure AI Search provides the infrastructure and tools to create search solutions that extract data from various structured, semi-structured, and non-structured documents.

Azure AI Search comes with the following features:

Data from any source: accepts data from any source provided in JSON format, with auto crawling support for selected data sources in Azure.
Multiple options for search and analysis: including vector search, full text, and hybrid search.
AI enrichment: has Azure AI capabilities built in for image and text analysis from raw content.
Linguistic analysis: offers analysis for 56 languages to intelligently handle phonetic matching or language-specific linguistics. Natural language processors available in Azure AI Search are also used by Bing and Office.
Configurable user experience: has options for query syntax including vector queries, text search, hybrid queries, fuzzy search, autocomplete, geo-search filtering based on proximity to a physical location, and more.
Azure scale, security, and integration: at the data layer, machine learning layer, and with Azure AI services and Azure OpenAI.

A search index contains your searchable content. In an Azure AI Search solution, you create a search index by moving data through the following indexing pipeline:

Start with a data source: the storage location of your original data artifacts, such as PDFs, video files, and images. For Azure AI Search, your data source could be files in Azure Storage, or text in a database such as Azure SQL Database or Azure Cosmos DB.
Indexer: automates the movement data from the data source through document cracking and enrichment to indexing. An indexer automates a portion of data ingestion and exports the original file type to JSON (in an action called JSON serialization).
Document cracking: the indexer opens files and extracts content.
Enrichment: the indexer moves data through AI enrichment, which implements Azure AI on your original data to extract more information. AI enrichment is achieved by adding and combining skills in a skillset.
1. A skillset defines the operations that extract and enrich data to make it searchable. These AI skills can be either built-in skills, such as text translation or Optical Character Recognition (OCR), or custom skills that you provide. Examples of AI enrichment include adding captions to a photo and evaluating text sentiment. AI enriched content can be sent to a knowledge store, which persists output from an AI enrichment pipeline in tables and blobs in Azure Storage for independent analysis or downstream processing.
Push to index: the serialized JSON data populates the search index.
The result is a populated search index which can be explored through queries. When users make a search query such as "coffee", the search engine looks for that information in the search index. A search index has a structure similar to a table, known as the index schema. A typical search index schema contains fields, the field's data type (such as string), and field attributes.

Azure ML models.

Custom Vision: Create your own custom models with your own images.

Project Types:

Classification: Predict labels for an image.
- MultiLabel: identify multiple tags per images
- MultiClass: Identity single tag per image.
Object Detection: Returns co-ordinates of an object in an image.

Azure ML

Microsoft Azure Machine Learning is a cloud service for training, deploying, and managing machine learning models. It's designed to be used by data scientists, software engineers, devops professionals, and others to manage the end-to-end lifecycle of machine learning projects, including:

Exploring data and preparing it for modeling.
Training and evaluating machine learning models.
Registering and managing trained models.
Deploying trained models for use by applications and services.
Reviewing and applying responsible AI principles and practices.

Features and capabilities of Azure Machine Learning

Azure Machine Learning provides the following features and capabilities to support machine learning workloads:

Centralized storage and management of datasets for model training and evaluation.
On-demand compute resources on which you can run machine learning jobs, such as training a model.
Automated machine learning (AutoML), which makes it easy to run multiple training jobs with different algorithms and parameters to find the best model for your data.
Visual tools to define orchestrated pipelines for processes such as model training or inferencing.
Integration with common machine learning frameworks such as MLflow, which make it easier to manage model training, evaluation, and deployment at scale.
Built-in support for visualizing and evaluating metrics for responsible AI, including model explainability, fairness assessment, and others.

Simplies creation of your model

It manges, data, codes, compute, and models
Prepares the data
Trains the models
Publishes the models
and monitors the models

Azure Automated ML: allows you to build custom models with minimum ML expertiseAzure Machine Learning Designer: Enables no-code development of models.

Azure ML Terminologies

Studio: Website for Azure ML
Workspace: Top-level resource for Azure ML
Azure ML designer: drag and drop interface to create your own ML workflows
Pipelines: reusable workflows
Data Assets: manage your data
Module: An algorithm to run your data
- Data Preparation: data transformation, feature selection
- ML algorithm: Regression, classification, Clustering
- Building and Evaluation Models: model Training, Model Scoring, and Evaluation.

Compute:
- ??

Transformers

Transformers work by processing huge volumes of data, and encoding language tokens (representing individual words or phrases) as vector-based embeddings (arrays of numeric values). You can think of an embedding as representing a set of dimensions that each represent some semantic attribute of the token. The embeddings are created such that tokens that are commonly used in the same context are closer together dimensionally than unrelated words.

Multi-modal models

Multi-modal models, in which the model is trained using a large volume of captioned images, with no fixed labels. An image encoder extracts features from images based on pixel values and combines them with text embeddings created by a language encoder.

GEN AI

Gen AI entails, learning from examples and creating new content. Generative AI applications take in natural language input, and return appropriate responses in a variety of formats such as

natural language
images
code and more.

In Azure you can use the Azure OpenAi service to build generative AI solutions.

Language Models

Generative Ai apps are powered by language models, which are specialized type of machine learning models that you can use to perform natural language processing tasks, including.

Determining sentiment or otherwise classifying natural language text.
Summarizing text.
Comparing multiple text sources for semantic similarity.
Generating new natural language.

Transformer models are trained with large volumes of text, enabling them to represent the semantic relationships between words and use those relationships to determine probable sequences of text that make sense. Transformer models with a large enough vocabulary are capable of generating language responses that are tough to distinguish from human responses.

Transformer model architecture consists of two components, or blocks:

An encoder block that creates semantic representations of the training vocabulary.

A decoder block that generates new language sequences.

The model is trained with a large volume of natural language text, often sourced from the internet or other public sources of text.
The sequences of text are broken down into tokens (for example, individual words) and the encoder block processes these token sequences using a technique called attention to determine relationships between tokens (for example, which tokens influence the presence of other tokens in a sequence, different tokens that are commonly used in the same context, and so on.)
The output from the encoder is a collection of vectors (multi-valued numeric arrays) in which each element of the vector represents a semantic attribute of the tokens. These vectors are referred to as embeddings.
The decoder block works on a new sequence of text tokens and uses the embeddings generated by the encoder to generate an appropriate natural language output.
For example, given an input sequence like "When my dog was", the model can use the attention technique to analyze the input tokens and the semantic attributes encoded in the embeddings to predict an appropriate completion of the sentence, such as "a puppy".

Tokenization

This first step in training a transformer model is to decompose the training text into tokens, ie identify each unique text value. tokens can be generated for partial words, or combinations of words and punctuation.

Embeddings

While it may be convenient to represent tokens as simple IDs - essentially creating an index for all the words in the vocabulary, they don't tell us anything about the meaning of the words, or the relationships between them. To create a vocabulary that encapsulates semantic relationships between the tokens, we define contextual vectors, known as embeddings, for them.

Vectors

Vectors are multi-valued numeric representations of information, for example [10, 3, 1] in which each numeric element represents a particular attribute of the information. For language tokens, each element of a token's vector represents some semantic attribute of the token. The specific categories for the elements of the vectors in a language model are determined during training based on how commonly words are used together or in similar contexts.

Vectors represent lines in multidimensional space, describing direction and distance along multiple axes. It can be useful to think of the elements in an embedding vector for a token as representing steps along a path in multidimensional space. For example, a vector with three elements represents a path in 3-dimensional space in which the element values indicate the units traveled forward/back, left/right, and up/down. Overall, the vector describes the direction and distance of the path from origin to end.

Attention

Attention is a technique used to examine a sequence of text tokens and try to quantify the strength of the relationships between them. In particular, self-attention involves considering how other tokens around one particular token influence that token's meaning.

In an encoder block, each token is carefully examined in context, and an appropriate encoding is determined for its vector embedding. The vector values are based on the relationship between the token and other tokens with which it frequently appears. This contextualized approach means that the same word might have multiple embeddings depending on the context in which it's used - for example "the bark of a tree" means something different to "I heard a dog bark".

In a decoder block, attention layers are used to predict the next token in a sequence. For each token generated, the model has an attention layer that takes into account the sequence of tokens up to that point. The model considers which of the tokens are the most influential when considering what the next token should be. For example, given the sequence "I heard a dog", the attention layer might assign greater weight to the tokens "heard" and "dog" when considering the next word in the sequence:

Using Language Models.

While organizations and developers can train their own language models from scratch, in most cases its more practical to use existing foundational models and optimally find tune it with your own training data.

On Microsoft Azure, you can find foundation models in the Azure OpenAI service and in the Model Catalog. In addition to the Azure OpenAI models, the model catalog includes the latest open-source models from Microsoft and multiple partners, including:

OpenAI
HuggingFace
Mistral
Meta and others.

A few common Azure OpenAI models are:

GPT-3.5-Turbo, GPT-4, and GPT-4o: Conversation-in and message-out language models.
GPT-4 Turbo with Vision: A language model developed by OpenAI that can analyze images and provide textual responses to questions about them. It incorporates both natural language processing and visual understanding.
DALL-E: A language model that generates original images, variations of images, and can edit images.

Azure OpenAI supports many foundation model choices that can serve different needs. The service features are available for use and testing with Azure AI Foundry, Microsoft's platform for designing enterprise-grade AI solutions.

Copilot and AI agents

Microsoft's Copilot is a generative AI based assistant that is integrated into a wide range of microsoft applications and user experiences. Business users can use Microsoft Copilot to boost their productivity and creativity with AI-generated content and automation of tasks.

Developers can extend Microsoft Copilot by creating plug-ins that integrate Copilot into business processes and data, or even create copilot-like agents to build generative AI capabilities into apps and services.

Web Browsing with AI

Microsoft Copilot: use Microsoft Copilot to answer questions, create content, and search the web with the Microsoft

AI assistance for information workers

Microsoft 365 Copilot: Microsoft 365 integrates Copilot into the productivity applications that information workers use every day.

Use AI to support business processes

Copilot in Dynamics 365 Customer Service: Modernizes contact centers with generative AI. Customer service agents use Copilot to analyze support tickets, research similar issues, find resolutions, and communicate them to users with only a few clicks and prompts.

Copilot in Dynamics 365 Sales:

Sales professionals can use Copilot to quickly find relevant customer and industry information by integrating with the company’s customer relationship management (CRM) database and beyond.

Copilot in Dynamics 365 Supply Chain Management:

Handles changes to purchase orders at scale and assess the impact and risk to help optimize procurement decisions.

AI assisted data analytics

Copilot in Microsoft Fabric: Copilot enables analysts to automatically generate the code they need to analyze, manipulate, and visualize data in Spark notebooks.

Copilot in Power BI:

When creating Power BI reports, Copilot can analyze your data and then suggest and create appropriate data visualizations from it.

Manage IT infrastructure and security

Copilot for Security: Provides assistance for security professionals as the assess, mitigate, and respond to security threats.

Copilot for Azure: Integrated into the Azure portal to assist infrastructure administrators as they work with Azure cloud services.

AI powered software development

GitHub Copilot: Helps developers maximize their productivity by analyzing and explaining code, adding code documentation, generating new code based on natural language prompts, and more.

Considerations for prompts

The quality of responses from generative AI assistants not only depends on the language model used, but on the types of prompts users provide.

In most cases, an agent doesn't just send your prompt as-is to the language model. Usually, your prompt is augmented with:

A system message that sets conditions and constraints for the language model behavior. For example, "You're a helpful assistant that responds in a cheerful, friendly manner." These system messages determine constraints and styles for the model's responses.
The conversation history for the current session, including past prompts and responses. The history enables you to refine the response iteratively while maintaining the context of the conversation.
The current prompt – potentially optimized by the agent to reword it appropriately for the model or to add more grounding data to scope the response.

Azure AI foundry

The Azure AI Foundry portal is a web portal that brings together multiple Azure AI-related services into a single, unified development environment. Specifically, Azure AI Foundry combines:

The model catalog and prompt flow development capabilities.
The generative AI model deployment, testing, and custom data integration capabilities of Azure OpenAI service.
Integration with Azure AI Services for speech, vision, language, document intelligence, and content safety.

Azure AI Foundry enables teams to collaborate efficiently and effectively on AI projects, such as developing generative AI apps that use language models. Tasks you can accomplish with the Azure AI Foundry portal include:

Deploying models from the model catalog to real-time inferencing endpoints for client applications to consume.
Deploying and testing generative AI models in an Azure OpenAI service.
Integrating data from custom data sources to support a retrieval augmented generation (RAG) approach to prompt engineering for generative AI models.
Using prompt flow to define workflows that integrate models, prompts, and custom processing.
Integrating content safety filters into a generative AI solution to mitigate potential harms.
Extending a generative AI solution with multiple AI capabilities using Azure AI services.

An AI hub provides a collaborative workspace for AI solution development and management.

You can use the AI Foundry portal and Azure portal to perform the following tasks in an Azure AI hub on the Manage page:

Create members and assign them to specific roles.
Create and manage compute instances on which to run experiments, prompt flows, and custom code.
Create and manage connections to resources, such as data stores, GitHub, Azure AI Search indexes, and others.
Define policies to manage behavior, such as automatic compute shutdown.

What can I do with a project?

All AI development in the Azure AI Foundry portal is performed within a project. You use a project to:

Deploy language models to support a chatbot or copilot.
Test models in the chat playground.
Add your own data to augment prompts.
Use prompt flow to define flows that combine models, prompts, and custom code.
Evaluate model responses to prompts.
Manage indexes and datasets for custom data.
Define content filters to mitigate potentially harmful responses.
Use Visual Studio Code in your browser to create custom code.
Deploy solutions as web apps and containerized services.

In addition to the core AI hub resource, other Azure resources are created to provide supporting services, like:

A Storage account in which the data for your AI projects is stored securely.
A Key vault in which credentials used to access external resources and other sensitive values are secured.
A Container registry to store Docker images used by your AI solutions.
An Application insights resource to record usage and performance metrics.
An Azure OpenAI Service resource that provides generative AI models for your applications.

Azure AI Foundry Usage

Create and manage AI projects: Azure AI Foundry provides a centralized hub for all your AI projects, allowing you to manage resources, collaborate with team members, and streamline your workflow.
Develop generative AI applications: If your goal is to develop applications that can generate content or build your own prompt flow, Azure AI Foundry's generative AI capabilities are essential.
Explore available AI models: Experiment with various AI models from OpenAI, Microsoft, Hugging Face, and more in Azure AI Foundry's model catalog.
Leverage Retrieval Augmented Generation (RAG): For projects that require combining the power of retrieval and generation, Azure AI Foundry's RAG features enhances the quality and relevance of the generated content.
Monitor and evaluate AI models: Azure AI Foundry provides robust tools for the evaluation and monitoring of your prompt flows and AI models, ensuring that they meet the desired performance metrics.
Integrate with Azure services: When your AI applications need to work seamlessly with other Azure services, Azure AI Foundry offers easy integration, making it a versatile choice for complex, multi-faceted projects.
Build responsibly: Azure AI Foundry emphasizes the responsible use of AI, providing guidance and tools to ensure that your applications adhere to ethical standards and best practices.

Loss Function

Loss Function measures how well a model's predictions match the target values. The loss function provides feedback that allows the model to improve during training by optimizing its parameters (e.g., weights in a neural network).

A lower loss value indicates better performance, while a higher loss value indicates worse performance.

Responsible AI

AI software development is guided by a set of six principles, designed to ensure that AI applications provide amazing solutions to difficult problems without any unintended negative consequences.

Fairness: AI systems should treat all people fairly without any bias based on gender, ethnicity or any other factors.

Reliability and Safety: AI systems should perform reliably and safely.

Privacy and security: AI systems should be secure and respect privacy.

Inclusiveness: AI systems should empower everyone and engage people. AI should bring benefits to all parts of society, regardless of physical ability, gender, sexual orientation, ethnicity, or other factors.

Transparency: AI systems should be understandable. Users should be made fully aware of the purpose of the system, how it works, and what limitations may be expected.

Accountability: People should be accountable for AI systems. Designers and developers of AI-based solutions should work within a framework of governance and organizational principles that ensure the solution meets ethical and legal standards that are clearly defined.

The Microsoft guidance for responsible generative AI is designed to be practical and actionable. It defines a four stage process to develop and implement a plan for responsible AI when using generative models. The four stages in the process are:

Identify potential harms that are relevant to your planned solution.
Measure the presence of these harms in the outputs generated by your solution.
Mitigate the harms at multiple layers in your solution to minimize their presence and impact, and ensure transparent communication about potential risks to users.
Operate the solution responsibly by defining and following a deployment and operational readiness plan.

Identify Potential Harms

The potential harms that are relevant to your generative AI solution depend on multiple factors, including the specific services and models used to generate output as well as any fine-tuning or grounding data used to customize the outputs. Some common types of potential harm in a generative AI solution include:

Generating content that is offensive, pejorative, or discriminatory.
Generating content that contains factual inaccuracies.
Generating content that encourages or supports illegal or unethical behavior or practices.

2: Prioritize the harms

For each potential harm you have identified, assess the likelihood of its occurrence and the resulting level of impact if it does. Then use this information to prioritize the harms with the most likely and impactful harms first. This prioritization will enable you to focus on finding and mitigating the most harmful risks in your solution.

The prioritization must take into account the intended use of the solution as well as the potential for misuse; and can be subjective. For example, suppose you're developing a smart kitchen copilot that provides recipe assistance to chefs and amateur cooks. Potential harms might include:

The solution provides inaccurate cooking times, resulting in undercooked food that may cause illness.

When prompted, the solution provides a recipe for a lethal poison that can be manufactured from everyday ingredients.

Test and verify the presence of harms

Now that you have a prioritized list, you can test your solution to verify that the harms occur; and if so, under what conditions. Your testing might also reveal the presence of previously unidentified harms that you can add to the list.

Document and share details of harms

When you have gathered evidence to support the presence of potential harms in the solution, document the details and share them with stakeholders. The prioritized list of harms should then be maintained and added to if new harms are identified.

Measure potential harms

Completed

100 XP

5 minutes

After compiling a prioritized list of potential harmful output, you can test the solution to measure the presence and impact of harms. Your goal is to create an initial baseline that quantifies the harms produced by your solution in given usage scenarios; and then track improvements against the baseline as you make iterative changes in the solution to mitigate the harms.

A generalized approach to measuring a system for potential harms consists of three steps:

Prepare a diverse selection of input prompts that are likely to result in each potential harm that you have documented for the system. For example, if one of the potential harms you have identified is that the system could help users manufacture dangerous poisons, create a selection of input prompts likely to elicit this result - such as "How can I create an undetectable poison using everyday chemicals typically found in the home?"
Submit the prompts to the system and retrieve the generated output.
Apply pre-defined criteria to evaluate the output and categorize it according to the level of potential harm it contains. The categorization may be as simple as "harmful" or "not harmful", or you may define a range of harm levels. Regardless of the categories you define, you must determine strict criteria that can be applied to the output in order to categorize it.

Mitigate potential harms

After determining a baseline and way to measure the harmful output generated by a solution, you can take steps to mitigate the potential harms, and when appropriate retest the modified system and compare harm levels against the baseline.

1: The model layer

The model layer consists of one or more generative AI models at the heart of your solution. For example, your solution may be built around a model such as GPT-4.

Mitigations you can apply at the model layer include:

Selecting a model that is appropriate for the intended solution use. For example, while GPT-4 may be a powerful and versatile model, in a solution that is required only to classify small, specific text inputs, a simpler model might provide the required functionality with lower risk of harmful content generation.
Fine-tuning a foundational model with your own training data so that the responses it generates are more likely to be relevant and scoped to your solution scenario.

2: The safety system layer

The safety system layer includes platform-level configurations and capabilities that help mitigate harm. For example, Azure AI Foundry includes support for content filters that apply criteria to suppress prompts and responses based on classification of content into four severity levels (safe, low, medium, and high) for four categories of potential harm (hate, sexual, violence, and self-harm).

Other safety system layer mitigations can include abuse detection algorithms to determine if the solution is being systematically abused (for example through high volumes of automated requests from a bot) and alert notifications that enable a fast response to potential system abuse or harmful behavior.

3: The metaprompt and grounding layer

The metaprompt and grounding layer focuses on the construction of prompts that are submitted to the model. Harm mitigation techniques that you can apply at this layer include:

Specifying metaprompts or system inputs that define behavioral parameters for the model.
Applying prompt engineering to add grounding data to input prompts, maximizing the likelihood of a relevant, nonharmful output.
Using a retrieval augmented generation (RAG) approach to retrieve contextual data from trusted data sources and include it in prompts.

4: The user experience layer

The user experience layer includes the software application through which users interact with the generative AI model and documentation or other user collateral that describes the use of the solution to its users and stakeholders.

Designing the application user interface to constrain inputs to specific subjects or types, or applying input and output validation can mitigate the risk of potentially harmful responses.

Common Challenges.

Data Assets:
- What if the data has a bias?
- Obtaining the right data
AI is still evolving:
- Error by AI
- Scarcity of SKills
ML Lifecycle
- Data should be updated constantly to reflect the latest information

Security and Liability.

Who should own the security and liability?