AWS AI Practitioner - Preparation / Last Minute Revision Sheet

AWS AI Practitioner - Preparation / Last Minute Revision Sheet

Are you ready to give the AWS AI Practitioner Certification Exam?

Below is the list of all the material and prep notes that helped me pass the exam.

Hope it will be helpful to you.


KEY NOTES: PLEASE READ FIRST

  1. Below are my personal notes taken from the AWS Skill Builder Free course I have taken for this exam. You can also take it by going to AWS Skill Builder.
  2. It is not exhaustive list and may not be specific sequence also (as it my personal note and follows my way to noting and remembering) so just use it as additional help to your planning
  3. Usually my idea was to write everything down while learning and then go through it again and in case I don't recall the given topic then I would go in depth to understand it further.
  4. The below material should be read as per the Indentation levels. If you don't see proper indentation, message me directly and can email that to you.
  5. Hope it helps you in prep, however way it can and most importantly WISH YOU GOOD LUCK..



?

Domain Level Revision below From the Course on AWS Skill Builder
Domain 1: Fundamentals of AI and ML?

Known Data -> Features -> Algorithm -> Output

Adjustments

Inference

ML models can be trained on various types of data.

Structured data on RDS, S3 or Redshift

S3 is primary source of training data

Semi-structures = DynamoDB & DocumentDB

Unstructured data - tokenization

Timeseries - sequential data

Model Training - Algorithm

Inference 2 options

? - Real time

???????? Low Latency

??????? High throughput

??????? persistent endpoint

- - Batch Transform

?????? Offline

?????? Large datasets

?????? Infrequent use

?

ML Types

? Supervised Learning

??????Amazon Sagemaker GroundTruth -> Amazon Mechanical Turk

? Unsupervised Learning

? Reinforcement Learning

????? Reward - AWS DeepRacer

?

Overfitting

? Model does well on training data but not outside it

Underfitting

? Model cannot determine meaningful results. It gives negative results for training data and new inputs

Bias and fairness

? Diversity of training data

? Feature importance

? Fairness constraints

Deep Learning

? Neural Networks

? Input Layer -> Hidden Layers -> Output Layer

Machine Learning vs Deep Learning

Consider alternatives when

? Costs outweigh the benefits

? Models cannot meet the interpretability requirements

? Systems must be deterministic rather than probabilistic

ML Models are probabilistic

?

Supervised learning -

??Classification

???? Binary????????? - Diabetic or not diabetic

???? MultiClass

? Regression

????? Simple Linear regression

????? Multiple Linear regression

????? Logistic regression

Unsupervised Learning

? Clustering

???? Define features

???? Similarity function

???? Number of clusters

? Anomaly detection

????? Data points that diverge

???

?

Amazon Rekognition

?? Facial comparison and analysis

?? Text detection

?? Object detection and labelling

?? Content moderation

?? Can find out explicit text from images and videos

?

Amazon Textract

? Extract text from scanned documents

?

Amazon Comprehend

? Extract key phrases, entities and sentiment.

? Main is finding PII data

?

Amazon Lex

?? Conversational voice and text

?

Amazon Transcribe

?? Converts speech to text

?

Amazon Polly

?? Converts Text to speech

?

Amazon Kendra

???Intelligent document search

?

Amazon Personalize

?? Personalized product recommendations

?

Amazon Translate

? Translates between 75 languages

?

Amazon Forecast

?? Predicts future points in time-series data

?

Amazon Fraud Detector

?? Detects fraud and fraudulent activities

?

Amazon Bedrock

?

Amazon Sagemaker

?

ML Pipeline

Identify Business Goal -> Frame ML Problem -> Collect Data -> Pre-process Data -> Engineer Features -> Train, Tune Evaluate -> Deploy -> Monitor

?

Collect Data

?? AWS Glue -

??????Cloud optimized ETL service

????? Contains its own data catalog

????? Built in transformations

? AWS Glue DataBrew

????? Point and click data transformation

????? 200+ transformations

? AWS SageMaker Ground Truth

?????Uses ML to label your training data

???? Can automatically label

AWS SageMaker Canvas

???? Import, Prepare, Transform, Visualize and analyze

AWS Sagemaker Feature Store

???? Processes raw data into features by using a processing workflow

Amazon Sagemaker Experiments

???? visual interface

Amazon Sagemaker automatic model tuning

?

Deploy

???? Batch inference

???? Real-time inference

???? Self-managed

???? Hosted

?

Amazon Sagemaker inference

??? Batch Transform

?????????????? Offline inference

?????????????? Large datasets

?? Asynchronous

?????????????? Long processing times

?????????????? Large payloads

?? Serverless

?????????????? Intermittent traffic

?????????????? Periods of no traffic

?? Real-time

?????????????? Live predictions

?????????????? Sustained traffic

?????????????? Low latency

?????????????? Consistent

?

Monitor the model

?????????????Configure alerts to notify and initiate actions if any drift

???????????? data drift / concept drift

?

Amazon Sagemaker Model Monitor

?

MLOps

????? Amazon SageMaker Model Building Pipelines

????? Repository Options

???????????? AWS Codecommit

???????????? AWS Sagemaker feature store

???????????? AWS Sagemaker model registry

??????????? 3rd party repository

????? Orchestration options

???????????? Amazon Sagemaker pipelines

???????????? Amazon managed workflows for apache airflow

???????????? AWS Step functions

?

Accuracy = (True Positives + Ture Negatives) / Total

Precision = True Positives / (True Positivies + False Positives)

Recall = True Positives / (True Positives + False Negatives)

F1 = Precision Recall 2 / (Precision + Recall)

False Positive Rate FPR = False Positives / (True Negatives + False Positives)

True Negative Rate = True Negatives / (True Negatives + False Positives)

Area Under Curve - AUC

Regression Model Errors

????? Mean Squared Error

?????? Root mean squared error

?????? Mean absolute error

?

?

Domain 2: Fundamentals of Generative AI

?

AI - ML - DL - GAI

Model

In-context learning

Prompts, prompt tuning, prompt engineering

Every NLP has a tokenizer which converts texts into token ID's.

Vector - ordered list of numbers.

Ability to encode related relationships and collect associations

Embeddings

Numerical vectorized representations of type that capture the semantic meaning of the token

Self-attention

?

LLMs

Deep learning foundation models

Transformers

Unimodal or multimodal

Multimodal use cases

Multimodal tasks

Diffusion Models

Forward Diffusion

Reverse Diffusion

Stable Diffusion

Does not use pixel space of the image, uses a reduced-definition latent space

?

SageMaker + Amazon Q Developer

Amazon Nimble studio and amazon samarian

?

Gen AI Architectures

Generative Adversarial Networks GANs

Variational autoencoders VAE

Transformers

?

AI Project lifecycle

Identify User case

Experiment and select

Adapt, align and augment

Evaluate

Deploy and integrate

Monitor

?

Interpretability

Intrinsic analysis

Post hoc analysis

?

ML outputs are deterministic

Gen AI outputs are non-deterministic

?

Gen AI Performance metrics

Recall - Oriented Understudy for Gisting Evaluation (ROUGE)

Bilingual Evaluation Understudy (BLEU)

?

Transfer learning

?

SageMaker JumpStart

?

?

Domain 3: Applications of Foundation Models

?

Considerations

Architecture

Complexity

Availability

Compatibility

Explainability

Interpretability

?

Inference

It is the process of generating an output from an input that you provided to the model.

Input = Prompt and inference parameters

Randomness and Diversity

Temperature? (Lower value = high probability outputs and Higher value = Low probability outputs)

Top K (Lower value = decrease the size of pool)

Top P

Length

Response Length

Penalties

Stop sequences

Prompt

A specific set of inputs to guide LLMs to generate an appropriate output or completion

RAG - Retrieval Augmented Generation (RAG)

Prompt enrichment and appending external data to your prompt

Vector Database

Collection of data stored as mathematical representations

?

AWS Services for Vector search databases

Amazon OpenSearch Service

Amazon OpenSearch Serverless

Amazon Aurora PostgreSQL

Amazon RDS PostgreSQL

Amazon Aurora

Amazon Neptune

Amazon DocumentDB [with MongoDB compatibility]

?

Amazon Bedrock AGENTS

Orchestrate prompt completion workflows

?

Prompt

Zero shot prompting

Few shot prompting

Prompt Template

Chain-of-thought prompting

Prompt tuning

?

Latent space

The encoded knowledge of language in LLMs or the stored patterns of data that capture relationships and reconstruct the language from the patterns when prompted

Statistical database

?

Prompt Engineering risks and limitations

Exposure

Prompt Injection

Jailbreaking

Hijacking

Poisoning

?

Training process for foundation models

Pretraining???????? - Self supervised learning

Fine-tuning??????? - Supervised learning??????????? :: Catastrophic forgetting

Continuous pre-training

?

Fine-tuning techniques

Parameter-efficient fine-tuning (PEFT)

Low-Rank Adaptation (LoRA)

Representation fine-tuning (ReFT)

Multitask fine-tuning

Domain adaption fine-tuning

Reinforcement learning from human feedback (RLHF)

?

Data preparation fine-tuning

Prepare your training data

Select prompts

Calculate loss

Update weights

Define evaluation steps

?

Data preparation AWS Services

Amazon SageMaker Canvas

Open-source frameworks

Amazon Sagemaker studio - integration with EMR, can use jupyter labs

Amazon Glue

Amazon SageMaker Feature Store

Amazon SageMaker Clarify? -- if you have bias in your data

Amazon SageMaker Ground Truth? -- manage data labelling

?

Model performance

One option to reduce inference latency is to decrease the size of LLMs but might decrease its performance

?

Gen AI Performance Metrics

Recall Oriented Understudy for Gisting Evaluation (ROUGE)

Automatic summarization tasks

Machine translation software

Bilingual Evaluation Understudy (BLEU)

Used for translation tasks

General Language Understanding Evaluation (GLUE)

Compare against benchmarks set by the experts

Access model generalization across multiple tasks

Holistic Evaluation of Language Models (HELM)

Help improve model transparency

Massive Multitask Language Understanding (MMLU)

Evaluates knowledge and problem solving capabilities of the model

Tested against history, mathematics, laws, computer science and more

Beyond the Imitation Game Benchmark (BIG-bench)

Focuses on tasks that are beyond the capabilities of the current language models

?

AWS Services for model evaluation

Amazon SageMaker JumpStart

Amazon SageMaker Clarify

?

Review these materials to learn more about the topics covered in this exam domain:?

?

?

?Domain 4: Guidelines for Responsible AI

?

Responsible AI

Fairness

Explainability

Robustness

Privacy and security

Governance

Transparency

?

Effects of bias and variance

Demographic disparities

Inaccuracy

Overfitting

Underfitting

User Trust

?

Responsible datasets

Inclusivity

Diversity

Balanced datasets

Privacy protection

Consent and transparency

Regular audits

?

Responsible practices

Environmental considerations

Sustainability

Transparency

Accountability

Stakeholder engagement

?

AWS service for this

Amazon SageMaker Clarify

Detect bias

Explainability

SageMaker Processing jobs

?

SageMaker pre-training bias analysis

Class imbalance

Label imbalance

Demographic disparity

Difference in positive proportions

Specificity difference

Recall difference

Accuracy difference

Treatment equality

?

Gen AI Risks

Hallucinations

Intellectual Property

Bias

Toxicity

Data privacy

?

Guardrails for Amazon Bedrock

Hate

Insults

Sexual

Violence

+ Denied topics

?

Model transparency

Interpretability?? - Deep analysis

Explainability????? - black box analysis

?

AI Service Card

Amazon SageMaker Model Cards

Sagemaker provides

Feature attributions - SHAP Values

Partial dependence plots

Amazon Augmented AI (A2I) - send data to human reviewers to review random predictions.

Use your own reviewers or use mechanical turf

?

?


Domain 5: Security, Compliance, and Governance for AI Solutions?


?IAM Identity Center

Workforce users, Workforce identities

Logging with CloudTrail

Captures API calls and related events

Integrated with SageMaker

Amazon SageMaker Role Manager

Preconfigured permissions for 12 activities

?

Encryption at rest

Amazon SageMaker

Data is encrypted by default on ML storage volumes

Notebook instances, SageMaker jobs, and endpoints

?

AWS Key Management Service - KMS

Amazon Macie

Identifies and alerts you to sensitive data

Remove PII during ingestion

?

AI System Vulnerabilities

Training Data

Input Data

Output Data

Models

Inversion

Theft

LLM's

Prompt Injection

?

Amazon SageMaker Model Monitor

Capture data

Create a baseline

Define data quality monitoring jobs

Evaluate statistics

?

Amazon SageMaker Model Registry

Amazon SageMaker Model Cards

Amazon SageMaker ML Lineage Tracking

Amazon SageMaker Feature Store

Amazon SageMaker Model Dashboard

?

Emerging AI compliance standards

ISO 42001 and ISO 23894

EU Artificial Intelligence Act

NIST AI Risk Management Framework (RMF)

?

AI Risk Management

Probability of occurrence

Severity of occurrence

?

Algorithmic Accountability Act

Transparency and explainability

Monitor for Bias

?

AWS Audit Manager

Audits AWS usage to assess compliance

Choose a framework

Gen AI

Customer frameworks

Collect evidence and add to audit report

?

Guardrails for Amazon Bedrock

Apply guardrails to any foundation model and agents for Amazon Bedrock

Configure harmful content filtering

Define and disallow denied topics

PII data

?

AWS Config

Continuously monitors and records configurations

AWS Config rules

Conformance packs

Operational best practices for AI and ML

Security best practices for Amazon SageMaker

?

Amazon Inspector

Works at application level

Performs automated security assessments on your applications

?

AWS Trusted Advisor

Provides guidance to help you

Reduce cost

Increase performance

Improve security

?

Data Governance

Curation

Discovery and understanding

Protection

? Define roles

Data steward

Data owner

IT Roles

?

AWS Glue DataBrew for data goverance

Data profiling

Data Lineage

AWS Glue Data Catalog

AWS Glue Data Quality

?

Curation

Data Quality Management

Data Integration

Data Management

Protection

Data Security

Data Compliance

Data Lifecycle management

?

Review these materials to learn more about the topics covered in this exam domain:?

?


GENERAL LINKS - For Revision


What are Transformers in Artificial Intelligence? ->?aws.amazon.com/what-is/transformers-in-artificial-intelligence/

What are Foundation Models? ->?aws.amazon.com/what-is/foundation-models/

What is Artificial Intelligence (AI)? ->?aws.amazon.com/what-is/artificial-intelligence/?

What is Machine Learning? ->?aws.amazon.com/what-is/machine-learning/?

What is Deep Learning? ->?aws.amazon.com/what-is/deep-learning/?

What is Generative AI? ->?aws.amazon.com/what-is/generative-ai/

What’s the Difference Between Supervised and Unsupervised Learning? ->?aws.amazon.com/compare/the-difference-between-machine-learning-supervised-and-unsupervised/

Machine Learning Concepts ->?docs.aws.amazon.com/machine-learning/latest/dg/machine-learning-concepts.html

?AWS AI Use Case Explorer ->?aws.amazon.com/machine-learning/ai-use-cases/?use-cases

?What is Amazon SageMaker? ->?docs.aws.amazon.com/sagemaker/latest/dg/whatis.html

?AWS Services - Machine Learning (ML) and Artificial Intelligence (AI) -> docs.aws.amazon.com/whitepapers/latest/aws-overview/machine-learning.html

AWS Deploy Serverless ML ->aws.amazon.com/blogs/machine-learning/deploy-a-serverless-ml-inference-endpoint-of-large-language-models-using-fastapi-aws-lambda-and-aws-cdk/

AWS Sagemaker - API Gateway - AWS Lambda ->?aws.amazon.com/blogs/machine-learning/call-an-amazon-sagemaker-model-endpoint-using-amazon-api-gateway-and-aws-lambda/

Inference parameters ->docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html

Inference parameters ->?docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html?icmpid=docs_bedrock_help_panel_playgrounds?

Amazon Bedrock or Amazon SageMaker? ->?docs.aws.amazon.com/decision-guides/latest/bedrock-or-sagemaker/bedrock-or-sagemaker.html?

Choosing a generative AI service ->?docs.aws.amazon.com/decision-guides/latest/generative-ai-on-aws-how-to-choose/guide.html

AWS Bedrock Agents -> aws.amazon.com/bedrock/agents/

What is RAG? - Retrieval-Augmented Generation AI Explained - AWS (amazon.com)

docs.aws.amazon.com/awscloudtrail/latest/userguide/how-cloudtrail-works.html

docs.aws.amazon.com/bedrock/latest/userguide/usingVPC.html

aws.amazon.com/blogs/machine-learning/use-aws-privatelink-to-set-up-private-access-to-amazon-bedrock/

要查看或添加评论,请登录

社区洞察

其他会员也浏览了