Lets talk AI - Intro for everyone
While I am preparing for my AI Practitioner cert on AWS, I was asked from different people "where do I start learning about AI or GenAI?". So this is for you who may know the basics and used ChatGPT or maybe worked with CoPilot but never rly understood how the hell this works in the background and what AI does for us.
Lets define AI
AI or Artificial Intelligence is the broader term often used to reference AI even if we talk about Machine Learning (ML), Deep Learning (DL) or Gernative AI (GenAI).
AI as such mimics human behavior (thats the most basic definition). AI mimics/performs tasks that typically require human intelligence such as reasoning, learning, perception, problem solving, decision-making.
AI is the umbrella term for the other technologies mentioned above.
A good explanation is also here: What is AI? - Artificial Intelligence Explained - AWS (amazon.com)
Use-cases for AI include things like vision or speech, think about facial recognition, image recognition or computer vision (e.g. autonomous vehicles) or fraud detection.
Machine Learning
Machine Learning is a sub-technique of AI that learns from data. The main thing to understand: ML always solves a specific problem, therefore its also called narrow-AI. You provide data and expect a certain outcome. The data is leveraged to improve the performance on a given task and make predictions based on the data used to train the model.
Use-cases that you might already know:
Regression - Used e.g. for forcasting
Classification - Think about image recognition cases (cat vs dog kind of thing)
Difference between ML and AI is explained here: AI vs Machine Learning - Difference Between Artificial Intelligence and ML - AWS (amazon.com)
ML Workflow
How does an AI project work?
Start with the problem
If you want to use a model to get insights from your data you first need to start with the business case for the the ML problem. So there needs to be a need/problem that you define.
Get the data right
Once you identified the problem, you need the data. Most projects spend about 80% of the time on AI/ML projects getting the right data, cleaning it, labeling it, testing and refining it.
Splitt the data into 80% for the traning of your model an 20% into evalation/testing dataset to verify if the model works.
Define the model
Once you got the data, you want to pick the right modell and test it. Sometimes a foundation model might solve your problem, sometimes you need to fine-tune the modell or augment it (RAG) and sometimes you need to build it from scratch. Of course building the model on your own is the most expenssive option, so think twice here. Training a modell will involve lots of CPU/GPU power and therefore the cost rises a lot.
Validation/Testing
An important step is to verify if the outcomes you get meet the expectations and solve the problem for your business or generate the expected outcome. If not, review the data and the modell and go back to the training and validation. There are different ways to validate a model (automatic validation, human/business user or 3rd party).
If you are working as an ML Engineer, most likely you will end up using tools like AWS Sagemaker or Bedrock. While Bedrock is a managed service kind of solution Sagemaker is much more customizable for ML/AutoML cases. Bedrock is great if you think about GenAI cases incluing a playground to test out new models.
ML Architecture
Data Layer
For ML to work, you need to have data available. This data is stored like in BigData solutions on storage solution such as AWS S3 or Azure Blob Storage.
ML Framework/Algorithm Layer
Data Scientists and ML Engineers work to find the right model based on the business requirement/use-case.
Model Layer
Implementation and training of a model (think tweaking of parameters and functions).
Application Layer
How to serve/present the results to the user.
What is Deep Learning?
Deep learning functions like our human brains. Its a bit abstract and hard to grasp this concept if explained by a scientist, but let me try to make that complex thing simple.
I would like to reference AWS here again, as they explain the layers in a simple way.
This is how AWS explains it:
领英推荐
Input Layer
An artificial neural network has several nodes that input data into it. These nodes make up the input layer of the system.
Hidden Layer
The input layer processes and passes the data to layers further in the neural network. These hidden layers process information at different levels, adapting their behavior as they receive new information. Deep learning networks have hundreds of hidden layers that they can use to analyze a problem from several different angles.
For example, if you were given an image of an unknown animal that you had to classify, you would compare it with animals you already know. For example, you would look at the shape of its eyes and ears, its size, the number of legs, and its fur pattern. You would try to identify patterns, such as the following:
The hidden layers in deep neural networks work in the same way. If a deep learning algorithm is trying to classify an animal image, each of its hidden layers processes a different feature of the animal and tries to accurately categorize it.
Output Layer
The output layer consists of the nodes that output the data. Deep learning models that output "yes" or "no" answers have only two nodes in the output layer. On the other hand, those that output a wider range of answers have more nodes.?
Now still not easy to get this, right?
I got one more great source that explains it easy: Deep Learning: Overview of Neurons and Activation Functions | by Stacey Ronaghan | Medium
Now you will say "are you kidding, I thought this is simple!". Well, its not simple, sorry.
You just need to understand that in deep learning we work with unlabled data (which is refered to as "Unsupervised Learning") while in ML we work with labled data (therefore its called "Supervised Learning"). So in Deep Learning we let the model figure out the logic within the data. Why is this now such a big thing? Well the models can be used to identify patterns in data (e.g. fraud detection), it is the base for GenAI (think about predicting the next word in an LLM (Large Language Model), e.g. ChatGPT ).
Traditional ML is limited as it is not working with many layers, but in Deep Learning you can have many many layers and can train a model on massive amounts on unstructured data. So in ML you needed to label the data, in DL you dont.
Deep Learning is already making our daily tasks easy, think about autocompletion on your mobile phone when typing messages (it predicts words or corrects it with a spell-check).
Deep Learning can work the best if the quality of the data you input is good. So even if you can use loads of data you still want to use good data to train the model.
Backpropagation
Deep Learning also learns over time using things like backpropagation (Backpropagation - Wikipedia)
At this point you should understand DL is already very much into statistics, math and works with complex algorithms. I am not a math expert at all, but to explain backpropagation simple:
The algorithms goes backwards from the output layer and evaluates the erros using a Loss-function. So the more losses you have the worse your model works, thats also a way to evaluate your model and check how good it works. Over time it should have less errors.
Use-Cases for Deep Learning
Alright so DL is complex, but what is it used for?
Use-cases include things like:
Computer Vision - image classification, object detection, image segmentation
NLP (Natural Language Processing) - text classification, sentiment analysis, translation, language generation
What is GenAI?
Well everyone knows GenAI - right?
I think no. Why? Lets get into it.
Everyone might know ChatGPT can create a text or an image based on the users inputs. It uses foundation models in the background to generate outputs based on the users inputs.
The founation models are pre-trained.
You know what GPT means? Generative Pre-Trained Transformer
Besides the text and images GenAI can also serve other use-cases:
The foundation models work on top of unlabled data and are based on neural networks (you know what that is from the DL part). These foundation models can be fine-tuned for specific use-cases.
Large Language Model (LLM)
LLMs are used in the text generation and can create complete sentences instead of word-by-word completion. They are very strong in text generation and text processing. LLMs work with Encoder and Decoder to predict the next word with a certain propability.
Transformer based LLMs can generate and understand human like text and are trained on a massive amount of data from the internet, from books, from texts and understand the relationship between the different words and phrases.
Examples of LLMs are ChatGPT, OpenAI or Google BERT.
Diffusion Models (Stable Diffusion)
Images are processed adding noise (which allows the computer to re-create images based on that images using stable diffusion).
This process allows use-cases such as:
Multi-Modal Models (think GPT-4o)
Multi-Modal models such as Gemini or GPT-4o can use a combination of the above mentioned cases. E.g. generate a video including subtitles/text.
Let me know if that helps and if you want to go one step deeper and talk about the different models, agents or use-cases. Stay tuned for the next article on AI.