登录查看更多内容

Where is AI, where am I?

Vali Jafarov

Lead Full-stack Engineer | Swisscom | Java, Kotlin, Angular, Machine Learning | Open for a new opportunity

发布日期: 2023年12月17日

This article is to catch-up with AI, even if moves away while you are reading.

You see a news, announcement from AI almost everyday, which has been triggered by advancements in ChatGPT and the enabling of training with large datasets which results with better AI suggestions.

Google’s Transformer architecture has been a breakthrough allowing a neural network to learn context and meaning by tracking relationships in sequential data (like words in same sentence) and learning in parallel which has accelerated training speed. It also stands as last letter of ChatGPT which is based on the Transformer architecture.

Latest applications of?AI

Latest advancements have opened big doors for various AI applications for productivity you can see below.

ChatGPT

Nowadays it is the 1st application coming to minds when we say AI.

So let’s look to most important part about ChatGPT: how it works?

There are 3 stages in training of ChatGPT (Chat Generative Pre-trained Transformer):

Stage 1: Generative Pre-Training

The transformer is trained on lots of text data from all over the internet?—?websites, books, articles and etc. So many functionalities including language modeling, summarization, translation, and sentiment analysis are prepared in this intensive stage.

Generative Pre-Training stage (credit: Pradeep Menon)

Stage 2: Supervised Fine-Tuning (SFT)

The model gets trained on specific tasks that are relevant to what the user is looking for, including conversational chat. The main aim is to match with the users’ expectations.

Supervised Fine-Tuning (SFT) stage (credit: Pradeep Menon)

The 1st step in this stage is to form sets of crafted conversations. These conversations are created by one human agent chatting with another human agent pretending to be a chatbot with “ideal ”responses.
Then the conversation history is used as input and ideal next response as output as training corpus. A set of tokens are generated to update Base GPT model’s parameters.
The training corpus is trained using Base GPT Model and Stochastic Gradient Descent Algorithm (SGD). It consists of rounds after each a feedback is given to model to correct the faults. This process repeats until the computer gets really good at conversation. SGD is like the optimization algorithm that keeps pinching the parameters of a model until the cost function is minimized.

During this stage, the parameters of the ChatGPT base model are updated to capture task-specific info that wasn’t around before SFT.

Stage 3: Reinforcement Learning through Human Feedback (RLHF)

In this stage, the agent interacts with its environment and learns to make decisions by getting rewarded or punished. A human agent gives responses in this stage and another human ranks them. The model pairs request and responses to know which is better and reward model gives a high score to ChatGPT when it responds with best option.

But there is a trap in this stage known as Goodhart’s Law:

“when a measure becomes a target, it ceases to be a good measure.”

To deal with this issue, an extra step: measure of difference between two probability distributions has been added to the stage, called Kullback-Leibler (KL) divergence. It tells how much info gets lost when one thing is used to guess the other, which is commonly used for Machine Learning tasks. The model understands is in trouble when the KL divergence is too high. The trap has been fixed in ChatGPT via this step.

Finally, ChatGPT is made ready to answer questions of you after completion of this stage. It only knows what it has been taught?—?so if you ask it something that it hasn’t been trained, will give you a random answer out of templates in the trained way.

Gemini

It is long-awaited answer of Google to ChatGPT. It is “most capable” AI model ever, according to Google, which was trained on video, images, audio and text. Is available to developers through Google Cloud’s API since December 13, 2023. Google’s Bard, a chatbot similar to ChatGPT and suggested replies from the keyboard of Pixel 8 smartphones are already powered by Gemini Pro. It is expected to be introduced into main Google products in 2024.

Google says Gemini Ultra scores %90 higher than any other model including GPT-4, on the Massive Multitask Language Understanding (MMLU) benchmark. It was tested using a data set of toxic model prompts developed by the Allen Institute for AI.

Alexei Efros, a professor at UC Berkeley specialized in the visual capabilities of AI, says Google’s general approach with Gemini appears promising. However, he says that “…the problem with all these proprietary models, we don’t really know what’s inside”.

Now let’s look to the latest applications on Computer Vision field of AI.

StyleGAN

The Generative Adversarial Network (GAN), is a class of Machine Learning frameworks and a prominent model for approaching generative AI that pits one neural network against another.

StyleGAN causes two images to be generated and then combined by taking low-level features from one and high-level features from the other. A mixing regularization technique is used by the generator, causing some percentage of both to appear in the output image. And after each convolution layer, a noise on a per-pixel basis is added.

StyleGAN method works by gradually increasing the resolution, thus ensuring that the network evolves slowly, initially learning a simple problem before progressing to learning more complex problems. Instead of generating a single image they generate multiple ones, and this technique allows for styles or features to be dissociated from each other.

领英推荐

Transformers: More than meets the AI

Jason Renshaw 4 个月前

GPT-based Models Meet Simulation; Survey on ChatGPT…

Danny Butvinik 1 年前

What is Deep Research AI, and which is Best?…

Trent Gillespie 4 周前

Change in StyleGAN results by resolution & time (credit: fritz.ai)

StyleGAN allows to train and reconstruct historical pictures.

Reconstruction of William Shakespeare using StyleGAN (source: Nathan Shipley)

Verification

Entrupy has developed an app to detect imitation and fake bags. It makes verification by checking many product traits like color, stitching, and leather patterns.

AI device that can detect fake, imitation handbags (credit: El Pais)

Video solutions

Dragonfruit, a SF based start-up has developed a stock-out solution by applying computer vision technology to monitor products on store shelves and send real-time alerts via email/text/integrations into user’s system.

Bard lets you chat to YouTube videos, while Microsoft Co-pilot is able to summarize the entire video. Neural Radiation Fields (NeRFs) allow detailed 3D scenes to be generated from a series of 2D images.

Pika labs has recently released a text-to-video offering.

The video-to-text advancements have potential to compete with video commentators.

Animate Anyone can turn pictures into videos.

Advancements in audio modality shows potential for Spotify to be flooded with AI music surpassing Bruno Mars and Taylor Swift.

ChatGPT’s ability to identify actions from camera pictures, uncovers how much data will be generated with AI-powered CCTV cameras?—?one of plenty reasons why AI needs to be regulated and EU is preparing 1st AI Law. Let’s look at it.

1st AI Law

On December 9, 2023, the Council and EU Parliament negotiators reached a provisional agreement on the EU AI Act as a global AI landscape that is ethical, safe, and trustworthy.

It is a risk based regulation, categorizing risks into 4 levels:

Minimal or No Risks: The majority of AI systems with negligible risks can continue without regulation.
Limited Risks: AI systems with manageable risks are subject to light transparency obligations to empower users with informed decision-making.
High Risks: A broad spectrum of high-risk AI systems will be authorised but with stringent requirements and obligations to access the EU market.
Unacceptable Risks: Systems containing deemed unacceptable risks, including cognitive manipulation, predictive policing, emotion recognition in workplaces and schools, social scoring, and certain remote biometric identification systems, will be banned, with limited exceptions.

Top skills for AI?roles

Now let’s review the top requirements for current AI roles. It should give you final idea where you stand relatively to Artificial Intelligence. Experience in the area and University degree in Computer Science/AI/Mathematics are top qualifications asked for the roles.

The top skills wanted for top AI roles today:

Machine Learning Engineer

Python/Java/Scala
TensorFlow/PyTorch/PySpark/Scikit-learn
Spark/Hadoop/SQL
CI/CD, testing

Data Scientist

Spark/SQL/Hadoop/Pig/Hive/MapReduce
Python/Scala
NumPy/Pandas/TensorFlow/PyTorch/PySpark/Scikit-learn

NLP Engineer

NLTK/SpaCy/Gensim
GPT3/ Llama/ T5/ BERT
Transformers/PyTorch /JAX
Python/Scala

Robotics Engineer

?ROS/ROS2
Python/C++

References

[1] StyleGAN: Use machine learning to generate and customize realistic images

[2] AI-pocalypse Now

[3] Discover how ChatGPT is trained!

[4] Google Just Launched Gemini

[5] World’s First AI Law

Rashid Aliyev

Innovator, ex. Big-4, Lifelong Entrepreneur, People's Choice awardee

1 年

11 次回应

要查看或添加评论，请登录

Vali Jafarov的更多文章

Top 5 start-UP ideas for?2024

2024年1月27日

Top 5 start-UP ideas for?2024

I would like to share my top 5 start-UP ideas for 2024 by joining my ideas with the hints gathered through my research.…
Why to switch from Java to Kotlin?

2023年12月23日

Why to switch from Java to Kotlin?

Are there a lot of benefits? Does it require a lot of work? Is the switch a betrayal of Java? Let’s take a brief look…

5 条评论

Where is AI, where am I?

Vali Jafarov

Lead Full-stack Engineer | Swisscom | Java, Kotlin, Angular, Machine Learning | Open for a new opportunity

Latest applications of?AI

领英推荐

1st AI Law

Top skills for AI?roles

Vali Jafarov的更多文章

社区洞察

其他会员也浏览了

Generative AI and Creativity: The First Year After ChatGPT and a Look to the Future

Philosophy, Consciousness and AI - A deeper discussion with Chat GPT 4

Ask ChatGPT about Yourself

DeepSeek AI vs. ChatGPT: A Detailed Comparison of Use Cases, Real-Life Applications, and Future Potential

One year ChatGPT - is the Honaimoon over?

Beyond the Buzz: A Critical Look at ChatGPT and DeepSeek

Unveiling ChatGPT Model O1: A Leap Forward in Conversational AI

Chat GPT

Befriending ChatGPT: A Guide for Business People Lost in the AI Wonderland

Can AI solve every problem?

Latest applications of?AI

领英推荐

1st AI Law

Top skills for AI?roles

Vali Jafarov的更多文章

Top 5 start-UP ideas for?2024

Why to switch from Java to Kotlin?

社区洞察

其他会员也浏览了

Generative AI and Creativity: The First Year After ChatGPT and a Look to the Future

Philosophy, Consciousness and AI - A deeper discussion with Chat GPT 4

Ask ChatGPT about Yourself

DeepSeek AI vs. ChatGPT: A Detailed Comparison of Use Cases, Real-Life Applications, and Future Potential

One year ChatGPT - is the Honaimoon over?

Beyond the Buzz: A Critical Look at ChatGPT and DeepSeek

Unveiling ChatGPT Model O1: A Leap Forward in Conversational AI

Chat GPT

Befriending ChatGPT: A Guide for Business People Lost in the AI Wonderland

Can AI solve every problem?