Where is AI, where am I?
Vali Jafarov
Lead Full-stack Engineer | Swisscom | Java, Kotlin, Angular, Machine Learning | Open for a new opportunity
This article is to catch-up with AI, even if moves away while you are reading.
You see a news, announcement from AI almost everyday, which has been triggered by advancements in ChatGPT and the enabling of training with large datasets which results with better AI suggestions.
Google’s Transformer architecture has been a breakthrough allowing a neural network to learn context and meaning by tracking relationships in sequential data (like words in same sentence) and learning in parallel which has accelerated training speed. It also stands as last letter of ChatGPT which is based on the Transformer architecture.
Latest applications of?AI
Latest advancements have opened big doors for various AI applications for productivity you can see below.
ChatGPT
Nowadays it is the 1st application coming to minds when we say AI.
So let’s look to most important part about ChatGPT: how it works?
There are 3 stages in training of ChatGPT (Chat Generative Pre-trained Transformer):
Stage 1: Generative Pre-Training
The transformer is trained on lots of text data from all over the internet?—?websites, books, articles and etc. So many functionalities including language modeling, summarization, translation, and sentiment analysis are prepared in this intensive stage.
Stage 2: Supervised Fine-Tuning (SFT)
The model gets trained on specific tasks that are relevant to what the user is looking for, including conversational chat. The main aim is to match with the users’ expectations.
During this stage, the parameters of the ChatGPT base model are updated to capture task-specific info that wasn’t around before SFT.
Stage 3: Reinforcement Learning through Human Feedback (RLHF)
In this stage, the agent interacts with its environment and learns to make decisions by getting rewarded or punished. A human agent gives responses in this stage and another human ranks them. The model pairs request and responses to know which is better and reward model gives a high score to ChatGPT when it responds with best option.
But there is a trap in this stage known as Goodhart’s Law:
“when a measure becomes a target, it ceases to be a good measure.”
To deal with this issue, an extra step: measure of difference between two probability distributions has been added to the stage, called Kullback-Leibler (KL) divergence. It tells how much info gets lost when one thing is used to guess the other, which is commonly used for Machine Learning tasks. The model understands is in trouble when the KL divergence is too high. The trap has been fixed in ChatGPT via this step.
Finally, ChatGPT is made ready to answer questions of you after completion of this stage. It only knows what it has been taught?—?so if you ask it something that it hasn’t been trained, will give you a random answer out of templates in the trained way.
Gemini
It is long-awaited answer of Google to ChatGPT. It is “most capable” AI model ever, according to Google, which was trained on video, images, audio and text. Is available to developers through Google Cloud’s API since December 13, 2023. Google’s Bard, a chatbot similar to ChatGPT and suggested replies from the keyboard of Pixel 8 smartphones are already powered by Gemini Pro. It is expected to be introduced into main Google products in 2024.
Google says Gemini Ultra scores %90 higher than any other model including GPT-4, on the Massive Multitask Language Understanding (MMLU) benchmark. It was tested using a data set of toxic model prompts developed by the Allen Institute for AI.
Alexei Efros, a professor at UC Berkeley specialized in the visual capabilities of AI, says Google’s general approach with Gemini appears promising. However, he says that “…the problem with all these proprietary models, we don’t really know what’s inside”.
Now let’s look to the latest applications on Computer Vision field of AI.
StyleGAN
The Generative Adversarial Network (GAN), is a class of Machine Learning frameworks and a prominent model for approaching generative AI that pits one neural network against another.
StyleGAN causes two images to be generated and then combined by taking low-level features from one and high-level features from the other. A mixing regularization technique is used by the generator, causing some percentage of both to appear in the output image. And after each convolution layer, a noise on a per-pixel basis is added.
StyleGAN method works by gradually increasing the resolution, thus ensuring that the network evolves slowly, initially learning a simple problem before progressing to learning more complex problems. Instead of generating a single image they generate multiple ones, and this technique allows for styles or features to be dissociated from each other.
领英推荐
StyleGAN allows to train and reconstruct historical pictures.
Verification
Entrupy has developed an app to detect imitation and fake bags. It makes verification by checking many product traits like color, stitching, and leather patterns.
Video solutions
Dragonfruit, a SF based start-up has developed a stock-out solution by applying computer vision technology to monitor products on store shelves and send real-time alerts via email/text/integrations into user’s system.
Bard lets you chat to YouTube videos, while Microsoft Co-pilot is able to summarize the entire video. Neural Radiation Fields (NeRFs) allow detailed 3D scenes to be generated from a series of 2D images.
Pika labs has recently released a text-to-video offering.
The video-to-text advancements have potential to compete with video commentators.
Animate Anyone can turn pictures into videos.
Advancements in audio modality shows potential for Spotify to be flooded with AI music surpassing Bruno Mars and Taylor Swift.
ChatGPT’s ability to identify actions from camera pictures, uncovers how much data will be generated with AI-powered CCTV cameras?—?one of plenty reasons why AI needs to be regulated and EU is preparing 1st AI Law. Let’s look at it.
1st AI Law
On December 9, 2023, the Council and EU Parliament negotiators reached a provisional agreement on the EU AI Act as a global AI landscape that is ethical, safe, and trustworthy.
It is a risk based regulation, categorizing risks into 4 levels:
Top skills for AI?roles
Now let’s review the top requirements for current AI roles. It should give you final idea where you stand relatively to Artificial Intelligence. Experience in the area and University degree in Computer Science/AI/Mathematics are top qualifications asked for the roles.
The top skills wanted for top AI roles today:
Machine Learning Engineer
Data Scientist
NLP Engineer
Robotics Engineer
References
[2] AI-pocalypse Now
Innovator, ex. Big-4, Lifelong Entrepreneur, People's Choice awardee
1 年??