Monthly AI Recap - June 2024

Monthly AI Recap - June 2024

Last month, we focused on GPT-4o and the overall trend to speed up and lightweight the models.

Today, we'll talk about:

- The revolution in AI consumption

- A forefront of edge-device GenAI and clever technical solutions

- A new member of the Claude 3 family

- NVIDIA's current state and where they seem to be heading


Apple Intelligence ??

AI for everyone?

The best AI available is ubiquitous already. GPT-4o, the most recent OpenAI model, is available for free users.

The number of requests is limited, but it should be enough if you're not a hardcore user.

Still, many people have only heard about GenAI; perhaps some use it occasionally.


So here's the thing - OpenAI's platform is used by around 180 million people.

Soon, all iPhone users will consciously use GenAI. That's a staggering 1.3 billion people.


It won't happen overnight, as the release will be done in chunks and over a longer period.

First, the development cycles are much more straightforward this way, as Apple can form dedicated teams with a fixed roadmap.

Second, Apple will start releasing new features in the US, as they're not compliant with various data privacy protection acts in the EU.


The AI will be a part of the following components of all the Apple operating systems:


iOS 18

- advanced photo editing via commands

- smart content summaries in Safari

- Custom emojis (Genmoji)

- audio transcription

- Siri and GPT-4o integration.


macOS Sequoia

- Advanced writing and image editing tools

- new gaming functionalities (possibly AI upscaling)

- improved system settings


iPadOS

- accessibility features like eye-tracking

- advanced note-taking


That's revolution #1 wrapped up.


How data moves

The second one is directly tied but in a totally different realm - the technological one.

Apple will be the first major company to utilize AI on edge devices, as new functionalities will cover all the most popular Apple products.

Funnily enough, most innovations result from the company's approach to user privacy.

Prioritizing on-edge device processing will allow to keep as much of the data on the device.


There is a problem, though. AI models use RAM memory to process data and generate outputs. A lot of RAM memory.


A 7 billion parameter model requires 14 GB RAM to run. GPT4 is rumored to have 1.76 trillion parameters. I'll let that sink in.


The models that can be run on the device will have around 3 billion parameters. But of course, that lightweight is burdened with a much lower performance. They'll have to be replaced with much more powerful ones to support more complex tasks.


Here's how Apple is going to tackle that problem - they'll introduce a GenAI pipeline:

  1. A Dynamic Model Selection feature will translate the user query and decide which available model will process it.
  2. The default one, suited for smaller tasks, will be on-device, eg. OpenELM
  3. The next ones in the line are hosted in a Private Cloud Compute, more capable but still not the biggest, e.g. Ferret UI (we wrote about it here and here )
  4. In the end, if the query is that advanced, it will be redirected straight to GPT-4o.


A word about how the Apple team was able to produce models small enough to be run on the phone but good enough to be useful.


They used quantization to reduce the precision of weights, reducing the model's size to about 1/4 (e.g., from 6 GB to 1.5 GB).


To maintain the efficacy of such a small model, they fine-tuned it using LoRA (Low-Rank Adaptation) to optimize only the weights used for a certain task.


We're waiting for the rollout of the first features. The AI-augmented UI will definitely be a milestone.


Claude on the forefront

Anthropic is following the swiftness trend of AI models. The newest addition to the Claude 3 family—Claude 3.5 Sonnet—is another example of moving from an experimental stage to a user- and business-oriented product stage.


It's focused on productivity, speed and efficiency improvements to ease potential implementations and deployments.


Compared to Claude 3 Opus, the core features are:

- better performance in various benchmarks,

- double the speed,

- an 80% cost reduction for developers' operationalization.


There's also a bow to “regular chat users.” The Artifact feature displays AI-generated content in a separate window next to the chat.

NVIDIA - products for the biggest

Ok, so we're here. NVIDIA has officially crossed 3 TRILLION DOLLARS in market capitalization and is the highest-valued company in the world.

A month ago, we wrote about their GPU chip production. Here, it's important to mention another big branch - simulations.


Starting with Earth 2.


Remember Google Earth? Quite impressive, isn't it? Now, Earth 2 is bigger, better, and more.

The website states it's a "full-stack, open platform that accelerates climate and weather predictions with interactive, AI-augmented, high-resolution simulation."

It can be used as a starting point for many models or simulations developed by scientific institutes. The demo is available for free here -> https://www.nvidia.com/en-us/high-performance-computing/earth-2/demo/


Next, Intelligent robots.


It's been here for some time, but lately, it's been heavily developed further.


The Isaac platform - a very complex set of applications, libraries, models and specified workflows and pipelines. All to train intelligent robots in simulated environments before transferring the "intelligence" to actual robots operating in a physical world.


BYD Electronics, Siemens, Teradyne Robotics, Alphabet's Intrinsic, Hexagon, or Husqvarna all use it.


The platform is modular, so any organization can tailor components to their needs and specific workflows.


It's being integrated with Omniverse Microservices, a tool for generating gigantic amounts of synthetic data for simulations, such as sensor data.


One last thing—they've also released the Open Synthetic Data Generation Pipeline. It's a family of models that generate synthetic data to train LLMs.


That's it for this month's recap. We'll see you next week with a continuation of the Mixture of Agents approach.




For more ML and AI insights, subscribe or follow Sparkbit on LinkedIn.

If you're looking to start an AI project, you can book a free consultation with our CTO here: https://calendly.com/jedrek_sparkbit/ai-consultation



Author: Kornel Kania , AI Delivery Consultant at Sparkbit


要查看或添加评论,请登录

Sparkbit的更多文章

社区洞察

其他会员也浏览了