登录查看更多内容

Google I/O Updates, training a 65B model on a single GPU, HF Transformers Agent, and some interesting papers.

Praveen Cherukuri

CTO passionate about solving real business challenges using tech

发布日期: 2023年5月17日

Welcome to the 3rd edition of the AI Matters newsletter. This edition explores progress in the AI world since last week.

But before we get too deep into the newsletter, a few things:

1. I'm floored by the response and encouragement I have received from the LinkedIn network. Thanks to all who supported and have been supporting this experiment.

2. I would like to share that many organizations, including the ACM, use the newsletter name, AI Matters. Great minds think alike. :) So, I will change the newsletter's name next week. If you have an opinion on what it should be named, please make your voice heard.

1. Google I/O updates

2. Cool Projects

3. Interesting Papers

Google I/O updates

Google I/O 2023 was all about AI. Sundar Pichai, Google's CEO, and his co-presenters mentioned AI 143 times over the two-hour keynote. I have listened to the keynote, and here are some notable topics and announcements I thought were interesting:

1. Bard

Google is removing the Bard waitlist and making it available in many countries worldwide. It's also going multi-modal with images. Adobe Firefly is now directly integrated (private beta?) into Bard for image generation based on a prompt.

Tools, extensions, and partners were mentioned. I assume tools/extensions will be similar to OpenAI's plugin concept. Bard can now annotate source code with citations.

With these announcements, Bard seems to achieve general feature parity for important aspects with ChatGPT. However, based on my interactions with both, Bard still has room for improvement in the quality of responses.

2. PaLM 2

PaLM 2, the next version of the PaLM foundation model, was announced during the keynote. It will come in different sizes (number of parameters). Supposedly better at logic, reasoning, and multi-lingual text. Apparently, it can generate, debug, and explain code in 20 programming languages. This will power about 25 products across their product lineup.

3. Codey

A foundation model trained on Google's documentation and code; fine-tuned on Google Cloud user behaviors and patterns. Based on the documentation, it's integrated into the developer environment and can generate/complete code, provide chat assistance to various GCP topics, best practices, and search capabilities across documents.

One aspect that caught my attention is that we can train Codey with custom code, and Google will keep it private. I can't wait to try this out.

3. Imagen

Imagen is a text-to-diffusion model similar to Lexica (my favorite), DreamStudio by stability.ai, and MidJourney was announced.

4. Chirp

Chirp, a speech-to-text model, was announced.

5. Fine-tuned PaLM versions

Sec-PaLM: A version of PaLM 2 targeted at security use cases.
Med-PaLM 2: A version of PaLM 2 fine-tuned on medical knowledge and is apparently 9x more accurate than the base model in reasoning. It also has reached an expert-level performance in Medical Licensing exams.

6. MusicLM

MusicLM is a text-to-music model.

领英推荐

TAI #141: Claude 3.7 Sonnet; Software Dev Focus in…

Towards AI 4 周前

TAI #126; New Gemini, Pixtral, and Qwen 2.5 model…

Towards AI 4 个月前

Google's Imagen Is More Relatable than OpenAI's DALL-E…

Michael Spencer 2 年前

7. Gemini

Next-generation foundation model currently in training. A key feature of this model is its multi-modal capabilities.

8. Deep integration of Generative AI into products.

Duet AI for Workspaces, Duet AI for Google Cloud, and Duet AI for Appsheet provide deep AI integration into Google's broader product ecosystem, such as Google Slides, Docs, Vertex AI, etc. "Help me write", is a feature that allows the user to type a prompt and generate text within Gmail. Another example I appreciated was the ability to generate speaker notes for slides.

The demo shown by the Google Labs team on how Generative AI is integrated into the Google Search experience looked like a hybrid between the current search and ChatBot experience.

9. Google Cloud

Vertex AI allows the use of foundation models such as PaLM 2 to fine-tune using dedicated clusters, thereby guaranteeing the privacy of data. Using the Generative AI studio, it looks like models can be fine-tuned with users' private data using a no-code interface. Users can then deploy the model from the UI. I was impressed by this.

A3 GPU Supercomputer was announced. The specs on A3 were mind-blowing. This blog page claims 26 exaFlops of performance. A3 VMs utilize 8 H100 GPUs with 3.6 TB/s bisectional bandwidth between the GPUs. It looks like these VMs are meant for training, while the G2 VMs are meant for inference.

10. Project Tailwind

Project Tailwind allows a Google user to fine-tune a model based on private documents stored on Google Drive!

11. Identifying synthetic content

Google is working towards identifying synthetic content. For example, Google Images will include metadata that indicates when an image first appeared. Google is also building watermarking into AI-generated images.

12. Prompt-to-Wallpapers on Android

Android gets Generative AI wallpapers that you can generate using a prompt using text-to-diffusion models.

13. StudioBot

An Android developer coding assistant was announced.

14. WebGPU

The latest Chrome version has WebGPU built-in. WebGPU accelerates in-browser workloads by accelerating AI libraries like TensorFlow.js 100x. This could open up a whole new set of apps, which leverage WebGPU for inference in the browser. We will have to wait and see how this plays out.

Cool Projects

This is the biggest thing that happened since last week: we can now fine-tune a 65B model on a single GPU. It requires 48GB of memory on the GPU, which means a consumer-grade GPU like a 4090 isn't going to work. So, we will still need an enterprise-grade GPU or a Quadro. But we would only need one of those, and we can rent one for about $1 from LambdaLabs as an example. So I'm guessing that for less than $100, we should be able to fine-tune a model. Maybe even less. I haven't tried this yet. Tim says we can fine-tune a 7B model in about 3 hours. This is tremendous progress!! Similar announcement from another user on Twitter.
Larry Laake pointed out Anthropic's Constitutional AI approach, which differs from the approach OpenAI takes (RLHF). Philosophically, this looks like a more automated approach. I feel that it may be a while before this vision is fully realized.
Hugging Face announced "Transformers Agent", a natural language wrapper around some curated set of Hugging Face models. Quiet simple to use if you are Python programmer.
Scale AI, the company chosen by the White House a few days ago to evaluate the big players in the AI industry, announced Scale Donovan and Scale EGP. Donovan is targeted at the Defense industry. They have an impressive demo. EGP is targeted at Enterprises.
Wendy's and Google are using AI to take drive-through orders. Last month, I thought this was a use case that could be?implemented with the current tech stack (LLMs, text-to-speech, speech-to-text models). And, now, it's reality.
China has an AI news anchor.
A very interesting thread on how much time and money it costs to train an LLM.
An LLM for Healthcare trained on NHS-UK Conditions dataset and UK's National Institute for Health and Care Excellence (NICE) guidance.
Eva is a database for AI apps.
Microsoft Guidance, a GitHub project to simplify prompt-based programming.
Salesforce announced TableauGPT. This allows users to interact with their data!
Interested in learning more about LLMs? Here's a course on it: https://fullstackdeeplearning.com/llm-bootcamp/spring-2023/.
Here's a how-to on fine-tuning the RedPajama model.

Interesting Papers

ImageBind: Meta has a multi-modal (multi-sensor) model that weaves text, audio, IMU (sensor to detect speed and orientation), depth perception (3D map), and heat map into a single model. This is quite impressive, as AI can now correlate between audio, video, and other modalities. Their blog post says, and I quote:

"With the capability to use several modalities for input queries and retrieve outputs across other modalities, ImageBind shows new possibilities for creators. Imagine that someone could take a video recording of an ocean sunset and instantly add the perfect audio clip to enhance it, while an image of a brindle Shih Tzu could yield essays or depth models of similar dogs. Or when a model like Make-A-Video produces a video of a carnival, ImageBind can suggest background noise to accompany it, creating an immersive experience"

Tracking objects in Video using Meta's SAM.
Much of the current LLM architecture is token (words or subwords) based. This paper discusses a different approach that removes some of the complexities of tokenization.
Another Meta's SAM-based project: Stable Diffusion using segmentation maps to generate images.

Reminder: please subscribe to the?AI Matters?newsletter (soon to be renamed to something else) and share it in your network. Thank you!

Please let me know your thoughts on this edition in the comments section. Did you like it? Too much info in one article? Did I miss anything you encountered in the last week?

#innovation ?#artificialintelligence ?#technology ?#news

Synthetic Thought: AI Digest

332 位关注者

Praveen Cherukuri

CTO passionate about solving real business challenges using tech

1 年

By the way, one notable development from yesterday was Sam Altman's senate testimony. I haven't been able to listen to the full testimony. But if you are interested, you can listen to it at: https://www.youtube.com/watch?v=P_ACcQxJIsg.

要查看或添加评论，请登录

Praveen Cherukuri的更多文章

AI Code Generators, Misinformation in AI models, Javascript Agents library, Model editing, and current news

2023年8月7日

AI Code Generators, Misinformation in AI models, Javascript Agents library, Model editing, and current news

Dear Readers, Welcome to the 9th issue of Synthetic Thought: AI Digest. I use many sources to keep up with what's…
AI Architecture Patterns, 24x faster inference, xAI announcement, Code Interpreter rollout and relevant news

2023年7月16日

AI Architecture Patterns, 24x faster inference, xAI announcement, Code Interpreter rollout and relevant news

In this issue of Synthetic Digest, we discuss developments in the AI space since the last publication. Let's dive into…
Digital Twins, AI Regulation, Content Generators, and Model updates

2023年6月21日

Digital Twins, AI Regulation, Content Generators, and Model updates

Dear Readers, Welcome to the 6th edition of Synthetic Thought: AI Digest. We have numerous updates to cover today.
Discovery of a new algorithm and an antibiotic using AI, Apple's new AR headset, and recent papers published on AI

2023年6月11日

Discovery of a new algorithm and an antibiotic using AI, Apple's new AR headset, and recent papers published on AI

Hello, Dear Readers! Welcome to the 5th issue of Synthetic Thought: AI Digest. I hope this finds you well, safe, and…
LLMs on iPhone, AI-generated 3D Worlds, Autonomous Agents playing Minecraft, Brain interfaces using AI, and AI-generated newsletter attempt

2023年6月3日

LLMs on iPhone, AI-generated 3D Worlds, Autonomous Agents playing Minecraft, Brain interfaces using AI, and AI-generated newsletter attempt

Dear Readers, Welcome back to issue #4 of the Synthetic Thought: AI Digest newsletter. As you might have noticed, I…
AI Matters: name change vote

2023年5月16日

AI Matters: name change vote

Well, it looks like there are many newsletters with the name "AI Matters". So, I'd like to adapt and change the…

14 条评论
Small(er) Language Models, AI Governance, Cool Projects, Model updates, and Sustainability

2023年5月10日

Small(er) Language Models, AI Governance, Cool Projects, Model updates, and Sustainability

Welcome to the second edition of the AI Matters Newsletter, published weekly. In the last week or so, just like in the…

3 条评论
35,000x speed improvement to AI workloads

2023年5月4日

35,000x speed improvement to AI workloads

We certainly live in some very interesting times. I have been saying this to myself quite frequently lately.

See all articles

Google I/O Updates, training a 65B model on a single GPU, HF Transformers Agent, and some interesting papers.

Praveen Cherukuri

CTO passionate about solving real business challenges using tech

Table of Contents

Google I/O updates

领英推荐

Cool Projects

Interesting Papers

Synthetic Thought: AI Digest

332 位关注者

Praveen Cherukuri的更多文章

社区洞察

其他会员也浏览了

A Comprehensive Guide to Azure OpenAI Service

The 6 Best LLM Tools To Run Models Locally

Deploying DeepSeek-R1 Locally with a Custom RAG Knowledge Data Base

Unpacking The OpenAI Meltdown

A Tale of Two Copilots: One You Know, The Other a Mystery

Last Week's Tech Highlights

Introducing OpenAI o1

TrueFoundry Newsletter #18: Prompt Engineering & ML System Scoring??

People need more jobs and videos.

AWS Introduces Multi-Agent Orchestrator: Revolutionizing AI Agent Management

Table of Contents

Google I/O updates

领英推荐

Cool Projects

Interesting Papers

Synthetic Thought: AI Digest

332 位关注者

Praveen Cherukuri的更多文章

AI Code Generators, Misinformation in AI models, Javascript Agents library, Model editing, and current news

AI Architecture Patterns, 24x faster inference, xAI announcement, Code Interpreter rollout and relevant news

Digital Twins, AI Regulation, Content Generators, and Model updates

Discovery of a new algorithm and an antibiotic using AI, Apple's new AR headset, and recent papers published on AI

LLMs on iPhone, AI-generated 3D Worlds, Autonomous Agents playing Minecraft, Brain interfaces using AI, and AI-generated newsletter attempt

AI Matters: name change vote

Small(er) Language Models, AI Governance, Cool Projects, Model updates, and Sustainability

35,000x speed improvement to AI workloads

社区洞察

其他会员也浏览了

A Comprehensive Guide to Azure OpenAI Service

The 6 Best LLM Tools To Run Models Locally

Deploying DeepSeek-R1 Locally with a Custom RAG Knowledge Data Base

Unpacking The OpenAI Meltdown

A Tale of Two Copilots: One You Know, The Other a Mystery

Last Week's Tech Highlights

Introducing OpenAI o1

TrueFoundry Newsletter #18: Prompt Engineering & ML System Scoring??

People need more jobs and videos.

AWS Introduces Multi-Agent Orchestrator: Revolutionizing AI Agent Management