登录查看更多内容

LLMs on iPhone, AI-generated 3D Worlds, Autonomous Agents playing Minecraft, Brain interfaces using AI, and AI-generated newsletter attempt

Praveen Cherukuri

CTO passionate about solving real business challenges using tech

发布日期: 2023年6月3日

Dear Readers,

Welcome back to issue #4 of the Synthetic Thought: AI Digest newsletter. As you might have noticed, I have changed the name of this newsletter from "AI Matters" to "Synthetic Thought: AI Digest". Mainly to avoid name collisions with other newsletters out there.

I have taken some time off after the last publication and have not published an issue last week. However, I have included all events, projects, papers, and notable topics that I'd have covered last week into this week's.

OpEd: An attempt at an AI-generated newsletter
Cool Projects
Notable Papers

OpEd: An attempt at an AI-generated newsletter

I have been using various AI image generation tools (Lexica and DreamStudio) to generate the image that goes with each issue. I wanted to see if I could get an AI-assist in making this newsletter generation process easier. Maybe even fully generate it using an AI. Here's the process I wanted to use:

Generate a prompt using ChatGPT 4
Feed the generated prompt back into ChatGPT 4 to generate the actual newsletter.
Use the same generated prompt with Bard and Bing.
Lastly, try step #2 using "Browse with Bing" feature that allows ChatGPT 4 to surf the internet, click on links, and gather knowledge.

Here are the results:

Overall, I have to say I was pretty disappointed with the results from all of them (ChatGPT 4 with and without the Browse with Bing plugin, Bing, and Bard). Here are some reasons why:

I have started with the following prompt:

Create a prompt to generate content for the newsletter "Synthetic Thought: AI Digest" covering latest news, summarize key articles, notable arXiv papers in the Artificial Intelligence category, and from key influencers like Andrej Karpathy and Jeremy Howard. Make sure to include only coverage for events, articles, papers after May 17th.

First of all, they all failed at the prompt generation and just directly started writing the newsletter. Their responses included items from prior to May 17th. There was some inaccurate information and the content was anemic for a newsletter. I spent a bit more time tweaking the prompt to see if I could coerce them into generating a reasonable response. I gave up probably too quickly. But it feels like they're not ready to generate an entire newsletter on their own at this point. In the future, I'd like to prompt-engineer futher or fine-tune using previous editions of this newsletter as examples and see if that helps.

领英推荐

The Future of Search Upended - ChatGPT Search

Michael Spencer 4 个月前

Google's biggest breakthroughs in AI, Meta's new AI…

AI NOW & BEYOND 1 年前

??Google Gemini 2.0 For Text, Images, and Speech…

Ritesh Kanjee 3 个月前

Cool Projects

class>Skybox AI generates a 3D world based on a text prompt. I asked it to generate "A Hindu temple on a mountain floating above a beautiful beach in concave shape" and it created this world. Pretty amazing!

Genmo

is a text-to-3D model and image-to-3D model generator. I saw the demo but had to get on a waitlist to try it out.

Open LLM Leaderboard

. According to their page, this board "aims to track, rank and evaluate LLMs and chatbots as they are released". Hopefully, this makes the process a bit easier.

Massively Multilingual Speech (MMS) project

. In addition, they've announced language identification capabilities for 4000 languages.

runs on a range of consumer devices

, including iPhones.

Will Depue

announcing this. Through his tweet, I also learned about the

Alexandria Index

, an attempt at embedding all of the Internet. Embeddings allow us to perform searches, clustering of information, recommend, detect anomalies, classify, measure diversity, etc. Creating embeddings of large datasets accelerates AI workflows and pipelines. Thank you, Will!

Voyager

plays Minecraft autonomously using GPT-4 and excels at it. According to their website, "It obtains 3.3x more unique items, travels 2.3x longer distances, and unlocks key tech tree milestones up to 15.3x faster than prior SOTA"

Reimagine XL

. Of course, you are not limited to just furniture. You can upload any image and generate a number of variations. Try it out, and let me know how it goes.

Generative Fill

feature in Adobe Photoshop is just mindblowing. Watch the intro video on their page. It's very cool.

Mindeye

is another mind-blowing (I know I have already said that about the previous one :) )project. This project reconstructs an image from brain activity! Look at the examples on their page. Another fascinating project.

Brain-Spine Interface

. The end of that article has videos showing a man

walking

with the help of this Brain-Spine Interface.

explores

how specifying the lens type in your prompt impacts the generated image in Midjourney. Another type of prompt "engineering".

Gorilla

integrates 1600+ APIs into LLMs so that natural language queries are translated into accurate API calls. General challenge with LLMs is, sometimes they hallucinate, which doesn't work very well when using APIs. Gorilla team claims to reduce hallucinations substantially.

Break-A-Scene

extracts multiple items/concepts from a single image and generates variations of those concepts in other images. Currently most methods only extract a single concept and that too from multiple images. This is significant progress compared to most current methods.

This professor

(I'm guessing he's a professor) asked his undergrad students to use ChatGPT for an assignment and asked them to grade by looking for hallucinated info. Apparently, all 63 essays had hallucinated info. Lesson here is, in my opinion is, don't trust the AI blindly. Always double check.

DINOv2

is a computer vision model that can perform high-quality segmentation, depth estimation, classification, and image retrieval using the Self-Supervised Learning approach. Try out the

demo

. More on the history and evolution of these models at

InfoQ

architecture

. This Superchip, as Nvidia calls it, uses an NVLink interconnect running at 900GB/s bandwidth, 7x more than x16 PCIe Gen5 lanes! If you do read it, let me know what other features stood out for you. There are too many to mention here.

Neuralangelo

reconstructs 3D surfaces with high-fidelity from just video captures. Their reconstruction is of much higher quality than current state of the art.

SoundStorm

generates realistic voices and dialogues using an introductory voice prompt. I couldn't tell the difference between the original voice and synthesized voice(s). I can see some very cool applications using this and can also see how this could be misused, such as bypassing biometric identification systems etc. The authors acknowledge some of these safety issues at the end of the webpage linked.

Microsoft Build

conference that were interesting. I did manage to watch Andrej Karpathy's

State of GPT

. Highly recommend watching it.

Notable Papers

This paper discusses LLMs' bilingual capabilities. Apparently have the ability to translate very well. However, there seems to be a correlation between how well they can translate and how large a model is. Larger models perform better, according to this paper.
DragGAN allows users to manipulate generated images. Users click and drag features of an image and adjust various parameters to regenerate images until desired properties are achieved. Their website has some cool demos. I encourage checking out those demos to get a sense of how this works.
FrugalGPT claims to reduce costs by 98% by picking the right LLM based on the query and optimizes for cost and accuracy.
Current LLMs are resource intensive as they take historical context into account. This is also why they are great at NLP. However, a new architecture allows for an approach "that combines the efficient parallelizable training of Transformers with the efficient inference of RNNs"
Sophia helps train AI models 2x faster than the traditional Adam optimizer.
Prompt Engineering is necessary to steer Large Language Models (LLMs) to get the best out of them. Automatic Prompt Engineer?(APE) demonstrates a way to use LLMs also to generate the required prompt. This paper shows that

APE-engineered prompts are able to improve few-shot learning performance (by simply prepending them to standard in-context learning prompts), find better zero-shot chain-of- thought prompts, as well as steer models toward truthfulness and/or informativeness.

Reminder: please subscribe to the?Synthetic Thought: AI Digest newsletter (renamed from AI Matters) and share it in your network. Thank you!

Please let me know your thoughts on this edition in the comments section. Did you like it? Too much info in one article? Did I miss anything you encountered in the last week?

#innovation??#artificialintelligence??#technology??#news

Synthetic Thought: AI Digest

332 位关注者

要查看或添加评论，请登录

Praveen Cherukuri的更多文章

AI Code Generators, Misinformation in AI models, Javascript Agents library, Model editing, and current news

2023年8月7日

AI Code Generators, Misinformation in AI models, Javascript Agents library, Model editing, and current news

Dear Readers, Welcome to the 9th issue of Synthetic Thought: AI Digest. I use many sources to keep up with what's…
AI Architecture Patterns, 24x faster inference, xAI announcement, Code Interpreter rollout and relevant news

2023年7月16日

AI Architecture Patterns, 24x faster inference, xAI announcement, Code Interpreter rollout and relevant news

In this issue of Synthetic Digest, we discuss developments in the AI space since the last publication. Let's dive into…
Digital Twins, AI Regulation, Content Generators, and Model updates

2023年6月21日

Digital Twins, AI Regulation, Content Generators, and Model updates

Dear Readers, Welcome to the 6th edition of Synthetic Thought: AI Digest. We have numerous updates to cover today.
Discovery of a new algorithm and an antibiotic using AI, Apple's new AR headset, and recent papers published on AI

2023年6月11日

Discovery of a new algorithm and an antibiotic using AI, Apple's new AR headset, and recent papers published on AI

Hello, Dear Readers! Welcome to the 5th issue of Synthetic Thought: AI Digest. I hope this finds you well, safe, and…
Google I/O Updates, training a 65B model on a single GPU, HF Transformers Agent, and some interesting papers.

2023年5月17日

Google I/O Updates, training a 65B model on a single GPU, HF Transformers Agent, and some interesting papers.

Welcome to the 3rd edition of the AI Matters newsletter. This edition explores progress in the AI world since last week.

1 条评论
AI Matters: name change vote

2023年5月16日

AI Matters: name change vote

Well, it looks like there are many newsletters with the name "AI Matters". So, I'd like to adapt and change the…

14 条评论
Small(er) Language Models, AI Governance, Cool Projects, Model updates, and Sustainability

2023年5月10日

Small(er) Language Models, AI Governance, Cool Projects, Model updates, and Sustainability

Welcome to the second edition of the AI Matters Newsletter, published weekly. In the last week or so, just like in the…

3 条评论
35,000x speed improvement to AI workloads

2023年5月4日

35,000x speed improvement to AI workloads

We certainly live in some very interesting times. I have been saying this to myself quite frequently lately.

See all articles

LLMs on iPhone, AI-generated 3D Worlds, Autonomous Agents playing Minecraft, Brain interfaces using AI, and AI-generated newsletter attempt

Praveen Cherukuri

CTO passionate about solving real business challenges using tech

Table of Contents

OpEd: An attempt at an AI-generated newsletter

领英推荐

Cool Projects

Notable Papers

Synthetic Thought: AI Digest

332 位关注者

Praveen Cherukuri的更多文章

社区洞察

其他会员也浏览了

?? Llama Takes off Shades to Look Upon Your Images

Exploring the Latest Innovations from OpenAI's Dev Day 2024

How to Create Custom GPTs? / Samsung Galaxy AI is Coming / Microsoft Copilot in Iphone

Reddit is now blocking major search engines | Bing previews its answer to Google’s AI Overviews | Mistral’s Large 2 is its answer to Meta and OpenAI.

How 2023 May Have Changed Media Forever

OpenAI seems to have finalized its deal with Apple | Custom GPTs now available for free ChatGPT users | Mistral launches Codestral.

Look who's talking now! Voice gets chatty.

Hands-on with Bing’s new ChatGPT-like features

From 1-800-ChatGPT to Google's Deep Research: AI Goes Big on Accessibility and Innovation

OpenAI Begins Hyper-realistic Voice Rollout

Table of Contents

OpEd: An attempt at an AI-generated newsletter

领英推荐

Cool Projects

Notable Papers

Synthetic Thought: AI Digest

332 位关注者

Praveen Cherukuri的更多文章

AI Code Generators, Misinformation in AI models, Javascript Agents library, Model editing, and current news

AI Architecture Patterns, 24x faster inference, xAI announcement, Code Interpreter rollout and relevant news

Digital Twins, AI Regulation, Content Generators, and Model updates

Discovery of a new algorithm and an antibiotic using AI, Apple's new AR headset, and recent papers published on AI

Google I/O Updates, training a 65B model on a single GPU, HF Transformers Agent, and some interesting papers.

AI Matters: name change vote

Small(er) Language Models, AI Governance, Cool Projects, Model updates, and Sustainability

35,000x speed improvement to AI workloads

社区洞察

其他会员也浏览了

?? Llama Takes off Shades to Look Upon Your Images

Exploring the Latest Innovations from OpenAI's Dev Day 2024

How to Create Custom GPTs? / Samsung Galaxy AI is Coming / Microsoft Copilot in Iphone

Reddit is now blocking major search engines | Bing previews its answer to Google’s AI Overviews | Mistral’s Large 2 is its answer to Meta and OpenAI.

How 2023 May Have Changed Media Forever

OpenAI seems to have finalized its deal with Apple | Custom GPTs now available for free ChatGPT users | Mistral launches Codestral.

Look who's talking now! Voice gets chatty.

Hands-on with Bing’s new ChatGPT-like features

From 1-800-ChatGPT to Google's Deep Research: AI Goes Big on Accessibility and Innovation

OpenAI Begins Hyper-realistic Voice Rollout