登录查看更多内容

AI Code Generators, Misinformation in AI models, Javascript Agents library, Model editing, and current news

Praveen Cherukuri

CTO passionate about solving real business challenges using tech

发布日期: 2023年8月7日

Dear Readers,

Welcome to the 9th issue of Synthetic Thought: AI Digest. I use many sources to keep up with what's happening in the industry. Sometimes, this includes information from other content creators. To be fair to the original content creators, starting this issue, I will attempt to attribute content to original authors when I use their information.

Most of the developer-related content is towards the end of this issue.

OpEd: How this week's cover image was generated. Hint: using a series of AI tools

As you might have noticed, this week's cover image was generated using several AI tools. Here's roughly what the process was.

Used GPT-4 to generate a prompt for Midjourney. The prompt was "Generate a mid journey prompt to generate an image representing future, technology, intelligence, and creativity".
Pasted the prompt into Midjourney, Lexica.art, and DreamStudio. I generated several images and tweaked the prompt for each tool until I got something reasonably interesting.
Uploaded the generated image to Instagram, applied Instagram filters, and played with the filters until the image got a bit more interesting.
I didn't find an easy way to export the image from Instagram, so I screenshotted the refined image, pasted it into Stability.ai's Uncrop tool, and resampled it to half the size and resolution of LinkedIn's cover image.

That was roughly what the process was. Took a bit of tweaking and experimentation at each step. This is just one way to do it. If this is what you want to do, I'd recommend exploring various tools, understanding their strengths and weaknesses, and getting to a process that would work for you.

领英推荐

Integrating OpenAI APIs with ChatMotor.ai : A Retex…

Eric PETIOT 7 个月前

AutoAgent: A Zero-Code Framework for Multi-Agent LLM…

Muhammad Ehsan 1 周前

RAG Pipelines with Visual Embeddings

Unwind AI 4 个月前

This week's updates:

A thread on some cool image-to-video clips using Runway. The easiest approach seems to be to prompt Midjourney with text and use the generated image(s) to feed Runway to generate a video.
Using Interpolating Images, approach a series of images can be generated by specifying a source and target images. Generated images represent a continuum of images as you progress from source to target. Source.
brain2music reconstruction paper identifies or reconstructs music that a human subject hears using fMRI. Source.
An update to ChatGPT allows custom instructions that are persistent across multiple chats. So folks who have had to specify prompts/instructions repeatedly can now set and forget some of these instructions in the future.
Llama-2 was announced since our last edition. Its license allows for commercial use. But it's not open-source. A couple of other caveats to be mindful of: 1) can't use it to train other language models 2) Organizations with >700M users have to apply for a special license. Covered in this post.
llama2.ai lets you play with the latest Llama model from Meta using a ChatBot-like web interface.
llama-playground allows you to experiment with the latest Llama update on your Macbook. This is for folks who can drop to a shell prompt and execute commands.
InstructPix2Pix takes an original image (left) and can alter it based on an instruction, and regenerate the original image based on the instruction provided. Unfortunately, I lost the prompt as it took a while for the model to execute and lost track of it later. It was something along the lines of "modify this image to represent lava spewing out of the mountain" or something along those lines.

No alt text provided for this image — Original image on left regenerated using a text prompt and InstructPix2Pix model. Result on the right.

This paper helps with watermarking language model-generated content to establish provenance later. This is one aspect of AI safety that is of great interest currently. Source.
Meta open-sourced AudioCraft, a suite of tools for music, sound generation, and compression. I had to navigate a few links to get to their GitHub repo. Here's a link to it.
Music To Image generates an image representing the music. I found this very interesting. This model sends audio to another model (LP-Music-Caps) that generates text from audio. The model then takes the generated text and sends it to Stable Diffusion XL to create an image. So I have decided to extend this further. I have taken one of the examples from AudioCraft text, corresponding to the generated audio, and fed the audio to this model, which internally directs the audio to LP Music Caps, and then finally routes that text generated to Stable Diffusion. Why? Because I can. :) So here are all the hops in short form: text -> audio -> text -> image. I used the first prompt in the AudioCraft blog: "Whistling with wind blowing" and fed that into this model. The image it generated was:

class>Hugging Face released Agents.js, which can be thought of as a Javascript-based orchestration tool with tight integration into the HF ecosystem of models. It can run in browser and Node environments.

(or should I call them Xs now?) reports some new models in the leaderboard.

patterns for building LLM Systems & Products

. Check it out.

Med-Flamingo

is an LLM fine-tuned on medical textbooks, images, and a Biomed dataset. This

summarizes it well.

PointOdyssey

has a data and way to synthesize data for fine-grained tracking in lengthy video clips.

comparison of two AI app generators

using LLMs. I plan to try both. My initial attempt while writing this issue led to errors. Had to refocus on completing this article. Will be trying out both, and if I find anything worth reporting, you will find it in the next issue.

MetaGPT

takes a one-line mission statement of a software product and generates code. Internally, it uses agents representing Product Managers, Architects, and various other team members you would typically find in a software dev team—one more tool to try on my list.

Rift

is another code generator that can now edit code on the fly based on prompts.

summary of Llama2

with a nice infographic based on the original

paper

models can be poisoned with misinformation

. And then, they refer to a tool called

ROME

that enables editing model facts and identifying specific weights related to the knowledge in question. This can then correct or misinform the model, depending on the user's intent.

ripple effects of editing a model

and its effects on other knowledge. It discusses how current practices aren't sufficient for consistent model updates. They then conclude that in-context editing is likely the best approach to this.

Source

Robotic Transformer 2

combines vision and natural language commands. The model can reason and perform actions, even on previously unseen data.

leaderboard for Embedding Models

and an associated

video

explaining tradeoffs when choosing one.

article

that discusses design trade-offs involved in building your own ChatBot.

fine-tuning Llama 2

using your own set of instructions.

Stable Beluga 1

(Llama 1-based) and

Stable Beluga 2

(Llama 2-based) were

announced

. Both models were released under a non-commercial license.

MultiMedBench

was introduced.

Source

PromptsRoyale

allows you to generate multiple prompts and run a battle to test which ones are the best.

Pallet Detection Model

and generating synthetic data. Pallet detection can be important in a range of Manufacturing use cases.

prompting using speech

in different languages. While many have built apps to accomplish this effect, this paper discusses doing this natively using model constructs.

Source

language model from scratch

using TensorFlow and TPUs.

direct integration into Hugging Face Hub

models. Models can be deployed directly without leaving the Azure web interface.

scaling laws around quantization

. According to this paper, if you have a 30B 8-bit model and a 60B 4-bit model, apparently, the 4-bit model has better (zero-shot) accuracy.

Source

repo

that can train and run inference on a model, including Llama2. Very impressive.

Retentive Network

is being positioned as a successor to the Transformer architecture. It looks great on paper. The authors claim that more work needs to be done. Transformers have been successful because of their reasoning abilities and emergent properties. Hoping that future studies will explore these areas.

llm-toys

repo has quantized models that are fine-tuned for various language tasks such as paraphrasing, changing tone, summarization, etc.

Please subscribe to the?Synthetic Thought: AI Digest?newsletter and share this with your network. Thank you!

And I encourage you to let me know your thoughts on this edition in the comments section.?

#innovation??#artificialintelligence??#technology??#news#ai?#datascience?#machinelearning?#deeplearning?#technews?#techcommunity?#aiinsights?#digitaltransformation?#techupdates?#futuretech?#subscribe?#stayinformed?#aiknowledge?#techdiscoveries?#techrevolution?#aicommunity?#techenthusiasts?#techinfluencers

Synthetic Thought: AI Digest

332 位关注者

要查看或添加评论，请登录

Praveen Cherukuri的更多文章

AI Architecture Patterns, 24x faster inference, xAI announcement, Code Interpreter rollout and relevant news

2023年7月16日

AI Architecture Patterns, 24x faster inference, xAI announcement, Code Interpreter rollout and relevant news

In this issue of Synthetic Digest, we discuss developments in the AI space since the last publication. Let's dive into…
Digital Twins, AI Regulation, Content Generators, and Model updates

2023年6月21日

Digital Twins, AI Regulation, Content Generators, and Model updates

Dear Readers, Welcome to the 6th edition of Synthetic Thought: AI Digest. We have numerous updates to cover today.
Discovery of a new algorithm and an antibiotic using AI, Apple's new AR headset, and recent papers published on AI

2023年6月11日

Discovery of a new algorithm and an antibiotic using AI, Apple's new AR headset, and recent papers published on AI

Hello, Dear Readers! Welcome to the 5th issue of Synthetic Thought: AI Digest. I hope this finds you well, safe, and…
LLMs on iPhone, AI-generated 3D Worlds, Autonomous Agents playing Minecraft, Brain interfaces using AI, and AI-generated newsletter attempt

2023年6月3日

LLMs on iPhone, AI-generated 3D Worlds, Autonomous Agents playing Minecraft, Brain interfaces using AI, and AI-generated newsletter attempt

Dear Readers, Welcome back to issue #4 of the Synthetic Thought: AI Digest newsletter. As you might have noticed, I…
Google I/O Updates, training a 65B model on a single GPU, HF Transformers Agent, and some interesting papers.

2023年5月17日

Google I/O Updates, training a 65B model on a single GPU, HF Transformers Agent, and some interesting papers.

Welcome to the 3rd edition of the AI Matters newsletter. This edition explores progress in the AI world since last week.

1 条评论
AI Matters: name change vote

2023年5月16日

AI Matters: name change vote

Well, it looks like there are many newsletters with the name "AI Matters". So, I'd like to adapt and change the…

14 条评论
Small(er) Language Models, AI Governance, Cool Projects, Model updates, and Sustainability

2023年5月10日

Small(er) Language Models, AI Governance, Cool Projects, Model updates, and Sustainability

Welcome to the second edition of the AI Matters Newsletter, published weekly. In the last week or so, just like in the…

3 条评论
35,000x speed improvement to AI workloads

2023年5月4日

35,000x speed improvement to AI workloads

We certainly live in some very interesting times. I have been saying this to myself quite frequently lately.

See all articles

AI Code Generators, Misinformation in AI models, Javascript Agents library, Model editing, and current news

Praveen Cherukuri

CTO passionate about solving real business challenges using tech

OpEd: How this week's cover image was generated. Hint: using a series of AI tools

领英推荐

This week's updates:

Synthetic Thought: AI Digest

332 位关注者

Praveen Cherukuri的更多文章

社区洞察

其他会员也浏览了

The truth on AI Agents Frameworks

Conversational AI with Langchain & Hugging Face: Building a Simple Chatbot Interface

Demystifying Semantic Kernel

Turning browsers into smart agents with GPT + ARIA

Unlocking the Power of LLMs: A Guide to Advanced Prompting Strategies

How to Train an AI to Use Your Own Design System

Prompting Conversation — Exploring the Nuance Mix Builder (Copilot) Capabilities

Notes on Building GPT-powered PubMed Chat Assistants (Part 2) (feat. Sendbird)

Build your first Chat GPT like app[Beginner]

OpEd: How this week's cover image was generated. Hint: using a series of AI tools

领英推荐

This week's updates:

Synthetic Thought: AI Digest

332 位关注者

Praveen Cherukuri的更多文章

AI Architecture Patterns, 24x faster inference, xAI announcement, Code Interpreter rollout and relevant news

Digital Twins, AI Regulation, Content Generators, and Model updates

Discovery of a new algorithm and an antibiotic using AI, Apple's new AR headset, and recent papers published on AI

LLMs on iPhone, AI-generated 3D Worlds, Autonomous Agents playing Minecraft, Brain interfaces using AI, and AI-generated newsletter attempt

Google I/O Updates, training a 65B model on a single GPU, HF Transformers Agent, and some interesting papers.

AI Matters: name change vote

Small(er) Language Models, AI Governance, Cool Projects, Model updates, and Sustainability

35,000x speed improvement to AI workloads

社区洞察

其他会员也浏览了

The truth on AI Agents Frameworks

Conversational AI with Langchain & Hugging Face: Building a Simple Chatbot Interface

Demystifying Semantic Kernel

Turning browsers into smart agents with GPT + ARIA

Unlocking the Power of LLMs: A Guide to Advanced Prompting Strategies

How to Train an AI to Use Your Own Design System

Prompting Conversation — Exploring the Nuance Mix Builder (Copilot) Capabilities

Notes on Building GPT-powered PubMed Chat Assistants (Part 2) (feat. Sendbird)

Build your first Chat GPT like app[Beginner]