Discovery of a new algorithm and an antibiotic using AI, Apple's new AR headset, and recent papers published on AI
Image generated using ChatGPT 4 generated text prompt fed into DreamStudio

Discovery of a new algorithm and an antibiotic using AI, Apple's new AR headset, and recent papers published on AI

Hello, Dear Readers!

Welcome to the 5th issue of Synthetic Thought: AI Digest. I hope this finds you well, safe, and eager to dive into the latest insights from the remarkable world of artificial intelligence. So grab a cup of your favorite brew and settle for an enlightening read. As always, I aim to make you think, question, and hopefully learn something new.


Table of Contents

  • OpEd: AI as an autonomous problem solver
  • Notable topics this week
  • Papers


OpEd: AI as an autonomous problem solver

This week I ran into a couple of AI solutions with a common theme: letting AI explore a space to find the most optimal solution on its own.

  1. DeepMind discovers a new sorting algorithm all by itself.
  2. A new antibiotic was discovered to fight a drug-resistant superbug.

This is almost like brute-forcing all possible combinations, eliminating options based on some criteria (or loss function), and arriving at a plausible solution. While most ML approaches work like this, we see this being applied to real-world problems beyond algos.


Notable topics this week

  • NVIDIA announced their DGX GH200, a 100TB GPU Memory system. Yes, 100TB of GPU memory! Model optimizations, programming language optimizations, and new research every day keep reducing the need for Large Models. However, hardware keeps accommodating larger and larger models (multi-trillion parameter models?) with every iteration. Much of the innovation came from the Open Source realm after ChatGPT's introduction. All these innovations will only accelerate the space further. AGI is probably not too far off.
  • Apple released its $3,500 AR headset called Vision Pro. I found this YouTube video to be an excellent summary of its capabilities.
  • It looks like Apple is going to be leveraging an LLM of some kind to enhance the autocorrect feature on its keyboard.
  • Those of you embarking on embedding a language model into your product(s) need to pay attention to "Prompt Injection Attacks". I found this article on Prompt Injection Attacks to be very helpful. Those of you, familiar with SQL Injection will be able to relate to this. However, Prompt Injection Attacks aren't as easy to thwart as SQL Injection attacks, as you would have to first reason about the input, potentially doubling your LLM usage costs.
  • LLaMA 7B model running on a MacBook using 0% CPU and 38 GPU cores at 40 tokens/sec—quite an accomplishment. Inference on consumer devices will soon become a norm. This page outlines 2-6 bit quantization that enables running language models on consumer hardware.
  • LTM-1 is an LLM with a monstrous 5 million prompt tokens for context.
  • Recognize Anything Model recognizes multiple objects in a picture, highlights those objects, and provides a list of tags as output in English and Chinese. It can optionally describe the scene too.
  • Salesforce introduced CodeTF, a Python library that allows for pluggable Code LLMs and offers:

  1. Fast inference using pre-quantized models
  2. Fine-tuning
  3. Support for nl2code, code summarization, code completion, code translation, code refinement, clone detection, defect prediction
  4. Pre-processed benchmark datasets
  5. Easy model evaluation on benchmarks
  6. Support for Foundation Code Models (CodeBERT, CodeT5, CodeGen, CodeT5+, Incoder, StarCoder, etc.)
  7. Fine-tuned checkpoints
  8. Utilities to manipulate source code, such AST Parsers etc


Papers

  • Explainpaper summarizes research papers. I haven't checked it out yet. But it's on my list as this will allow me to go through more papers in the same amount of time, so I can better keep up with what's going on in the AI space. If you try it out, love to hear from you on how you like it.
  • Video-LLaMA enables LLMs with audio and video understanding. So now you can use prompts about the audio and video content.
  • A diffusion model that fills in a person's image into a scene, and if you don't provide a person's image to be filled in, the AI can get creative and insert an arbitrary person into the image all on its own. Quite fascinating. Check out their project homepage to get an idea of all the possibilities this research can be applied.
  • MusicGen by Facebook generates music from a text prompt. You can try it on HuggingFace. A simple "Electronic music for running" prompt took about 80 secs to generate a clip. And the music did match the prompt very well.
  • OmniMotion is an excellent object-tracking mechanism in Videos, even when an object is occluded. If you have used the likes of OpenCV OpticalFlow, sometimes you lose track of the object under various conditions. This seems to be better than the current state of the art. Worth checking out.
  • ChatDB enables LLMs to interact with a general-purpose SQL database as memory, to respond to a natural language prompt. Their illustration shows "Chain-of-Memory", a series of SQL operations performed by LLM to satisfy a user request.


Subscribe to the?Synthetic Thought: AI Digest?newsletter and share it with your network. Thank you!

Please let me know your thoughts on this edition in the comments section. Did you like it? Too much info in one article? Did I miss anything you encountered in the last week?

#innovation??#artificialintelligence??#technology??#news

要查看或添加评论,请登录

Praveen Cherukuri的更多文章

社区洞察

其他会员也浏览了