2-Min AI Newsletter #16

2-Min AI Newsletter #16

Featured Post?????????

Meet DALL-E-Bot: An Artificial Intelligence (AI) Based Robotics System That Gives Web-Scale Diffusion Models An Embodiment To Realise The Scenes That They Imagine

Nowadays, it is difficult to pass a day without reading/hearing about a new application of diffusion models if you are following the news about artificial intelligence/machine learning. The massive success of diffusion models like DALL-E and Stable Diffusion has attracted enormous attention to these applications.

What if we could go deeper? What if we use these generated images to train another AI model to achieve a task? How about teaching a robot to do something? That’s the question DALL-E-Bot tries to answer.

DALL-E-Bot tries to tackle the object rearrangement problem. Since diffusion models can generate realistic images, the authors wanted to examine their capabilities of arranging objects in the scene in a natural way. For example, “kitchen tabletop with utensils” prompt will generate a realistic-looking image where the utensils and the plate are neatly placed if you pass it to DALL-E. Based on this observation, DALL-E-Bot uses a diffusion model to generate the goal for the robot. Once the robot sees this image, it will know what the final object arrangement should look like.?

No alt text provided for this image

???Baidu AI Researchers Propose ERNIE-ViLG 2.0, A Large-Scale Chinese Text-To-Image Diffusion Model That Gradually Improves Image Quality

???This AI Watches You Walk to Diagnose Parkinson’s, MS Video plus algorithms could make gait analysis cheaper

???Amazon AI Researchers Propose A New Deep Learning-Based Method For Adapting An MDE Model Trained On One Labeled Dataset To Another, Unlabeled Dataset

??Open AI Just launched the DALL·E API so developers can integrate DALL·E directly into their own apps and products

???A New MLOps System Called ALaaS (Active-Learning-as-a-Service) Adopts the Philosophy of Machine-Learning-as-Service and Implements a Server-Client Architecture

??Google plans giant AI language model supporting world’s 1,000 most spoken languages

???Harvard Researchers Propose a Self-Supervised Deep Learning Algorithm for Fast and Scalable Search of Whole-Slide Images

No alt text provided for this image

???A New MLOps System Called ALaaS (Active-Learning-as-a-Service) Adopts the Philosophy of Machine-Learning-as-Service and Implements a Server-Client Architecture

???Top AI Tools/Platforms To Perform Machine Learning ML Model Monitoring

???Aporia and ClearML Launch New Full-Stack MLOps Platform Partnership

???MLOps platform Galileo lands $18M to launch a free service

???Weights & Biases Bolsters Developer-First MLOps Platform with Major Updates, Including ML Workload Orchestration and Enterprise ML Lifecycle Management

???Apache DolphinScheduler in MLOps: Create Machine Learning Workflows Quickly

No alt text provided for this image

???High Fidelity Neural Audio Compression

???Pop2Piano : Pop Audio-based Piano Cover Generation

???Text-Only Training for Image Captioning using Noise-Injected CLIP

???Lightweight and High-Fidelity End-to-End Text-to-Speech.....

? Vox-Fusion: Dense Tracking and Mapping...

???DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image...

要查看或添加评论,请登录

Asif Razzaq的更多文章

社区洞察

其他会员也浏览了