maadaa AI News & Open Datasets: Nvidia’s Next-Gen AI Chip, AI-Generated Shows and Movies & More
(maadaa AI News Weekly: May 28~ June 3)
1. Nvidia’s Next-Gen AI Chip “Rubin” Unveiled
News:
Nvidia has revealed its next generation AI chip, called Rubin, ahead of Computex in Taiwan this week. The chip will be based on 8 stacks of HBM4 memory and is set to be available in 2026.
Key Points:
Why It Matters?
The increased computational power and frequent updates offered by Rubin will enable the training of larger language models and multimodal AI on more diverse datasets. Rubin’s capabilities will drive AI research by allowing more experimentation with massive training datasets across domains, advancing the state-of-the-art.
2. AI Content Creation: Navigating Legal and Ethical Challenges
News:
A new streaming service, Showrunner, aims to let users create their own AI-generated animated shows by providing prompts to control dialogue, characters, and shots. It claims to be “the Netflix of AI” but faces skepticism over feasibility and potential legal issues regarding copyrighted characters.
Key Points:
Why It Matters?
Showrunner could potentially enhance training datasets for generative AI models by providing many user-generated prompts and corresponding AI-generated content across diverse genres and styles. This user-generated data could help models learn to understand better and generate content based on natural language prompts, improving their capabilities.
3. Sony Embraces AI Revolution: Lights, Camera, Artificial Intelligence
News:
Sony Pictures plans to incorporate generative AI into movie and TV production to increase efficiency and reduce costs. CEO Tony Vinciquerra stated they will “use AI to produce films for theaters and television in more efficient ways, using AI primarily.”
Key Points:
Why It Matters?
This news is significant as it highlights the growing adoption of AI in the film industry, which could enhance training datasets for generative AI models. By incorporating AI into actual movie production, Sony can generate valuable data on AI’s capabilities, limitations, and integration with human creativity.
领英推荐
4. Additional News:
Shared Open and Commercial Datasets
Open Dataset #1: MovieGraphs Dataset
The MovieGraphs dataset aims to annotate social interactions and character relationships in movies through graphs. It provides rich semantic information on film content, facilitating in-depth analysis of complex social interactions in movies.
Open Dataset #2: MovieNet Dataset
The MovieNet dataset is a comprehensive dataset designed for multiple tasks in film analysis, including scene segmentation, character identification, and visual storytelling. It includes many annotated movies with detailed scene descriptions, character bounding boxes, and narrative elements.
Commercial Dataset #3: Video Object Instance Segmentation Dataset
Volume:5K
The Video Object Instance Segmentation Dataset from maadaa.ai includes Internet-collected video clips with an average length of around 10s and a resolution of over 1920 x 1080.
Annotation Type: Instance Segmentation
The video scenes cover indoors and outdoors, including parks, shopping streets, offices, supermarkets, shopping malls, etc. More than twenty objects are labeled with instance segmentation, such as a person, bicycle, car, bus, cat, dog, cup, etc.
Application Scenarios: Video Entertainment; Visual Understanding.
Commercial Dataset #4: Multi-modal Generative AI Large Datasets — Licensed
maadaa.ai’s large dataset is specially developed for state-of-the-art multi-modal large language models, including various structured datasets like image-text pairs, video-text pairs, and e-book in markdown. Following the rules of international copyright authorization, this large dataset ensures the infusion of authenticity and diversity into Generative AI model training, propelling Generative AI models towards unprecedented accuracy and innovation.
Source: