AI-Powered news roundup: Edition 13
Siili Solutions
We help you find what’s essential. Then we build it. MAKE IT REAL.
We’ve surpassed 5,000 subscribers! We appreciate your ongoing support!
Our bi-weekly AI news roundup is designed to keep you informed on the latest (and most important) developments, all in under 5 minutes.
In this edition:
1. OpenAI introduces Canvas, a new interface for writing and coding with ChatGPT
Source: OpenAI
OpenAI is rolling out a new interface called Canvas for ChatGPT Plus and Team users, offering a collaborative workspace for writing and coding projects beyond simple chat interactions. Canvas provides tools for real-time collaboration with ChatGPT, allowing users to highlight sections, receive inline feedback, and directly edit text or code. With features like shortcuts for adjusting writing length, changing reading levels, or debugging code, it enhances productivity by offering more control over project management.
While Canvas bears some similarity to Claude's Artifacts, is better suited for those who need immediate, hands-on collaboration with an AI in writing or coding environments. It thrives in scenarios where constant interaction and iteration are needed. Claude Artifacts, on the other hand, is geared more toward long-term projects where preserving and referring back to stages of development is crucial, but without the need for real-time interaction.
Built with GPT-4o, Canvas opens when needed, making it easier to track changes in code or refine writing. Coding-specific tools include reviewing code, adding logs, fixing bugs, and translating between languages like Python, Java, and C++. For writers, it offers suggestions, length adjustments, and polishing for grammar and clarity.
Currently available for Plus and Team users, Canvas will be expanded to Enterprise, Edu, and Free users soon. This innovative tool reflects OpenAI's efforts to enhance collaboration between humans and AI, providing a more interactive and tailored experience for creators.
2. OpenAI to launch AI agents in 2025: What to expect
Source: Tom's Guide
At OpenAI’s recent DevDay event, CEO Sam Altman announced the upcoming launch of AI agents in 2025. These agents will be autonomous AI systems capable of performing tasks without direct human input. Demonstrating the potential, an AI assistant successfully placed a phone order during the event, showcasing its ability to act independently. Altman emphasized that this milestone brings us closer to Artificial General Intelligence (AGI), where AI systems can execute tasks like self-publishing a book or managing projects.
The development of these agents is powered by OpenAI’s o1 models, which enable AI to reason and take action based on planned responses. While these advances are significant, the key challenge remains in ensuring these agents are aligned with human values and safety, preventing unintended consequences.
AI agents represent the next evolution in AI, allowing users to delegate complex tasks that would typically take weeks or months, down to hours. OpenAI plans to release these agents gradually, starting with early integration into ChatGPT, eventually scaling to handle multiple roles for individuals and businesses alike. However, safety and alignment with human goals will be a primary focus before wide deployment.
3. Nvidia Releases Open-Source AI Model to Rival GPT-4
Source: VentureBeat
In a groundbreaking move, Nvidia has unveiled its NVLM 1.0 family of open-source multimodal AI models, headlined by the NVLM-D-72B. This model, boasting 72 billion parameters, is designed to compete with industry giants like OpenAI’s GPT-4 and Google’s models. The NVLM-D-72B excels in both visual and text tasks, such as interpreting memes, solving math problems, and analyzing complex images, while improving text-only performance by 4.3% on key benchmarks.
What sets Nvidia’s model apart is its open-source availability, breaking the trend of proprietary systems. This decision gives researchers and developers unprecedented access to cutting-edge technology, sparking excitement across the AI community. By providing the model weights and promising the release of training code, Nvidia is fostering an era of increased collaboration and innovation.
NVLM 1.0’s open-source nature has the potential to reshape the AI landscape, making powerful tools available to smaller organizations and independent researchers. However, it also raises ethical concerns about the misuse of advanced AI. Nvidia’s release may prompt other tech leaders to follow suit, accelerating AI advancements while pushing for responsible use. The full impact of this move remains to be seen, but it’s clear that Nvidia is challenging the status quo in AI development.
4. Tesla unveils Robotaxi and Robovan at "We, Robot" event
Source: TechCrunch
Tesla's highly anticipated We, Robot event has concluded, showcasing two major announcements that could reshape autonomous transport. First, the Cybercab, a two-seater robotaxi priced under $30,000, was revealed as Tesla's next step in making autonomous vehicles more accessible. In a surprise twist, CEO Elon Musk also introduced the Robovan, a larger vehicle designed to transport up to 20 people.
The event, streamed live on platforms like X and YouTube, marked the latest in Tesla’s push towards full autonomy, a goal Musk has championed for years. Despite delays, including the cancellation of a promised $25,000 EV and setbacks in Tesla’s Full Self-Driving (FSD) system, Musk remains committed to bringing autonomous robotaxis to the road. The Cybercab’s compact, futuristic design—similar to the Cybertruck—features no steering wheel or pedals, aiming to function as a fully driverless vehicle.
Industry veteran Anthony Levandowski, co-founder of Google’s self-driving car program, expressed support for Tesla’s vision in a post-event interview, signaling broad interest in Tesla’s autonomous ambitions. Though Tesla's FSD is still not fully autonomous, Musk’s long-term goal of launching a fleet of robotaxis continues to push the boundaries of electric and autonomous vehicle innovation.
5. Google upgrades Google Lens with AI-powered video search and shopping features
Source: Google Blog
Google has introduced a major upgrade to its Lens app, adding AI-powered video search capabilities. Now, English-speaking Android and iOS users can record videos through Lens and ask questions about objects in real time. Powered by Google’s customized Gemini AI model, this feature analyzes the video and provides context-specific answers via Google’s AI Overviews.
For example, users can record a video of fish swimming and ask why they’re moving in circles. Lens will then offer an explanation and relevant resources. This new functionality, available through the Google Search Labs program, marks a leap in Google’s AI integration, allowing deeper interaction with real-world surroundings.
In addition to video analysis, Lens now supports multi-modal search, combining images and text, and has enhanced shopping features. When Lens identifies a product, users can instantly view details like price, reviews, and availability, with shopping ads included in the results.
Though competitors like Meta and OpenAI are working on similar real-time AI video features, Google Lens has taken the lead by launching this powerful new tool. However, the feature is asynchronous and currently limited to early-access users, with its full potential yet to be proven.
6. Adobe unveils AI-powered video generation for Firefly
Source: TechCrunch
Adobe has introduced video generation capabilities to its Firefly AI platform, enabling users to create AI-generated videos directly from text or images. Accessible via Adobe’s website, the beta version allows users to generate five-second videos using Firefly's text-to-video and image-to-video models. In the Premiere Pro beta app, the new Generative Extend feature lets users extend video clips by up to two seconds, smoothly continuing camera movements and audio.
The Generative Extend tool has impressed early testers, proving more practical than Firefly's basic video generation models, which still lag behind competitors like Runway's Gen-3 Alpha. Adobe’s AI strategy focuses on complementing creative workflows rather than replacing them. Firefly’s tools aim to speed up editing, addressing problems like incomplete footage, rather than generating new content from scratch.
Adobe’s balanced approach reflects the company’s efforts to cater to its meticulous creative audience, easing concerns that AI could replace traditional methods. Adobe has also committed to fair practices, compensating artists for training data and embedding watermarks in AI-generated content metadata.
As Adobe continues refining its AI offerings, it remains focused on empowering creators to leverage AI for more efficient, creative work, rather than viewing it as a threat.