Microsoft Phi-4, ChatGPT Vision, Grok Free?—?What’s Changing?

Microsoft Phi-4, ChatGPT Vision, Grok Free?—?What’s Changing?


Welcome to This Edition of the DSA Newsletter! In this edition, we’ve got game-changing updates, cutting-edge tools, and exclusive training resources. Ready to master AI and take your skills to the next level? Dive in to explore the best of AI, and don’t forget to check out our AI Training Course below!

What’s new in?AI:

  • ChatGPT Advanced Voice Mode Gains Vision Capabilities: Now capable of analyzing visuals alongside voice interactions.
  • Grok on ??: Faster, Smarter, and Now for All
  • Anthropic’s Claude 3.5 Haiku Now Generally Available
  • Microsoft Releases Small, Powerful Phi-4 Model
  • OpenAI’s Canvas Goes Public with New Features
  • AI Training: Turn Your Screenshots into Working Prototypes
  • 7 New Tools and 5 AI Jobs You Should Know

?? Master AI with our exclusive training sessions! Learn to harness cutting-edge tools, build prototypes, and become a leader in the AI space.

?? Explore AI Training Now ?? Have questions or ideas? Reply to this email and let’s chat!


ChatGPT Advanced Voice Mode gains vision capabilities

The Brief: OpenAI just launched a major upgrade to ChatGPT’s Advanced Voice Mode on Day 6 of its live stream event, enabling the AI to analyze and respond to live video input and screen sharing during conversations.

The details:

  • Users can show live videos or share their screens while using Advanced Voice Mode, and ChatGPT can understand and discuss the visual context in real time.
  • The feature works through a new video icon in the mobile app, with screen sharing available through a separate menu option.
  • The updates are available to ChatGPT Plus, Pro, and Team subscribers, with Enterprise and Edu users gaining access in January.
  • OpenAI also introduced a festive new voice option, allowing users to chat with Santa as a limited-time seasonal addition through early January.

Why it matters: Seven months after its initial demo, OpenAI is finally delivering on the promise of visual understanding in conversational AI?—?moving ChatGPT beyond text and voice into true multimodal interaction. It’s been a big week for vision, with Gemini and ChatGPT Advanced Voice gaining some extremely powerful new capabilities.


Grok on ??: Faster, Smarter, and Now for?All!


The Brief: Grok AI, now sharper than ever, rolls out to all ?? users for free. Packed with groundbreaking features like real-time web search, citations, and the new Aurora image generator, Grok elevates the AI experience for both casual users and enterprises.

The details:

  • Faster & More Accurate: Grok’s updated model is 3x faster, with enhanced instruction-following and multilingual support.
  • New Features:
  • Web Search & Citations: Provides real-time, accurate answers with sources.
  • Aurora Image Generator: Delivers stunning photorealistic visuals and memes.
  • “Draw Me” Feature: Lets users reimagine themselves using their ?? profile data.
  • Enterprise Access: New API models (grok-2–1212, grok-2-vision-1212) now available, with lower pricing and free credits for developers.
  • Interactive Insights: The Grok button offers context and analysis directly on ?? posts, enriching discussions.

Why It Matters: Grok’s upgrades democratize access to high-functioning AI, blending creativity, knowledge, and utility. By integrating seamlessly into ??, Grok fosters smarter interactions, creative explorations, and deeper engagement.?

Try Grok on ?? for free and explore its powerful features like unfiltered reasoning, coding assistance, and stunning image generation. Premium+ users unlock even more capabilities! ?? Learn More & Sign Up


Anthropic’s Claude 3.5 Haiku is now generally available

The Brief: Anthropic quietly rolled out its fastest AI model, Claude 3.5 Haiku, to all Claude users on web and mobile platforms, expanding from its previous API-only availability?—?though no official announcement has been made.

The details:

  • Haiku 3.5 was released in November along with Claude’s computer use feature?—?beating the previous top model 3 Opus on key benchmarks.
  • The model excels at coding tasks and data processing, offering impressive speed and performance with high accuracy.
  • Haiku features a 200K context window, which is larger than competing models, while also integrating with Artifacts for a real-time content workspace.
  • The initial release drew criticism for Haiku’s API pricing, which was increased 4x over 3 Haiku to $1 per million input tokens and $5 per million output tokens.
  • Free users can now access Haiku with daily message limits, while Pro subscribers ($20/month) get expanded usage and priority access.

Why it matters: It’s been a relatively quiet holiday season of releases for Anthropic compared to rivals. Although Haiku is impressive compared to previous generations, it doesn’t feel like a huge needle mover during a big week of AI releases?—?and it might take a launch of a top-tier 3.5 Opus to steal the spotlight from Google and OpenAI.


Microsoft releases small, powerful?Phi-4

The Brief: Microsoft just released Phi-4, a 14B parameter small language model that outperforms massive competitors like GPT-4o and Gemini Pro 1.5 in areas like mathematical reasoning despite a drastic size difference.

The details:

  • Phi-4 outperforms models like Gemini Pro 1.5 on several math and complex reasoning benchmarks despite being a fraction of the size.
  • Phi-4 even surpasses its teacher model, GPT-4o, on graduate-level STEM Q&A and math competition problems.
  • Microsoft trained Phi-4 primarily on synthetic data, using AI to generate and validate approximately 400B tokens of high-quality training material.
  • The model also features an upgraded mechanism that can process longer inputs of up to 4,000 tokens, double the capacity of Phi-3.
  • Phi-4 is available in a limited research preview on Azure AI Foundry, and a wider release is planned for Hugging Face.

Why it matters: Microsoft’s Phi models continue to challenge the ‘bigger is better’ trend in AI, showing that smaller models can match or exceed the capabilities of larger ones?—?particularly in specialized areas. The AI future may not be about raw size but smarter architecture and training approaches that do more with less.


OpenAI’s Canvas goes public with new?features

The Brief: OpenAI just made Canvas available to all users, with the collaborative split-screen writing and coding interface gaining new features like Python execution and usability inside custom GPTs.

The details:

  • Canvas now integrates natively with GPT-4o, allowing users to trigger the interface through prompts rather than manual model selection.
  • The tool features a split-screen layout with the chat on one side, a live editing workspace on the other, and inline feedback and revision tools.
  • New Python integration enables direct code execution within the interface, supporting real-time debugging and output visualization.
  • Custom GPTs can also now leverage Canvas capabilities by default, with options to enable the feature for existing custom assistants.
  • Other key features include enhanced editing tools for writing (reading level, length adjustments) and advanced coding tools (code reviews, debugging).
  • OpenAI previously introduced Canvas in October as an early beta to Plus and Teams users, with all accounts now gaining access with the full rollout.

Why it matters: While this Canvas release may not be as hyped as the Sora launch, it represents a powerful shift in how users interact with ChatGPT, bringing more nuanced collaboration into conversations. Canvas’ Custom GPT integration is also a welcome sight and could breathe life into the somewhat forgotten aspect of the platform.


Apple Intelligence gets a big upgrade with iOS?18.2

The Brief: Apple just rolled out its biggest Apple Intelligence update yet, AI-powered emoji creation, image generation capabilities, Visual Intelligence with camera control, and more?—?alongside the broader integration of ChatGPT.

The details:

  • Genmoji is now live and allows users to create custom AI-generated emojis from text descriptions or photos with options to add accessories and themes.
  • Image Playground adds AI image creation across the system, with dedicated app access and integration into apps like Messages and Keynote.
  • Visual Intelligence debuts as an iPhone 16-exclusive feature, using Camera Control to analyze surroundings and provide info through Google or ChatGPT.
  • Apple Intelligence also expands to new regions with localized English support, including the UK, Australia, Canada, and others.
  • As revealed in the Day 5 livestream, Siri gains ChatGPT integration, letting users tap OpenAI’s capabilities directly without switching apps.

Why it matters: Apple Intelligence has been underwhelming so far, to say the least, but the ChatGPT integration brings the system closer to what users likely envisioned when upgrading their iPhones for the new AI tools. However, we’ll have to wait until 2025 for agentic Siri capabilities that can handle more complex actions.


Cognition launches Devin AI developer assistant

The Brief: Cognition Labs has officially launched Devin, its AI developer assistant, targeting engineering teams and offering capabilities ranging from bug fixes to automated PR creation.

The details:

  • Devin integrates directly with development workflows through Slack, GitHub, and IDE extensions (beta), starting at $500/month for unlimited team access.
  • Teams can assign work to Devin through simple Slack tags, with the AI handling testing and providing status updates upon completion.
  • The AI assistant can handle tasks like frontend bug fixes, backlog PR creation, and codebase refactoring, allowing engineers to focus on higher-priority work.
  • Devin’s capabilities were demoed through open-source contributions, including bug fixes for Anthropic’s MCP and feature additions to popular libraries.
  • Devin previously went viral in March after autonomously opening a support ticket and adjusting its code based on the information provided.

Why it matters: Devin’s early demos felt like the start of a new paradigm, but the AI coding competition has increased heavily since. It’s clear that the future of development will largely be a collaborative effort between humans and AI, and $500/m might be a small price to pay for enterprises offloading significant work. Try it here.


Pika drops major 2.0 video?upgrade

The Brief: Pika Labs just released version 2.0 of its AI video generator, introducing a new ‘Ingredients’ tool that lets users incorporate their own images into AI-generated videos?—?alongside improved motion, prompting, and animation features.

The details:

  • A new ‘Scene Ingredients’ system allows users to upload and mix characters, objects, and backgrounds that the AI automatically recognizes and animates.
  • Pika’s updated model shows impressive realism, smooth movement, and prompt/image adherence, giving users more control over outputs.
  • The new video generator also features a significant update to text alignment, showcasing the ability to craft realistic branded scenes and advertising content.
  • Pika has already attracted over 11M users and secured $80M in funding, and the new version follows its viral ‘effects’ launch in October.

Why it matters: Pika’s new upgrades are wild, continuing to move video outputs out of the ‘slot machine’ luck phase into a more customizable, personalized experience. While we patiently waited for Sora, the AI video scene leveled up in a major way?—?with Pika, Luma, Runway, Kling, Hailuo, and others dulling the impact of OpenAI’s latest release.


Anthropic analyzes real-world AI use with?Clio

The Brief: Anthropic introduced Clio, a new system that reveals patterns in how people actually use AI assistants worldwide, providing detailed insights into real-world AI adoption while maintaining user privacy.

The details:

  • Clio analyzes millions of conversations by summarizing and clustering them while removing identifying information in a secure environment.
  • The system then organizes these clusters into hierarchies, allowing researchers to explore patterns in usage without needing access to sensitive data.
  • Analysis of 1M Claude conversations showed that coding and business use cases dominate, with web development representing over 10% of interactions.
  • The system also uncovered unexpected use cases like dream interpretation, soccer match analysis, and tabletop gaming assistance.
  • Usage patterns vary significantly by language and region, such as a higher prevalence of economic and social issue chats in non-English conversations.

Why it matters: AI assistants are becoming increasingly integrated into our daily lives, but each person leverages them in a different way?—?making this a fascinating window into how the tech is being used. Understanding the dominant real-world use cases can both help improve user experience and align development with actual user needs.


One orange gift that’s fit for?all

The Brief: Bring the best gift to your holiday festivities this year with the rabbit r1?—?an AI companion designed to grow more delightful over time thanks to continuous updates and new features that keep it fresh and engaging.

Whether young, young at heart, or in-between, gift r1 to:

  • Feed curiosity with unlimited answers to any questions at a moment’s notice
  • Elevate travels with bi-directional translation in 100+ languages and personalized itineraries
  • Nerd out with a fully customizable UI, the ability to create custom agents, and a playground of out-of-the-box AI features

Take advantage of the only discount of the year and secure r1s for everyone this holiday season.


Replit launches ‘Assistant’ for?coding

The Brief: Replit just officially launched its upgraded AI development suite, removing its Agent from early access and introducing a new Assistant tool, alongside a slew of other major platform improvements.

The details:

  • A new Assistant tool focuses on improvements and quick fixes to existing projects, with streamlined editing through simple prompts.
  • Users can now attach images or paste URLs to guide the design process, and Agents can use React to produce more polished and flexible visual outputs.
  • Both tools integrate directly with Replit’s infrastructure, providing access to databases and deployment tools without third-party services.
  • The platform also introduced unlimited usage with a subscription-based model, with built-in credits and Agent checkpoints for more transparent billing.

Why it matters: The competition in AI development has gotten intense, and tools like Replit continue to erase barriers, with builders able to create anything they can dream up. Both beginners and experienced devs now have no shortage of AI-fueled options to bring ideas to life and streamline existing projects.


ChatGPT gains ‘Projects’ for chat organization

The Brief: OpenAI launched Projects for ChatGPT on Day 7 of its ’12 Days of OpenAI’ event, a new organizational system that lets users group conversations, files, and custom instructions into individual workspaces with shared context.

The details:

  • The feature introduces project-specific folders where users can bundle related chats, documents, and custom AI instructions across conversations.
  • Each Project automatically leverages GPT-4o while maintaining access to core features like Canvas, DALL-E, and web search capabilities.
  • The system is rolling out first to Plus, Pro, and Teams subscribers, with Enterprise and Education users gaining access in January.
  • Projects can be created and managed through the web interface and Windows app, while mobile and Mac users can view and chat with existing Projects.

Why it matters: While this isn’t the most groundbreaking feature (Anthropic released Projects for Claude in June), it’s important for user workflows?—?avoiding the dreaded need to refresh entire context and instruction prompts when starting new chats.


AI TRAINING

Turn your screenshots into working prototypes

The Brief: Claude Artifacts lets you create functional prototypes directly from screenshots or photos, bringing your ideas to life in minutes.

Step-by-step:

  1. Access Claude through your browser or mobile app.
  2. Take a clear screenshot or photo of your design.
  3. Upload it with a detailed prompt requesting a React prototype with specific features and styling.
  4. Enhance your prototype by requesting additional functionality and refinements.

Pro tip: Use high-contrast images and break down complex interfaces into smaller components for better results.

>>.<<

Transform words into cinematic magic with?Sora

The Brief: OpenAI’s newly launched Sora AI video generator allows you to turn your text descriptions into realistic videos without cameras, actors, or editing software.

Step-by-step:

  1. Access Sora (accessible via a paid ChatGPT account).
  2. Write a detailed prompt describing your desired video (e.g., “A majestic albino jaguar drinks from a crystal-clear stream.”)
  3. Choose your settings: aspect ratio (16:9, 1:1, or 9:16), resolution (480p to 1080p), and duration (5–20s).
  4. Generate and enhance your video using remix, re-cut, or blend features.

Pro tip: Test your concepts with shorter durations and lower resolutions first, then upgrade settings for your final version.

>>.<<

Practice job interviews with?ChatGPT

The Brief: ChatGPT’s Advanced Voice Mode can be turned into a personalized interview coach, conducting mock interviews and providing real-time feedback.

Step-by-step:

  1. Open ChatGPT’s Advanced Voice Mode on your mobile device.
  2. Set up your specific interview scenario and industry context.
  3. Engage in a realistic mock interview with industry-focused questions.
  4. Get immediate feedback on your responses and presentation.

Pro tip: If you need more time to formulate your responses, you can customize how the AI responds in Custom Instructions.

<<.>>

Turn AI passion into a consulting career

The Brief: Innovating with AI’s new program, AI Consultancy Project, transforms AI enthusiasts into professional consultants?—?tapping into a market projected to reach $54.7B by 2032.

The 6-month program delivers:

  • Proven frameworks for client acquisition and service delivery
  • A step-by-step path to six-figure consulting income
  • Students who land their first AI client in as little as 3 days

Click here to request early access to The AI Consultancy Project.


NEW TOOLS &?JOBS


Trending AI?Tools

  • Gemini Stream Realtime?—?Interact with Gemini in real-time using text, voice, video, or screen sharing.
  • AI Santa by Tavus ? —?Video chat with Santa in real-time across 30 languages
  • Reddit Answers?—?AI-powered search tool that lets you find human perspectives, recommendations, and info powered by Reddit communities
  • Doctronic AI?—?Instant, accurate care from home with an AI consultation followed by video visits with licensed doctors
  • Paperguide AI writer?—?Easily write well-researched articles and academic papers with AI
  • iMerch AI?—?Next-gen AI e-commerce tool offering intelligent product recommendations and personalized product lists
  • Depth AI?—?Answer complex questions on large and messy codebases, onboard new engineers quickly, and ship code faster

New AI Job Opportunities


QUICK HITS

Google announced Android XR, a new Gemini-powered operating system for mixed reality systems, with Samsung set to launch the first compatible headset codenamed ‘Project Moohan’ in 2025.

ChatGPT head of product Nick Turley discussed the platform’s future in an interview with The Verge, saying that chat-based interactions may soon feel as “outdated as ’90s instant messaging.”

Amazon Prime Video launched a new ‘AI Topics’ beta feature, using machine learning to group and recommend content based on viewers’ interests and watching habits.

xAI rolled out an upgraded version of Grok-2 to all X platform users, featuring tripled speed, improved multilingual capabilities, and integration of web search and advanced image generation features.

Meta’s FAIR released a suite of new AI research projects including Meta Motivo for embodied agent control and Meta Video Seal for video watermarking, alongside improved models for memory scaling and social intelligence.

OpenAI cofounder Ilya Sutskever warned that AI has reached ‘peak data’ during the NeurIPS conference, predicting a shift from current training methods to more autonomous, reasoning-based systems that will become increasingly unpredictable.

Google unveiled NotebookLM Plus with interactive audio features and Gemini 2.0 Flash integration, allowing users to verbally engage with AI hosts during Audio Overviews and access expanded enterprise capabilities.

OpenAI published new email correspondences and a timeline of events with Elon Musk, claiming that Musk initially wanted the company to be a for-profit entity despite active lawsuits.

DeepSeek released VL2, a new vision-language model family leveraging MoE architecture that performs similarly to rival models despite smaller sizes.

Anonymous-chatbot has returned to the LM Arena, which was previously used to test GPT 4o, sparking rumors of a potential GPT 4.5 or upgraded OpenAI model coming soon.


SPONSOR US

Get your product in front of over 800k+ AI enthusiasts

Our newsletter is read by thousands of tech executives, investors, engineers, managers, and business owners around the world. Get in touch today.

Want to sponsor us and get in front of 750k+ AI enthusiasts? Get in touch.

Looking for our Expert ChatGPT Prompt Guide? Download free.

Interested in podcasts? Check out ours here.

Go deeper? Join the TD8 AI University.

要查看或添加评论,请登录

Onyekachi Anyaegbu, M.S.的更多文章

社区洞察

其他会员也浏览了