AI-Powered news roundup: Edition 21

AI-Powered news roundup: Edition 21

Our bi-weekly news roundup is designed to keep you informed on the latest developments in AI, all in under 5 minutes.


In this edition:

  1. DeepSeek’s meteoric rise: A new challenger in AI
  2. OpenAI launches o3-mini: A fast, cost-effective reasoning model
  3. EU’s AI Act takes effect, bans “unacceptable risk” AI
  4. DeepSeek’s R1 model shakes up Nvidia’s stock
  5. OpenAI unveils Operator: A new AI agent for web automation
  6. Perplexity unveils AI assistant for daily tasks
  7. Anthropic introduces citations API to reduce AI errors


1. DeepSeek’s meteoric rise: A new challenger in AI

Sources: TechCrunch, ChinaTalk

Chinese AI lab DeepSeek has burst onto the global stage after its chatbot app topped the Apple and Google Play store charts. Backed by hedge fund High-Flyer Capital Management, DeepSeek quickly built powerful AI models using cost-efficient training techniques, raising concerns about U.S. dominance in the AI sector and the demand for AI chips.

DeepSeek's V3 model, launched in December 2024, claims superior performance over Meta’s Llama and even OpenAI’s GPT-4o, while its R1 "reasoning" model sets a new benchmark in self-fact-checking AI. However, as a Chinese company, its models must align with government regulations, limiting responses on sensitive topics.

Source: Deepseek

Despite pricing its AI models well below market rates—and even offering some for free—DeepSeek has attracted widespread adoption. With Microsoft integrating DeepSeek into its Azure AI Foundry and industry giants like OpenAI and Nvidia reacting to its rapid rise, DeepSeek’s future remains uncertain.



2. OpenAI launches o3-mini: A fast, cost-effective reasoning model

Source: TechCrunch

OpenAI has released o3-mini, the latest in its "o" reasoning model family, designed to be faster, more efficient, and cost-effective. Fine-tuned for STEM fields, including programming, math, and science, o3-mini is positioned as an alternative to its predecessor, o1, while maintaining accuracy at a lower cost.

OpenAI claims that o3-mini reduces major mistakes by 39% compared to o1-mini and delivers responses 24% faster. Available in ChatGPT and via API, the model allows users to adjust the "reasoning effort" to balance speed and accuracy. OpenAI also touts o3-mini as a safer model, outperforming GPT-4o in safety evaluations.

Despite its strengths, o3-mini does not consistently outperform DeepSeek’s R1 reasoning model, especially in advanced scientific domains. Still, with pricing 63% cheaper than o1-mini, OpenAI is betting that affordability and efficiency will drive adoption—while also countering the rising influence of Chinese AI competitors.


3. EU’s AI Act takes effect, bans “unacceptable risk” AI

Source: European Commission

The European Union’s AI Act has reached its first compliance deadline, empowering regulators to ban AI systems deemed to pose an “unacceptable risk”. As of February 2, AI applications that manipulate behavior, exploit vulnerabilities, or conduct real-time biometric surveillance in public spaces are now illegal within the bloc. Companies violating the law could face fines of up to €35 million ($36 million) or 7% of global revenue.

Some notable banned use cases include:

  • AI-powered social scoring based on personal behavior
  • Emotion recognition at workplaces and schools
  • Predictive policing based on appearance
  • Biometric data collection without proper oversight

While companies like Google, Amazon, and OpenAI voluntarily pledged to align with the AI Act early, Apple, Meta, and Mistral AI did not sign the EU’s AI Pact. However, exemptions exist—law enforcement may still use biometric AI in public spaces under strict conditions.

With further enforcement measures rolling out in August, businesses face growing regulatory pressure. How will AI companies adapt to Europe’s tough new rules?


4. DeepSeek’s R1 model shakes up Nvidia’s stock

Source: TechCrunch

Chinese AI startup DeepSeek made waves last week with the release of its R1 reasoning model, which rivals U.S. counterparts while using less compute power. While a win for AI efficiency, R1’s success sent Nvidia’s stock tumbling nearly 17%, wiping $600 billion off its market cap. Investors fear that DeepSeek’s breakthrough signals a reduced dependence on high-end AI chips, threatening Nvidia’s dominance.

Source: Yahoo Finance

Nvidia responded by highlighting that AI inference still relies on its GPUs and framed DeepSeek’s approach as an example of Test Time Scaling, a technique that optimizes AI models post-training.

The release comes amid shifting U.S. AI policies. Former president Joe Biden’s AI chip export ban aimed at China was reversed by Donald Trump, who instead launched Project Stargate, a $500 billion AI infrastructure initiative.

DeepSeek’s R1 demonstrates that AI innovation isn’t just about hardware. As U.S.-China AI competition intensifies, is America focusing on the right battleground?


OpenAI Unveils Operator: A New AI Agent for Web Automation

Source: OpenAI

OpenAI has launched Operator, an AI-powered web automation agent capable of navigating websites, booking travel, shopping online, and more—all without developer-facing APIs. Available first to ChatGPT Pro ($200/month) users in the U.S., Operator will expand to Plus, Team, and Enterprise tiers over time.

Powered by OpenAI’s Computer-Using Agent (CUA) model, Operator mimics human-like web interactions, clicking buttons, filling out forms, and navigating menus. To mitigate risks, OpenAI has implemented strict safeguards, requiring user confirmation before finalizing critical actions like purchases or emails.

While promising, Operator has limitations. It struggles with CAPTCHAs, complex UI, and password fields, and won’t yet perform emailing or calendar management tasks. Security remains a key concern, but OpenAI asserts that automated monitoring prevents misuse.


6. Perplexity Unveils AI Assistant for Daily Tasks

Source: Perplexity

AI-powered search engine Perplexity has launched Perplexity Assistant, an AI agent designed to handle multi-app tasks like hailing rides, setting calendar events, and even using your phone’s camera to answer questions about objects around you. Integrated with Perplexity’s web-powered search, the assistant maintains context across actions—allowing users to research restaurants and automatically book a table.

Available in 15 languages, Perplexity Assistant is currently free for all users. However, the rollout comes with some caveats. CEO Aravind Srinivas admitted that some actions might not always work, similar to past Perplexity features that launched with glitches and delays.

The assistant’s debut follows Perplexity’s recent expansion, including its Sonar API for enterprise AI search and the acquisition of professional networking platform Read.cv. But legal battles with publishers remain a challenge—News Corp and NY Post have sued Perplexity over alleged content scraping.


7. Anthropic introduces citations API to reduce AI errors

Source: Anthropic

Anthropic has launched Citations, a new API feature that enables its AI models to ground responses in specific source documents. This enhancement allows models to provide detailed references to exact sentences and passages, increasing the verifiability and trustworthiness of AI-generated outputs.

Key Features:

  • Improved Verification: Citations addresses the challenge of verifying AI-generated responses by automatically linking claims to their original sources, reducing the need for complex prompt engineering. Internal evaluations indicate that this built-in capability increases recall accuracy by up to 15% compared to custom implementations.
  • Diverse Applications: The feature is particularly beneficial for:
  • Seamless Integration: Citations processes user-provided source documents (PDFs and plain text files) by chunking them into sentences. These chunks, along with user-provided context, are then passed to the model with the user's query. Claude analyzes the query and generates a response that includes precise citations based on the provided chunks and context for any claims derived from the source material.

Citations uses Anthropic's standard token-based pricing model. While it may use additional input tokens to process documents, users will not pay for output tokens that return the quoted text itself.

Thomson Reuters, utilizing Claude for their AI platform CoCounsel, reported that Citations makes citing and linking to primary sources much easier to build, maintain, and deploy to users. Endex, another client, noted a reduction in source hallucinations and formatting issues from 10% to 0% and a 20% increase in references per response after implementing Citations.

Citations is now available for the new Claude 3.5 Sonnet and Claude 3.5 Haiku models. Developers can start using Citations by exploring Anthropic's documentation.



Are you as passionate about AI as we are? Explore open positions at Siili and work on exciting, cutting-edge projects. Join us and take the next step in your career!


Thanks for sharing Siili Solutions . AutoKeybo runs DeepSeek.

回复

要查看或添加评论,请登录

Siili Solutions的更多文章

社区洞察

其他会员也浏览了