Biggest data moments past 2 weeks: GPT-4.5 sparks debate, Apple’s M4 drops, and Google races toward AGI

Biggest data moments past 2 weeks: GPT-4.5 sparks debate, Apple’s M4 drops, and Google races toward AGI

??What's inside this issue

This fortnight brings exciting developments across AI, hardware, and data infrastructure. From OpenAI's latest model and Apple's new products to innovative approaches for making AI more reliable and data systems more performant, we've packed this issue with insights for engineers, data scientists, and tech enthusiasts alike.


?? Recent highlights

OpenAI's GPT-4.5 launch

OpenAI released GPT-4.5, but reception has been mixed. The model prioritizes emotional intelligence over raw reasoning power, with users noting it feels like talking to a thoughtful person and can write beautifully. However, it comes with significant drawbacks:

  • Prohibitively expensive pricing ($75/input and $150/output per million tokens)
  • Required 10x more compute for what Andrej Karpathy described as diffuse improvements – broad but subtle enhancements spread across various capabilities rather than concentrated, measurable gains in specific areas
  • Many see it as evidence that scaling alone may be hitting fundamental limits

Apple’s new MacBook Air with M4 chip ??

Apple has refreshed its MacBook Air lineup with the new M4 chip, bringing five key updates to both 13-inch and 15-inch models. The laptops now feature a 10-core CPU, a 12-megapixel Center Stage webcam, support for dual external monitors, and a new sky blue color option. Most importantly, Apple has lowered the starting prices to $999 for the 13-inch and $1,199 for the 15-inch model. The new MacBook Airs, along with updated Mac Studio models and M3 iPad Airs, are available for preorder now.

Google's AI race

Sergey Brin, Google cofounder, has issued a direct challenge to the company's DeepMind AI division in the race to develop artificial general intelligence (AGI). In a surprising internal memo, Brin urged AI researchers to turbocharge their efforts by working 60-hour weeks, coming to the office daily, and focusing on simpler solutions. Most notably, Brin criticized Google's AI products as overrun with filters," suggesting they need to trust users rather than creating nanny products. The memo reflects growing pressure on Google to balance AI safety with development speed as the company positions itself against rivals in what Brin calls the final race to AGI

Company news & funding

  • Anthropic raised an enormous $3.5 billion at a $61.5 billion valuation
  • Ex-OpenAI CTO Mira Murati's startup Thinking Labs is reportedly close to raising $1 billion at a $9 billion valuation
  • TSMC plans to invest over $100 billion in new AI chip facilities in the US over the next four years
  • OpenAI is reportedly planning tiered AI agents targeting different professional levels: $2K/month for knowledge workers $10K/month for software development $20K/month for PhD-level research Expected to generate 20-25% of the company's revenue

Impressive new technologies past two weeks

????Sesame's conversational speech model:

Sesame (founded by Oculus co-founder Brendan Iribe) demonstrated revolutionary voice AI technology:

  • Their "Conversational Speech Model" creates remarkably human-like voice interactions
  • Features natural pauses, emotional expressions, and consistent personality
  • Blind listening tests show many evaluators couldn't distinguish it from human recordings
  • Planning to open-source key components under Apache 2.0 license

???Alibaba's QwQ-32B:

  • Alibaba released QwQ-32B, an open-weight AI that can be run locally
  • Designed for step-by-step reasoning through complex questions
  • Available through Qwen cloud and various demos


?? Breaking: Claude 3.7 Sonnet is out

Claude 3.7 Sonnet, Anthropic's newest AI assistant, represents a fundamental shift in how artificial intelligence approaches complex problems. Unlike previous AI models that simply provide answers, Claude 3.7 Sonnet takes you behind the scenes of its thinking process, offering unprecedented transparency into how it reaches conclusions.

The thinking revolution

What makes this new model truly special is its Extended Thinking Mode. When activated, Claude doesn't just solve problems, it shows you its entire reasoning path. This is similar to how a good math teacher doesn't just give you the answer to an equation but walks you through each step of the solution. It allows users to see exactly how Claude arrives at its answers. The model can dedicate specific resources (measured in "tokens") to its thinking process before providing a final response. This means Claude can work through complex problems methodically, considering multiple approaches and refining its thinking along the way.

Technical capabilities

Claude 3.7 Sonnet excels at tasks requiring deep analytical skills. It demonstrates remarkable improvements in mathematical reasoning, scientific problem-solving, and working with extensive documents. The model can generate responses up to 128,000 tokens long, equivalent to roughly 100,000 words or a small novel.

This creates new possibilities for developers and technical teams. Claude's ability to think deeply before responding benefits complex optimization problems, detailed data analysis, and nuanced content generation. Technical users can control exactly how much "thinking" they want Claude to perform based on the complexity of their task.

Claude’s API

Developers can easily access Claude 3.7 Sonnet through Anthropic's straightforward API. Alternative platforms like Replicate also offer Claude 3.7 Sonnet integration, providing flexibility for teams with different infrastructure needs.

When should you use Claude 3.7 Sonnet?

Claude 3.7 Sonnet shows its greatest value when tackling challenging problems that require careful consideration. Some ideal use cases include:

  • Solving complex mathematical or scientific problems where the reasoning path matters
  • Analyzing lengthy documents or datasets where nuance is important
  • Creating detailed plans that need to account for multiple constraints
  • Generating comprehensive, well-structured content

For simpler tasks, you might allocate fewer thinking tokens (4,000-8,000), while complex problems benefit from deeper thinking (16,000+ tokens). Remember that more thinking means slightly longer response times, but often results in more accurate and thorough answers.


?? Industry insights

Solving LLM reliability issues with agentic mesh

Eric Broda highlights a fundamental challenge with LLMs: as models tackle larger tasks, errors compound exponentially due to the Combinatorial Explosion of Choice problem. Rather than waiting for bigger models, he proposes a practical architecture where specialized LLMs handle smaller, independent subtasks orchestrated by agents. This approach transforms large, error-prone requests into discrete steps executed by domain-specific models, preventing cascading failures. By implementing this as microservices with deterministic orchestration, engineers can leverage familiar patterns for security, monitoring, and state management while dramatically improving reliability. For teams struggling with LLM hallucinations in complex applications, this composition-based strategy offers an immediately implementable solution using existing infrastructure practices rather than hoping the next model iteration magically solves the problem.

Apache Iceberg vs. Hadoop

Apache Iceberg solves important technical problems like flexible schemas and reliable transactions, but implementing it comes with challenges similar to what made Hadoop projects fail. While the table format itself is elegant, you still need to manage catalogs, compute engines, and maintenance processes that require significant expertise. Common issues like the "small file problem" persist, and the lack of standardized catalogs can lead to vendor lock-in. Before adopting Iceberg, engineering teams should honestly assess if they have the operational capabilities to handle these complexities or whether a managed solution might be better. The key lesson from Hadoop applies here too: powerful technology alone doesn't guarantee success if the surrounding ecosystem is too complex to manage effectively.

Delta Lake table compaction strategies compared

A recent benchmark tested five ways to solve the small file problem in Delta Lake tables, where performance drops as tiny files pile up. The winner? Combining two features: Auto Compaction (which automatically merges small files) and Optimized Write (which organizes data before writing it).

The study tested:

  • No compaction (files grew unchecked, making writes 5x slower)
  • Scheduled compaction (required manual maintenance jobs)
  • Auto compaction (automatic merging of small files)
  • Optimized write (pre-organizes data to create fewer files)
  • Auto compaction + Optimized write (the winning combination)

With this winning approach, tables maintained fast, consistent performance without requiring manual maintenance. One current limitation: due to a bug (fix coming soon), only use Auto Compaction on tables smaller than 1GB to avoid excessive merging on larger tables.


?? Tool of the fortnight

Check out Hard Fork's chat with Anthropic's CEO about the new Claude model – fascinating stuff if you're into AI and where it's headed.


?? Pro tip: ??? Struggling to pick the right visualization library? Check out Deepnote's multi-library comparison template that puts Matplotlib, Seaborn, Plotly, and Altair side-by-side so you can see which one best fits your data storytelling needs.

? Final thoughts

As AI development accelerates and data infrastructure evolves, finding the right balance between innovation and reliability becomes crucial. Whether you're building with LLMs, managing data lakes, or exploring visualization libraries, remember that the most elegant technical solution is the one that fits your specific needs while remaining maintainable for the long term. Looking forward to the next issue!

要查看或添加评论,请登录

Deepnote的更多文章

社区洞察