Infinite Context Length ??

Infinite Context Length ??

Meta recently launched MEGALODON, an advanced neural architecture with unlimited context length. It showcases enhanced efficiency over Llama 2 with 7 billion parameters and 2 trillion training tokens, hinting that Llama-3, expected this summer, is likely to include infinite context length capabilities.?

Soon, the debate over which model boasts the greatest context length will become irrelevant. Most recently, Microsoft, Google, and Meta have taken strides in this direction, making context length infinite.?

Besides Meta, Google also introduced Infini-Attention, a new method for efficiently scaling Transformers to handle infinitely long inputs, enhancing performance on extensive language tasks.

How is it different? MEGALODON emphasises architectural modifications for gradient flow and computational efficiency, whereas Infini-Attention focuses on a hybrid attention mechanism that integrates memory compression to manage lengthy inputs effectively.

There’s more. Recently, Google researchers also introduced TransfomerFAM, a novel Feedback Attention Memory (FAM) mechanism that enables Transformers to process indefinitely long sequences without additional weights, significantly enhancing performance on long-contest tasks.?

Meanwhile, researchers from Beijing Academy of Artificial Intelligence have introduced Activation Beacon, a method that extends LLMs’ context length by condensing raw activations into compact forms. This plug-in component enables LLMs to perceive long contexts while retaining performance within shorter contexts.?

Pushing the limit: In February, Microsoft Research published the paper ‘LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens’.?

The technique significantly increases the context length of LLMs to an unprecedented 2048k tokens while preserving their original performance within shorter context windows.

Furthermore, another team of Microsoft researchers have challenged the traditional approach of LLM pre-training, which uniformly applies a next-token prediction loss to all tokens in a training corpus. Instead, they propose a new language model called RHO-1, which utilises Selective Language Modeling (SLM).?

Small Context Window: Better Results??

Not really. There has been a long-going conversation about how longer context-length window models have the problem of getting lost in the middle. Opting for smaller context-length inputs is recommended for accuracy, even with the advent of long-context LLMs. Notably, facts at the input’s beginning and end are better retained than those in the middle.

Jim Fan from NVIDIA AI explains how claims of a million or billion tokens are not helpful in improving LLMs. “What truly matters is how well the model actually uses the context. It’s easy to make seemingly wild claims, but much harder to solve real problems better,” he added.?

Recently, NVIDIA researchers have developed RULER, a synthetic benchmark designed to evaluate long-context language models across various task categories, including retrieval, multi-hop tracing, aggregation, and question answering.?

Enjoy the full story here.?


Will Keyboards Become Obsolete?

American entrepreneur and investor Naval Ravikant recently launched Airchat, a unique voice-centric social media app, amidst a broader shift towards voice as a primary user interface in AI.?

Besides this, several new age apps and devices, including Hume AI, Humane Ai Pin, Rabbit R1, and Limitless, are all prioritising voice commands, alongside robotics company Figure AI, which also believes that voice will be an integral part of an AI future.?

Read the full story here.?


INDIA?


Workshop Wonders

Join us for an exclusive webinar on ‘Automating Data Pipelines with Snowflake: Accelerating Data Integration and ETL Workflows’ on April 25, 2024. This insightful session is tailored for those keen on streamlining their data pipelines and accelerating time-to-insight.?

What are you waiting for? Register now to secure your spot in this enriching journey towards advanced data handling. Click here >>


Enjoying Sector 6 (formerly AIM Daily XO)? Share it with colleagues or friends – they can sign up here.?

We love hearing from our readers! Have thoughts on our new format? Questions, comments, or ideas are always welcome. If there’s a specific topic in AI or analytics that you're curious about, tell us! Reach out to us at [email protected].

Stay tuned for more insights in our next edition! Curated with ??

要查看或添加评论,请登录

AIM Research的更多文章

社区洞察

其他会员也浏览了