登录查看更多内容

Reka AI releases Reka Core: Understands images, videos, and audio

David Cronshaw

Sr. Product Manager @Disney Streaming | Co-Founder Chatmosa chatmosa.bsky.social | AI, Generative AI | Revenue Generation | Former Microsoft and T-Mobile | Co-Founder UltimateTV.com - Zap2it.com

发布日期: 2024年4月16日

Reka AI, a San Francisco-based AI startup founded by researchers from DeepMind, Google and Meta, is introducing a new multimodal language model called Reka Core.

"Reka is a frontier-class multimodal language model on par with leading models in the industry today. Core was efficiently trained from scratch on thousands of GPUs over a period of a few months." - Reka AI

Available by API, on-premise, or on-device deployment options, Core is the third member in Reka’s family of language models and offers the ability to understand multiple modalities, including image, audio and video, while offering a massive context window, exceptional reasoning skills, and even coding.

Reka Core is one of only two commercially available comprehensive multimodal solutions.

You can test out Reka Core in the Reka Playground.

Even though Reka was trained in less than a year, it matches or beats the performance of top models from leading players in the AI space, including OpenAI, Google and Anthropic.

"Core is comparable to GPT-4V on MMMU, outperforms Claude-3 Opus on our multimodal human evaluation conducted by an independent third party, and surpasses Gemini Ultra on video tasks. On language tasks, Core is competitive with other frontier models on well-established benchmarks." - Reka AI

The table below summarizes a comparison of Core with leading models in the market today.

Reka AI has 3 models: Reka Core, Flash, and Edge. All 3 of their models are trained to handle and analyze multimodal inputs.

Reka Core Capabilities

Multimodal (image and video) understanding. Core is not just a frontier large language model. It has powerful contextualized understanding of images, videos, and audio and is one of only two commercially available comprehensive multimodal solutions.?
128K context window. Core is capable of ingesting and precisely and accurately recalling much more information.?
Reasoning. Core has superb reasoning abilities (including language and math), making it suitable for complex tasks that require sophisticated analysis.?
Coding and agentic workflow. Core is a top-tier code generator. Its coding ability, when combined with other capabilities, can empower agentic workflows.?
Multilingual. Core was pretrained on textual data from 32 languages. It is fluent in English as well as several Asian and European languages.?
Deployment Flexibility. Core, like our other models, is available via API, on-premises, or on-device to satisfy the deployment constraints of our customers and partners.

Reka Model Showcase:

Reka displayed some impressive results for image and data analysis on their Model Showcase Page:

Reka Core Video:

Reka Core has a lot of capabilities, and one of them is understanding video --- let’s see what Core thinks of the @3body trailer.

Data & Analytics 1 年前

How to get more out of LLMs

Stefan Huyghe 1 年前

The Future of AI: Small Language Models, Small Agent…

Les Ottolenghi 1 个月前

Reka tested its Reka Core multimodal language model on Netflix’s “3 Body Problem” and it was able to translate what’s happening onscreen into text. Credit: Reka

Reka Core Use-Cases

Some users cases of Reka’s models include:

Image captioning and Tagging
Content moderation
Engineering
Engagement (sales, customer service, support)
Direct action / agentic workflows

Reka Core Tech Report:

State-of-the-Art Performance:

Reka models demonstrate state-of-the-art performance, especially Reka Core, which is competitive with the best models from other leading companies in both automatic and human evaluations across various benchmarks.

Model Details:

Reka Edge and Flash are smaller but powerful models with 7B and 21B parameters, respectively.
Reka Core is still in development but shows promising results comparable to leading models like GPT-4.

Training Data and Architecture:

Utilizes a diverse mix of public and proprietary data.
Incorporates advanced architectural features like a modular encoder-decoder structure and supports multimodal inputs.

Training and Infrastructure:

Extensive use of Nvidia’s latest GPUs, with training processes detailed to optimize performance.
Emphasis on overcoming the challenges of scaling up training infrastructure and managing computational resources efficiently.

Evaluation and Benchmarks:

Comprehensive evaluation across language understanding, multimodal tasks, and specialized domains like medical reasoning.
Demonstrates superior capabilities in handling complex queries over long context spans and multilingual content.

User and Developer Accessibility:

Models are accessible for use at chat.reka.ai and showcase.reka.ai.
Provides APIs and platforms for developers to interact with and integrate these models into various applications.

Ongoing Development and Future Prospects:

Continuous improvement is highlighted, with expectations for further advancements in model capabilities and applications.
Discussion on the balance of innovation and practical deployment in AI development.

You can try Reka Core at https://chat.reka.ai/

#ai #rekaai #rekacore #aivideo #aiaudio #multimodality

要查看或添加评论，请登录

查看全部

Reka AI releases Reka Core: Understands images, videos, and audio

David Cronshaw

Sr. Product Manager @Disney Streaming | Co-Founder Chatmosa chatmosa.bsky.social | AI, Generative AI | Revenue Generation | Former Microsoft and T-Mobile | Co-Founder UltimateTV.com - Zap2it.com

Reka Core Capabilities

Reka Model Showcase:

Reka Core Video:

领英推荐

Reka Core Use-Cases

Reka Core Tech Report:

更多精彩文章

社区洞察

其他会员也浏览了

Crafting Intelligence: The Art of Tailoring Large Language Models for Precision and Relevance

Insider's Edit: OpenAI's Tips for Writing Better Prompts

Large Action Models(LAM): Ushering in a New Era of AI Autonomy

Building vs. Utilizing Existing Large Language Models (LLMs): Considerations for Use Cases and Bias Mitigation

Llama 3 and More: Unveiling AI Advances in Language, Vision, and Audio

Training, Tuning, and Retrieval: How Large Language Models Get Smart

Introducing LLMamass: Totally Free Access to all major Ai platforms!

?? A laypeople's guide into the World of Large Language Models (LLMs) ??

Understanding Large Language Models (LLMs) and Small Language Models (SLMs): A Shift Towards Efficiency

LLMs are disrupting the way live our lives

Reka Core Capabilities

Reka Model Showcase:

Reka Core Video:

领英推荐

Reka Core Use-Cases

Reka Core Tech Report:

Team Efficiency with Microsoft’s New Autonomous Agents

2024年10月23日

AI Audio in Entertainment: Key Takeaways from LA #TechWeek 2024

2024年10月21日

Job Disruptions and AI’s Impact on the Future of Work

2024年10月21日

Rising Studios and Entertainment Tech Frontiers at LA #TechWeek

2024年10月16日

New Patent from Microsoft may have an audio-to-image generator

2024年10月15日

The Future of Media Companies in the Age of AI: Beyond Aggregation

2024年10月9日

Emu3: Simplifying Multimodal AI with Next-Token Prediction

2024年10月8日

Introducing OpenAI Canvas: A New Way to Collaborate with ChatGP

2024年10月7日

Google NotebookLM is all about ME!

2024年10月5日

Reasons to Be Optimistic About the Entertainment Business

2024年9月27日

社区洞察

其他会员也浏览了

Crafting Intelligence: The Art of Tailoring Large Language Models for Precision and Relevance

Insider's Edit: OpenAI's Tips for Writing Better Prompts

Large Action Models(LAM): Ushering in a New Era of AI Autonomy

Building vs. Utilizing Existing Large Language Models (LLMs): Considerations for Use Cases and Bias Mitigation

Llama 3 and More: Unveiling AI Advances in Language, Vision, and Audio

Training, Tuning, and Retrieval: How Large Language Models Get Smart

Introducing LLMamass: Totally Free Access to all major Ai platforms!

?? A laypeople's guide into the World of Large Language Models (LLMs) ??

Understanding Large Language Models (LLMs) and Small Language Models (SLMs): A Shift Towards Efficiency

LLMs are disrupting the way live our lives