DAIR.AI

DAIR.AI

研究服务

Democratizing Artificial Intelligence Research, Education, and Technologies

关于我们

Building and democratizing AI research, education, and technologies

网站
https://github.com/dair-ai
所属行业
研究服务
规模
2-10 人
总部
Belmopan
类型
个体经营
创立
2023
领域
LLMs、Deep Learning、NLP、Generative AI、Technical Corporate Training、Consulting和Education

地点

DAIR.AI员工

动态

  • DAIR.AI转发了

    查看Elvis S.的档案,图片

    Co-founder at DAIR.AI | Ph.D. | Prev: Meta AI, Galactica LLM, Elastic | Prompting Guide (6M+ learners) | I teach how to build with AI ??

    If you are looking to learn how to use or build with AI, I've built a dedicated learning path just for that: 1) Introduction to Prompt Engineering: learn the basics of working with LLMs from what are LLMs to effectively apply few-shot and chain-of-thought prompting 2) Advanced Prompt Engineering: learn more advanced prompting techniques like prompt changing and ReAct and how to agentic chatbots with them. 3) Introduction to AI Agents: learn agentic design patterns and how to build with multi-agent and hierarchical agentic systems. 4) Introduction to RAG: learn the essentials of retrieval augmented generation and how to build complex RAG systems, including agentic RAG apps. 5) Introduction to NotebookLM: Learn how to use NotebookLM as a powerful research assistant for professional and personal projects. Whether you are technical or non-technical, there is something for everyone in our academy. Enroll now: https://lnkd.in/enbjTWm3 And there is a lot more coming. Stay tuned! Use code BLACKFRIDAY to get 35% off. Offer expires 11/29 If you are a student, please email us at [email protected] for special discounts. If you want to onboard your team, please email us at [email protected] for special offers.

    • 该图片无替代文字
  • DAIR.AI转发了

    查看Elvis S.的档案,图片

    Co-founder at DAIR.AI | Ph.D. | Prev: Meta AI, Galactica LLM, Elastic | Prompting Guide (6M+ learners) | I teach how to build with AI ??

    LLM-based Agents for Automated Bug Fixing Analyzes 7 leading LLM-based bug fixing systems on the SWE-bench Lite benchmark, finding MarsCode Agent (developed by ByteDance) achieved the highest success rate at 39.33%. Reveals that for error localization line-level fault localization accuracy is more critical than file-level accuracy, and bug reproduction capabilities significantly impact fixing success. Shows that 24/168 resolved issues could only be solved using reproduction techniques, though reproduction sometimes misled LLMs when issue descriptions were already clear. Concludes that improvements are needed in both LLM reasoning capabilities and Agent workflow design to enhance automated bug fixing effectiveness. This paper highlights the challenging nature of some domains, like code, and the opportunities to innovate further in agentic workflow design. https://lnkd.in/dcypkdhf ↓ Enjoy reading AI papers? Join 100K+ researchers and devs for our weekly summary of top AI papers: ?https://lnkd.in/e6ajg945

    • 该图片无替代文字
  • DAIR.AI转发了

    查看Elvis S.的档案,图片

    Co-founder at DAIR.AI | Ph.D. | Prev: Meta AI, Galactica LLM, Elastic | Prompting Guide (6M+ learners) | I teach how to build with AI ??

    Generating Science From AI-Powered Automated Falsification This paper introduces BABY-AIGS, a multi-agent system for automated scientific discovery that emphasizes falsification through automated ablation studies. BABY-AIGS uses a Domain-Specific Language (DSL) to ensure executability and incorporates multi-sampling with reranking to enhance creativity in research proposals. The system was tested on three ML tasks (data engineering, self-instruct alignment, and language modeling), demonstrating the ability to produce meaningful scientific discoveries. However, the performance is below experienced human researchers. BABY-AIGS showed promise in falsification capabilities but revealed areas needing improvement, particularly in designing concrete experiment plans and verifying hypotheses. Scientific discovery is the next point of focus for AI. Very early research shows potential and different ways how to approach automated research and scientific discovery. This multi-agent system is no different from others but I think what they have done differently is focus on incorporating methods (falsification and executability) that have proven to work for human researchers. These are challenging to automate without human input and there is a need for more advanced agentic systems/architectures that can more naturally unleash discovery and creativity. Paper: https://lnkd.in/dTV9tJsV ↓ Enjoy reading AI papers? Join 100K+ researchers and devs for our weekly summary of top AI papers: ?https://lnkd.in/e6ajg945

    • 该图片无替代文字
  • DAIR.AI转发了

    查看Elvis S.的档案,图片

    Co-founder at DAIR.AI | Ph.D. | Prev: Meta AI, Galactica LLM, Elastic | Prompting Guide (6M+ learners) | I teach how to build with AI ??

    AWS releases Multi-Agent Orchestrator! Multi-Agent Orchestrator is a flexible framework for managing multiple AI agents and handling complex conversations. Features include: - Dynamic query routing - Python and Typescript support - Streaming support - Context management - Run locally or on any cloud platform - Pre-built agents and classifiers available

    • 该图片无替代文字
  • DAIR.AI转发了

    查看Elvis S.的档案,图片

    Co-founder at DAIR.AI | Ph.D. | Prev: Meta AI, Galactica LLM, Elastic | Prompting Guide (6M+ learners) | I teach how to build with AI ??

    ?? The competition for the best reasoning LLM intensifies! A few days ago, we had the Forge Reasoning API, now we have DeepSeek-R1-Lite-Preview which produces o1-preview-level performance on math benchmarks. Here are my observations after some initial tests on Deepseek’s new reasoning model. Math Capabilities: It looks effective for math reasoning problems. The benchmark results do reflect the potential of this model on math reasoning capabilities (even outperform o1-preview on their benchmarks). Something to watch very closely. Coding tasks: It wasn’t able to solve a simple code problem (generating bash script for transposing a matrix) which the o1 models solve easily. Complex knowledge understanding: I also tried the model on a much harder cross-word puzzle but it failed miserably. To be fair, even the o1 models fail on this particular test that requires knowledge of modern references. More thoughts and tests here: https://lnkd.in/d_diAjdz I believe the model is good at code and math as DeepSeek has been explicitly optimizing their models for this. But there is more work to do on the "reasoning" steps. In some instances, the model looks like it is able to correct itself when generating the thinking steps, displaying what looks like native self-reflection. Hard to confirm this without details on training data, architecture, and a technical report/paper. Looking forward to the open models and APIs.

    • 该图片无替代文字
  • DAIR.AI转发了

    查看Elvis S.的档案,图片

    Co-founder at DAIR.AI | Ph.D. | Prev: Meta AI, Galactica LLM, Elastic | Prompting Guide (6M+ learners) | I teach how to build with AI ??

    ?? Excited to launch my new RAG course! https://lnkd.in/eEiYwhVx I've built this hands-on course to be the ultimate guide to building RAG systems. It covers topics ranging from RAG enhancements all the way to building Agentic RAG systems. Here's what you will get out of the course: RAG Introduction: Learn the fundamentals of RAG and its core components. Understand why RAG is an important advancement in AI and discover common applications where RAG provides advantages over traditional approaches. RAG Architecture:?Explore the technical architecture of RAG systems, covering chunking, embedding models, vector databases, and semantic search fundamentals. Students will explore how retrievers and generators work together while learning key enhancements that optimize RAG performance. Building Naive RAG Systems: Students will apply the fundamentals to build their first RAG application from scratch. You will build a personalized tutor using RAG. Build a RAG Chat Assistant: Chat assistant is one of the most common enterprise use cases where RAG is applied. Students will learn how to create a document store from scratch, build the chat assistant with RAG, and apply common techniques like query expansion to improve results. You will build a RAG-powered customer service chatbot for an online website. Advanced RAG: Students will implement an advanced RAG system and apply more advanced prompting techniques like tool calling, chain-of-thought prompting (CoT), and prompt chaining to improve reliability and response quality. You will build a complex RAG solution that unifies core ideas used for building with LLMs. Agentic RAG: Includes one of the most recent and advanced ways to build agentic-based RAG systems. Students will learn about function calling and how agents can integrate with a RAG system to extend its capabilities and improve user experience. You will build an Agent RAG application that interacts with external tools such as a calculator, a reasoning chain tool, and an LLM chain tool to complete customer orders. Deploy RAG Apps: Students will take all the learnings from the course and build a shareable online RAG application to receive feedback. You will also learn more advanced tips for how to continue improving your RAG apps. --- We're also excited to offer a special 35% discount that will last until 11/29 (the end of next week). Use code BLACKFRIDAY at checkout. This is the cheapest our courses will ever be so take advantage of it as prices will go up in the coming months as we continue to add more technical courses.

    • 该图片无替代文字
  • DAIR.AI转发了

    查看Elvis S.的档案,图片

    Co-founder at DAIR.AI | Ph.D. | Prev: Meta AI, Galactica LLM, Elastic | Prompting Guide (6M+ learners) | I teach how to build with AI ??

    Bi-Mamba: Towards Accurate 1-Bit State Space Models Presents Bi-Mamba, a scalable 1-bit Mamba architecture designed for more efficient LLMs with multiple sizes across 780M, 1.3B, and 2.7B. Bi-Mamba achieves performance comparable to its full-precision counterparts (e.g., FP16 or BF16). It significantly reduces memory footprint with better accuracy than posttraining-binarization Mamba baselines. Looks like efforts on low-bit representation continue to be explored. This will be an important trend to watch going into 2025. Paper: https://lnkd.in/ex87PVR3 ↓ Enjoy reading AI papers? Join 90k+ researchers and devs for our weekly summary of top AI papers: ?https://lnkd.in/e6ajg945

    • 该图片无替代文字
  • DAIR.AI转发了

    查看Elvis S.的档案,图片

    Co-founder at DAIR.AI | Ph.D. | Prev: Meta AI, Galactica LLM, Elastic | Prompting Guide (6M+ learners) | I teach how to build with AI ??

    Fast GraphRAG is a new framework for interpretable, high-precision, agent-based retrieval workflows. Looks very promising and it is open source! https://lnkd.in/ebxDJRAE ↓ Enjoy reading AI papers? Join 90k+ researchers and devs for our weekly summary of top AI papers: ?https://lnkd.in/e6ajg945

    • 该图片无替代文字
  • DAIR.AI转发了

    查看Elvis S.的档案,图片

    Co-founder at DAIR.AI | Ph.D. | Prev: Meta AI, Galactica LLM, Elastic | Prompting Guide (6M+ learners) | I teach how to build with AI ??

    The Dawn of GUI Agent Explores Claude 3.5 computer use capabilities across different domains and software. They also provide an out-of-the-box agent framework for deploying API-based GUI automation models. "Claude 3.5 Computer Use demonstrates unprecedented ability in end-to-end language to desktop actions."

    • 该图片无替代文字

相似主页

查看职位