RAG to Riches

RAG to Riches

Pinecone explores RAG for cost savings. Stability AI launches Stable Code 3B. Meta explores diffusion for texture in images. Let’s dive in!

ML Engineering Highlights:

  • New vector database architecture a 'breakthrough' to curb AI hallucinations: Pinecone , a New York City-based startup, has announced a serverless vector database architecture aimed at reducing AI hallucinations and making AI applications more knowledgeable and cost-efficient. The new architecture is designed to deliver up to 50x cost reductions, eliminate infrastructure hassles, and make it easier and cheaper for companies to provide RAG and knowledge over that content at scale. Pinecone serverless is seen as a sign that the generative AI ecosystem and tech stack is maturing, with integrations with other top AI companies in the stack.

Credit: Pinecone

  • Stability AI releases Stable Code 3B: Stability AI has announced the release of the commercially licensed Stable Code 3B, a 3-billion parameter model focused on code completion capabilities for software development. The model was optimized with an expanded context size using a technique known as Rotary Position Embeddings (RoPE) and trained on 18 different programming languages, demonstrating leading performance on benchmark tests. Stable Code 3B is available through Stability AI’s new membership subscription service.
  • Microsoft Copilot Pro launches for power AI users: 微软 has launched a premium tier of its Copilot AI companion, Copilot Pro, for individual power users such as developers, designers, and researchers for $20 per user per month. This tier promises priority access to the latest GPT models from OpenAI during peak times and enhanced features to meet the needs of individual users more efficiently. While this offers an expanded experience and capabilities for users, some are concerned that it may pose a barrier for many users who have been relying on the free version of Copilot.

Research Highlights:

  • TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion: This paper by researchers at Meta and 美国加州大学圣迭戈分校 presents TextureDreamer, a method for transferring realistic textures from a small number of input images to arbitrary 3D shapes, potentially democratizing texture creation. The method draws inspiration from recent advancements in diffusion models and includes personalized modeling for texture information extraction, variational score distillation for detailed appearance synthesis, and explicit geometry guidance with ControlNet. Experiments show that TextureDreamer is able to successfully transfer highly realistic, semantically meaningful texture to arbitrary objects, surpassing the visual quality of previous state-of-the-art methods.

Credit: Paper authors

  • DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference: This paper by 微软 introduces DeepSpeed-FastGen, a system that uses Dynamic SplitFuse to improve the throughput and latency of large language models (LLMs). It leverages DeepSpeed-MII and DeepSpeed-Inference to provide an efficient serving system for LLMs, supporting a range of models and deployment options. The paper presents benchmarking methodology, performance analysis, and a future roadmap for enhancements, with the code available for community engagement and contribution.
  • Asynchronous Local-SGD Training for Language Modeling: This paper by Google DeepMind and 美国德克萨斯大学奥斯汀分校 presents an empirical study of asynchronous Local-SGD for training language models, examining the impact of worker hardware heterogeneity, model size, number of workers, and optimizer on learning performance. The study finds that with naive implementations, asynchronous Local-SGD takes more iterations to converge than its synchronous counterpart and identifies momentum acceleration on the global parameters as a key challenge. The paper proposes a novel method that utilizes a delayed Nesterov momentum update and adjusts the workers' local training steps based on their computation speed, which matches the performance of synchronous Local-SGD in terms of perplexity per update step and surpasses it in terms of wall clock time.

Don’t Miss the Submission Deadline

  • CML 2024: International Conference on Machine Learning Submission Deadline: Thu Feb 01 2024 11:59:00 GMT-1200
  • CHIL 2024: Conference on Health, Inference, and Learning Submission Deadline: Mon Feb 05 2024 23:59:59 GMT-0500
  • ECCV 2024: European Conference on Computer Vision 2024 Submission Deadline: Fri Mar 08 2024 06:59:00 GMT-0500
  • MICCAI 2024: International Conference on Medical Image Computing and Assisted Intervention Submission Deadline: Fri Mar 08 2024 02:59:59 GMT-0500
  • ECAI 2024: European Conference on Artificial Intelligence 2024 Submission Deadline: Fri Apr 26 2024 07:59:59 GMT-0400

Want to learn more from Lightning AI? “Subscribe” to make sure you don’t miss the latest flashes of inspiration, news, tutorials, educational courses, and other AI-driven resources from around the industry. Thanks for reading!

要查看或添加评论,请登录

社区洞察