?? Qwen2.5-Coder-32B now available in OpenLLM! It is the first open-source coding model to match GPT-4o's capabilities! It leads the pack with SOTA performance and 128K context support! Serve and deploy it with OpenLLM: openllm serve qwen2.5-coder:32b openllm deploy qwen2.5-coder:32b Check out our demo video to see how it performs in generating the SVG code for a bento box ?? #OpenLLM #AI #MachineLearning #OpenSource #BentoML
BentoML
软件开发
San Francisco,California 8,894 位关注者
Unified Inference Platform for building scalable AI systems, with any model, on any cloud.
关于我们
BentoML is an Inference Platform that let developer build scalable AI systems with unparalleled speed and flexibility. Own your AI models, iterate faster, and scale at a lower cost.
- 网站
-
https://www.bentoml.com
BentoML的外部链接
- 所属行业
- 软件开发
- 规模
- 11-50 人
- 总部
- San Francisco,California
- 类型
- 私人持股
- 创立
- 2019
- 领域
- Model Serving、Model Inference、Inference Platform、Compound AI Systems、Multimodality、AI Inference、LLM Inference、LLM Applications、MLOps和LLMOps
产品
地点
-
主要
650 California St
6 fl
US,California,San Francisco,94108
BentoML员工
动态
-
BentoML转发了
?? macOS users! Do you know you can now run LLMs right on your macOS with OpenLLM? ??Try out some of the popular models with a simple command using `openllm serve`! openllm serve phi3:3.8b-ggml-q4 openllm serve llama3.2:1b-instruct-ggml-fp16-darwin openllm serve qwen2.5:14b-ggml-q4 ???More exciting news: we're working on MLC model versions to deliver up to 3x faster inference on macOS — same models, lightning speed! ? Check out the demo and stay tuned for more OpenLLM updates! #LLM #AI #macOS #OpenLLM #OpenSource
-
??? Happening TODAY: Join us in 2 hours for our live AI development workshop! Learn how to build a phone calling AI agent 20x faster with open-source models using BentoML Codespaces! ? Today at 9:00 AM PT ?? Live demo ???Jump in https://lu.ma/numlvwan See you soon! ??
??? Webinar: Build an #AI phone calling agent 20x faster using open-source models Developing modern AI apps like #RAG or voice agents requires multiple GPUs and complex dependencies. This often results in dev delays, complicated setup processes, and inconsistent transitions from dev to prod, ultimately leading to unreliable AI services. ???Say hello to #BentoML #Codespaces, your solution to streamlined AI development! Join our Head of Engineering, Sean Sheng, for a live session where he will show you how to: ?? Develop and iterate on a phone calling agent 20x faster with your favorite IDE ?? Leverage cloud GPUs to build the AI app with open-source models ?? Eliminate dependency issues with auto-provisioned environments ?? Debug with real-time updates that mirror production and view live logs ?? Date: November 20, 2024 ?? Time: 9:00 AM - 10:00 AM PT ?? Register: https://lu.ma/numlvwan Can't make it? Register anyway to receive the recorded session. ?? #AIInference #LLM #MachineLearning
-
BentoML转发了
?? LIVE! #AGIBuildersMeetup tomorrow, 9 AM, PT ??? Build phone calling AI agent using OSS models? ? Iterate 20x faster with real-time code synchronization ?? Live Q&A with BentoML, Head of ENG, Sean Sheng? ?? Can't attend? Register to get the recording ?? Spread the love with likes, shares, and invites
?? LIVE! #AGIBuildersMeetup on 11/20, 9 AM, PT ??? Build phone calling AI agent using OSS models ? Iterate 20x faster with real-time code synchronization ?? Live Q&A with BentoML, Head of ENG, Sean Sheng ?? Can't attend? Register to get the recording ?? Spread the love with likes, shares, and invites #AI #GenAI #ML
此处无法显示此内容
在领英 APP 中访问此内容等
-
?? Excited to share our latest example project: Serving ColPali with BentoML! ColPali is a game changer for RAG. It enables powerful document retrieval that understands both text AND visual content with a single model. No more complex OCR pipelines needed! Key features of this example project: ? Single model for text and visual elements (charts, layouts, diagrams) ? Efficient multi-vector embeddings ? Adaptive batching for high-volume workloads ? Easy deployment with BentoML ?? This project is perfect for teams dealing with large document collections, technical documentation, or research papers. You can search through thousands of pages with natural language queries! Want to try it yourself? Check out our open-source implementation here: https://lnkd.in/gpyZGvaj Special thanks to Tony W., Pierre Lecerf and Victor Alibert from ILLUIN Technology for their support and contribution! #ColPali #BentoML #MachineLearning #RAG #AI #OpenSource
GitHub - bentoml/BentoColPali
github.com
-
?? 3 days left! Learn to build an #AI phone agent 20x faster with #BentoML Codespaces! ??? Nov 20, 9 AM PT. ?? Register now!
??? Webinar: Build an #AI phone calling agent 20x faster using open-source models Developing modern AI apps like #RAG or voice agents requires multiple GPUs and complex dependencies. This often results in dev delays, complicated setup processes, and inconsistent transitions from dev to prod, ultimately leading to unreliable AI services. ???Say hello to #BentoML #Codespaces, your solution to streamlined AI development! Join our Head of Engineering, Sean Sheng, for a live session where he will show you how to: ?? Develop and iterate on a phone calling agent 20x faster with your favorite IDE ?? Leverage cloud GPUs to build the AI app with open-source models ?? Eliminate dependency issues with auto-provisioned environments ?? Debug with real-time updates that mirror production and view live logs ?? Date: November 20, 2024 ?? Time: 9:00 AM - 10:00 AM PT ?? Register: https://lu.ma/numlvwan Can't make it? Register anyway to receive the recorded session. ?? #AIInference #LLM #MachineLearning
Building Phone Calling AI Agents with BentoML Codespaces - LIVE! AGI Builders Meetup · Zoom · Luma
lu.ma
-
?? Exciting update: Llama 3.1 Nemotron 70B is now available in the #OpenLLM model repo, thanks to Tal Kain! This NVIDIA-customized model has shown outstanding performance on the Chatbot Arena #LLM Leaderboard! You can now serve and deploy the model using: `openllm serve llama3.1-nemotron:70b` `openllm deploy llama3.1-nemotron:70b` ?? Hardware requirement: Minimum 4 40GB or 2 80GB NVIDIA GPUs ?? Double celebration: We've also hit 10K stars on OpenLLM! Again, special thanks to Tal for this contribution to the OpenLLM community! Check out our video demo of Llama 3.1 Nemotron 70B deployment and inference on #BentoCloud. #OpenSource #BentoML #AI
-
?? New blog post: Deploy AI21 Labs Jamba 1.5 Mini with BentoML! Jamba 1.5 Mini is a top performer on NVIDIA's long context RULER benchmark, perfect for modern apps like RAG and agentic systems! Read the blog to learn how to deploy this powerful SSM-Transformer model that can handle up to 256K tokens! ???? Key highlights: ??Deploy and scale Jamba 1.5 Mini on BentoCloud with high-performance infra ??Set up OpenAI-compatible endpoints ??Full support for custom VPC deployment for maximum control Check out the step-by-step guide: https://lnkd.in/gbBcr8rD #AI #MachineLearning #OpenSource #LLM #AI21 #Jamba #BentoML
Deploying AI21’s Jamba 1.5 Mini with BentoML
bentoml.com
-
??? Webinar: Build an #AI phone calling agent 20x faster using open-source models Developing modern AI apps like #RAG or voice agents requires multiple GPUs and complex dependencies. This often results in dev delays, complicated setup processes, and inconsistent transitions from dev to prod, ultimately leading to unreliable AI services. ???Say hello to #BentoML #Codespaces, your solution to streamlined AI development! Join our Head of Engineering, Sean Sheng, for a live session where he will show you how to: ?? Develop and iterate on a phone calling agent 20x faster with your favorite IDE ?? Leverage cloud GPUs to build the AI app with open-source models ?? Eliminate dependency issues with auto-provisioned environments ?? Debug with real-time updates that mirror production and view live logs ?? Date: November 20, 2024 ?? Time: 9:00 AM - 10:00 AM PT ?? Register: https://lu.ma/numlvwan Can't make it? Register anyway to receive the recorded session. ?? #AIInference #LLM #MachineLearning
Building Phone Calling AI Agents with BentoML Codespaces - LIVE! AGI Builders Meetup · Zoom · Luma
lu.ma
-
BentoML转发了
?? LIVE! #AGIBuildersMeetup on 11/20, 9 AM, PT ??? Build phone calling AI agent using OSS models ? Iterate 20x faster with real-time code synchronization ?? Live Q&A with BentoML, Head of ENG, Sean Sheng ?? Can't attend? Register to get the recording ?? Spread the love with likes, shares, and invites #AI #GenAI #ML
此处无法显示此内容
在领英 APP 中访问此内容等