BentoML

BentoML

软件开发

San Francisco,California 8,894 位关注者

Unified Inference Platform for building scalable AI systems, with any model, on any cloud.

关于我们

BentoML is an Inference Platform that let developer build scalable AI systems with unparalleled speed and flexibility. Own your AI models, iterate faster, and scale at a lower cost.

网站
https://www.bentoml.com
所属行业
软件开发
规模
11-50 人
总部
San Francisco,California
类型
私人持股
创立
2019
领域
Model Serving、Model Inference、Inference Platform、Compound AI Systems、Multimodality、AI Inference、LLM Inference、LLM Applications、MLOps和LLMOps

产品

地点

  • 主要

    650 California St

    6 fl

    US,California,San Francisco,94108

    获取路线

BentoML员工

动态

  • 查看BentoML的公司主页,图片

    8,894 位关注者

    ?? Qwen2.5-Coder-32B now available in OpenLLM! It is the first open-source coding model to match GPT-4o's capabilities! It leads the pack with SOTA performance and 128K context support! Serve and deploy it with OpenLLM: openllm serve qwen2.5-coder:32b openllm deploy qwen2.5-coder:32b Check out our demo video to see how it performs in generating the SVG code for a bento box ?? #OpenLLM #AI #MachineLearning #OpenSource #BentoML

  • BentoML转发了

    查看BentoML的公司主页,图片

    8,894 位关注者

    ?? macOS users! Do you know you can now run LLMs right on your macOS with OpenLLM? ??Try out some of the popular models with a simple command using `openllm serve`! openllm serve phi3:3.8b-ggml-q4 openllm serve llama3.2:1b-instruct-ggml-fp16-darwin openllm serve qwen2.5:14b-ggml-q4 ???More exciting news: we're working on MLC model versions to deliver up to 3x faster inference on macOS — same models, lightning speed! ? Check out the demo and stay tuned for more OpenLLM updates! #LLM #AI #macOS #OpenLLM #OpenSource

  • 查看BentoML的公司主页,图片

    8,894 位关注者

    ??? Happening TODAY: Join us in 2 hours for our live AI development workshop! Learn how to build a phone calling AI agent 20x faster with open-source models using BentoML Codespaces! ? Today at 9:00 AM PT ?? Live demo ???Jump in https://lu.ma/numlvwan See you soon! ??

    查看BentoML的公司主页,图片

    8,894 位关注者

    ??? Webinar: Build an #AI phone calling agent 20x faster using open-source models Developing modern AI apps like #RAG or voice agents requires multiple GPUs and complex dependencies. This often results in dev delays, complicated setup processes, and inconsistent transitions from dev to prod, ultimately leading to unreliable AI services. ???Say hello to #BentoML #Codespaces, your solution to streamlined AI development! Join our Head of Engineering, Sean Sheng, for a live session where he will show you how to: ?? Develop and iterate on a phone calling agent 20x faster with your favorite IDE ?? Leverage cloud GPUs to build the AI app with open-source models ?? Eliminate dependency issues with auto-provisioned environments ?? Debug with real-time updates that mirror production and view live logs ?? Date: November 20, 2024 ?? Time: 9:00 AM - 10:00 AM PT ?? Register: https://lu.ma/numlvwan Can't make it? Register anyway to receive the recorded session. ?? #AIInference #LLM #MachineLearning

    Building Phone Calling AI Agents with BentoML Codespaces - LIVE! AGI Builders Meetup · Zoom · Luma

    Building Phone Calling AI Agents with BentoML Codespaces - LIVE! AGI Builders Meetup · Zoom · Luma

    lu.ma

  • BentoML转发了

    查看Eric Liu的档案,图片

    Industrializing Machine Learning

    ?? LIVE! #AGIBuildersMeetup tomorrow, 9 AM, PT ??? Build phone calling AI agent using OSS models? ? Iterate 20x faster with real-time code synchronization ?? Live Q&A with BentoML, Head of ENG, Sean Sheng? ?? Can't attend? Register to get the recording ?? Spread the love with likes, shares, and invites

    查看Eric Liu的档案,图片

    Industrializing Machine Learning

    ?? LIVE! #AGIBuildersMeetup on 11/20, 9 AM, PT ??? Build phone calling AI agent using OSS models ? Iterate 20x faster with real-time code synchronization ?? Live Q&A with BentoML, Head of ENG, Sean Sheng ?? Can't attend? Register to get the recording ?? Spread the love with likes, shares, and invites #AI #GenAI #ML

    此处无法显示此内容

    在领英 APP 中访问此内容等

  • 查看BentoML的公司主页,图片

    8,894 位关注者

    ?? Excited to share our latest example project: Serving ColPali with BentoML! ColPali is a game changer for RAG. It enables powerful document retrieval that understands both text AND visual content with a single model. No more complex OCR pipelines needed! Key features of this example project: ? Single model for text and visual elements (charts, layouts, diagrams) ? Efficient multi-vector embeddings ? Adaptive batching for high-volume workloads ? Easy deployment with BentoML ?? This project is perfect for teams dealing with large document collections, technical documentation, or research papers. You can search through thousands of pages with natural language queries! Want to try it yourself? Check out our open-source implementation here: https://lnkd.in/gpyZGvaj Special thanks to Tony W., Pierre Lecerf and Victor Alibert from ILLUIN Technology for their support and contribution! #ColPali #BentoML #MachineLearning #RAG #AI #OpenSource

    GitHub - bentoml/BentoColPali

    GitHub - bentoml/BentoColPali

    github.com

  • 查看BentoML的公司主页,图片

    8,894 位关注者

    ?? 3 days left! Learn to build an #AI phone agent 20x faster with #BentoML Codespaces! ??? Nov 20, 9 AM PT. ?? Register now!

    查看BentoML的公司主页,图片

    8,894 位关注者

    ??? Webinar: Build an #AI phone calling agent 20x faster using open-source models Developing modern AI apps like #RAG or voice agents requires multiple GPUs and complex dependencies. This often results in dev delays, complicated setup processes, and inconsistent transitions from dev to prod, ultimately leading to unreliable AI services. ???Say hello to #BentoML #Codespaces, your solution to streamlined AI development! Join our Head of Engineering, Sean Sheng, for a live session where he will show you how to: ?? Develop and iterate on a phone calling agent 20x faster with your favorite IDE ?? Leverage cloud GPUs to build the AI app with open-source models ?? Eliminate dependency issues with auto-provisioned environments ?? Debug with real-time updates that mirror production and view live logs ?? Date: November 20, 2024 ?? Time: 9:00 AM - 10:00 AM PT ?? Register: https://lu.ma/numlvwan Can't make it? Register anyway to receive the recorded session. ?? #AIInference #LLM #MachineLearning

    Building Phone Calling AI Agents with BentoML Codespaces - LIVE! AGI Builders Meetup · Zoom · Luma

    Building Phone Calling AI Agents with BentoML Codespaces - LIVE! AGI Builders Meetup · Zoom · Luma

    lu.ma

  • 查看BentoML的公司主页,图片

    8,894 位关注者

    ?? Exciting update: Llama 3.1 Nemotron 70B is now available in the #OpenLLM model repo, thanks to Tal Kain! This NVIDIA-customized model has shown outstanding performance on the Chatbot Arena #LLM Leaderboard! You can now serve and deploy the model using: `openllm serve llama3.1-nemotron:70b` `openllm deploy llama3.1-nemotron:70b` ?? Hardware requirement: Minimum 4 40GB or 2 80GB NVIDIA GPUs ?? Double celebration: We've also hit 10K stars on OpenLLM! Again, special thanks to Tal for this contribution to the OpenLLM community! Check out our video demo of Llama 3.1 Nemotron 70B deployment and inference on #BentoCloud. #OpenSource #BentoML #AI

  • 查看BentoML的公司主页,图片

    8,894 位关注者

    ?? New blog post: Deploy AI21 Labs Jamba 1.5 Mini with BentoML! Jamba 1.5 Mini is a top performer on NVIDIA's long context RULER benchmark, perfect for modern apps like RAG and agentic systems! Read the blog to learn how to deploy this powerful SSM-Transformer model that can handle up to 256K tokens! ???? Key highlights: ??Deploy and scale Jamba 1.5 Mini on BentoCloud with high-performance infra ??Set up OpenAI-compatible endpoints ??Full support for custom VPC deployment for maximum control Check out the step-by-step guide: https://lnkd.in/gbBcr8rD #AI #MachineLearning #OpenSource #LLM #AI21 #Jamba #BentoML

    Deploying AI21’s Jamba 1.5 Mini with BentoML

    Deploying AI21’s Jamba 1.5 Mini with BentoML

    bentoml.com

  • 查看BentoML的公司主页,图片

    8,894 位关注者

    ??? Webinar: Build an #AI phone calling agent 20x faster using open-source models Developing modern AI apps like #RAG or voice agents requires multiple GPUs and complex dependencies. This often results in dev delays, complicated setup processes, and inconsistent transitions from dev to prod, ultimately leading to unreliable AI services. ???Say hello to #BentoML #Codespaces, your solution to streamlined AI development! Join our Head of Engineering, Sean Sheng, for a live session where he will show you how to: ?? Develop and iterate on a phone calling agent 20x faster with your favorite IDE ?? Leverage cloud GPUs to build the AI app with open-source models ?? Eliminate dependency issues with auto-provisioned environments ?? Debug with real-time updates that mirror production and view live logs ?? Date: November 20, 2024 ?? Time: 9:00 AM - 10:00 AM PT ?? Register: https://lu.ma/numlvwan Can't make it? Register anyway to receive the recorded session. ?? #AIInference #LLM #MachineLearning

    Building Phone Calling AI Agents with BentoML Codespaces - LIVE! AGI Builders Meetup · Zoom · Luma

    Building Phone Calling AI Agents with BentoML Codespaces - LIVE! AGI Builders Meetup · Zoom · Luma

    lu.ma

相似主页

查看职位

融资