登录查看更多内容

What's new in TrueFoundry : Introducing Ratelimiting, Vision Models & GPU Metrics

TrueFoundry

Reduce time to value on Gen AI & ML initiatives

发布日期: 2024年11月13日

+ 关注

Hello Folks!

We’ve rolled out a bunch of exciting new product updates this past month -?

1) Ratelimiting for Gateway?

2)?Support for New Models - Vision, Image Generation & Speech to Text ( ASR)

3) GPU related cluster metrics?

Ratelimiting for AI Gateway?

Rate limiting for an AI gateway prevents system overload and manages costs by controlling the request volume. It ensures stable performance and equitable resource access for all users. It helps -?

? Control traffic to maintain stable performance.

? Limit usage to avoid high costs from heavy requests.

? Prevent any single user from monopolizing resources, keeping access fair for all.?

TrueFoundry now supports rate limiting by user, service account, model, or any metadata such as customer ID.

Support for New Models - Vision, Image Generation & Speech to Text ( ASR)

? ?We are constantly expanding the array of models we support on the platform.

? You can now deploy vision, image generation, and speech-to-text models on the platform, enabling diverse applications across multiple domains.

? These models come with pre-configured resources and best practices such as autoscaling, auto-shutdown, rollout strategy, and model caching built in, ensuring efficient, scalable, and cost-effective deployments.

GPU Related Cluster Metrics

TrueFoundry now provides comprehensive GPU-related cluster metrics, enabling users to monitor GPU resources at a cluster level. Key metrics include -

? Cluster GPU Nodes, which displays active GPU types and counts

? GPU Count to track provisioned and requested GPUs

? GPU Memory Usage to show memory utilization for efficient resource management.

These insights help teams optimize GPU usage, enhance performance, and ensure balanced workloads across their AI infrastructure.

Reach out to us at: https://www.truefoundry.com/book-demo

What's new in TrueFoundry : Introducing Ratelimiting, Vision Models & GPU Metrics

TrueFoundry

Reduce time to value on Gen AI & ML initiatives

Ratelimiting for AI Gateway?

Support for New Models - Vision, Image Generation & Speech to Text ( ASR)

GPU Related Cluster Metrics

TrueFoundry Newsletter

1,945 位关注者

更多精彩文章

Ratelimiting for AI Gateway?

Support for New Models - Vision, Image Generation & Speech to Text ( ASR)

GPU Related Cluster Metrics

TrueFoundry Newsletter

1,945 位关注者

TFY Newsletter: Build vs Buy Dilemma for GenAI

2024年10月18日

#34: ??Year-end reflection: TrueFoundry

2024年1月6日

#33: Scaling up fine-tuned LORA models ???

2023年12月22日

#32: Implementing Fractional GPUs on Kubernetes ??

2023年12月8日

#31: Benchmarking Popular Open-source LLMs

2023年12月7日

#30: Mistral-7B Benchmarks ??

2023年11月10日

#29: Reduce the ML/LLM Workload & Discover MLOps @GitLab ??

2023年10月30日

#28: Open Source LLMs at scale & LLMs Ebook release ??

2023年10月13日

#27: Llama-2-7B Benchmarks for RAG

2023年9月15日

#26: Reducing your cloud cost from ML workloads by ~30% ??

2023年9月1日