What's new in TrueFoundry : Introducing Ratelimiting, Vision Models & GPU Metrics

What's new in TrueFoundry : Introducing Ratelimiting, Vision Models & GPU Metrics

Hello Folks!

We’ve rolled out a bunch of exciting new product updates this past month -?

1) Ratelimiting for Gateway?

2)?Support for New Models - Vision, Image Generation & Speech to Text ( ASR)

3) GPU related cluster metrics?

Ratelimiting for AI Gateway?

Rate limiting for an AI gateway prevents system overload and manages costs by controlling the request volume. It ensures stable performance and equitable resource access for all users. It helps -?

? Control traffic to maintain stable performance.

? Limit usage to avoid high costs from heavy requests.

? Prevent any single user from monopolizing resources, keeping access fair for all.?

TrueFoundry now supports rate limiting by user, service account, model, or any metadata such as customer ID.

Read more

Support for New Models - Vision, Image Generation & Speech to Text ( ASR)

? ?We are constantly expanding the array of models we support on the platform.

? You can now deploy vision, image generation, and speech-to-text models on the platform, enabling diverse applications across multiple domains.

? These models come with pre-configured resources and best practices such as autoscaling, auto-shutdown, rollout strategy, and model caching built in, ensuring efficient, scalable, and cost-effective deployments.

GPU Related Cluster Metrics

TrueFoundry now provides comprehensive GPU-related cluster metrics, enabling users to monitor GPU resources at a cluster level. Key metrics include -

? Cluster GPU Nodes, which displays active GPU types and counts

? GPU Count to track provisioned and requested GPUs

? GPU Memory Usage to show memory utilization for efficient resource management.

These insights help teams optimize GPU usage, enhance performance, and ensure balanced workloads across their AI infrastructure.

Reach out to us at: https://www.truefoundry.com/book-demo

要查看或添加评论,请登录