Redis的动态

Redis转发了

查看Manvinder Singh的档案

VP of AI Product Management, Redis. | Ex-Google AI / Cloud | McKinsey, Kellogg & IIT alum

One major trend I see in the world of Enterprise GenAI is the rapid adoption of centralized "Gateway" strategies for AI Inference. While at first glance this may feel like the API Gateway projects from 5-10 years ago, building a gateway that can analyze and take action based on LLM prompts while being on the "hot-path" is a different ball game. It requires: ?? Developing a semantic understanding of the prompts with very low latency vector searches ?? Low-latency operations across a wide variety of services. You need a fast but versatile database for this that can support everything from rate-limiting to storing masked PII data ?? Unique capabilities like semantic caching and semantic routing to optimize AI inference. Here is a quick overview of AI Gateways and how to enhance your gateway using Redis https://lnkd.in/gGQW_Vnf.

  • 该图片无替代文字
回复
Nagesh Nama

CEO at xLM | Transforming Life Sciences with AI & ML | Pioneer in GxP Continuous Validation |

4 个月

I believe that the success of these gateways hinges on their ability to seamlessly integrate with existing infrastructure and services. This requires a deep understanding of the underlying systems and the ability to optimize performance without disrupting existing workflows. Additionally, I believe that AI gateways should be designed with scalability in mind, as the demands of AI inference will only continue to grow in the coming years. Ultimately, the success of AI gateways will depend on their ability to deliver fast, reliable, and cost-effective AI inference at scale.

Franck Benichou, M.A., M.Sc.

AI Engineer & Technical Owner | Senior Generative AI Specialist @ Deloitte | Chat | Search | Agentic AI | Gen AI at Scale

4 个月
Sangam Pandey

Tech Leadership | Engineering Manager | Leading Next-Gen AI & Web Apps | Generative AI & RAG Expert

4 个月

Excellent breakdown of the AI Gateway paradigm! Manvinder Singh What resonates is your emphasis on semantic understanding and low-latency requirements. These gateways aren't just passing through requests; they are doing sophisticated prompt analysis, personally identifiable information detection, and semantic routing in real-time. The need to handle these operations without introducing significant latency overhead is a critical engineering challenge. Really looking forward for this space

Salman Paracha

Building Arch | Intelligent Infrastructure for GenAI | ex-AWS/Oracle

4 个月

Manvinder Singh you may want to give Arch a look - a uniquely intelligent gateway built by the contributors of envoy-: https://github.com/katanemo/arch. There are some additional capabilities unified in Arch that might be helpful to customers. And would love to partner with Redis

Should be a key part of a model factory/garden particularly on the serving layer?

Nicos Kekchidis

AI Engineering | Team Builder | Mentor and Advisor

4 个月

Good comprehensive overview of making LLM centric things work in enterprise. ?? Now, further elevating mixture-of-experts to upper layers for actionable decision making introduces: - computational inefficiencies, and - prohibitive cost leaks before acceptable outcomes are converged. Technicalities removed from: - what you are solving? - what are the cost implications of putting this non-trivial and unproven machinery with short shelf-life? is a sure path to the looming disaster, employee demoralization and revolving door in enterprise.

Henry Tam

Head of Growth Marketing at SkySQL, GTM for Enterprise Software - AI, Database, Networking, Security| Technical Product Marketer ex-Redis, ex-F5 Networks, ex-HP, ex-Dell

4 个月

One more AI Gateway to add to your list from my former company. I hope they used Redis for their semantic cache.??https://www.dhirubhai.net/posts/shawnwormke_ai-llm-aiops-activity-7262890664512626688-ZyAz?utm_source=share&utm_medium=member_ios

Lindsey LeFloch

Unlocking Data Potential for Enhanced Competitive Edge

4 个月

This is what Informatica does! :)

回复
查看更多评论

要查看或添加评论,请登录