#34: ??Year-end reflection: TrueFoundry

#34: ??Year-end reflection: TrueFoundry

Year-end reflection of our thesis on MLOps??

It’s time to reflect on TrueFoundry’s journey over the past year. This reflection isn’t just a celebration of our achievements but also an acknowledgment of the challenges we’ve navigated, appreciation of the opportunities we have been presented with, and the learnings we’ve embraced.

This blog focuses on the chronological journey of learnings and realizations indexed on our thesis on MLOPs- and how things played out in reality. It covers:

? GTM experiments to run based on our learnings of working with design partners.

? The hypothesis that we validated from our customers and prospecting calls.

? Uniformity of LLMOPs, MLOPs & DevOps.


Read the full blog


Enterprise GenAI and LLMOps with Labhesh Patel ???? (ex-CTO Jumio Corporation )??

In this video, we have Labhesh Patel ???? , ex-CTO at Jumio Corporation, to talk about his previous stint at Jumio as a CTO and the following topics:

? Challenges related to data management, data quality, and the crucial role of data in machine learning pipelines.

? Generative AI and Visual Q&A, including use of segmentation maps and attention mechanisms in image-related tasks.

? Labhesh's extensive portfolio of more than 250 research papers and patents.

? Overcoming roadblocks post-implementation with a specific cloud provider

Generative AI applications in the identity detection industry.

? Hiring challenges and skillset disparities in ML Teams.

? Small Language Models (SLMs) vs. Large Language Models (LLMs).

? Transitioning from very large models to very small models, considering factors like simplicity, efficiency, and latency.


Read the full blog here


Handpicked Resources on MLOps & LLMs ??

Below are summary of some of the informative conversations on the most popular MLOps community and research papers:

?? Avoiding the meltdown of a vector DB

Summary: Postgres can act as a vector database using the open-source pg_vector extension. A useful pattern while using Postgres as a vector DB is to create a partial index for records that are recent. For example, we can create a partial index for records with age of less than 7 days.

Read the full conversation

?? Prototyping with LLMs on AWS

Summary: To reduce inference times on Huggingface models, you can use various tools like VLLM or DeepSpeed. If you are handling concurrent requests, you can obviously provision more GPU cards to support faster inference but this will eventually have diminishing returns due to communication overheads. In this case, you can put your inference servers behind a load balancers and distribute request using any load balancing algorithm and this should scale linearly with GPUs.

Read the full conversation

?? Retrieval-Augmented Generation for LLMs: A Survey

Summary: This survey paper explores the state of RAG systems in detail. It talks about various modules that can be added to RAG - advanced data processing, various indexing techniques and mutliple, iterative or hierarchical retrieval. It talks about processing corpora to obtain best semantic representation, how to match semantic representation of query to that of retrieved data and post-processing techniques for retrieved documents.

Read the full conversation


That's All for today ??

Brief about TrueFoundry!

Just as a reminder, for the new members of our community, TrueFoundry is a comprehensive ML/LLM Deployment PaaS, empowering ML Teams in enterprises to test, and deploy ML/LLMs with ease while ensuring the following benefits:

?Full security for the infra team.

?40% lower costs due to Resource Management.

?90% faster deployments with SRE best practices.

For LLM/GPT style model deployment, we allow users to select pre-configured models from our catalog and fine-tune them on their datasets



要查看或添加评论,请登录

TrueFoundry的更多文章

社区洞察

其他会员也浏览了