TrueFoundry Newsletter #16: Deploying ChatGPT Plugins??
I hope you enjoyed your long weekend. Here is what we have for you in this week's newsletter:
? Next blog in the LLM series: All about ChatGPT plugins
? Latest in the True ML Talks video series
? Noteworthy MLOps Community Slack discussions
??Understanding ChatGPT plugins
Plugins are add-ons that extend the capabilities of ChatGPT. They allow users to access up-to-date information, run computations, or interact with third-party services in response to a user's request.
In this blog, we will explain how ChatGPT Plugins work, explore some of the existing plugins and deploy a sample plugin that can search a vector DB like Pinecone and return relevant documents.
Read our last three blogs on LLM Series:
LLMs, LLMs Everywhere: Exploring their applications!
Fine Tuning: OpenAI Models + Your Confluence Data
??True ML Talks: ML Workflow @ Salesforce
In this video, Arpeet Kale, who was part of the engineering team at Salesforce that built the entire ML platform, shares his insights on:
? Machine Learning use cases, team structure & infrastructure overview
? Prototyping ML models at Salesforce
? Managing Costs for Large-Scale ML Projects in the Cloud
? Building a Multi-Tenant Real-Time Prediction Service
? Security and Reliability Measures in Salesforce AI Platform
? ML Infrastructure Platform vs Software Deployment Platform
??MLOps.Community slack discussions
Below are a summary of some of the informative conversations on the most popular MLOps community:
??MLFlow: Cache intermediate data between steps?
Summary: MLFlow while tracking ML experiments can be used to log metadata for reproducibility but it provide a way to snapshot data for reproducibility. But if you have a mechanism by which you can version your data as well, then you can refer to the specific version of data used in an experiment and log that as a metadata using MLFLow. For example, with Delta Lake Time Travel capabilities, you can automatically version the big data and log the version number with the experiment run. TrueFoundry is also working on expanding the capabilities of our mlfoundry library to log versions of datasets as well.
??DynamoDB and Postgres as low-latency feature store options
Summary: Feature stores need to be performant, so AWS offerings like DynamoDB (NoSQL) and Aurora (relational) are reasonable options. DynamoDB is especially quick for look ups where you query with the primary partition key (say one user) but suffers when doing batch queries or scans. With DynamoDB, it's super important to get the data modelling right and can be expensive. Aurora is performant enough for most realtime application usecases and also provides a way to move data in an offline manner to S3 in parquet format without affecting query times.
??PyTorch models to TensorRT for deployment using Nvidia Triton
Summary: PyTorch models can be compiled to work with Triton backend on NVIDIA using torch_tensorrt library but it does not support dynamic batching (batch size is fixed). TorchInductor is a new compiler for PyTorch, which is able to represent all of PyTorch and is built in a general way such that it will be able to support training and multiple backend targets.
We are trying to make this newsletter worthy of the space in your inbox. Let us know your feedback.
Loved it ?? | Good ??| Meh ??| Bad??
If you like our newsletter? Spread the love to friends and colleagues.
With ?? by TrueFoundry