?? How to Expand LLMs Memory

?? How to Expand LLMs Memory

?

On Today’s Summary:

  • Repo Highlight: MemGPT
  • Trending Repos: litellm, 4DGaussians
  • Pytorch Tip: ONNX
  • Trending Models: MistralLite, SSD-1B
  • Python Tip: set()

Reading time: 3 min 29 sec

?

MemGPT: Transforming LLMs into Memory Managers


What’s New

MemGPT expands the memory capacity of language models. It uses a tiered memory system to help the model manage more text, improving performance in long chats and big document analysis.

Why Does It Matter

Current LLMs are limited by how much they can “remember” at once. This can hinder performance for tasks like document analysis and multi-session chats. MemGPT enables LLMs to efficiently handle extended conversations or analyze bigger documents without forgetting details.

How it Works

MemGPT operates in analogy with computer operating systems. It creates a virtual memory space for LLMs, similar to how computers use RAM and hard drives. This allows models to keep the most relevant data in quick-access memory and store other information in an external context.

Features

  • Extended Memory: Mimics computer memory systems to give LLMs a larger "memory space".
  • Self-Regulating: The LLM can decide how to manage and transfer its data.
  • Broad Use Cases: Useful for longer conversations and larger documents, and compatible with a wide range of LLMs.

TRY MEMGPT

?

?? TRENDING REPOS

BerriAI / litellm (☆ 2k)

Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)


hustvl / 4DGaussians (☆ 800)

4D Gaussian Splatting (4D-GS) is a new method for quickly and efficiently rendering dynamic scenes in real-time. It offers low storage requirements and fast training, and generates high-quality high-resolution visuals.


spdustin / ChatGPT-AutoExpert (☆ 4k)

AutoExpert offers an effective set of custom instructions designed to improve the performance of GPT-4 and GPT-3.5-Turbo, optimizing responses for depth and context.


thuml / Time-Series-Library (☆ 2k)

TSlib is an open-source library for creating and evaluating deep time series models. The library covers five key tasks: forecasting, imputation, anomaly detection, and classification.


dennybritz / reinforcement-learning (☆ 19k)

Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.

?

PYTORCH TIP

ONNX

Open Neural Network Exchange (ONNX) provides an open-source format for deep learning models, allowing interchangeability between various deep learning frameworks. PyTorch's integration with ONNX enables developers to move models between different platforms with ease, optimizing for inference and deployment.

When To Use

  • Interoperability: When you need to use or deploy a PyTorch model in a different framework or platform.
  • Optimized Inference: To leverage platform-specific optimizations for faster inference.

Benefits

  • Flexibility: Transfer models between various deep learning frameworks without being locked into one.
  • Ease of Deployment: Facilitate deployment on cloud platforms and edge devices that support ONNX.


# PyTorch to ONNX
import torch
import torch.onnx
import torchvision.models as models

model = models.resnet18(pretrained=True)
model.eval()
x = torch.randn(1, 3, 224, 224)
torch.onnx.export(model, x, "resnet18.onnx")

# ONNX Runtime for inference
import onnxruntime

session = onnxruntime.InferenceSession("resnet18.onnx")
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name

result = session.run([output_name], {input_name: x.numpy()})

# result now contains the inference output        

?

??? TRENDING MODELS/SPACES

amazon/MistralLite

MistralLite is an optimized version of Mistral-7B-v0.1 that is adept at processing extended contexts up to 32K tokens. By leveraging refined Rotary Embedding and a sliding window, it offers enhanced performance in tasks like summarization and question-answering over its predecessor.


LP-Music-Caps-demo

A project designed to generate descriptive captions for music using two approaches: transforming music tags into captions with OpenAI's GPT-3.5 Turbo API, and directly translating music audio to captions using a trained cross-model encoder-decoder model.


SimianLuo/LCM_Dreamshaper_v7

Latent Consistency Models (LCMs) offer rapid, high-resolution image synthesis by predicting solutions in the latent space, reducing the need for extensive iterative sampling. LCMs deliver top-tier text-to-image generation performance in fewer steps and lower latency than other diffusion models.

?

PYTHON TIP

Set Collection

The ‘set’ data type in Python is designed for checking membership of elements in a collection. When you have a large dataset and need to frequently verify if an item exists within it, using a ‘set’ can be much faster than a list.

When To Use

  • Frequent Membership Queries: When you need to repeatedly check for the existence of elements in the same collection.
  • Data Deduplication: When you need to eliminate duplicate entries from a collection.

Benefits

  • Speed: Set operations like membership tests are very fast, generally achieving O(1) time complexity.
  • Simplicity: Easily convert a list to a set and vice versa.
  • Uniqueness: By design, sets don't allow duplicate entries.

my_list = [1, 2, 2, 2, 2, 3, 5]

# Convert to set
my_set = set(my_list)

# Output (it removed duplicates)
{1, 2, 3, 5}

%time print(3 in my_list)
CPU times: user 71 μs,

%time print(3 in my_set)
CPU times: user 1.03 ms,

# lookups are 71x faster!        

Thank You

要查看或添加评论,请登录

社区洞察

其他会员也浏览了