登录查看更多内容

what is memGPT?

michael raspuzzi

building worldwide studios | no code ai agents sprint (march 24-april 4)

发布日期: 2023年10月28日

making large language models better with a layer of virtual memory management

computer operating systems have memory management enabling new functions beyond what the physical memory can do. there is a virtual layer that enables multitasking, caching, and protection against malicious applications. without the operating system managing memory, computer processing would be severely limited.

right now large language models (LLMs) have that limitation. they have limited context windows, where there’s only so much input for so much output, as well as short term memory loss. talking to chatGPT is like talking to dory from finding nemo. every conversation you have to remind it what it’s role is and it’s goal. to keep swimming, it needs to know where you are swimming towards.

llms are like dory right now with short term memory loss.

LLMs have the ability to be programmed to maintain certain roles and functions throughout a series of prompts and tasks. this is what’s enabled a new virtual layer for memory management taking the first big step to making LLMs like an operating system enabling new functions.

memGPT gives LLMs memory management

a team of researchers at berkeley created memGPT, which acts as memory management for large language models. this enables long term memory retrieval and writing ability as well as bypasses the context window input limit.

memGPT gives LLMs a feedback loop between user events, searching virtual context, and performing a function (source)

memGPT augments LLMs with a hierarchical memory system and functions that let it manage its own memory. the LLM processes main context (like RAM main memory in an OS) as an input, and output text is parsed with either a yield or a function.

these functions enable memGPT to move data between main context and external context (like OS disk memory). when the LLM processor generates a function call, it can chain together a series functions, like searching a database and sending a message.

this process enables an ability for long term memory storage as well as an ability to figure things out, like rewriting it’s own memory when it gets corrceted by a user message or data in a document.

how memGPT works (going left to right from the above diagram)

event based control flow: different kind of events trigger LLM inference. these events include: user messages (think chat), system messages (think main context capacity warnings), user interactions (like a user logging in or uploading a document), timed events (running on a regular schedule).
memory-hiearchy: memGPT has two primary memory types: main context, like physical RAM, and external context, like disk memory and storage. main context is the standard fixed context window in current language models. external context is any information outside of this window, which always must be moved to main context for inference. memGPT enables function calls for the LLM processor to manage it’s own memory going between the two without user intervention.
operating system like functions:search through databases for deep memory retrieval in archive, analyze large documents, or even perform nested tasks referencing multiple data sources. the ability to write and rewrite memory and responses without user intervention.

testing memGPT in analyzing documents and long form chat

the team tested memGPT on two main use cases: 1) document analysis and 2) long form chat conversations.

for analyzing documents, previous llms have token limits for how much can be processed in one function, which limits the kind of documents that can be processed. for example, open ai’s gpt-4 has an 8192 token limit. stephen king’s best selling novel, the shining, has around 150,000 words, which approximates to about 200,000 tokens. it would take 25 context windows (or prompts) to feed gpt-4 stephen king’s novel.

using the main context to feed a single document can limit performance when scaled up.

memGPT is consistent at analyzing documents regardless of size, while gpt-4 decreases

in the first use case, they were able to show that memGPT performed consistently well in accuracy regardless of context length, or how much text information was used in query.

and what is more interesting, is the ability for memGPT to do nested key value task. see below.

in this example, memGPT continously searches archival memory until it finds the latest key. once the archival memory reveals that the current key value is not a key, then it starts the search again to find its pair. once it finds the final value, it returns the message to the user. alongside search, it also can self edit its memory.

memGPT enables self correcting memory and long term retrieval

hi brad, not chad (source memGPT github)

notice in the above conversation, the chat bot made a mistake of saying ‘hi chad,’ and the user corrected it to say their name is ‘brad.’ highlighted in red, memGPT is able to edit it’s memory of brad’s first name to both reply back instaneously and remember it long term.

this makes the conversation more natural as well as more useful.

领英推荐

TAI 131: OpenAI’s o3 Passes Human Experts; LLMs…

Towards AI 2 个月前

Multilingual RAG, Algorithmic Thinking, Outlier…

Towards Data Science 9 个月前

Hallucination-Free, Self-Tuned, Fast Hierarchical LLMs…

Vincent Granville 11 个月前

ideas to apply memGPT

memGPT enables LLMs to have the ability to read long documents, search different archived datasets, and remember a user over a long history of chat.

understanding what new functionality is unlocked with LLMs:

reading long documents: memGPT can help process large documents like books, textbooks, financial filings, or legal transcripts
search external datasets: the ability to search archived data and perform different functions allow for deeper memory in agent or chat based work, like reference chat history with a user
connect multiple data points: memGPT can enable multiple function calls to determine best fit, like the nested key value task above

these three things unlock a new generation of smart assistants and co-pilots that help create a more robust and richer user experience.

alongside the obvious of a better customer support chat bot referencing a company’s wiki and ticket log, some other examples for using memGPT include:

personalized onboarding bot for a new product: starting from the first steps of using a product, memGPT can help create a great experience answering questions along the way. it can also help the company understand different activation points or collect feedbck.
patent search assistant: memGPT can help craft prior claims and do research across a patent database.
design consultant: memGPT could query different data bases to help answer design questions around materials or regulatory code.
research assistant: review and search different relevant papers while answering questions and providing support along a researcher’s journey.
super physics tutor: referecning multiple text books and curriculum examples memGPT could enable better tutoring for students, parents, and event teachers.

what i’m most interested in is the last use case: how can a large generative model be made for specific use cases in educating and supporting better learning environments?

make an awesome render of a virtual physics tutor working with students and families

rather than the limits of one text book or one tutor’s knowledge graph of a subject like physics, what if you could have access to a knowledge graph that’s trained on the top 100 textbooks throughout time, able to personalize conversation because it knows the student’s journey, and it can customize curriculum for them?

while LLMs by themselves are good general tutors, memGPT could help super power that based on longer term memory for better reference and chat experience.

while one point against this is that it takes away the human role of either writing text books or teaching, i actually think it has the potential to have the opposite effect:

enable better home schooling by supporting parents: it could be used by parents who want to learn a new topic they don’t know and help their children along the way of learning. home schooling quality could increase.
increase quality of teacher training: public schools will no longer be limited by localized talent. teachers who are curious and want to learn the subject material could have separate physics assistant bot than their students. this could help both teachers and students. rather than be students using ai v. teachers, or ai doing certain tasks v. teachers.

ultimately, this becomes a new kind of textbook resource for school. it’s a compression of knowledge that students can interact with. the main difference: they can really interact with it. ideally this unlocks more learning, less schooling.

it’s crazy that in 2023 while schooling in synonomous with ‘learning,’ learning based outcomes are not guaranteed with more schooling.

smart tutors with memGPT will help bridge this gap in accessibility and quality.

memGPT while impressive in the moment may be easily forgotten

this seems similar to the time right after the iphone was released. there was a community that would jail break the hardware to enable new functionality based on what they wanted to use it for v. what apple thought users might want.

for example, the iphone 4 was the first to have a camera flash in june 2010. within a few weeks there was a jailbroken app to use that flash as a standalone flashlight by sept 2010. it took apple 3 years for iOS 7 to be released in 2013 to have built in flashlight controls…

memGPT seems to be that jailbroken app, enabling longer term memory, which is a short term problem that may eventually be updated.

with open ai releasing gpt less than a year ago, we have not really seen any product features or releases yet. it’s been more so integrated into other products like github co-pilot or notion’s assistance.

overally, the first year seems to be one large large human in the loop learning reinforcement learning environment to see how humans will interface with ai as well as to fine tune the output.

it will be interesting to see how long before open ai shares core memory management features as they explore new physical interfaces (iphone without the phone?) as well as what they share at their first developer conference in november.

hey, i’m michael and i’m exploring the intersection of ai and alt education. the next generation of innovaters deserve next generation tooling. connect with me on linkedin to follow my build journey.

要查看或添加评论，请登录

michael raspuzzi的更多文章

ai + global health hack recap dec 14-15

2024年12月17日

ai + global health hack recap dec 14-15

"amazing experience!" Ahmed Ibrahim from gxza health how might we help solve problems on the ground in gaza immediately…

7 条评论
cooking with ai #4 - ai for blogging, slide generation (failure), experimental releases like notebookLM, and zoom's ai avatar

2024年10月27日

cooking with ai #4 - ai for blogging, slide generation (failure), experimental releases like notebookLM, and zoom's ai avatar

welcome to the fourth episode of cooking with ai, where we explore practical experiments with ai tools and discuss…

1 条评论
cooking with ai #003: llmu from cohere, open ai swarm, agents, clipbooklm

2024年10月25日

cooking with ai #003: llmu from cohere, open ai swarm, agents, clipbooklm

welcome to the third episode of cooking with ai, a weekly discussion about our understanding of the latest in ai with…
ai + global health hack recap sept 24

2024年9月23日

ai + global health hack recap sept 24

"thoroughly amazing!" Vinay Prabhu how might we increase quality and/or access to better healthcare globally using the…

18 条评论
16 grand challenges in healthcare ai from a16z

2024年8月16日

16 grand challenges in healthcare ai from a16z

inspired by the grand challenges podcast with Vijay Pande, PhD and Julie Yoo. healthcare is one industry where the…
ai in motion: lab auto hack recap june 24

2024年6月24日

ai in motion: lab auto hack recap june 24

"truly the best hackathon i have ever been in." sunir how might the latest in ai accelerate fundamental science through…

31 条评论

See all articles

what is memGPT?

michael raspuzzi

building worldwide studios | no code ai agents sprint (march 24-april 4)

making large language models better with a layer of virtual memory management

memGPT gives LLMs memory management

testing memGPT in analyzing documents and long form chat

领英推荐

ideas to apply memGPT

memGPT while impressive in the moment may be easily forgotten

michael raspuzzi的更多文章

社区洞察

其他会员也浏览了

Revolutionizing Mathematical Problem-Solving: OpenAI’s o3 and Its Transformative Impact on Businesses and Enterprises

Top LLM Papers of the Week (October Week 1, 2024)

Dennis Ritchie’s Legacy Lives On

Regex Engines: History and Contributions

Ways of Counting

New Grounds in Theorem Proving with DeepSeek-Prover-V1.5

Patches Are All You Need! [with code]

Musk Vs Zuck: Fight is Outside the Cage

Don’t Do RAG: When Cache-Augmented Generation is All You Need for Knowledge Tasks

Digital Mindset Core Building Blocks

making large language models better with a layer of virtual memory management

memGPT gives LLMs memory management

testing memGPT in analyzing documents and long form chat

领英推荐

ideas to apply memGPT

memGPT while impressive in the moment may be easily forgotten

michael raspuzzi的更多文章

ai + global health hack recap dec 14-15

cooking with ai #4 - ai for blogging, slide generation (failure), experimental releases like notebookLM, and zoom's ai avatar

cooking with ai #003: llmu from cohere, open ai swarm, agents, clipbooklm

ai + global health hack recap sept 24

16 grand challenges in healthcare ai from a16z

ai in motion: lab auto hack recap june 24

社区洞察

其他会员也浏览了

Revolutionizing Mathematical Problem-Solving: OpenAI’s o3 and Its Transformative Impact on Businesses and Enterprises

Top LLM Papers of the Week (October Week 1, 2024)

Dennis Ritchie’s Legacy Lives On

Regex Engines: History and Contributions

Ways of Counting

New Grounds in Theorem Proving with DeepSeek-Prover-V1.5

Patches Are All You Need! [with code]

Musk Vs Zuck: Fight is Outside the Cage

Don’t Do RAG: When Cache-Augmented Generation is All You Need for Knowledge Tasks

Digital Mindset Core Building Blocks