Towards a responsible "Her": A holistic evaluation of personal AI companions with long-term memory (Part 1)

Towards a responsible "Her": A holistic evaluation of personal AI companions with long-term memory (Part 1)

The remarkable progress of Large Language Models (LLMs) such as ChatGPT from OpenAI, Claude from Anthropic, and Gemini from Google has enabled human-like interactions through conversational interfaces. An active area of research is long-term memory (LTM), which allows these models to maintain context over extended periods and sessions, continuously learn about the user and their preferences, and effectively retrieve relevant information.

One application area of LTM capabilities with increasing traction is personal (or personalized) AI companions and assistants. With the ability to retain and contextualize past interactions and adapt to user preferences, personal AI companions and assistants promise a profound shift in how we interact with AI and are on track to become indispensable in personal and professional settings. However, this advancement introduces new challenges and vulnerabilities that require careful consideration regarding the deployment and widespread use of these systems.

The goal of this two-part series is to explore the broader implications of building and deploying personal AI applications with LTM capabilities using a holistic evaluation approach (Spector et al., 2022). Part 1 will review the technological underpinnings of LTM in LLMs and briefly survey currently available personal AI companions and assistants. Part 2 will explore critical considerations when designing, deploying, and using these applications as well as broader societal implications.

Long-term memory mechanisms in AI

The evolution of LTM mechanisms in artificial intelligence has progressed from early symbolic systems to the sophisticated capabilities of contemporary LLMs. Traditional AI relied on symbolic methods like knowledge bases and rule-based systems, which stored and retrieved static information but lacked dynamic adaptability (Russell & Norvig, 2016). As AI research advanced, neural network models emerged, showing promise in learning from data and generalizing to new situations. However, early neural models struggled with maintaining long-term context and adapting to user preferences over extended interactions, primarily using short-term memory. The introduction of Long Short-Term Memory (LSTM) and attention mechanisms partly addressed this issue. LSTMs were designed to tackle the vanishing gradient problem in earlier RNNs, allowing networks to retain information over more extended periods (Hochreiter & Schmidhuber, 1997). Attention mechanisms, particularly through Transformer architecture, further improved memory systems by enabling selective focus on relevant input data, enhancing the handling of long sequences (Vaswani et al., 2017).

The advent of LLMs, like OpenAI's GPT series and Google's BERT, significantly advanced natural language processing. These models excel in translation, summarization, and text generation by leveraging large datasets and complex neural networks to produce coherent, context-aware outputs (Brown et al., 2020; Devlin et al., 2018). Despite these advances, attention mechanisms in LLMs face limitations in maintaining context and adapting over extended interactions, such as high computational costs, uneven information retention, and potential biases from training data. LLM memory lacks the depth and contextual recall of human long-term memory, which is crucial for applications like personalized recommendations or adaptive learning.

Key approaches to addressing these limitations to improve the LTM capabilities of LLMs include:

  1. Increased context length: Research techniques, like sparse attention mechanisms and quantization, aim to increase context length without proportional computational requirements (Wang et al., 2024).
  2. External knowledge base: Retrieval-augmented generation (RAG) outsources memory functions to an external database, broadening the model's information access beyond its immediate training data (Lewis et al., 2020).
  3. Additional memory layers: Memory-augmented neural networks (MANNs), such as Neural Turing Machines (NTMs) and Differentiable Neural Computers (DNCs), incorporate an external memory component for dynamic read and write operations (Graves et al., 2014; Graves et al., 2016).
  4. Integrated memory: MemoryBank, a long-term memory module, enables LLMs to store, recall, and update memories, adapting to user personalities by synthesizing past interactions using methods inspired by the Ebbinghaus Forgetting Curve theory (Zhong et al., 2023).

Case study: Personal AI companions

Integrating long-term memory in personal AI systems can significantly enhance their functionality by enabling them to continuously learn from past interactions and adapt to user preferences over time, providing a deeply personalized experience. These models can be very powerful – for instance, AI companions can offer social companionship, solace to individuals in isolation or long-term care, and digital therapy that evolves to meet users' psychological needs (Chaturvedi et al., 2023). These models can also be trained to create digital twins that can serve as interactive avatars for celebrities, engaging fans in a personalized manner. Personal AI assistants can learn user preferences and effectively manage tasks with less human oversight.?

The addition of LTM capabilities in LLMs opens doors to future innovations in user experience design. Moreover, the introduction of new modalities such as voice-based or video-based models could tranform how users interact with AI. Interfaces can become more intuitive, allowing users to interact with these systems in increasingly natural and responsive ways.

Below is a brief survey of recent personal AI applications, as of May 2024. Note that these examples have been cherry-picked to be representative of each category and do not reflect the larger landscape of personal AI applications

Brief taxonomy of personal AI applications, as of May 2024 (created by author)


AI Companions

  • Purpose: Provide emotional support, companionship, and personalized interactions.
  • SiliconFriend: Zhong et al. (2023) created SiliconFriend to evaluate MemoryBank's performance within LLMs. It focuses on long-term interactions, adaptability based on users' personalities, and meaningful emotional support through psychological data.
  • Replika: Designed for deeply personal interactions, Replika helps users understand themselves better through empathetic, non-task-oriented interactions. While not (yet) equipped with LTM, the model continuously evolves to mirror the user's personality and enhance emotional engagement (Replika, n.d.).
  • Personal.ai : This platform creates personalized digital twins using unique models called Personal Language Models (PLMs). These models, equipped with long-term memory, allow for various personas, from professional assistants to companions. The platform is able to integrate new memories quickly and give users control over their data and the model's learning process (Personal AI, n.d.).
  • Character.ai : enables users to create and interact with customized AI characters, providing personalized and engaging conversational experiences. It allows users to craft unique personalities and narratives, fostering creative and interactive storytelling. The platform continually adapts to user interactions, enhancing the realism and depth of character engagements (Character.ai , n.d.).

AI Assistants

  • Purpose: Assist with tasks, increase productivity, and provide information.
  • Charlie Mnemonic: Touted as the first LLM-based personal assistant with long-term memory, Charlie Mnemonic uses GPT-4 to simulate human-like memory processes, offering personalized and enduring user interactions. It combines Long-Term Memory (LTM), Short-Term Memory (STM), and episodic memory to dynamically update memories and allow continuous learning without retraining (GoodAI, 2024).
  • Google Gemini: Formerly known as Bard, Google Gemini is an advanced AI assistant leveraging LTM for highly personalized experiences. It integrates across Google's ecosystem, adapting to user preferences to enhance functionality and engagement over time. It also has multimodal capabilities.
  • ChatGPT: While not personalized, ChatGPT can remember and apply past interactions within a session to provide coherent, contextually relevant responses. Since February 2024, ChatGPT can remember (or forget) past conversations (Morris, 2024). It also has impressive multimodal features (GPT-4o).

While LTM is applicable for both AI companions and assistants, AI companions' focus on long-term personalized interaction strongly motivates the integration of LTM.

Part 2 will provide a holistic evaluation and practical recommendations on designing and deploying personal AI applications, with a larger emphasis on AI companions.




References

Baddeley, A. (1992). Working Memory. Science, 255(5044), 556–559. https://doi.org/10.1126/science.1736359

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., … Amodei, D. (2020). Language Models are Few-Shot Learners (arXiv:2005.14165). arXiv. https://arxiv.org/abs/2005.14165

Clara Hainsdorf, Tim Hickman, Dr. Sylvia Lorenz, Jenna Rennie. (2023, December 14). Dawn of the EU’s AI Act: Political agreement reached on world’s first comprehensive horizontal AI regulation | White & Case LLP. https://www.whitecase.com/insight-alert/dawn-eus-ai-act-political-agreement-reached-worlds-first-comprehensive-horizontal-ai

Character.ai . (n.d.). Character.ai . Retrieved June 3, 2024, from https://character.ai

Chaturvedi, R., Verma, S., Das, R., & Dwivedi, Y. K. (2023). Social companionship with artificial intelligence: Recent trends and future avenues. Technological Forecasting and Social Change, 193, 122634. https://doi.org/10.1016/j.techfore.2023.122634

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (arXiv:1810.04805). arXiv. https://doi.org/10.48550/arXiv.1810.04805

Differences Between Personal Language Models and Large Language Models. (n.d.). Retrieved May 12, 2024, from https://www.personal.ai/plm-personal-and-large-language-models

GoodAI. (2024, March 1). Introducing Charlie Mnemonic: The First Personal Assistant with Long-Term Memory. GoodAI. https://www.goodai.com/introducing-charlie-mnemonic/

Graves, A., Wayne, G., & Danihelka, I. (2014). Neural Turing Machines (arXiv:1410.5401). arXiv. https://doi.org/10.48550/arXiv.1410.5401

Graves, A., Wayne, G., Reynolds, M., Harley, T., Danihelka, I., Grabska-Barwińska, A., Colmenarejo, S. G., Grefenstette, E., Ramalho, T., Agapiou, J., Badia, A. P., Hermann, K. M., Zwols, Y., Ostrovski, G., Cain, A., King, H., Summerfield, C., Blunsom, P., Kavukcuoglu, K., & Hassabis, D. (2016). Hybrid computing using a neural network with dynamic external memory. Nature, 538(7626), 471–476. https://doi.org/10.1038/nature20101

Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735

Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W., Rockt?schel, T., Riedel, S., & Kiela, D. (2021). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (arXiv:2005.11401). arXiv. https://doi.org/10.48550/arXiv.2005.11401

Morris, C. (2024, February 14). ChatGPT and Google’s Gemini will now remember your past conversations. Fast Company. https://www.fastcompany.com/91029395/chatgpt-google-gemini-remember-past-conversations

Replika. (n.d.). Replika.Com . Retrieved May 12, 2024, from https://replika.com

Russell, S., & Norvig, P. (2016). Artificial Intelligence: A Modern Approach. Pearson. https://books.google.com/books?id=XS9CjwEACAAJ

Spector, A. Z., Norvig, P., Wiggins, C., & Wing, J. M. (2022). Data Science in Context: Foundations, Challenges, Opportunities. Cambridge University Press. https://books.google.com/books?id=SaKIEAAAQBAJ

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention Is All You Need. Advances in Neural Information Processing Systems, 30, 5998–6008.

Wang, X., Salmani, M., Omidi, P., Ren, X., Rezagholizadeh, M., & Eshaghi, A. (2024). Beyond the Limits: A Survey of Techniques to Extend the Context Length in Large Language Models (arXiv:2402.02244). arXiv. https://doi.org/10.48550/arXiv.2402.02244

Zhong, W., Guo, L., Gao, Q., Ye, H., & Wang, Y. (2023). MemoryBank: Enhancing Large Language Models with Long-Term Memory (arXiv:2305.10250). arXiv. https://arxiv.org/abs/2305.10250


Alfred Spector

Visiting Scholar @ MIT

4 个月

Eunhae Lee’s observations are excellent.? When you pair this Part 1 with her forthcoming Part 2, her paper will suggest new ways of thinking (individually or societally) about what we really want in personal AI Companions.? I recommend the sequence!

Akshit Singla

MIT Graduate | Startup Founder | Tech Innovation Expert | Building Solutions for Founders

4 个月

Checkout MAIHEM (YC W24)!

要查看或添加评论,请登录

社区洞察

其他会员也浏览了