Time to Abandon the Chat-bot interface

Time to Abandon the Chat-bot interface

Chat-bot interfaces (instruction fine-tuning) to LLMs have been very popular since the release of ChatGPT. However, the chat-based interface—prompting the LLM to perform tasks or answer questions—isn’t the only way to interact with these models. Despite this, it has become the dominant method, leading to the rise of techniques like prompt engineering and auxiliary systems like Retrieval-Augmented Generation (RAG) to handle prompts and context management more effectively.

I’m suggesting a new interface to complement the standard instruction-based model: streaming with short-term memory.

Test-time compute and reasoning models have shown that using the LLM’s output logs as input for further processing can significantly enhance the intelligence and coherence of these systems. This concept aligns with what I call the “Aha moment” paradigm, where iterative reflection deepens understanding.

Feeding an LLM’s output back into its input reduces the need to keep all information in the context window. Instead, older information can be shifted out to make space for new inputs, assuming that reasoning has already synthesized the important details—albeit in an abbreviated or transformed form. This process resembles how human memory works.

This sliding context window approach opens the door to a streaming interface, where information continuously flows both into and out of the LLM. The output stream could include:

? Tool usage commands (e.g., “save this information to RAG,” “search for new data”)

? Self-assessment parameters

? Internal conversations or self-reflections

Meanwhile, input could be divided into continuous information streams such as consuming external information (for instance reading) or long-term memory recall (from knowledge database, traditional databases, APIs or semantic databases) and externally triggered internal thoughts—similar to system prompts, though not necessarily driven by human input.

This model would make more sense when building real agents, compared to the agent frameworks today that rely on prompt engineering and auxiliary RAG systems.

要查看或添加评论,请登录

Johan Karlsson的更多文章

社区洞察

其他会员也浏览了