The Evolution and Future of Human-Computer Interaction
A seamless, intuitive, and almost invisible Human-Computer Interaction (HCI) is in reach. An interaction that is more accessible, inclusive, easier to work with, and more human-like (well, maybe JARVIS-like). In this essay, we'll explore the remarkable evolution of HCI and discuss how the latest advancements, such as Large Language Models (LLMs), could revolutionize the way we interact with technology.
Since the introduction of Von Neumann Architecture in the 1940s, the field of human-computer interaction (HCI) has undergone significant evolution. In the early days, interacting with the computing power required using 0/1 sequences on punch cards or paper tape, which was undoubtedly challenging. In the 1970s, the command-line interface (CLI) was introduced, allowing users to interact with computers using typed commands. While the CLI was an improvement over punched cards, it was not until the introduction of the graphical user interface (GUI) in the 1980s that computing became accessible to the masses. GUI was a significant improvement over the CLI, allowing users to interact with computers using visual elements such as icons, buttons, and menus. One of the first commercial GUIs was introduced by Xerox in the 1970s, but perhaps it was the introduction of the Apple Macintosh in 1984 that made the GUI popular. The Macintosh used a graphical desktop metaphor, making it easy for users to interact with the computer by pointing and clicking. Windows and Point-and-Click disrupted this interaction even further. Microsoft Windows was introduced in the 1980s, bringing the concept of point-and-click interaction to mainstream computing. This made it possible for users to interact with computers using a mouse, which was more intuitive and user-friendly than typing commands. Windows also introduced the concept of a desktop, taskbar, and file manager, making it easier for users to navigate and manage their files. This was not the end-game though. The rise of the internet in the 1990s led to the development of web-based user interfaces. Web-based user interfaces allow users to interact with software applications through a web browser, without the need to install software locally. Web-based user interfaces are now ubiquitous, powering everything from social media sites to online banking applications.
There were other disruptions to this trend over the years, such as the introduction of the iPhone in 2007 where the concept of a touchscreen user interface (well, it introduced it to a mass public, the concept already existed) was introduced, allowing users to interact with their devices using gestures such as swiping and pinching and rise of virtual assistants such as Amazon's Alexa and Apple's Siri has led to the development of Voice User Interfaces (VUIs). Looking at this evolution, each new HCI has abstracted some functionalities for the users so that the interaction is smoother.
The new disruption: We are now in the future! Large Language Models, as Geoffrey Hinton said, is “the wheel” (some people compared it with the discovery of “fire”!!!).
One key distinction between humans and other animals lies in our exceptional ability to build new tools and use them. It didn't end with discovering fire and melting ore to create metal; we took it a step further by using metal to forge even more tools, which we then employed to accomplish various tasks and create even more advanced tools. This unique human capacity has given us an edge in the evolutionary race. Instead of being bound by the slow pace of natural evolution, we've managed to innovate and adapt to our surroundings at a more rapid pace. In a way, our creative use of tools serves a similar purpose as evolution, allowing us to thrive and adapt to an ever-changing world.
领英推荐
What is fascinating about LLMs is their ability to do the same: They can use tools! Of course prompting them to summarize a text or extract some information, or even chat with me about some knowledge is really great, but their unique ability in using tools is a distinguishing factor: They are not only a knowledge base and know things, but they can use it (at least we can tell them to use it). This can enable the next generation of HCI: You tell the system what you want to happen (your intention), they use their knowledge to figure out how to do it, and they find the right tools (or invent one) to make it happen. This is what systems like ChatGPT Plugins have enabled already. You ask them to book a trip to be fun and is within you budget, and that is it! The system breaks this down, uses right plugins (e.g., Expedia), finds the best destinations, and build the whole itinerary for you to confirm or modify, or revise together. This abstraction layer removes all the needs for the user to know where to store files, and where to find them again (load that document I was writing yesterday about languages and tools), which tool to use for calling their colleagues, for planning a new project, writing code, for planning a trip, and more. There wouldn’t be any need for any app anymore from the user perspective, everything would be a chat (or conversation, or brain wave, etc.). The apps would be tools managed by the LLM! Products such as LangChain, Semantic Kernels, and TaskMatrix.AI (the last two are from Microsoft) also aim for enabling such capability, and even more. With the invention of Reflexion, ReAct, and new advances by Microsoft, this is just around the corner.
A conceptual framework
Here is a conceptual framework that encompasses different architectures for using LLMs for different tasks and interactions.
From smart homes to security to coding to controlling robots, this architecture can handle overall interactions.
Conclusion: The evolution of human-computer interaction has been a fascinating and transformative journey. From the early days of punch cards to the promise of LLMs that can use tools, we've come a long way in making technology more accessible, user-friendly, and efficient. As we continue to push the boundaries of HCI, we can look forward to a future where our interactions with technology are not just effortless but also more engaging and empowering than ever before.
Solution Architecture Lead, FSI, Industry Solutions Delivery at Microsoft | OneMeta.AI board of advisors member | Passy board of directors member | TechLab Independent Director | Microsoft for Startups Mentor
1 年Great stuff Reza Bonyadi!