ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

A better approach, Cognitive Architectures are the Future of AI

Arthur Wielgosz

Solutions Architect / Full Stack Developer / AI Consultant | Enterprise Platforms

å‘å¸ƒæ—¥æœŸ: 2024å¹´2æœˆ7æ—¥

Lots of people talking, few of them know, that the soul of an inference engine is probabilistic and doesn't care about the truth. In this paper, we are discussing the nuanced realms of Artificial Intelligence (AI), Machine Learning (ML), and Large Language Models (LLMs). We'll delve into how these technologies, particularly LLMs, are reshaping our understanding and interaction with digital systems, focusing on their probabilistic nature and the innovative solutions addressing their inherent challenges. Let us start with the simple definition to put this into context:

AI (Artificial Intelligence) - The broad science of making intelligent machines or systems that can simulate human intelligence processes.
ML (Machine Learning) - A subset of AI focused on algorithms and statistical models that enable computers to perform specific tasks without using explicit instructions, relying on patterns and inference instead.
LLMs (Large Language Models) - Advanced ML models trained on vast amounts of text data to understand and generate human-like text, a pinnacle of current AI research in natural language processing.

Inference Engines and LLMs

Large Language Models (LLMs) like ChatGPT function as inference engines, using Machine Learning to analyse and generate text based on vast amounts of data. These models predict the most probable next word or phrase in a sequence, effectively 'inferring' human-like responses. This capability stems from their training on extensive collections of text, allowing them to apply accumulated knowledge to new queries.

Inference engines, despite their impressive capabilities, have inherent limitations due to their probabilistic nature. They generate responses based on statistical likelihoods, leading to "best guesses" rather than definitive answers. This approach can cause "hallucinations," where the engine produces incorrect or nonsensical information, especially when faced with queries outside its training data. This challenge underscores the need for advanced mechanisms to improve the reliability and accuracy of these AI systems.

Addressing the Challenges

The Mixtral (Mixture of Experts Model) approach, represents a significant evolution in tackling the challenges posed by inference engines. Unlike traditional models that rely on a single neural network to process all types of information, Mixtral utilises a diverse set of specialised sub-models, each an "expert" in a particular domain. This architecture mirrors the way human expertise is distributed across different fields, allowing for a more nuanced and precise approach to problem-solving.

In a Mixtral system, a "router" network plays a crucial role, dynamically selecting the most relevant experts based on the specific context of the input data. This means that for any given query, only the most applicable sub-models are activated, making the process both efficient and effective. The selected experts process the data independently, and their outputs are then aggregated to form a comprehensive response.

This methodology addresses several limitations of traditional inference engines. By leveraging multiple experts, the Mixtral model reduces the likelihood of hallucinations, as each expert's specialised knowledge contributes to a more accurate and reliable output. Moreover, this approach allows the system to handle a wider range of queries with higher confidence, as there is likely an expert well-suited to any given task.

The significance of Mixtral lies in its adaptability and scalability. As new domains of knowledge emerge, new experts can be developed and integrated into the system, enhancing its capabilities without the need to retrain the entire model from scratch. This makes Mixtral an agile solution in the fast-paced field of AI, ensuring that systems can stay at the cutting edge of knowledge and technology.

Retrieval-Augmented Generation (RAG)

RAG is an advanced technique that significantly enhances Large Language Models (LLMs) by incorporating external knowledge sources during the text generation process. This method combines the generative prowess of LLMs with the precision of information retrieval systems.

In essence, RAG works by first identifying relevant information from a vast database or knowledge base in response to a query or prompt. This retrieved information is then used by the LLM to generate responses that are not only contextually relevant but also more accurate and informative. This two-step processâ€”retrieval followed by generationâ€”allows the model to produce answers that are grounded in external evidence, reducing reliance solely on the information it learned during its initial training phase.

The significance of RAG lies in its ability to bridge the gap between the vast knowledge encoded in LLMs and the constantly evolving pool of human knowledge. By tapping into external data sources, RAG-equipped models can provide up-to-date information, offer more nuanced answers, and even correct or supplement their pre-existing knowledge base. This makes RAG a powerful tool for enhancing the utility and applicability of LLMs across various domains, from customer service and education to research and content creation.

é¢†è‹±æŽ¨è

Artificial General Intelligence: Breaking Down the Path to Human-Level AI

Artificial General Intelligence: Breaking Down theâ€¦

Kanerika Inc 3 å‘¨å‰

OpenAI's AI Model Aims for "Ph.D.-Level" Intelligence

Innovation Incubator Advisory 7 ä¸ªæœˆå‰

How AI is Transforming the IT Industry

BoldTek 8 ä¸ªæœˆå‰

Commonalities between these two leading solutions

Mixtral (Mixture of Experts Model) and Retrieval-Augmented Generation (RAG) are both advanced techniques designed to enhance the capabilities of AI systems, particularly in handling complex tasks. Here's a quick comparison highlighting their similarities:

Specialisation and Collaboration: Both approaches leverage the concept of specialisation. Mixtral uses a collection of expert models each specialised in a different domain, while RAG enhances LLMs by incorporating specialised external knowledge sources during the generation process.
Dynamic Selection: Each method involves a dynamic selection process. Mixtral's router network selects the most relevant experts for a given task, while RAG identifies and retrieves the most pertinent external information to inform the response generation.
Enhanced Accuracy and Contextualise: By integrating specialised knowledge, whether through expert models or external data sources, both Mixtral and RAG aim to improve the accuracy and contextuality of AI-generated responses.
Adaptability: Both techniques enhance the adaptability of AI systems, enabling them to address a broader range of queries more effectively by utilising the most relevant resources or expertise at any given moment.
Complementary to LLMs: Mixtral and RAG are not standalone technologies but are designed to complement and enhance the capabilities of Large Language Models, making them more versatile and effective.

These similarities underscore a shared goal between Mixtral and RAG; to create more intelligent, responsive, and capable AI systems by leveraging specialised knowledge and dynamic, context-aware processes. We can use this and take it further...

A different direction, Cognitive Architectures are the Future of AI

As above, Mixtral and RAG techniques highlight that the key to advancing AI isn't necessarily in building bigger LLMs, but in being smarter about how we use them. This is where cognitive architectures, such as Abstract State Machines, come into play. These architectures aim to mimic the human brain's ability to integrate various sensory inputs and cognitive processes to understand and interact with the world.

Cognitive architectures enable the integration of various AI modalitiesâ€”including LLMs, vision systems, and moreâ€”into a cohesive framework. By doing so, they allow for the creation of more sophisticated and versatile AI systems. For instance, an Abstract State Machine can orchestrate how an AI system switches between processing text, analysing images, and making decisions based on a combination of these inputs, much like how our brains process information from our senses to form a coherent understanding of our environment.

The potential of these architectures lies in their ability to overcome the current limitations of AI systems, particularly in terms of accuracy and context-awareness. By leveraging the strengths of different AI models and techniques, cognitive architectures facilitate more nuanced and adaptable AI responses. This not only enhances the performance of AI systems in complex tasks but also opens up new possibilities for AI applications, from more intuitive human-computer interactions to autonomous systems capable of navigating the real world with human-like understanding.

In the future of AI, the focus may shift towards developing and refining these cognitive architectures, ensuring that AI systems can effectively leverage the vast capabilities of LLMs and other specialised models. This approach promises to bring us closer to achieving AI systems that are not just powerful in processing vast amounts of data but are also truly intelligent and adaptable in their interactions with the world.

Conclusion

The journey through the landscapes of AI, ML, and particularly LLMs, reveals a world brimming with potential yet tempered by inherent limitations. Large Language Models have indeed transformed our interaction with technology, offering glimpses into a future where AI can converse, create, and even reason in ways that mirror human intellect. However, as we've explored, the path to realising the full spectrum of AI's capabilities is fraught with challengesâ€”chief among them being the probabilistic nature of inference engines and the consequent risk of hallucinations.

The exploration of techniques like Mixtral and Retrieval-Augmented Generation (RAG) underscores a pivotal shift in our approach to AI development. Rather than striving for ever-larger models, the focus is turning towards smarter, more efficient use of the technology we've cultivated. Cognitive architectures, particularly those like Abstract State Machines, stand at the forefront of this shift, promising a framework where various AI modalities can converge and cooperate to achieve a synergy that transcends their individual limitations.

As we stand on the brink of this new era in AI, it's clear that the future is not about AI systems that can simply store more information or generate text more fluently. It's about crafting AI that can think, understand, and create with a level of nuance and adaptability that truly mirrors human cognition. This requires a commitment to advancing cognitive architectures that not only enhance the accuracy and reliability of AI responses but also imbue systems with the ability to navigate the delicate balance between accuracy and creativity, all while being tailored to the specific needs of diverse applications.

The potential for AI to revolutionise every aspect of our lives is immense, but realising this potential will require us to think beyond the limitations of current models and architectures. By embracing and advancing cognitive architectures, we can unlock the true promise of AI, paving the way for systems that are not only more intelligent and reliable but also more in tune with the complex tapestry of human thought and creativity.

Shivangi Singh

Operations Manager in a Real Estate Organization

10 ä¸ªæœˆ

Great article. DLNs face critical challenges, including brittleness, machine hallucinations, and inconsistency. DLNs can be fragile, exhibiting a dramatic drop in accuracy with slight changes in data. Similarly, adding a small amount of noise can fool DLNs to misclassify well-known images with high confidence. Furthermore, GPTs, which constitute a category of DLNs, exhibit Machine hallucinations. Similarly, DLNs, like Falcon-40B, can inconsistently answer the same question correctly one time and incorrectly the second time. Efforts to address these issues are complicated because of the unexplainable and uninterpretable nature of DLNs. Proposed solutions for mitigating Machine Hallucinations include ensemble approaches that use multiple independently configured DLNs and combining them with Internet search engines. Because GPTs have Machine Hallucinations, imitation DLNs trained on such hallucinated output exhibit poor accuracy, thereby exacerbating this problem. Hence, the quest to enhance DLNs continues, acknowledging the need for methods to mitigate these fundamental challenges. More about this topic: https://lnkd.in/gPjFMgy7

èµž

å›žå¤

Mohammed Lubbad, PhD ??

1 å¹´

Fascinating insights! Looking forward to reading your paper. ??

èµž

å›žå¤

Arthur Wielgosz

Solutions Architect / Full Stack Developer / AI Consultant | Enterprise Platforms

1 å¹´

I should add to this a statement that summarises everything I discuss in one sentence; "AI systems (LLM, vision, audio, etc) systems need a prefrontal cortex". That is the purpose of the cognitive architectures we need to build to make AI intelligent (along with a situational awareness and long term memory).

èµž

å›žå¤

æŸ¥çœ‹æ›´å¤šè¯„è®º

è¦æŸ¥çœ‹æˆ–æ·»åŠ è¯„è®ºï¼Œè¯·ç™»å½•

Arthur Wielgoszçš„æ›´å¤šæ–‡ç«

If you donâ€™t like the gameâ€¦ Cheat!

2024å¹´10æœˆ18æ—¥

If you donâ€™t like the gameâ€¦ Cheat!

Disclaimer: This post covers industry domains that some may find offensive. If youâ€™re a far-right religious zealot or aâ€¦

2 æ¡è¯„è®º
Revisiting Alan Turingâ€™s Vision: Modern Computational Systems and the Power of Abstract State Machines

2024å¹´5æœˆ16æ—¥

Revisiting Alan Turingâ€™s Vision: Modern Computational Systems and the Power of Abstract State Machines

Abstract Alan Turing's contributions to computer science are nothing short of transformative. His pioneering work onâ€¦
Ghosts in the Machine: The Importance of AI Hallucinations

2024å¹´2æœˆ22æ—¥

Ghosts in the Machine: The Importance of AI Hallucinations

In Artificial Intelligence we have a peculiar phenomenon known as "hallucinations", this has been highlighted as a bigâ€¦

3 æ¡è¯„è®º
Data-State-Driven Architectures, Abstract State Machines and new AI cognitive though-process architectural patterns

2024å¹´1æœˆ29æ—¥

Data-State-Driven Architectures, Abstract State Machines and new AI cognitive though-process architectural patterns

In the rapidly evolving landscape of computer science, the integration of artificial intelligence into our systems isâ€¦
Pushing the capability of AI in Physics

2024å¹´1æœˆ24æ—¥

Pushing the capability of AI in Physics

As most of you know I have added "AI" to my skill set and have been pushing the boundaries of what is possible, fromâ€¦

6 æ¡è¯„è®º
Decoding Consciousness: The Confluence of Information, Thought, and Reality

2023å¹´11æœˆ9æ—¥

Decoding Consciousness: The Confluence of Information, Thought, and Reality

Soâ€¦ what is a â€œbrainâ€? Defining a "brain" is a yet another complex task that philosophers have struggled with forâ€¦
Emulation of Biological Thought in Artificial Intelligence

2023å¹´11æœˆ8æ—¥

Emulation of Biological Thought in Artificial Intelligence

In AI, tensor structures serve as a digital representation of human neural networks. Emulating neural activity within aâ€¦
The New Payment Platform (NPP) in Australia : Explained

2016å¹´10æœˆ28æ—¥

The New Payment Platform (NPP) in Australia : Explained

The Australian banking industry has been busy over the last couple of years building a New Payment Platform (NPP). Setâ€¦

83 æ¡è¯„è®º
Debrief: SWIFT Corporate Forum Sydney 2016

2016å¹´9æœˆ23æ—¥

Debrief: SWIFT Corporate Forum Sydney 2016

I had the pleasure of attending the SWIFT Corporate Forum on Tuesday 13th September 2016, I wanted to share theâ€¦

4 æ¡è¯„è®º
Is your Cash Visibility keeping you awake at night?

2016å¹´9æœˆ7æ—¥

Is your Cash Visibility keeping you awake at night?

I consider myself a practical type of a person, I like things to be simple, easy and working for my benefit so that Iâ€¦

See all articles

A better approach, Cognitive Architectures are the Future of AI

Arthur Wielgosz

Solutions Architect / Full Stack Developer / AI Consultant | Enterprise Platforms

é¢†è‹±æŽ¨è

Arthur Wielgoszçš„æ›´å¤šæ–‡ç«

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Exploring the Pros and Cons of Artificial Intelligence (AI): A Comprehensive Analysis

What is DeepSeek? Understanding the Impact of This Game-Changing AI Tool

Introduction to Generative AI and LLMs: Revolutionizing the AI Landscape

Cracking the Code of GenAI: Insights from a Developer's Lens

The Future of Artificial Intelligence: How Machine Learning is Revolutionizing Industries

Weekly Artificial Intelligence Newsletter

Unraveling the AI Odyssey: From Turing Test Triumphs to GPT-3 Glories - A Comprehensive Guide to Artificial Intelligence Evolution

A Glossary of Common AI Terms: Part I

The AI Horizon: GPT4o, Gemini, Mustafa Suleyman and Meredith Whittaker

Is AI Really as Smart as We Think? Breaking Down AI's Limitations

é¢†è‹±æŽ¨è

Arthur Wielgoszçš„æ›´å¤šæ–‡ç«

If you donâ€™t like the gameâ€¦ Cheat!

Revisiting Alan Turingâ€™s Vision: Modern Computational Systems and the Power of Abstract State Machines

Ghosts in the Machine: The Importance of AI Hallucinations

Data-State-Driven Architectures, Abstract State Machines and new AI cognitive though-process architectural patterns

Pushing the capability of AI in Physics

Decoding Consciousness: The Confluence of Information, Thought, and Reality

Emulation of Biological Thought in Artificial Intelligence

The New Payment Platform (NPP) in Australia : Explained

Debrief: SWIFT Corporate Forum Sydney 2016

Is your Cash Visibility keeping you awake at night?

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Exploring the Pros and Cons of Artificial Intelligence (AI): A Comprehensive Analysis

What is DeepSeek? Understanding the Impact of This Game-Changing AI Tool

Introduction to Generative AI and LLMs: Revolutionizing the AI Landscape

Cracking the Code of GenAI: Insights from a Developer's Lens

The Future of Artificial Intelligence: How Machine Learning is Revolutionizing Industries

Weekly Artificial Intelligence Newsletter

Unraveling the AI Odyssey: From Turing Test Triumphs to GPT-3 Glories - A Comprehensive Guide to Artificial Intelligence Evolution

A Glossary of Common AI Terms: Part I

The AI Horizon: GPT4o, Gemini, Mustafa Suleyman and Meredith Whittaker

Is AI Really as Smart as We Think? Breaking Down AI's Limitations

é¢†è‹±æŽ¨è

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†