AI & Humans: My own reflection on Generative AI, LLMs and the future of AGI.
Alex Laterza
Helping organizations leverage AI Agents at Salesforce || ex-AWS || VC Investor & Advisor
Some General Thoughts...
This is an exciting time to be working in tech and AI. For me AI is a force for good, propelling humanity towards a future where technology amplifies our ability to confront the grand challenges of our time. Like every invention in human history, we will adapt, we will advance and we will progress. AI doomers, Woke AI developers and late adopters will be the losers. And the absolute winners will be those who build models that are more representative of global diversity, minimizing biases and preventing the perpetuation of stereotypes; and those who train their own models on their own IP, because these models will be the unique irreproducible expression of their values, beliefs and culture.
?
My hope for the future of AI is not just rooted in technological advancement but in the potential for AI to keep augmenting human capabilities, solve intractable problems, and enhance our understanding of the world. AI technologies offer unprecedented opportunities to address global challenges, from healthcare to poverty and justice, by leveraging the vast computational power and analytical capabilities at our disposal.
However, realizing this potential requires a concerted effort to ensure that AI development is also guided by ethical considerations, inclusivity in contribution and fruition, and transparency in code and intensions. Open-source initiatives are a crucial step towards fostering this environment where innovation flourishes avoiding the risks of concentrated power and monopolistic control. I think that a monopoly in this case prevents innovation and creators from building new ideas and progress on top of these platforms and models. And I believe there should be a royalty system on any IP to unleash intelligence as well as creative power.
To ensure that the benefits of AI are equitably distributed across society, governments must put in place incentives and disincentives to adopt and speed up AI development. Governments should prepare to step in with subsidies for people losing their jobs because of AI, should pay for training courses, should invest and take equity in enterprises so these people get back on their feet as fast as possible, perhaps creating value on top of these AI platforms.
Challenges and Prospects in the Evolution of Large Language Models (LLMs)
Understanding the World Beyond Text
While LLMs are adept at parsing and generating textual content, their grasp of the physical realm remains superficial. Their analytical strength, grounded in tokenization and identifying statistical correlations, falls short of genuine insight into the complexities of context or the physical laws governing reality.
Reasoning and Strategic Planning Limitations
The outstanding language capabilities of LLMs are greatly advantageous for someone with numerous ideas and a definitive vision, like myself. Yet, for me, the process of generating content word by word is exceedingly energy-intensive and lacks efficiency. This allows my strategic perspectives and expertise to progress initiatives efficiently by streamlining ideation, feedback exchange and next steps determination. Their linguistic competencies effectively amplify my strengths. However, LLMs cannot perform reasoning or planning tasks that require an intuitive grasp of real-world dynamics. ?Their reliance on text as a primary data source limits their ability to conceptualize and navigate tasks involving physical entities or abstract reasoning.
Persistent Memory and Emotional Intelligence
LLMs lack the mechanisms for durable memory storage and retrieval that humans naturally possess, hampering their ability to draw on past experiences in a meaningful way. Additionally, their inability to recognize or respond to emotional cues or navigate social contexts underlines a significant gap in achieving human-like communication and interaction capabilities. Furthermore, humans can generalize from limited data and adapt to new situations by applying abstract principles. LLMs require extensive data to learn and struggle with generalization beyond their training datasets.
领英推荐
Limited Reasoning and Planning
LLMs struggle with tasks requiring complex reasoning or the ability to plan actions based on future goals or past interactions, essential components of human-like intelligence. Their performance and outputs are tightly coupled with the quality and scope of their input data, without the human ability to infer beyond provided information or engage in creative problem-solving.
There are however attempts to bridging the gap with alternative AI models. Models like Joint-Embedding Predictive Architecture (JEPA) are designed to learn from sensory data, aiming to bridge the gap between text-based understanding and real-world comprehension by embedding sensory experiences into AI learning processes. Video Prediction Models (VPM) attempt to understand and predict future frames in videos, offering a pathway to learn about physical dynamics and causal relationships in the environment, which is a step beyond static text-based learning.
Why Humans are not threatened by LLMs
Humans learn and reason through a combination of direct sensory experiences and abstract thinking. This multimodal learning allows humans to understand complex concepts, predict outcomes, and plan actions based on a rich, integrated model of the world—a capability LLMs currently lack. Additionally, humans possess internal models of the world that enable planning, prediction, and causal reasoning. LLMs, conversely, operate on tokenization and sequence prediction without an underlying model of the world's structure or the principles governing it.
I will give you some datapoints to get a sense of the complexity we’re talking about:
·??????? The entirety of publicly available text on the internet used for training LLMs is on the order of 10^13 tokens.
·??????? Each token is typically two bytes, making the total training data approximately 2 x 10^13 bytes.
·??????? Reading through this amount of data would take approximately 170,000 years at a pace of eight hours a day.
·??????? A 4-year-old child has been awake for about 16,000 hours in their life.
·??????? The amount of information that reaches the visual cortex of a 4-year-old is estimated to be around 10^15 bytes, based on the assumption that the optical nerve transmits data at roughly 20 megabytes per second.
·??????? Comparing these figures, 10^15 bytes for a 4-year-old's visual experience significantly surpasses the 2 x 10^13 bytes representing the sum of human textual knowledge used to train LLMs.
Milestones to Artificial General Intelligence (AGI)
My definition of AGI is a human-engineered intelligence system endowed with the capability to self-replicate, adapt, and autonomously evolve into improved cognitive entities without necessitating further direct human input.
To progress toward AGI, AI systems must develop a grounded understanding of the world. This involves integrating sensory data and learning to interpret the environment in a way that mirrors human interaction with the world. Achieving AGI necessitates significant advancements in AI's ability to reason, plan, and execute tasks in complex, dynamic environments, akin to human cognitive processes. As AI systems approach AGI, ensuring their values and decision-making processes are aligned with (or superior to) human ethics becomes crucial. This includes the ability to make morally informed decisions and to understand and respect human values. Future models must be capable of cross-modal data processing, integrating text, visuals, and other sensory inputs to form abstract concepts and understandings, paving the way for more sophisticated reasoning and generalization capabilities.
Yes, in this case I am speciesist.
Helping organizations leverage AI Agents at Salesforce || ex-AWS || VC Investor & Advisor
11 个月Carlos Ignacio Gutierrez, PhD keen to hear your thoughts, my friend.