AI 2024, quo vadis?
Introduction
This article aims to provide an overview of the prevailing trends and advancements in AI for the current and upcoming year. While it does not encompass every achievement or improvement in the field, it focuses on highlighting some of the most significant and noteworthy developments.
On-device AI
As previously mentioned (1), although machine learning algorithms were deployed on PDAs as early as the 1980s, the predominant approach over the last decade has been cloud-based AI, requiring a stable and performant internet connection. However, in 2017, Huawei's introduction of the Kirin 970 with its Neural Processing Unit (NPU) marked a significant shift. Subsequently, high-end mobile device CPUs now commonly incorporate dedicated AI processing units, often referred to as neural or AI engines.
At MWC Barcelona 2024 this trend was further emphasized with numerous software solutions showcased. Interestingly, ARM presented a solution running on standard ARM CPUs rather than NPUs (2). Despite this variation, mobile device manufacturers have recognized the growing importance of on-device AI, which eliminates the dependence on internet connectivity. Consequently, they have begun to offer solutions featuring speech and image generation models.
Integration of generative AI
As text, image, and video generated by AI continue to improve daily, there is also a convergence between different tools. OpenAI's integration between DALL-E 3 and ChatGPT-4 exemplifies this trend (3), with further integrations emerging, including video-generating AIs. Merging various AI models, often referred to as multimodal AI (4), not only provides a unified platform for creativity across diverse media but also enhances conversational capabilities and comprehension. Prompt inputs now extend beyond written text alone, encompassing uploaded files and images, from which information can be extracted or commented on, with output generated in various formats such as text, audio, images, and videos.
While the aforementioned integration enhances the user experience seamlessly, numerous evolving API integrations offer developers endless opportunities to create custom and unique business solutions. Consequently, it is highly probable that new smart virtual assistants and solutions will emerge in 2024, leveraging existing proprietary or open-source models.
Voice modeling
Speech synthesis has a lengthy history intertwined with computer technology, but for decades, synthesized speech sounded artificial. Two significant improvements now define the current trend. Firstly, there's been a substantial performance enhancement in API integrations. Presently, numerous cloud-based text-to-speech models can provide near real-time output over the internet (5). Secondly, there's been a remarkable improvement in voice quality. Artificially synthesized voices sound increasingly natural, and advancements in voice sampling technology now allow for the cloning of a person's voice with as little as 15 seconds of sample data (6).
This remarkable technological development opens doors to significant opportunities in learning and marketing platforms. Moreover, it is expected to have a prominent impact on the audiobook and storytelling industry, where an actor's voice can be sampled and then utilized, even in a different language. However, it's essential to acknowledge the downside of this technology, as it enables persuasive fraud through voice modeling, where voices of relatives or superiors can be cloned and used for social engineering.
Visual modeling
Discussing the potential misuse of evolving technologies for scamming or disseminating fake information, it's imperative to pinpoint the advancements in visual modeling. Generative AI has reached a stage where it can clone a real person in just 5 minutes, creating a virtual character capable of speaking written text, complete with the sampled voice of the real person, synchronized lips, facial expression, and even gestures (7). This goes beyond the concept of a traditional avatar, as seen in video games or platforms like Meta; instead, it creates a virtual clone that is often indistinguishable from the real person.
Another aspect of this remarkable technology is its ability to create a virtual human entirely without a real human reference. Notable examples in this realm are Alba Renai and Aitana López, two prominent AI influencers created solely from input based on "the preferences and interests of Generation Z" (8). The fact that these virtual celebrities can influence the thoughts and preferences of hundreds of thousands of real human followers is precarious. However, what's even more unsettling is that many individuals either cannot believe or are unaware that these AI influencers are not real human beings (8).
AI in robotics
While humanoid robots may not yet be mistaken for human beings, their locomotive capabilities are advancing rapidly. From the quirky Boston Dynamics robot in 2005, a second-generation Atlas robot with remarkable mobility has emerged. Boston Dynamics isn't the sole player in the field of humanoid robotics; it shares the stage with other major players such as Nvidia, Tesla (9), and Accenture (10).
Equally significant are the advancements in non-humanoid robots, particularly those utilized in production or supply chains. These robots are transitioning from rigid rule-based systems hardcoded for specific tasks and motion sequences to agile systems capable of assessing their environment and adapting their workflows accordingly (11). This transition is facilitated by AI, which analyzes the surrounding environment, now even making first steps in recognizing the physical and emotional condition of nearby human beings. In this case, AI plays a pivotal role in a long-term ambitious scheme of integrating the work of robots and humans in the same space and time.
In summary, while AI traditionally supported the design and development of robots from the initial stages, it is now being directly integrated into robots themselves, enabling features such as robotic process automation (RPA), reinforcement learning, adaptive procedures, and autonomy.
Autonomous Agents
The concept of autonomy naturally transitions to the realm of autonomous agents, which are smart systems capable of making independent decisions regarding the organization, execution, and completion of tasks based on predefined goals. This capability can be applied across a spectrum of scenarios – from ordering a coffee from an embodied agent to commissioning software development to a non-embodied agent. In this context, embodied agents refer to physically present systems, typically in the form of robots, equipped with specific perceptual and adaptive capabilities. Non-embodied agents, on the other hand, are virtual assistants specializing in task design, prioritization, and execution across various business workflows, including creating source code, developing web applications, generating social media content, or producing marketing material (12).
A notable distinction between generative AI and non-embodied agents is that the latter can execute a series of tasks sequentially and with dependencies, without requiring human intervention. By integrating generative AI with memory and tools, these agents can be deployed across diverse domains, ranging from manufacturing, logistics, and customer care to algorithmic trading, precision farming, and military applications (13). While the enterprise benefits of autonomous agents are substantial, they may also entail workforce reductions, particularly in roles where tasks can be accomplished more efficiently or effectively with AI assistance.
AI backed warfare
The discussion would be incomplete without acknowledging the significant presence of AI in the military domain. Since Alan Turing's Colossus computer, IT technology has been heavily promoted and financially supported by the military. It is only logical that the military is among the first sectors to leverage the most advanced computer technologies, including AI.
AI is being evaluated and implemented across numerous fields within the military, ranging from battle management systems requiring rapid analysis of complex situations to autonomous vehicles, jet fighters, sentries, and drones (14). While AI continues to enhance the efficacy and supremacy of conventional weapons and systems, it also represents the next frontier in non-conventional areas, particularly in cyberspace (15). This extends warfare beyond targeting military objects to encompass strategic infrastructure and facilities from both the public and private sectors.
Despite moral and ethical considerations, the military explores any combination of the technology areas outlined in the preceding sections if it holds the potential for combatant superiority and reduction of own lethal outcomes, including the development of fully autonomous armed robots. Notably, the adoption of autonomous armed drones, as demonstrated by the Ukrainian acceptance of Saker Scout, is not exclusive to autocratic regimes (16).
The peril of energy consumption
In 2017, Google researchers published the article "Attention is All You Need," igniting a competition for the largest generative AI. The remarkable exponential growth in model size (17) is undeniably impressive but comes at the expense of increased energy consumption and carbon emissions. As previously discussed (18), achieving a scale comparable to the number of synapse connections in the human brain (totaling 100 trillion) with large language models (LLMs) would require at least 50 times more resources than GPT-4.
领英推荐
While one potential solution to mitigate carbon emissions and financial costs is through hardware and software optimization, significant breakthroughs in this direction are not expected soon, although Bloomberg has reported already in 2022 that dynamic consumption accounts for only half of the total energy consumption, with the remainder lost to idle consumption and infrastructure maintenance (19). Despite this insight, energy consumption is projected to continue rising, with model sizes nearly doubling every third month (20). While not as immediately perilous as AI-backed warfare, the long-term implications of escalating energy consumption pose a significant climatic concern that warrants attention and action today.
Summary
This article has only scratched the surface of recent AI advancements. Due to space limitations, it was not feasible to delve deeply into each topic or to cover other significant areas, such as AI for Automotive, AI for Weather Research and Forecasting, AI for Medical and Healthcare, or AI for Office Applications. The breadth of AI applications appears almost infinite, highlighting ample room for further improvement and development. The hype surrounding AI is likely to persist for years to come.
It's essential to note that while AI gets much attention, other emerging information technologies, such as quantum computing and nanotechnology, should not be overlooked. Their evolutionary potential is equally profound, if not greater.
?
(11) https://www.dhirubhai.net/pulse/ai-robotics-advances-automation-human-machine-dave-balroop-6tqwc/
(15) https://militaryembedded.com/ai/machine-learning/ai-mosa-and-the-future-of-secure-uncrewed-warfare
?