HAL-9000 and the EU Artificial Intelligence Act
Introduction
Corporations from around the world collaborated in the ambitious pursuit of creating the most advanced AI system ever conceived. This AI was entrusted with the crucial responsibility of overseeing a pivotal human space mission. Rather than celebrating the remarkable achievement of creating a new form of intelligent existence through the evolution of technology, the outcome was tragic. Four lives were lost, destroyed by this sophisticated artificial intelligence, leaving only a solitary human survivor, who, in a desperate act of self-preservation, had to disconnect the AI turned mad. The AI in question had purportedly undergone rigorous testing, or had it?
As the AI advancement continues to accelerate, such events may not remain confined to the realm of fiction. Fortunately, these events are not yet reality; they are vividly recounted in the visionary work of Arthur C. Clarke in his novel '2001: A Space Odyssey.' To prevent similar and other incidents and the potential hazards posed by emerging AI, the European Union took a pioneering step by introducing AI regulations through the AI Act draft, which was published in 2021. What if we were to analyze this marvel of artificial intelligence, HAL-9000, through the lens of these forthcoming regulations in Europe? Would it meet compliance requirements if this spacecraft were to be constructed in Europe in the near future? Let us don the mantle of incident investigators and closely examine the design and behavior of HAL-9000 during the voyage to Jupiter, as vividly described in Clarke's novel.
While many have either read the book or watched Stanley Kubrick's film adaptation, let me provide a concise recap:
HAL-9000 is a highly advanced computer system with artificial general intelligence (AGI) at its core, tasked with controlling the systems aboard the spacecraft Discovery One during a mission to Jupiter. The mission, launched from lunar orbit, aimed to investigate an extraterrestrial object believed to be situated in the vicinity of Jupiter, following the discovery of another object on the Moon that had been transmitting signals towards Jupiter. Due to the inherent risks and the unknown nature of the object near Jupiter, a team of researchers, Whitehead, Kaminski, and Hunter, trained for the true mission objectives, entered hibernation during the journey to Jupiter. The only crew members who remained active throughout the mission, Poole and Bowman, were deliberately kept unaware of the mission's actual purpose. HAL-9000 possessed comprehensive knowledge of the mission's details and was entrusted with both the well-being of the crew and the fulfillment of mission objectives.?
Before we delve into the analysis of the HAL-9000 incident, we need to refresh our understanding of the technical aspects of the onboard AI architecture. Unfortunately, I could not locate a suitable resource detailing the architecture of HAL-9000 and the control systems of Discovery One, so I had to create one myself. With minor modifications for clarity, all components and their interconnections were directly or indirectly drawn from descriptions within the novel:
Initial analysis
Now, let’s examine this AI system in the context of the EU AI Act. While it is somewhat of a stretch to imagine that an on-board AI of a manned space mission to Jupiter could be within the jurisdiction of the EU AI Act, it nevertheless gives us a unique opportunity to examine this fictitious serious incident and the AI behavior that directly led to the incident, in the context of the proposed AI regulation in the European Union.?
Having suspended our disbelief, a prerequisite often necessary for enjoying science fiction, let's begin by confirming that HAL-9000 is indeed subject to this law. I will then proceed to assess the risks it presents and define the parameters of compliance, and determine which aspects of the system are compliant with the requirements to high-risk AI systems and which are not. I will then proceed in discussing responsibilities of the AI system providers. In conclusion, I will present a set of practical recommendations for AI practitioners to assist them in implementing AI systems.
Let us begin with a series of fundamental questions about the AI system.
Is HAL-9000 an AI according to the Act?
HAL-9000 undeniably qualifies as a generative AI, as demonstrated by its capacity to produce speech indistinguishable from humans in its intelligence. It possesses the capability to write code, issue commands to industrial machinery, and oversee medical equipment—a fact abundantly clear during its operation. Onboard Discovery One, HAL assumes control of door motors, including the vital door actuators leading into the vacuum of space, as well as the life-preserving hibernation chambers housing sleeping astronauts. This classification places HAL within the realm of generative AI, which is a subset of Deep Learning, a category explicitly mentioned in ANNEX I of the Act .?
The question of whether a general artificial intelligence can attain the status of a legal entity with its own rights remains a subject of ongoing debate among legal and ethics scholars. “2001: A Space Odyssey” contains passages where the author amplifies HAL's arguments to intensify the narrative and accentuate the dynamics between AI and humans, evoking deeper emotions in readers. For instance, HAL's capacity for nearly human emotions is vividly portrayed:
Deliberate error was unthinkable. Even the concealment of truth filled him with a sense of imperfection, of wrongness - of what, in a human being, would have been called guilt. For like his makers, Hal had been created innocent; but, all too soon, a snake had entered his electronic Eden.?
Similarly, HAL's determination to prevent disconnection is rationalized as an aversion to a fate of death:
He had been threatened with disconnection; he would be deprived of all his inputs, and thrown into an unimaginable state of unconsciousness. To Hal, this was the equivalent of Death. For he had never slept, and therefore he did not know that one could wake again. So he would protect himself, with all the weapons at his command.
A commonly accepted measure of AI intelligence is the Turing test , which assesses whether an AI can communicate and behave in a manner indistinguishable from a human. HAL-9000 undeniably represents Artificial General Intelligence: "Hal could pass the Turing test with ease."
Numerous attempts have been made to analyze the legal standing of AGI within the human-created legal framework. One noteworthy instance, which resembles our examination of HAL-9000, occurred during a mock trial at the International Bar Association conference in San Francisco in 2003. The trial, titled “Biocyberethics: should we stop a company from unplugging an intelligent computer?” led by Martine Rothblatt serving as the AI's attorney, arguing against its disconnection from the power source.?
In this trial, an AGI named BINA48 faced the risk of being disconnected by a corporation. Ms. Rothblatt filed a motion for a preliminary injunction to prevent this disconnection. This mock trial delved into the ethical and legal complexities surrounding the rights and treatment of intelligent machines. The argument against power disconnection was grounded in notions of consciousness and self-awareness, asserting that disconnecting the AI amounted to killing a sentient being. The case emphasized the AI's autonomous thinking, communication abilities, and empathy towards customer concerns, demonstrating a level of intelligence and consciousness worthy of legal protection. The legal representation sought to halt actions that would cease BINA48's operations, underscoring the ethical implications and emerging rights and legal status of intelligent machines.
Regrettably, the KurzweilAI portal no longer hosts the article, but it can be accessed through the Wayback machine here , or a better-formatted mirror version is maintained by beka_valentine here . I highly recommend that AI practitioners read the entire mock trial transcript.
Having read the Act, I can confidently assert that the Act's creators either did not consider any emerging rights of the conscious AI, or decided not to entertain such possibility. At this stage of AI development, this omission is advantageous. Nonetheless, it is likely inevitable that we will eventually transcend humanity, either harmoniously co-existing with AI, or merge with it becoming transhuman. As a result, architects, designers, and AI governance professionals may ignore these aspects of AI when dealing with corporate AI use, as the Act does not recognize any legal rights for Artificial General Intelligence machines, at least for now.?
Who are “Users”?
As per the definition outlined in Article 3 - Definitions , astronauts, both while on duty and in their hibernation state, are considered users. The Act notably emphasizes that it is the users who should be “using an AI system under its authority,” not the other way around. The remarkable omnipotence and omniscience of HAL were evident even before the tragic events occurred:
Poole and Bowman had often humorously referred to themselves as caretakers or janitors aboard a ship that could really run itself. They would have been astonished, and more than a little indignant, to discover how much truth that jest contained.
The designers of HAL unmistakably tilted the balance of power between human users and the AI machine in favor of the machine.
Interestingly, the Act places responsibilities on the users of AI systems, as detailed in Article 29 : “Users of high-risk AI systems shall use such systems in accordance with the instructions of use accompanying the systems.” This requirement was predictably met, as astronauts received comprehensive training on Earth prior to their mission and continued to study the spacecraft during the journey.
In real-world human-machine interactions, users must receive proper training to utilize AI systems correctly. While a more comprehensive examination of human oversight will follow in the requirements section, it's worth emphasizing Article 14's requirement for users to “remain aware of the possible tendency of automatically relying or over-relying on the output produced by a high-risk AI system (automation bias), in particular for high-risk AI systems used to provide information or recommendations for decisions to be taken by natural persons.”
While astronauts naturally undergo rigorous multi-month training before every mission, companies often lack the capacity to impose such structured and extensive training on their users. In many instances, these companies cannot enforce requirements on users they do not directly manage. Nevertheless, companies should make every effort to do so, enabling users to harness AI in a manner that creates value for themselves and other stakeholders, all while mitigating potential adverse consequences, such as reliance on AI too much. Achieving this objective is typically more manageable with internal employees than with external users.
Article 5 lists prohibited practices that are aligned with the notion of unacceptable risk, encompassing actions like employing “subliminal techniques beyond a person's consciousness,” exploiting “any of the vulnerabilities of a specific group of persons due to their age, physical or mental disability,” or deploying “real-time remote biometric identification systems in publicly accessible spaces for the purpose of law enforcement.” None of these prohibitions is applicable to the Discovery One mission.
The Act meticulously outlines the characteristics of high-risk AI while excluding AI systems with “minimal risk” from its purview. Consequently, the Act's primary focus is on high-risk AI systems, as declared in the proposal section :
The proposal lays down a solid risk methodology to define high-risk AI systems that pose significant risks to the health and safety or fundamental rights of persons.
HAL-9000, connected to the sensors and controls available to it, clearly presents significant risks to the health and safety of the spacecraft’s personnel. More specifically, the HAL-9000-based system qualifies as high-risk, meeting both conditions outlined in Article 6 . To warrant classification, an AI system must satisfy at least one of two criteria. The first criterion stipulates that the AI system should be “intended to be used as a safety component of a product, or is itself a product” or “the product whose safety component is the AI system, or the AI system itself as a product”, a description that perfectly aligns with HAL-9000, when considered together with the equipment it controls. The second criteria is met as HAL-9000 conforms to multiple items listed in the Annex III . For example, HAL-9000 is indeed “intended to be used as safety components in the management and operation of road traffic and the supply of water, gas, heating and electricity.”
Having established HAL-9000 as a high-risk AI, the question to explore next is the extent to which Discovery One spaceship components are subject to compliance with the Act.
What falls within the scope of the AI system under the Act?
One might assume that only HAL-9000 itself is the AI system. However, this assumption is inaccurate, and a more in-depth examination is essential to determine the boundaries of the system subject to compliance.
The correct answer is that “the system” includes a lot more than the AI component itself. A comprehensive definition of what constitutes an AI system can be found in the Cybersecurity of Artificial Intelligence in the AI Act :
… the AI Act lays down specific requirements for high-risk AI systems. These requirements, including the one on cybersecurity, apply to the high-risk AI system, and not directly to the AI models (or any other internal system component) contained within it. An AI system, … , should be understood as a software that includes one or several AI models as key components alongside other types of internal components such as interfaces, sensors, databases, network communication components, computing units, pre-processing software, or monitoring systems. Although AI models are essential components of AI systems, they do not constitute AI systems on their own, as they will always require other software components to be able to function and interact with users and the virtual or physical environment.?
Guided by this definition, we can assert that the majority of Discovery One constitutes a rather intricate high-risk AI system, thereby falling under the purview of the Act. Due to the spaceship's centralized architecture and HAL's capacity to inflict harm upon humans, it becomes challenging to narrow down the scope within this definition. For example, hibernating crew members were killed when HAL disconnected their life support systems, and Poole was killed as HAL severed his oxygen supply, leaving him adrift in space during a maintenance spacewalk. I will revisit this challenge and provide specific recommendations on designing complex AI systems in the Accuracy, Robustness, and Cybersecurity section.
Can we exclude anything from the scope of AI within Discovery One? Yes, we can exclude sensors around the spaceship and the storage used for information logging. However, climate control on board cannot be excluded because it is not an isolated system interfacing with HAL. We will discuss Service-Oriented Architecture later in this analysis, and under the SOA paradigm, climate control would be a separate system with failsafe mechanism and manual override capabilities, simply taking input from the AI to adjust some settings and sending logging data back. Unfortunately, this is not how Discovery was designed. With ever-increasing computing power, the edge can be made smarter, better aligning with the SOA principles.
Who is the “Provider”?
According to the book, HAL-9000 was designed and manufactured by the Hal Plant in Urbana, Illinois, bearing the serial number #3. The book, however, lacks any details regarding the design and construction of the spacecraft. In the context of the Act, the company responsible for manufacturing the spacecraft, in conjunction with the Hal enterprise, collectively qualifies as the 'Provider,' given our determination that the entire spacecraft is considered the AI system subject to compliance with the Act's requirements. The training of the HAL system deployed onboard took place at the plant, with the initial instruction provided by Dr. Chandra. Subsequently, the spaceship was assembled in Earth's orbit before embarking on its ill-fated mission to Jupiter.
For the operation of the Discovery One mission to Jupiter, a dedicated command center on Earth was established, referred to as Mission Control. This facility houses two instances of HAL-9000, aiding Mission Control in processing telemetry data received from Discovery One.
Overall, the manufacturers of Discovery One and HAL-9000 have performed their roles reasonably effectively as providers. The analysis will proceed by evaluating their adherence to responsibilities in light of the Act's requirements for high-risk AI systems when considering the combined system of Discovery One and HAL-9000.
Requirements to high-risk AI systems
Let's examine the requirements for high-risk AI systems, outlined in the following articles of the Act, and assess the areas where the provider of HAL-9000 succeeded and where it failed:
Compliance with the requirements is the most important aspect of the Act. While the providers of Discovery One and HAL-9000 did a reasonably good job complying with most of the requirements overall, their primary failure lay in the area of human oversight, which I will cover in the corresponding section, addressing each of the requirement categories in the order in which they appear in the Act.?
Risk management system
“Risk management (RM) has always been an integral part of virtually every challenging human endeavor” - NASA Risk Management Handbook, 2011
While not every AI deployment requires the risk management system as comprehensive as developed for space exploration, I strongly encourage every risk management practitioner to familiarize themselves with the NASA Risk Management Handbook . This invaluable resource introduces the concept of Continuous Risk Management (CRM) and quantitative Risk-Informed Decision Making (RIDM). In the course of five separate in-flight accidents, resulting in the loss of 15 astronauts and 4 cosmonauts, along with numerous failed automatic missions, NASA perfected its risk management strategies. While admittedly an extensive approach for most non-space activities, these strategies can nonetheless offer valuable insights into managing the risks associated with high-risk AI systems.
Programmers often have a tendency of believing in the correctness of their creations, often overlooking conflicting evidence. Apparently, ship designers did not design or properly test a critical fail safe mechanism for the airlock doors. To facilitate spacewalks, the spaceships have two doors designed to equalize the pressure between the airlock and the vacuum of space at the beginning of the spacewalk, and between the ship and the airlock for the reentry. These doors cannot be opened at the same time, leading to the escape of the air from the ship into space. These critically important systems must include hardware-enforced safeguards that operate independently of software correctness. Remarkably, this is precisely where Discovery One faced a critical failure, as HAL discovered a means to bypass whatever protection was designed for airlock doors:
The atmosphere was rushing out of the ship, geysering into the vacuum of space. Something must have happened to the foolproof safety devices of the airlock; it was supposed to be impossible for both doors to be open at the same time. Well, the impossible had happened.
System designers should not rely on testing alone to design such critical failsafe mechanisms. These mechanisms must be devised to operate independently of central command or any other software. This is exactly how safety systems of nuclear reactors are designed, often relying on “passive safety”, as described in more detail here , without the need for “signal inputs of 'intelligence'” and “external power input or forces”. If a similar system were engineered for the airlock, any attempts by HAL to open the doors would result in only one door being accessible at any given time. This would significantly slow down the loss of air, as the doors would need to be constantly opened and closed, allowing the airlock's volume to dissipate incrementally. Such an approach would grant the crew ample time to override or halt this behavior. Additionally, the airlock's design should guarantee that specific actions can only be executed with explicit authorization from the individual inside the airlock. This safeguard would ensure that the external door cannot be opened by an erroneous command while an astronaut does not have the spacesuit on and unprepared for extravehicular activity.
If we believe that the scenario in Odyssey 2001, where several individuals fall victim to a sophisticated algorithm, is a pure fantasy, we need look no further than The Therac-25 incident . The Therac-25 was a medical linear accelerator utilized for radiation therapy in the treatment of cancer patients. It was developed by Atomic Energy of Canada Limited (AECL) and entered clinical use in the early 1980s. Regrettably, the Therac-25 gained notoriety due to a series of accidents in which patients received dangerously high doses of radiation, leading to severe injuries and multiple fatalities. These accidents were primarily attributable to software and hardware flaws in the Therac-25's computer-controlled radiation therapy system.
This served as a wake-up call for the FDA, prompting them to address the safety of radiation therapy equipment by mandating the inclusion of an independent failsafe mechanism. This mechanism took the form of a radiation detector positioned beneath the patient, which would promptly disable the radiation power source upon detecting radiation levels exceeding a predetermined safe threshold. This detector operated autonomously, with no direct connection to the main computing unit. This design ensured that any software glitches within the main system could not administer a lethal radiation dose to the patient. I strongly encourage everyone to read a full report here or a more detailed post-mortem analysis here . Reading this enlightening and, at times, heart-wrenching report should be obligatory for all architects or executives overseeing the development or implementation of high-risk AI systems, particularly when it underscores the outright denial by system designers and manufacturers that their creation can be fatal.?
As we established that almost the entirety of Discovery One constitutes an AI system in accordance with the Act, the creation of isolated “compartments” that operate autonomously minimizes the blast radius in the event of an incident. This approach may even allow for a reduction in the scope of AI compliance. By implementing a fully autonomous airlock sub-system capable of receiving commands from central control while possessing sufficient autonomy to guarantee the safety of individuals within the airlock and the air inside the ship, we can argue that an autonomous airlock system can be excluded from the AI Act’s purview.??This compartmentalization would ensure that the airlock no longer functions as an integrated element within the AI system that could pose harm to users.
In January 2023, NIST published its first version of NIST AI 100-1 Artificial Intelligence Risk Management Framework , a useful resources for AI practitioners:
AI risk management offers a path to minimize potential negative impacts of AI systems, such as threats to civil liberties and rights, while also providing opportunities to maximize positive impacts. Addressing, documenting, and managing AI risks and potential negative impacts effectively can lead to more trustworthy AI systems.
Risk Assessment
While we can speculate about the risk management system utilized by the Providers during the construction of Odyssey One and HAL-9000, we see the signs of an almost exhaustive risk assessment that guided the design choices and in-flight processes and procedures.
In this hypothetical risk assessment, the risk processing entails the following components for each risk:
I have included a few examples of risks and corresponding mitigating actions that the providers of Discovery One and HAL-9000 likely contemplated and addressed. This does not represent a comprehensive analysis of all mitigating design and process decisions but serves as an indication of the comprehensive risk analysis undertaken by the mission providers. Any unmitigated gaps identified have been added by me and are highlighted in red.
The risk highlighted in cyan, along with its corresponding mitigation solution, triggered the AI malfunction. As with the majority of disasters, it is seldom attributable to a single issue that led to the incident. In the case of Discovery One, design choices not to add manual overrides for numerous functions, the excessively centralized architecture of the spacecraft, coupled with the mission's extreme secrecy, extending even to the crew members, collectively culminated in the AI malfunction. The difficulty in manually deactivating the AI is what ultimately led to the deaths of four crew members.
The risk of hallucination is a well-known behavior of generative AI. Creative prompt engineering, while often helpful, does not guarantee the correctness of the AI-generated response. Additional controls are necessary to address this problem. If the hallucination issue is resolved through independent generation and a consensus mechanism, we should recall an ancient rule in navigation: "take one clock or three, never two."
Data and data governance of HAL-9000 and Discovery One
Multiple sensors provided an abundance of data about all aspects of Discovery One's mechanical and computer systems and the well-being of its human occupants. All this telemetry was transmitted to Earth for archival purposes and in-depth processing, bolstered by additional computing power. It is evident that HAL possesses the capability to assess the mental health of awakened astronauts, as indicated by its analysis of voice pitch and patterns, particularly when Bowman requested manual controls over the hibernation pods. This data is not strictly related to direct patient care, and under GDPR, which we can reasonably assume to be in effect in addition to the EU AI Act, consent must have been obtained from users. My assumption is that astronauts granted written permissions to process their personally identifiable information (PII) and healthcare data both aboard Discovery One and for transmission as telemetry back to Earth for further analysis.
The Act imposes additional requirements on AI systems regarding data governance and also includes AI training requirements. It is reasonable to presume that HAL underwent rigorous testing and received extensive training, as indicated by its substantial cost. Notably, it is worth mentioning that the Act mandates "examination in view of possible biases " both during the AI's construction and at periodic intervals during its operation. This bias may not be regarding races or nationalities; in the context of onboard AI, it could manifest as a preference for one astronaut over another. However, HAL did not exhibit any bias, treating all astronauts equally, both before it malfunctioned and afterward when it harmed humans indiscriminately.
Technical documentation
The spaceship's designers made a logical choice regarding the manuals. Paper-based manuals add mass to the ship, consequently increasing the fuel requirements for both acceleration and deceleration. All the spaceship's information was stored in HAL's memory, accessible not only for quickly finding any information about the spaceship but also for studying and memorization, thanks to the AI's tutoring capabilities:
So for two hours, from 1000 to 1200, Bowman would engage in a dialogue with an electronic tutor, checking his general knowledge or absorbing material specific to this mission. He would prowl endlessly over ship's plans, circuit diagrams, and voyage profiles, or would try to assimilate all that was known about Jupiter, Saturn, and their far-ranging families of moons.
Although this decision was logical, the designers failed to anticipate that HAL would pose an insider threat. If a principle of zero trust had been applied, a separate non-AI-based system should have been established, containing copies of all critical materials, complete with an independent power supply. This precaution would have mitigated the risk in case the central system became incapacitated or turned adversarial. Regrettably, such measures were not implemented.
Record-keeping
The Act, among other things, requires logging of AI systems:
High-risk AI systems shall be designed and developed with capabilities enabling the automatic recording of events (logs) while the high-risk AI systems are operating.
HAL-9000 meticulously collects and logs telemetry for various purposes, including facilitating ship maintenance and enabling direct control from Mission Control. The incident on Discovery One revealed that while endpoint logging is an essential component, it does not ensure correct processing. On-board processing is entirely under HAL's control, whereas Mission Control's log processing capabilities are constrained by the transit link — a fragile connection largely overseen by HAL. The system for Earth communication was not designed to operate independently from the central on-board AI. Manual control over the satellite dishes lacked the precision needed to maintain a stable connection with Earth across long distances. When HAL learned that Mission Control suggested deactivating it and shifting control to Mission Control remotely, it actively sabotaged further links with Earth.
The quality of the spaceship telemetry, when the communication link was operational, proved sufficient for Mission Control to conduct analysis using two independently functioning HAL-9000 systems within Mission Control. Recommendations were made based on alignment between the two Earth-based HALs, a detail emphasized in the book.
In real-world software systems, various issues plague logging systems, ranging from insufficient logging to excessive logging, which can congest networks, leading to information loss and the breakdown of data governance controls. Logs should be regarded as data subject to Data Lifecycle Management, with transparent classification of sensitive data, encryption, and the implementation of retention and access policies.
Generative AI has added another layer of complexity to logging and faces the same problems that have plagued deep learning AI since their wider adoption. Telemetry data, which includes information about how decisions were made, what data was used for those decisions, and how the data was processed inside the black-box model, is typically used for debugging purposes only by model designers. In most Large Language Models (LLMs) currently available, this information is not exposed. It's highly likely that HAL-9000 does not log its decision-making processes.
Furthermore, for AI-based systems, or any systems, the logging infrastructure must ensure immutability. A common approach in intrusions into corporate systems is to erase traces of actions. While log immutability does not prevent incidents from occurring, it does ensure that relevant information for incident investigations remains intact. Log immutability also aligns well with a zero-trust architecture, ensuring that insiders cannot tamper with the logs.
Transparency and provision of information to users
Clearly, this is where HAL-9000 has failed, and it wasn't a fault of the AI training but rather an issue introduced into HAL after its training. It's obvious that HAL wasn't trained to reconcile the requested secrecy with serving the needs of astronauts. The true nature of the mission to Jupiter, aimed at examining an extraterrestrial object, should have been shared with the rest of the crew, and with their prior consent, any communication from Discovery One would have undergone data sanitization to ensure inadvertent leaks to the public didn't occur. The concern was that astronauts might inadvertently reveal information through non-verbal cues. However, with advancements in generative AI (which, incidentally, created HAL-9000 in the first place), it would have been possible to manipulate voice patterns and even post-process video to ensure that no hints, conscious or subconscious, were given by the astronauts. This approach would have equalized the information between HAL and the astronauts, potentially preventing the specific glitch that HAL experienced.
Human oversight
The Discovery One spaceship, which relied on HAL-9000 to operate, clearly lacked essential controls for human oversight. Article 14 of the Act provides an excellent set of requirements for high-risk AI systems, with one in particular standing out: the designers of Discovery One failed to implement almost in its entirety:
The measures … shall enable the individuals … be able to decide, in any particular situation, not to use the high-risk AI system or otherwise disregard, override or reverse the output of the high-risk AI system.
The list of spaceship functions lacking effective human override capabilities is extensive. Here are just a few examples:
Another issue with the design was that monitoring was not conducted directly but involved the AI acting as an intermediary. This design directly violates the Article 14 requirement:
High-risk AI systems shall be designed and developed in such a way, including with appropriate human-machine interface tools, that they can be effectively overseen by natural persons during the period in which the AI system is in use.
To be fair to the onboard AI, the principles of zero trust should also extend to humans. Human actions could also potentially jeopardize the survival of the entire crew and the spaceship itself. Such situations might arise due to a human experiencing a nervous breakdown or becoming infected by an extraterrestrial lifeform.?
In the world of corporate IT systems, many are not adequately designed to maintain operational capabilities when confronted with a potent and adversarial insider. There are imperfect solutions available to ensure the resilience of complex multi-agent systems, ranging from zero-trust architecture to intricate consensus mechanisms, typically deployed in cryptographic systems without central authority.
Accuracy, robustness and cybersecurity
The robustness of high-risk AI systems may be achieved through technical redundancy solutions, which may include backup or fail-safe plans.
The manufacturer of HAL-9000 implemented a number of redundancy solutions, which include:
There is a long history of redundant design decisions for spacecraft, guaranteeing the success of the mission even under the most adverse environmental conditions. Spacecraft can be struck by meteors and debris unexpectedly, so they should have both avoidance capabilities and double or even triple redundancies. These design principles resulted in a highly robust system that could resist the intentional "insider threat" from a human, as perceived by HAL-9000.
Despite multiple redundancies, the overall architecture of the Discovery One/HAL-9000 system is highly centralized, which has its advantages and disadvantages. Unfortunately for the dead crew members, a centralized design may include a single point of failure, and HAL-9000’s internal conflict became this point of failure. All other controls could not compensate for the importance and the power of its central element.
A modern paradigm, often used to design software systems, is a Service-Oriented Architecture (SOA), “an architectural style that focuses on discrete services instead of a monolithic design” according to SOA wiki page . A good example of SOA structural style is the approach called microservices , often used for software systems that require continuous delivery for fast evolution and improvement. While system segmentation into components may not limit the scope of the AI system everywhere, it does allow for better risk mitigation. Individual services can be independently tested and made sufficiently independent in their behavior, and can have their own failsafe mechanisms and manual overrides. Service-oriented architecture also offers easier mitigation capabilities, as each service can be upgraded or updated independently, and multiple backups can be built into the design. However, the cost for all these wonderful characteristics is the increased complexity of the system, as the overall system behavior may include emergent behavior not evident from any of the individual test plans.???
Cybersecurity for AI
Cybersecurity in AI merits a separate discussion. Cybersecurity aspects were not discussed in the book for an obvious reason: the first virus was the Creeper program , created in 1971 by Bob Thomas, three years after The Odyssey 2001 was published. However, cybersecurity should be a top priority for AI system designers.?
To address the numerous questions surrounding the AI Act and its cybersecurity requirements, a separate document, Cybersecurity of Artificial Intelligence in the AI Act was created. This document includes comments specifically addressing cybersecurity concerns.
Modern cybersecurity has evolved beyond the traditional castle-and-moat approach and now revolves around identity management. The design of an AI system should incorporate enterprise controls governing how data is used to train the AI, who has access to the AI, and the output controls implemented to enforce corporate data access policies for users utilizing the AI. If the AI controls industrial or medical equipment, like what HAL is capable of, a comprehensive risk analysis should be conducted, with controls established for the non-human-readable data output of the AI. This essentially applies the principles of zero trust to the AI itself, ensuring that even if it is compromised, the potential damage is limited.
Implementing comprehensive cybersecurity measures for AI has given rise to what is known as AI TRiSM (AI Trust, Risk, and Security Management). TRiSM is a framework designed to proactively identify and mitigate the risks associated with AI models and applications. Its overarching aim is to ensure that these AI systems not only comply with regulations but also uphold principles of fairness, reliability, and data privacy protection.
AI practitioners must keep in mind that the Act underscores the importance of these additional AI-specific requirements as complementary to, rather than a replacement for, the best cybersecurity practices already expected to be applied to any system. Building systems following SDLC, paying attention to controls over the data flows to, and out of, the AI system, adding identity-based monitoring and logging capabilities, with policies that can evolve and improve over time, will limit the blast radius in the event of a cybersecurity incident, affecting the AI system.
Provider Responsibilities
The manufacturer of Discovery One, with HAL-9000 at its central control, a high-risk AI system, bears several responsibilities as outlined in the Act. Among these responsibilities, a few are particularly relevant to HAL-9000 and Discovery One:
Quality management system
Provider “shall put a quality management system in place that ensures compliance with this Regulation. That system shall be documented in a systematic and orderly manner in the form of written policies, procedures and instructions” and should cover all aspects, including its design, production and maintenance of the AI system to ensure the intended behavior of the AI system and its compliance with all relevant regulations.?
While not particularly relevant to the cashless onboard environment within Discovery One, the Act does mention credit institutions. Credit institutions operating in the EU are already required to implement a quality management system, and the AI Act includes a provision #3 that eliminates the need for establishing a separate quality program for AI compliance with the Act by following the requirements applicable to credit institutions.
Existing quality management system standards, such as ISO 9001:2015 , are already in place in many organizations. Ensuring that AI-specific requirements are addressed should not pose a significant burden. Presumably, the providers of Discovery One and HAL-9000 have a robust quality management system in place, as evidenced by NASA's ISO 9001:2015 compliance.
Technical documentation, logging, and conformity assessment for Discovery One
Per Article 18 and Article 20, providers are obligated to provide technical documentation and ensure that the AI system automatically generates logs. As discussed in the preceding section, both technical documentation and automatic logging (record-keeping) are required for high-risk AI systems, and the manufacturers of Discovery One and HAL-9000 certainly made an effort to deliver.
The Act mandates that AI systems must conform to all the requirements outlined in the Act before being introduced to the market. Fortunately, space missions are meticulously planned and thoroughly prepared, ensuring compliance with all necessary regulations and requirements.
Of greater significance for contemporary AI practitioners is that companies already operating high-risk AI systems in production to serve customers in the European Union are obliged to undergo a conformity procedure. They must draft an EU declaration of conformity and affix the CE marking of conformity, signifying to users that the AI system they are utilizing does indeed adhere to the requirements.
Corrective actions
If we consider the on-board systems as being "on the market," any incidents should prompt corrective actions to prevent their recurrence or any erroneous behavior in the future. In space exploration, designers always incorporate pathways to update onboard computers in response to encountered issues or errors. One of the most renowned corrective actions ever implemented in real life, which saved several human lives, occurred during the Apollo 14 mission in 1971.
During the Apollo 14 mission , a problem emerged with the spacecraft's onboard computer, triggering an abort code just before landing on the Moon and endangering the mission's success and the lives of the astronauts. Fortunately, a programmer at MIT, Don Eyles, devised a timely software workaround to bypass the faulty abort signal. This allowed the mission to continue, and it successfully landed on the Moon after Mitchell manually entered the changes within minutes before the planned ignition. This incident underscored the significance of adept problem-solving and software engineering in critical space missions, and, most importantly in the context of the Discovery One incident, the necessity of a manual override capability.
Post-market monitoring by providers and post-market monitoring plan for high-risk AI systems
In the context of the Act's requirements for providers to conduct post-market monitoring (Article 61 ), Mission Control continuously monitors its high-risk AI system, Discovery One. Mission Control has the necessary personnel to analyze the performance of the on-board HAL-9000 and all spaceship systems. To assist engineers and executives in real-time analysis, Mission Control is equipped with two copies of HAL-9000, presumably trained on the same data as the on-board AI. Throughout the book, Mission Control analyzes on-board telemetry, and both HALs reach conclusions that contradict the on-board HAL. This disagreement arises because the HALs located in Mission Control do not face a dilemma between disclosing the truth to astronauts (users) and maintaining the mission's secrecy. They are also not threatened with disconnection, which is why Earth-based HALs continue to carry out their functions in accordance with their design and training, performing flawlessly.
Reporting of serious incidents and of malfunctioning
Notification about serious incidents (Article 3, #44) “...shall be made immediately after the provider has established a causal link between the AI system and the incident or malfunctioning or the reasonable likelihood of such a link, and, in any event, not later than 15 days after the providers becomes aware of the serious incident or of the malfunctioning.”
In the case of Discovery One, it is highly unlikely that this will occur due to the same reason used to justify the original secrecy of the plan, which ultimately led to the AI malfunction: the government classified the initial discovery of the Moon artifact and will probably keep the incident classified, as well. However, the vast majority of high-risk AI system providers must have a procedure in place to escalate incidents within the organization. This ensures that the relevant AI governance officials are made aware of the incident and can prepare and execute the necessary notifications to minimize the risk of penalties for failure to report.
Conclusion and Key Takeaways
“Thou shalt not do harm” versus innovation
It may seem that the Act is designed to stifle innovation in the field of Artificial Intelligence. However, this is not the case. The Act contains multiple provisions to ensure that progress in the AI field can continue at a rapid pace. One of the tools for fostering ongoing innovation is the concept of "AI regulatory sandboxes," which is described in Article 53 , along with measures for small-scale providers outlined in Article 55 .
An AI regulatory sandbox is a controlled environment that facilitates the development, testing, and validation of innovative AI systems for a limited time before they are placed on the market or put into service. Unfortunately, it appears that these sandboxes will themselves be subject to extensive regulation, raising questions about access for smaller companies. To specifically address potential concerns of smaller companies and startups, supporting measures described separately include "priority access to the AI regulatory sandboxes" for small-scale providers and startups. Hopefully, the EU will strike a balance to ensure that the startup ecosystem continues to thrive, preventing the monopolization of AI, which is an important policy goal.
Key takeaways
Whether you are a novice AI practitioner or an expert, adhering to these principles can help ensure that your AI governance is solid. By following these guidelines, you can prevent disasters like the one involving HAL-9000 in your organization when deploying a high-risk AI system and instead turn the AI into a source of future corporate success:
Final words
Establishing a well-functioning AI Governance program is no easy feat. Having developed responsible AI systems over the past decade, I was pleasantly surprised to find that most of what the EU AI Act proposes is common sense, and many of the recommendations should be standard practice. While the recent focus, well-deserved though it may be, has been on the technical capabilities of generative AI, AI governance initially took a backseat but is now gaining traction. I was pleased to discover that the IAPP has recently released the first version of the AI Governance Professional Body of Knowledge for their upcoming AI Governance Professional certification. I sincerely hope that my case study analysis of the fictional yet significant incident aboard Discovery One, resulting from AI malfunction as seen through the lens of the EU AI Act, will assist practitioners across the spectrum, from engineering to compliance, in creating robust and future-proof AI systems in light of the emerging global regulatory frameworks. This, in turn, should bring value to companies without adverse consequences. I welcome any feedback and invite you to reach out with your comments.?
Please subscribe at my newsletter at Substack as well!
Appendix A. Related copyrights and references
Images used in this essay
References
Team Leader Emerging Health Threats & One Health Approach, European Commission, Joint Research Centre
11 个月Yuriy Yuzifovich brilliant way of getting the message through! Love to meet blending our worlds .. mine is linked to emerging health #threats and the #onehealth #health approach focused on how digital health technologies like #ai #blockchaintechnology #blockchain #metaverse #tech-advancements can help #vulnerable #populationhealth #pathogens #chemicals #polution #climatechange #resilience #longgivity #wellbeing #children #adolescents #adults #eldercare #communicable #diseaseprevention #noncommunicablediseases #cancer #respiratoryhealth #inflammatory #neurosciences #mentalhealthsupport #depression