Exploring the Demise of Human Space Flight at NASA
Fifty years ago an iconic photo of earth was taken while orbiting the moon. Humans were further from the earth than they had ever been in our history. The bravery of three men, the dreams and aspirations of a nation, and ingenuity of an organization of scientists, explorers and engineers made this picture possible.
In the shadow of this history, the NASA of today does not fly our astronauts into space. Commercial rockets and the Russian space agency Roscosmos fly our astronauts and materials to ISS. Why? How did we neuter one of the most creative and forward leaning organizations in human history?
This article looks at the cultural, environmental, and external factors that shaped NASA and the history of human space flight. These factors played an important part in the loss of two shuttles with all hands. Environmental and external factors formed and kept the pressure on a flawed leadership, communication, and ethical culture. This pressure ensured, that despite organizational changes, the same culture that enabled the shuttle Challenger disaster also enabled the tragic loss of shuttle Columbia.
Lawmakers and the administrations of seven presidents also had a hand in both accidents by demanding faster and cheaper without making allowance for risk or failure. Even a perfect organization would struggle to perform under these conditions and NASA, like most organizations, was far from perfect.
History as a guide
When President Kennedy asked the American people and Congress to commit to putting a man on the moon inside a decade, NASA had about 15 minutes of human space flight experience. Going to the moon and the space race with the Soviet Union was seen by the men and women of NASA as essential to the success of America. NASA engineers, scientists, and mission specialists overcame seemingly impossible challenges during the Apollo area. The culture that put a man on the moon valued competence, character, willingness to accept risk and failure, brutal honesty to the truth, and, most of all, attention to detail.
The successful lunar landing in 1969 succeeded in cementing NASA’s belief in the strength of their culture. Sidney Dekker (2015) in his book Safety Differently highlighted this effect as a major cultural flaw, “The institutionalized fiction was of NASA as an organization destined for success and incapable of failure - negating the budgetary and expertise gap that had opened wide since Apollo times” ( p. 259). With their self-view anchored in the history of the Apollo program successes, leaders did not consider the effect of external forces on the actual working conditions at NASA. Success makes us vulnerable to failure and can impede organizational learning. Throughout the 1970s and 80s NASA lost about 40 percent of its budget, lost many of its technically competent senior managers, and increasing relied on the technical competence and engineering experience of contractors that supported the nation’s space program.
Enter the shuttle program. A complex and high-consequence technology operating in an unforgiving environment on a tight timeline with an increasingly tighter budget. Throughout the 1990’s, budget reductions to the shuttle program, pressure to serve the International Space Station (ISS), and reductions in shuttle program workforce to preserve core bureaucratic NASA functions and reduce costs further borrowed from safety and put the ten space centers and their various pet projects at odds with each other over resources. External resource pressure and internal pressure to deliver faster, cheaper and better manifested in a willingness to normalize deviance. Managers and engineers felt exceptional pressure to minimize the impact of deficiencies and problems to shuttle operational schedules. The machines themselves were aging and shuttle schedules to support ISS slipped behind.
In 2000, President Bush appointed a new director for NASA, Sean O’Keefe, a former Navy Secretary and Deputy Director of the Office of Management and Budget (OMB). O’Keefe was a bureaucrat, not an engineer. O’Keefe went in to clean house; by 2001 the ISS was 4 billion dollars over budget, which also put great pressure on the Space Shuttle Program budget. “O’Keefe’s experience was in management and bureaucratic administration of large government programs – a skill base deemed important by the new Bush administration for the NASA top job. (Dekker, 2015, p. 253)”
O’Keefe’s appointment sent a powerful message to the team at NASA – cost and schedule mattered. And to the engineers on the deckplate, the ones closest to the work, the unspoken message was cost and scheduled mattered MORE than quality and safety. NASA’s once-prized cultural deference to expertise, was replaced by bureaucratic accountability and deference to hierarchy, procedure, and chain of command. President Bush and his staff put NASA on probation, with hard launch dates to support the ISS. If NASA could not keep up, the ISS and shuttle programs would be reconsidered.
In the backdrop of these constraints and pressures, shuttles keep flying. Two launches before the loss of shuttle Columbia, a large piece of foam came off at launch and hit the solid rocket booster. The solid rocket boosters were recovered and engineers could see that the damage was significant. A major piece of debris coming off and striking a part of the shuttle assembly, NASA rules demand that this type deficiency has to be classified as the highest level of anomaly, requiring serious engineering work to explain it away. It has only happened seven out of 111 launches (CAIB, 2003, pp. 125-126). But, people at NASA understand that if they classified this event as a serious violation of their flight rules, they were going to have to stop and fix it. So they classified it as a simpler mechanical problem, avoiding the extra precautionary measures they would have had to take by accurately calling it an in-flight anomaly (safety to flight).
Culture (Leadership, Communications, and Ethics)
NASA managers were not “rocket scientists”. These were very intelligent men and women, but most key leadership positions were occupied by accomplished bureaucrats, not engineers or scientists. Sidney Dekker (2011) in his book Drift into Failure wrote, “People who were the most technically qualified to make decisions about continued operation of the system got to make those decisions anyway” (p. 96).
NASA leaders violated their own first principles of quality and safety before production. Managers ignored or minimized the concerns of experienced engineers. Dissenting opinions were routinely quieted by managers with their own agendas. Engineering decisions were made by committee using power point: “One of the critical features of the information environment in which NASA engineers made decisions about safety and risk was “bullets”…the little black circles in front of phrases that were supposed to summarize things” (Dekker, 2011, p. 112). These technical presentations were highly vetted by lower level managers, “bulletized presentations, collapse data and conclusions and are dealt with more quickly than technical papers” (Dekker, 2011, p. 113).
Slideshows with their highly summarized conclusions overrode all other technical communications, became a short cut, and frankly bad organizational habit: “They dominated technical discourse and, to an extent, dictated decision making, determined what would be considered as sufficient information for the issue at hand” (Dekker, 2011, p. 113). Senior managers and decision-makers accepted these briefs without the benefit of underlying math and physics presented as proof of doing the right thing for the right reasons.
NASA’s Organizational Culture, or in layman’s terms: “how business is done around here”, accepted and lived with deficiencies in the shuttle program. The organization had an entire process to categorize issues and make the important ones visible to mangers and decision makers. However, this process was subverted to minimize the impact of seemingly low-level but high-frequency issues to support an ever-tightening shuttle operational schedule.
Over confidence led to leadership blind spots concerning communication with and oversight of contracted work. Strong communication systems did not exist and leaders relied increasingly on oversimplified technical briefs and decision making short cuts. A belief that after just a few flights the shuttle program was a mature and reliable delivery system with proven systems allowed the shuttle mission management teams to compromise on safety in order to meet schedule. Finally, the Challenger Accident in 1986 revealed NASA had lost its ability to be self-critical and react appropriately to external criticism. Instead of being self-reflective after the loss of shuttle Challenger, leaders clamped down on dissension instead enforcing the party line of how great NASA is and always will be. This in turn led to a habit of deceit where decisions, instead of honoring the principles of quality, safety, and belief in the truth, where beholden to faster, cheaper and better (not attainable goals together).
In deference to the very dedicated and talented NASA team that built and flew space shuttles, we should consider that ethical compromise is rarely a steep slope of decline or prompt drop. Rather it is a series of small trade-offs to get day-to-day business done that is blind to a wider understanding of how the system is being effected. It is very difficult to see this drift when you are on the inside of an organization, much as it is difficult for a navigator to know they have strayed of their course without the consistent marker of the sun or the guidance of a compass. Yet, most, if not all, of the people that comprised the NASA system had every intention to do the right thing. Sidney Dekker (2011) described this phenomenon in his book Drift into Failure:
Linear sequences of cause and effects, where big causes account for big effects, can never fairly capture the workings of NASA and its subcontractors where lots of little things eventually set something large in motion…thousands of little assessments and daily actions…in which everybody came to work to do a good job…dealing with their own daily innumerable kinds of production pressures, their PowerPoint presentations and seemingly innocuous decisions about maintenance and flight safety. (p. 64)
Results
Shuttle Columbia launches and has a piece of foam strike the wing. This has happened before, two of the three times in the past foam has hit the shuttle and not caused significant damage (CAIB, 2003, p. 128). Shuttle management and engineering teams talk themselves into classifying the fact that foam came off two out of three times as a minor material maintenance problem, not a threat to safety.
The space shuttle Columbia was damaged at launch by a fault that had repeated itself in previous launches over and over and over again. Seeing this fault happen repeatedly with no harmful effects convinced NASA that something which was happening in violation of its design specifications must have been safe. Success can narrow the perspective of people inside an event; they lose curiosity about the world around satisfied with the status quo. Individuals, well-meaning individuals, were swept along by the institution’s overpowering desire to protect itself. The system effectively blocked honest efforts to raise legitimate concerns. The individuals who raised did raise concerns faced personal, emotional reactions from the people who were trying to defend the institution. The social pressure that was bred from a culture of compromise effectively silenced any whistle-blowers.
NASA’s actual culture had diverged from its aspirational view of itself. Organizational values such quality, safety, and accuracy were replaced by better, faster, and cheaper. This divergence affected everyone in the organization and drove behavior at the individual level to believe they had to trade safety for mission accomplishment and timeliness. The culmination of behaviors and decisions had eaten deeply into safety until, finally, declining standards of engineering, rising risk, and the aging of the shuttle systems intersected to produce an accident.
The impact of the investigations
The Columbia Accident Investigation Board came down hard on NASA’s culture and ethical practices surmising that human space flight could be safely conducted if NASA only behaved more like other High Reliability Organizations (HRO) such as Naval Reactors (1) or the SUBSAFE (2) program for U.S. Navy Submarines (CAIB, 2003, pp. 182-184). This may not be an accurate view of NASA’s mission. Historically, NASA was formed during the cold war as a response to the Soviet Union’s Sputnik. NASA was expected to take risks, had an unlimited budget, and all of the scientific know-how of a country behind it’s Apollo program. In contrast to programs such as SUBSAFE or Naval Reactors, NASA was not born to be a safety focused or regulatory organization. Professors Arjen Boin of Louisiana State University and Paul Schulman of Mills College (2008) explored this idea: “But unlike HROs, which have a clearly focused safety mission that is built around a repetitive production process and relatively stable technology, NASA’s mission has always been one of cumulatively advancing space?ight technology and capability… nothing about NASA’s human space?ight program has been repetitive or routine” (p. 1054).
NASA managers and the country’s political leaders did not understand this distinction. NASA operated in a constrained and politically charged environment. Leaders wanted success and results immediately, but were completely risk-adverse to losing equipment or men. The quality, checking and double checking, required significant time and money to do right. Congress, the Department of Defense, and other stakeholders wanted rapid turn-around, flexible schedules, low costs, and perfect safety from the same program: “NASA’s political and societal environment, in short, has placed the agency in a catch-22 situation. It will not support a rapid and risky shuttle ?ight schedule, but it does expect spectacular results. Stakeholders expect NASA to prioritize safety, but they do not accept the costs and delays that would guarantee it” (Boin, Schulman, 2008, p. 1055).
It is easy to look at mistakes from the outside and assume the individuals or teams that made those mistakes were deficient in competence or character in some way. Blame the closest to the problem. To great extent, the Rogers Commission for Challenger and, to a lesser extent, the CAIB for Columbia did just that, assigned blame and compared NASA to other HROs designed and operated for an entirely different purpose: “Caught in the middle of an unstable environment in which there is little tolerance for either risk or production and scheduling delays, NASA has become a condemnable organization — it is being judged against standards it is in no position to pursue or achieve” (Boin, Schulman, 2008, p. 1059). This consideration alone directly led to the loss of a second space shuttle. After the Rogers Commission report, political leaders watched changes inside NASA and were satisfied that improvements were permanent and that business as usual could continue. Congress and successive presidential administrations continue to put pressure on NASA to perform and “Do more with less”. In this way learning was blunted at NASA, with all same external pressures still being exerted people were going to continue to react. There was no real hope for change in this environment.
The NASA of today is not much better, The CAIB imposed structural changes on NASA with more bureaucracy, increased reliance on hierarchy, and a new standard of being and acting as a learning organization. What the CAIB report and leaders did not do was address the significant environmental and political pressures from stakeholders. Nor did it did reorganize NASA to address inter-center rivalries and infighting that impacts decision making and slows the sharing of lessons learned. Finally, it failed to address the fear of failure and toxic perfectionism that handicapped important safety innovation and bullied personal into silence. If anything, this is mindset is magnified as a result of losing credibility in the wake of the accident.
It is this fear of failure and with it the layers of corrective actions, checks, procedures, tests, additional analysis, and paperwork that place additional pressure on an already burdened program that will drive the next accident and today makes human space flight an insurmountable task for NASA. The CAIB sought to build additional barriers to failure. What they may not have realized is these same barriers become new obstacles to getting the daily work done; it is another thing to be worked around. Additional layers only serve to reinforce corrosive behaviors, such as denying responsibility and deferring ownership. The system takes over instead of empowering credible individuals with authority, autonomy, and resources to solve problems.
Ultimately, operating complex technology requires a level of human performance (knowledge, discipline, integrity, formality-to name just a few attributes) that is beyond what comes naturally; what the technology demands and what we expect of people is not necessarily consistent with human nature. Hence it is a never-ending challenge to apply the energy and attention required to keep performance at the level required to minimize the occurrence of untoward events
To prevent the deaths of over a dozen astronauts from occurring in vain, the lessons from these tragedies have to be learned and shared at all levels; strategic, organizational, and individual. Stakeholders and operators of complex and high consequence technology have to recognize and own the effects of their behavior. No system is entirely free of risk or designed to be inherently safe. Process, environmental, and personnel safety are not free. All require a real investment of resources - time, money, and talent. All require a deep commitment to organizational integrity and learning. If we are going to put people in space again using our rockets we all have to make these commitments.
Notes:
(1) Naval Reactors is the regulatory organization formed by ADM Rickover to oversee the operation of nuclear power plants onboard U.S. Navy submarines and aircraft carriers.
(2) SUBSAFE is the program the U.S. Navy put in place to assure the safe design, maintenance, and operation of submarines after the loss of USS Thresher in 1963.
References:
Boin, A. and Schulman, P. (2008). Assessing NASA’s Safety Culture: The Limits and Possibilities of High-Reliability Theory, Public Administration Review; Nov/Dec 2008.
Columbia Accident Investigation Board(CAIB) (2003) Columbia Accident Investigation Report, National Aeronautics and Space Administration (NASA), Washington, DC.
Dekker, S. (2011). Drift into Failure: From Hunting Broken Components to Understanding Complex Systems, Ashgate Publishing Company, Burlington, VT.
Dekker, S. (2015). Safety Differently: Human Factors for a New Era, CRC Press, Boca Raton, FL.