The ugly child of fear.
Where did the NASA MER rovers or the ESA Proba go wrong?
Easy: They did what they did for 20x than what they were designed for.
Wait, I hear you say: getting your money's worth is a good thing. Well only if you ignore that this habitual exceeding the specification comes at a cost.
It is a failure if a device lives 10x its design - At the very least it is a failure in the estimation of design life.
And worse, since this overengineering is driven by fear it is one of the root causes that makes space industry both slow to innovate and expensive build. Lets have a look!
The devious (development) cycle
Ever heard the saying: we need to increase the reliability because its such an expensive mission? Probably, but would you agree that it is true the other way around, too: its expensive because its such a reliable mission?
Much money is spent to make space reliable. In result, that makes the mission expensive. Because its now expensive we cannot allow failure, so spent to make it more reliable...
The psychology behind it:
Only if you achieve a success, there will be a chance for another mission. This is true in space on all levels of industry and agency. In addition, with most space agency missions having a 4-5 year design cycle every person involved cannot do more than 4-5 of them during their career (until retirement).
On the other hand hardly anybody has been criticised for being cautious, for doing that extra test and going the extra mile. So is it a wonder that we get solutions that are slow and expensive and tend to overperform?
Why is it bad when you get more:
Any parameter of your mission will generally follow a s-curve, with a characteristic knee point as an optimum.
It is general very simple to achieve something that is in line with your peers. As soon as you get to the cutting edge everything becomes expensive. This is true for technological parameters as well as for reliability and lifetime.
The optimum is never in the extreme.
It is much better to build what is actually required than to overengineer a problem.
So achieving 2x of what is required because you fear the small chance that your mission could underperform could lead to a 10x increase in effort and cost.
From "Failure is (not) an option"
The truth is in the numbers - the KISS principle
Professor Udo Renner, one of my former Mentors always asked his students: "If a part has 90% reliability, how do you achieve 99%?" Usually the students answer is quick: "You take two parts in a redundant configuration." Prof. Renner then usually continued asking, "Ok, but what if you want to achieve 100%?" This usually leads to some confusion on the side of the student because even 10 parts in series will not be unfailing, there is always a residual risk. This becomes even more apparent especially once you consider that the decision maker which switches between multiple units is a potential failure source, too.
Prof. Renners answer to this problem is to keep it simple (and stupid):
"A part that is not there cannot fail.
In reality a simple system will often be more reliable than a complex one."
Prof. Udo Renner
Unfortunately, while in reality many simple (single string) satellites have achieved surprisingly long lifetimes it has become the norm to do very complicated analysis in the form of FMEA & FMECA to swat any potential single point failures.
As a result the number of parts in a mission and consequently its complexity have increased over the last decades. Making missions slow and more expensive.
Mission resilience vs reliability
If you build your satellites for reliability (on paper) they will become expensive. If one of them fails there will be no easy way to replace the capacity that you have lost. Whether that loss is caused by an anomaly or enemy action is of secondary importance. For this reason even users that are traditionally looking towards very reliable systems (such as defence) have come to understand that there is strength in numbers.
Note: that above graph shows the calculated reliability. In reality almost all satellites are significantly more reliable than the numbers would suggest.
Space Industries tools to assess reliability do not work
An often ignored fact is that the current tools used by space agencies and old space industry to assess reliability do not work. They are excessively conservative. On the one hand - if you calculate reliability - low cost missions done using the KISS principle should never work past a few month in space - but they often do.
According to standard tools Cubesats and non ECSS micro satellites should never be able to live past a few month.
On the other hand classically built missions are that are designed for 1 year life overshoot these values drastically. The ESA Proba satellites built by Qinetiq are notorious over achievers. So much so that Proba 1 is actually the longest living ESA spacecraft in orbit.
Lets remember that Proba was originally slated as a fast and low cost platform for in orbit demonstration.
Ideally, such an micro satellite IOD mission is repeated regularly, with high cadence (at least 1x pear year), short turn around time (<2 years from kick-off to launch) and at low cost (<150KEUR/kg of IOD payload including launch and operation) to allow many new technologies to be tested. Based on these requirements - putting aside for a moment the technological marvel that the satellites undoubtedly are - this mission is a failure.
3 Missions in 20 years
领英推荐
More than 20 MEUR per satellite
4 years from each kick-off to launch
ESA's Proba mission fails to deliver
I argue that over engineering design driven by fear on ALL levels are the root cause for this.
The market price for a similar capable 100kg satellite outside the ESA environment is at 10-20% of that of one Proba. Given three Proba satellites have been built and launched in the last 20 years that means that with similar expenditure ESA could have build and launched at least one satellite of this type every year. This would arguably have been a better use of European tax payer money.
To "Failure is a necessity for progress"
For the many wrongs in the industry there is however also hope for improvement. Several trends and successes seem to indicate that a change in the fear mindset is possible.
Strength by numbers is at the core of Operationally Responsive Space (ORS)
Sometimes a good enough capability that is reliably there and can be replaced on short notice if needed is better than a state of the art system. This is the idea behind operationally responsive space (ORS) concepts that are being driven by the United States military but also recently have found adoption in Germany with the RSC3 of DLR.
That said rapid innovation and quick replacement has not only found applications in the military. Two of the most successful commercial space ventures of the past decade - Planet and SpaceX made use of it.
14th generation in less than a decade (Rapid Satellite Design by Planet)
Planet a company whose meteoric rise has currently be recognized in a $2.8B valuation in a SPAC merger is a big proponent of quick iterations of satellite designs. Rather than trying to build one design to rule them all they have always looked to improve their satellites generation by generation and in quick succession.
Fly as you Test
Test as you Fly
has never been as aggressively implemented as for the Dove satellites of Planet
Rapid Unscheduled Disassembly (RUD introduced by SpaceX)
RUD or Rapid Unscheduled Disassembly is a term Elon Musk has popularized for when things go boom at SpaceX.
Interestingly this has not only given onlookers some pretty spectacular fire balls it also has delivered industry what was considered impossible before: liquid fly-back boosters .
"Failure is an option here. If things are not failing you're not innovating enough."
Elon Musk
What can we improve
As an outlook I would like to indicate a few things that all of us can improve to make our industry a less wasteful environment.
Reviews that identify and curb overengineering
As much as there are reviews to figure out why a mission has not worked or underachieved there need to be reviews to identify root causes of gross overengineering. If you have a mission that is "required" and consequently "designed" for 1 year of operation and then it repeatedly "achieves" a multitude of it you need to adapt your requirements or your design.
A process that delivers a deviation of 20x the design value needs to be overhauled
As an indication I would like to remind that Henry Ford regularly visited junk yards to inspect worn out Model-T cars. When he found that for example the gear box would outlast the rest of the car then, unlike Rolls Royce who would increase the quality for the entirety, Ford motors would reduce the effort spent in the gear box (to reduce cost).
Henry Ford famously inspected car junk yards to figure out which parts of his cars could be built in LOWER quality!
For the future satellites need to be more like the Ford-T Model than a Rolls Royce Silverghost to achieve their application for the great multitude.
Build for mission resilience
Like ORS accept a good enough capability and build for mission resilience rather than build one satellite to rule them all.
Lose the fear of failure
The space industry needs to escape the trap of higher cost, leads to fewer missions and demand for higher reliability. The approach for more satellites faster, which reduces the cost and consequently the fear of failure is a good way forward.
SpaceX and Planet are examples how innovation can be achieved if we leave behind our fear (as an industry). This only works if there is a culture both on agency as well as in the mission primes to accept risk.
Live up to the public image of space
Last but not least lets remember that space in the public eye is seen of innovation. Lets live up to that image before they realize how backwards and conservative most of the space industry really is.
How can you help:
This text is part of a?series ?of articles in which the author sets the framework to start a discussion about the wrongs of the space industry. If you have experienced similar things, leave a comment. Other views and opinions are very welcome, too, as they may present a way forward. Please be kind to each other.
Disclaimer
The author’s views are his own do not represent the views of his company Berlin Space Technologies.
Well written Tom. I always marvel at how many project managers, program managers, contract managers, various lawyers, negotiators, planning consultants, controllers and all sorts of other bureaucrats are paid fixed salaries fueled by tax payers to just disburse the meager funding for small missions that eventually are put together by a small bunch of enthusiastic engineers (or free labor students) working in a pseudo-garage and that, even without all those reliability studies, still have a very high chance of keep functioning in space for many years after few failures are digested, mistakes are corrected and carefully dodged the politics. However, if something works for long time and does not brake at the expected EOL, it does not mean it was always overengineered. It could have been simply designed, made and used well. The modern buzzword "sustainable" applies... you do not have to always throw stuff that can still work. The best indicator that counts in this context, not only in Space is: (People Actually Making)/(Everybody Somehow Involved) My personal mission at Microspace Rapid Pte Ltd has always been to show how much we can achieve in Space while we bring down BS (you know what is that for sure...) as close to zero as possible.
My favorite #spacedonewrong (related to this article) was the Cluster mission. Four spacecraft that all had to work for the 4D science to be performed. So each satellite got more and more redundancy, complexity, and of course mass, so ended up having to be flown on the first Arianne 5, and it didn't end well. The Company I was working with at the time did a paper at 4S Conference, that basically showed what every car driver knows - if you have a spare tyre, you are virtually guaranteed to get from A to B. A 5 satellite system, with little or no redundancy on each one would likely have been more reliable than 4 fully redundant satellites.
Guiding teams in satellite subsystem development for communication, radar, and navigation
3 年Tom, here in Russia we also feel that highly reliable satellites with long lifetime slowed down our industry in two ways. First, we need to meet strict requirements that takes more time, and second, by the end of the mission satellite is in good condition but it is outdated, nevertheless because of its good condition we cannot launch or even develop new one. I wanted to add my opinion on this topic. To fully implement the idea of building 10x cheaper sats we need 10x cheaper launches. As far as I know, 1 Proton-M launch costs about $50 million. So, for large sats launching less reliable satellites at MEO every year is still more expensive than rare launches of more reliable space vehicles. I am sure, that for cubesats and smallsats you are perfectly right.
Chief Technology Officer at KTsat
3 年Thank you Tom Segert for this insightful article. This clearly demonstrates the new space spirit and what many of us are willing to advocate, at least starting from lower orbit missions... ;o) Hope you and your beloved ones are all safe and healthy.
Researcher, PhD. Norwegian University of Science and Technology (NTNU)
3 年Very interesting, I liked this a lot. I would say this is the reason some of the projects I have been involved in before did not make it. But its hard to find the right point on that curve!