|
|
The system appears to be a bit slow.
|
|
|
|
|
Nope!
Bad command or file name. Bad, bad command! Sit! Stay! Staaaay...
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
I have not received any mail from CP since yesterday evening but there where some replies which has been shown as notification.
I did not get the "You must be logged in to post a reply" message. So it seems not related to that but a general problem.
Looks like the mail hamsters are taking a duvet day.
|
|
|
|
|
No, they are running DAYS late, or, are broken entirely.
|
|
|
|
|
I noticed that I'm getting my daily newsletters later than usual, but I thought it might be due to DST?
|
|
|
|
|
Maybe you've heard about this before, but it is very interesting.
It's a good example of the need for "code organization / code management" that you get from higher-level languages and OOP.
Bjarne Stroustrup wrote: On 23 September 1999, NASA lost its US$654 million
Mars Climate Orbiter due to a navigation error. “The root
cause for the loss of the MCO spacecraft was the failure
to use metric units in the coding of a ground software
file, ‘Small Forces,’ used in trajectory models. Specifically,
thruster performance data in English units instead
of metric units was used.”5
The amount of work lost was
roughly equivalent to the lifetime’s work of 200 good engineers.
In reality, the cost is even higher because we’re
deprived of the mission’s scientific results until (and if) it
can be repeated. The really galling aspect is that we were
all taught how to avoid such errors in high school: “Always
make sure the units are correct in your computations.”
Why didn’t the NASA engineers do that? They’re indisputably
experts in their field, so there must be good reasons.
No mainstream programming language supports units,
but every general-purpose language allows a programmer
to encode a value as a {quantity,unit} pair. We can
encode enough of the ISO standard SI units (meters, kilograms,
seconds, and so on) in an integer to deal with all
of NASA’s needs, but we don’t because that would almost
double the size of our data. Furthermore, checking the
units in every computation would more than double the
amount of computation needed.
Space probes tend to be both memory and compute
limited, so the engineers—just as essentially everyone else
in their situation has done—decided to keep track of the
units themselves (in their heads, in the comments, and
in the documentation).
In this case, they lost.
From http://www.stroustrup.com/Software-for-infrastructure.pdf[^]
|
|
|
|
|
Some form of quality control and testing regimen would probably have helped too.
The only good thing about this is that QA didn't exist in 1999 so it wasn't as a result of a "how do I navigate my spacecraft to Mars? SND CODZ URGNTZZZZ!!!!!" question and answer...
Bad command or file name. Bad, bad command! Sit! Stay! Staaaay...
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
OriginalGriff wrote: QA didn't exist in 1999
Well, I guess we have to hold CP responsible for this disaster then.
|
|
|
|
|
QA didn't exist in 1999? I hope that was a joke, because that was my exact job description.
|
|
|
|
|
Damn Wolowitz screwed up again?
Don't let your mind wander too far.
It's too small to be let out alone.
|
|
|
|
|
I think Wolowitz was only about 10 at the time. Cut him some slack, eh?
|
|
|
|
|
For strong typing could be used interface, struct or class concept. Or some plain getter functions with documentation.
Lost were also the time for mankind. So we ALL know later from mars. Maybe Elon Musk would invent some spaceship if he had better knowing about the mars climate.
Press F1 for help or google it.
Greetings from Germany
|
|
|
|
|
In theory you could specify that all internal units are SI (e.g. meter for distance), and then you just need to deal with unit conversions on the software interface to stored data or user interface.
But even if you handle everything in (value, unit) tuples, there could be an error in data entry that would go undetected by the software.
Alternative solution: have a debug mode where you use (value, unit), and have extra asserts to make sure your code is correct and in the release mode the unit checks would be removed so there's no overhead in production.
Wout
|
|
|
|
|
Those NASA engineers are real engineers sticking to their language they know C/C++. Don't get me wrong, I do like C/C+ (especialy C++11/14), but maybe it would be better they created a DSL (Domain Specific Language) for the control units ( at base station and at space-probe).
In stead of using runtime type checking, they could have done compile time type checking, preventing the overhead of the meta-data for the Run Time Type checking. These practices is all well written in the book "Compilers: Principles, Techniques, and Tools" (1st edition dates from 1986). The DSL uses the ISO standard as types and the translator-part of the compiler can add conversion code when they use non ISO standard.
It will not prevent human errors, but it will make it more clear what's going on when you use a Meter-type instead of a Feet-type if you writing down the calculation in the source-code of the computer.
to err is human; to forgive, divine
|
|
|
|
|
ddt_tdd wrote: to err is human; to forgive, divine FTFY: To err is human; to forgive requires a majority decision by the Change Control Board.
Software Zen: delete this;
|
|
|
|
|
Great stuff. Thanks for commenting.
It's an interesting story. Software crashes like this just get a lot of attention since they are so obviously catastrophic. Other problems occur and people do not hear about them but they happen everywhere in software.
|
|
|
|
|
Like comments on this post?
|
|
|
|
|
Hmm... the post was edited. There was an entire comment about what had happened. Interesting.
|
|
|
|
|
Proper strong static typing...
If they were using Ada with appropriate data typing (that domain is what Ada was designed for...), F# with 'units of measure' or even C++ with a suitable units library (in which all the checking and overhead is at compile time, not runtime), they'd have been fine...
...or a proper, robust design, which documented the units in use...
NASA are generally very good with the process and design parts of software development, especially when the target is safety critical - maybe this was 'just' an unmanned probe, so the same rigour didn't get applied...
Java, Basic, who cares - it's all a bunch of tree-hugging hippy cr*p
|
|
|
|
|
raddevus wrote: Space probes tend to be both memory and compute limited, so the engineers—just as essentially everyone else in their situation has done—decided to keep track of the units themselves (in their heads, in the comments, and
in the documentation).
While this is true, and one of the reasons for extensive and exhaustive testing of spacecraft software, in the case of Mars Climate Orbiter (as also noted in the original post) :
raddevus wrote: ... root cause for the loss of the MCO spacecraft was the failure to use metric units in the coding of a ground software file...
(My re-bolding)
While not excusing the failure, which was a result of insufficient testing and other related issues, in this case it wasn't due to memory limitations of the on-board computers, or people "keeping track of units in their heads...". Anyway, source code isn't uploaded and compiled on a spacecraft, so type-safety and better languages shouldn't have any impact on the size of the executables.
While I'm sure Stroustrup means well, and intends the use of this failure as a lesson in the advantages of more strongly typed languages and methods, I think he's misrepresenting the actual situation. I'd recommend reading the actual reports on what happened for more details :
Mars Climate Orbiter Failure Board Releases Report which also contains a link to the full, PDF, report.
m (in spacecraft test systems engineer mode )
Days spent at sea are not deducted from one's alloted span - Phoenician proverb
|
|
|
|
|
Yeah, I'm sure there is much more to it.
I think Stroustrup was just attempting to show an analogy about how programming languages can help us or provide no extra help. Kind of like a hand saw versus a chainsaw.
At times a newbie believes that because they just because he has the newest chainsaw he is a professional.
A person with vast experience can use hand tools and _process_ and do far more than some newbie with a chainsaw. However, when the experienced person with process then begins to use a power tool a leap of an order of magnitude may be made because the person can think at a different level.
Great discussion.
|
|
|
|
|
Reminds me of the Hubble telescope mirror problem. Perkin Elmer was contracted to create the large reflecting mirror for the Hubble telescope, and NASA was not allowed to review the process, visit the factory, nor check off the various milestones. NASA was only allowed to accept delivery when the mirror was complete. Then on delivery, NASA didn’t even do a thorough QA check of the mirror. Unfortunately, the mirror was so badly flawed, that after it was installed and used, everything the telescope saw was blurred. Useless data was collected. NASA had to order another manned mission to fix the problem at a good cost to the American taxpayer.
Thus, it pays to have constant checks and quality reviews by the customer.
|
|
|
|
|
Exactly right. Everything in that report says bad or insufficient systems engineering process (which was encouraged under the Faster, Better, Cheaper mantra). As a result, missions with approximately the same capability are now back to costing three or more times as much.
His quoted cost for the program is ridiculous as well.
(not a spacecraft test system engineer, but have read a great deal on it)
|
|
|
|
|
Nope, nope, nope.
There's no need for a different language. While some languages make certains things easier, in the end language is irrelevant. This team needed 3 things:
1) Correctly written specification and design. Doesn't have to be the Encyclopedia Britanica, but has to cover the major points, e.g., all units are metric.
2) Code review. Other people looking the code to ensure it makes sense.
3) Real testing. "It compiles and it runs" is not testing. Real testing includes validation that the results match the expectation. If this had been done the error would not have made it to production.
Don't get me wrong, I prefer strongly typed languages and IDEs that flag errors and all the goodies of "modern" programming. However, this debacle was caused by process, not programming.
The job of professional developers is not to design applications and write code. Our job is to solve someone else's business problem. Coding is a tiny part of that.
|
|
|
|
|