• breakingcups@lemmy.world
    link
    fedilink
    English
    arrow-up
    171
    arrow-down
    1
    ·
    14 days ago

    Let’s be clear, this isn’t the single programmer’s fault. Everybody will eventually make a mistake. The fact that it wasn’t caught by mitigating measures such as reviews, tests, and audits is the real error we can learn from here.

    • rtxn@lemmy.world
      link
      fedilink
      English
      arrow-up
      86
      ·
      14 days ago

      A Proton-M booster carrying a GLONASS satellite crashed shortly after takeoff at Baikonur in 2013. The failure was caused by a gyroscope package that had been installed upside down. The receptacle had a metal indexing pin that should’ve prevented the incorrect installation. The worker simply pushed so hard that it bent out of the way.

      When you make a foolproof design, God makes a better fool.

        • rtxn@lemmy.world
          link
          fedilink
          English
          arrow-up
          18
          ·
          edit-2
          13 days ago

          Ah yes, it’s on the internet, so it must be American.

          • Kosmodrom Baikonur (located in Kazakhstan) is the primary launch site of Roskosmos (Russia)
          • The Proton is a Soviet-made heavy launch rocket, still used today (not related to Rocket Lab’s Electron and Neutron families (which are also not American))
          • GLONASS is the Soviet/Russian equivalent of the GPS

          I think it’s safe to say that the guy did not land a job at NASA.

          • gens
            link
            fedilink
            English
            arrow-up
            1
            ·
            13 days ago

            Didn’t nasa make the same mistake ? Because I remember that they put arrows on the slots because someone put a sensor upside down.

            • rtxn@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              ·
              13 days ago

              I can’t recall anything like that. The only other crash I remember that was caused by a sensor was the Schiaparelli lander, and it was an ESA mission.

              • gens
                link
                fedilink
                English
                arrow-up
                1
                ·
                edit-2
                13 days ago

                I remember it from a youtube video from one of those engineering channels (might have been “real engineering”) probably a year ago. I only remember it because I thought “wow they have to have so many safeties” and that it is good to draw on parts and such instead of just relying on technical drawings.

                I don’t remember, but it might not have crashed (multiple sensors), and it might not have had a latch/notch. But it was a long time ago.

                Edit: I still remember the big yellow arrow.

        • LifeInMultipleChoice@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          9
          ·
          edit-2
          13 days ago

          I know a story about a certain fighter jet we built in the United States. Programmers for the radar had everything set and they ran the tests over and over and the radar was fucking up. Don’t want to put in to many details but end result was about $100m dollars in research losses to find out the mechanic who installed the antenna on the front of the fighter turned it a quarter turn to far and it must have stripped the threads and bent the antenna slightly. Took over a month for them to catch it. They just kept assuming the programming was wrong because the antenna looked right to the eye from as close as the standard person got

        • Fuck spez@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          4
          ·
          13 days ago

          Probably by being qualified, and also by being a human being who sometimes makes mistakes and had a bad day.

    • DontRedditMyLemmy@lemmy.world
      link
      fedilink
      English
      arrow-up
      36
      ·
      14 days ago

      I think it was a different era, to borrow an awful phrase. In 1962 they were still figuring out best practices for reviews, tests, and audits. Even today, lone hero outputs can get pretty far when processes aren’t follow.

    • towerful
      link
      fedilink
      English
      arrow-up
      18
      ·
      14 days ago

      Which they did learn from!
      I guarantee every mistake like this at any good company leads to a leap forward in tooling for simulation, testing, code building, review, merging, local dev environments etc.
      The good companies share their work (via open sourcing their solution, blogging their learnings) or by contribute to existing solutions.
      NASA’s ROI cannot be measured. The amount of industries their R&D has touched is massive

    • partial_accumen@lemmy.world
      link
      fedilink
      English
      arrow-up
      26
      arrow-down
      1
      ·
      14 days ago

      Mars Climate orbiter holds the record I think for coding problem and spacecraft failure. That one cost $460m.

      A great runner up would be the loss of the maiden flight of the new Ariane 5 rocket at $370m:

      "On June 4th, 1996, the very first Ariane 5 rocket ignited its engines and began speeding away from the coast of French Guiana. 37 seconds later, the rocket flipped 90 degrees in the wrong direction, and less than two seconds later, aerodynamic forces ripped the boosters apart from the main stage at a height of 4km. This caused the self-destruct mechanism to trigger, and the spacecraft was consumed in a gigantic fireball of liquid hydrogen.

      The disastrous launch cost approximately $370m, led to a public inquiry, and through the destruction of the rocket’s payload, delayed scientific research into workings of the Earth’s magnetosphere for almost 4 years. The Ariane 5 launch is widely acknowledged as one of the most expensive software failures in history. What went wrong?

      The fault was quickly identified as a software bug in the rocket’s Inertial Reference System. The rocket used this system to determine whether it was pointing up or down, which is formally known as the horizontal bias, or informally as a BH value. This value was represented by a 64-bit floating variable, which was perfectly adequate.

      However, problems began to occur when the software attempted to stuff this 64-bit variable, which can represent billions of potential values, into a 16-bit integer, which can only represent 65,535 potential values. For the first few seconds of flight, the rocket’s acceleration was low, so the conversion between these two values was successful. However, as the rocket’s velocity increased, the 64-bit variable exceeded 65k, and became too large to fit in a 16-bit variable. It was at this point that the processor encountered an operand error, and populated the BH variable with a diagnostic value."

      source

      The kicker on this one was the bug was copied from the previous successful Ariane 4 rocket code, but the Ariane 4 never experienced it because the Ariane 4 first stage was dropped in each flight before the bug would show itself, so it was never an issue there. Because the Ariane 5 had a slightly different flight profile it was in the air a longer period of time…enough time to experience the bug and cause a loss of the rocket in flight.

    • Bacano@lemmy.world
      link
      fedilink
      English
      arrow-up
      16
      ·
      edit-2
      14 days ago

      I’ll keep it going:

      Don’t forget about the time Initech had it’s credit union hacked with a virus that was supposed to only take a negligible percentage of each transaction but the programmer figured he must have “put the decimal in the wrong place or something.”

      The group got away under pretty mysterious circumstances…

    • Psythik@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      2
      ·
      14 days ago

      Why the fuck is/was NASA using the US customary system? Science is always done in metric, even in the US.

    • subtext@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      14 days ago

      It was just a simple transposition right? 2.45 (wrong) vs 2.54 (right)

      E: never mind, I was wrong

  • IninewCrow@lemmy.ca
    link
    fedilink
    English
    arrow-up
    17
    ·
    14 days ago

    Always loved the story of what they saw in the source code of software they used in historic NASA missions from decades past.

    https://interestingengineering.com/science/code-moon-landings-released-surprising-hilarious

    Turns out, the programmers back then were just as unsure about what they were doing as much as programmers are today … except the guys back then had computers less powerful than a modern smart watch controlling a missile that was aimed at the moon.

  • Churbleyimyam@lemm.ee
    link
    fedilink
    English
    arrow-up
    6
    ·
    14 days ago

    I also heard about a fuckup with the European space agency who had hired an American to work on a particular bit of the project. He used an imperial measurement somewhere and it caused the whole thing to fail.