Ep 63 Boeing 737 Max

Engineering News – World’s First Electric Passenger Plane (1:45)

This week's engineering failure is the Boeing 737 Max (9:00). A poorly executed flight stabilization software called MCAS (15:20) caused two catastrophic crashes (25:30) putting Boeing in hot water (30:25).

Sources:

Engineering News

Boeing 737 Max


Episode Summary

Hi and welcome to Failurology; a podcast about engineering failures. I’m your host, Nicole

And I’m Brian. And we’re both from Calgary, AB.

Thank you again to our Patreon subscribers! For less than the cost of a package of fresh socks, you can hear us talk about more interesting engineering failures!

That’s $5 Canadian per month. Our mini failures are released on the opposite Sundays from our regular episodes, so our patrons get an episode every week. And we have a link to set up the mini failures in your regular podcast app so you don’t have to flip back and forth.

This week is technically the podcast's 2nd birthday! But we're going to talk about that next week because I found a news article that fits in well with the Boeing failure.

This week in engineering news, the world’s first electric passenger plane took its first flight.

  • This article is from Popular Mechanics

  • Alice, the battery powered plane, took off at 7:10am from Grant County International Airport, flew for 8 minutes at 3,500ft (1,066m).

  • The airport is about 250km east of Seattle and about 50km east of the Gorge Amphitheatre. It’s a retired military facility with a long runway that is great for testing aircraft.

  • The plane is designed for up to 250 miles (400km) and up to 9 passengers.

  • One of the challenges they had to deal with is the weight of the batteries. In comparison, a gas car weighs about 15% less than a similar electric model. The heavier batteries are easier to deal with on the ground than in the air.

  • In addition to Alice offering zero carbon emissions, it also offers a quiet flight experience. And probably doesn’t smell like jet fuel.

  • I think this is just the start for electric airplanes. When, hopefully, the market shows there are consumers for this type of plane, that will drive engineers and experts alike to continue to improve the batteries and plane designs leading to longer ranges and more passenger capacity.

  • If you want to read more about Alice, check out the links on the web page for this episode at failurology.ca. We have a link to the Popular Mechanics article, as well as the website for Alice.

Now on to this week’s engineering failure; the Boeing 737 Max.

  • I was in Ireland at the beginning of September, speaking at a conference about failure and resilience. It was a great experience! One of the failures I spoke about was the 737 Max, so after 63 episodes, I finally feel really prepared for a plane one. Maybe not as much as brian still, but I have come a long way since Air France flight 447 in episode 9.

  • The 737 Max was the 4th generation 737 from Boeing. It was announced on August 30, 2011, took its maiden flight on January 29, 2016 and service commenced on May 22, 2017 with Malindo Air, which is a hybrid-full service carrier, an associate carrier of Lion Air Group and cooperative between Malaysia and Indonesia (hence the name Mal-Indo).

  • The updated 737 model came with more efficient engines, aerodynamic changes and airframe modifications; which we’ll talk about more in a minute. Notice I said “updated”. This is a very important part of the story. Boeing marketed the 737 Max as an upgrade from previous 737 models. Assuming they could get regulators on board (spoiler alert, they did) this had the potential to save Boeing a ton of money.

  • I want to mention this at the top, because I have gotten questions about it before. There are variations of the 737 Max, there's the Max 7, Max 8, Max 9, and Max 10 (2023). And while the two major crashes that we’re going to talk about today were Max 8s, all of the 737 Max’ had these issues. So for the purpose of this episode, we are going to call the planes 737 Max.

  • We’re going to get into the details of why in a bit, but all of the 737 Max were grounded from March 2019 until November 2020; almost 2 years. At that point the global fleet was 387 planes and they had operated 500,000 flights.

MCAS

  • The 737 Max had new control software on board called MCAS.

  • MCAS, which stands for maneuvering characteristics augmentation system, was a flight stabilization program that compensated for excessive nose up or excessive pitch during take off. Too high of a pitch during take off could result in an engine stall.

  • This nose up motion was caused by the engine placement and size. The engines offered a 15% reduction in thrust fuel consumption, 20% lower carbon emissions, and 50% lower nitrogen oxide emissions - but was almost 400kg heavier at 2780 kg and larger from 1.5m to 1.8m. Due to the larger engine, they had to increase the length of the landing gear, but also place the engine higher and more forward on the plane so it met the required ground clearance.

  • The planes have a split tip wingtip device, which means that the end of the wing kind of looks like a T, although most of the tip is aimed up and only a small segment is aimed down. This improves fuel efficiency, maximizes lift, while also following a similar design of the previous 737s (important for later). The planes also had a re-contoured tail cone, revised auxiliary power unit inlet and exhaust, aft-body vortex generators removed and other small aerodynamic changes.

  • As I mentioned the landing gear was longer to accommodate the engine size. About 20cm longer than the previous model. The landing gear and supporting structure were reinforced and the fuselage skins were thicker in some places, adding more weight.

  • In order to accommodate the engine size and placement, Boeing added software, MCAS, to correct the nose up movement and stabilize the plane.

  • The MCAS lowered the nose of the plane if the angle of attack sensors, located in the nose of the plane, detected it was flying at too high of an angle. This on its own, while maybe not the choice I would make, is not necessarily an issue.

  • Flight stabilization software is not a new concept. It has been added on planes for years. We’ve talked about it on other episodes; like FedEx Express Flight 80 in episode 59, or United 232 in episode 42. Using failures to explain why flight stabilization software is ok is not ideal, but there are lots of other planes with the software that are fine.

Issues

  • So, if everything about the plane is within reason, why all of the issues?

  • Boeing is one of the largest aircraft manufacturers in the world, representing almost 40% of the market. It’s no surprise that they would have quite a bit of pull with regulators.

  • The real problems started when the Federal Aviation Administration, the US department that determines airworthiness, or simply which planes can fly and which cant, allowed Boeing to remove the MCAS system from their aircraft manual. I don’t think it was quite that simple, but ultimately, after what we assume is much discussion, Boeing was allowed to omit the MCAS from the plane's manual.

  • On top of that, remember Boeing had packaged the 737 Max as an upgraded model from the previous generation. They argued that pilots who were already trained to fly a 737 did not need to take extensive training to fly the upgraded model. And so pilots were only given a 2 hour online course. While this saved Boeing and their customers valuable time and money on training, that online course made no mention of the MCAS software.

  • So when the 737 Max took its first flight in 2017, the pilots were completely unaware of the system.

Problems with the MCAS

  • The MCAS had a number of problems. First, it did not require redundancy from the sensor inputs, meaning that it took the readings from one angle of attack sensor, even if that sensor's readout was inaccurate. And like we saw with the DC-10 in United 232, a lack of redundancy in critical systems is a real problem!

  • Second, the MCAS system, if the pilots were even able to deactivate it, would automatically reactivate, repeatedly. The pilots were not able to override the system or prevent it from activating. And without a clear understanding of what it was doing and how, they could not react appropriately.

  • Catastrophic failure was imminent. And that’s exactly what happened.

Crashes

  • The first crash occurred on October 29th, 2018. Lion Air flight 610 crashed on its way from Jakarta to Pangkal Pinang (pun-call pin-ang) Indonesia killing all 189 people on board.

    • The plane, having been delivered to Lion Air only two months earlier, crashed into the Java Sea 13 minutes after takeoff in Jakarta.

    • Lion Air released a full report on October 25, 2019, blaming the MCAS system for pushing the plane into a dive after receiving data from a faulty angle of attack sensor.

    • Rumor has it that the day before the crash, a different crew piloting the plane had a similar issue, but the extra pilot sitting in the cockpit jumpseat was somehow able to diagnose the problem and disable the MCAS.

  • Boeing had a chance here to do the right thing, fess up to the MCAS system and prevent more tragedy. But instead, they doubled down that the 737 Max was fine, there were no issues with the plane or its controls.

    • And then they issued an operational manual guidance bulletin talking about how to address errors in cockpit readings. The bulletins also included a checklist for pilots to follow in the event of a malfunction.

    • The main problem I have with this is that a) it doesn't address the root cause of the problem, and b) it puts all of the responsibility back onto the pilots, who were still not properly trained, if they even know about the MCAS.

  • Then a second crash happened on March 10, 2019, less than 5 months later. Ethiopian Airlines flight 302 crashed on its way from Addis Ababa (add-iss a-ba-ba) Ethiopia to Nairobi Kenya, tragically killing 157 people on board.

    • This plane, only four months old at the time, also was in a dive at the time of the crash, similar to Lion Air 610, based on evidence retrieved at the crash site.

    • The cause was initially unclear, although the vertical speed after takeoff was reported as unstable.

  • Grounding of the 737 Max fleet started that day, March 10th, 2019.

Aftermath

  • After the second crash, Boeing couldn’t hide any further and finally admitted that the MCAS played a role in both crashes. There was a full investigation that showed Boeing knew the MCAS was not only an issue, but that it was responsible for the first Lion Air flight 610 crash and did not take action to prevent future crashes. Their cover up was exposed.

  • The investigation also found lapses in the Federal Aviation Administration’s certification of the 737 Max in the first place. Specifically their approval for Boeing to remove the MCAS from the aircraft manual.

  • Fun fact - there was a reassessment of the 737 Max in February 2020 by the Federal Aviation Administration and the European Union Aviation Safety Agency that determined the stability and stall characteristics of the plane would have been acceptable with or without the MCAS. So not only was it poorly designed, it wasn't even required.

  • Boeing ultimately paid $2.5 billion US dollars after being charged with fraud. The direct costs of the groundings is around $20 million and the indirect costs (i.e. loss of sales and stock price drops) is around $60 million.

  • When the 737 Max went back into service in late 2020, it included sensory redundancy, requiring input from two angle of attack sensors, as well a pilot override function in case the MCAS wasn’t working properly. Both of these things should have been included in the original design; they are pretty straightforward things in my opinion.

Lessons learned

  • So as I mentioned, I covered this failure at a conference in Ireland in September. Which was an amazing experience. If any of you are listening, hello, thank you for coming to my session, and hopefully we got to chat that day. It was so nice to meet everyone and nerd out about failure.

  • The basis of my session was talking about why different failures occur. Based on the almost 100 failures we’ve covered on this show, I have definitely noticed a pattern. Failures seem to either occur because of unknowns, mistakes, or on purpose. And sometimes it’s a combination of these things, but these are the three main “why”s that I see. And this flight was my example of failures that happen on purpose.

  • As frustrating as this story is, because it was so preventable, it’s a really important story to tell. What happened to Boeing is the type of thing that happens when you think you’re safe from failure, when you become complacent to your success and you think failure won’t happen to you. But guess what, it happens to everyone. And I think there is a misconception amongst engineers that people who are more junior are more at risk of failure, but I actually think it's the senior people that are more at risk, because they aren't out there looking for all of the things that could go wrong.

  • With that said, here are my highlight lessons learned from the Boeing 737 Max.

  • Cutting corners is never worth it. When you get caught, and notice I said “when” not “if” you almost always spend more than you save in fines, penalties, compensation and your reputation.

  • Second, proper training on new technology is very important. Especially when dealing with the public in a life safety capacity such as flying an airplane.

  • And lastly, no one is safe from failure. The safer you think you are, the more at risk you could be, because you’re not looking for it.


So there you have it, the Boeing 737 Max. poorly designed flight stabilization software, lack of training, and the unwillingness of Boeing to admit their failure were to blame for the preventable deaths of almost 350 people. I would say this failure rocked the airline industries in a way that we have never seen before, and hope to never see again.


For photos, sources and an episode summary from this week’s episode head to Failurology.ca. If you’re enjoying what you’re hearing, please rate, review and subscribe to Failurology, so more people can find us. If you want to chat with us, our Twitter handle is @failurology, you can email us thefailurologypodcast@gmail.com, you can connect with us on Linked In or you can message us on our Patreon page. Check out the show notes for links to all of these. Thanks, everyone for listening. And tune in to the next episode where we’ll talk about the Station Nightclub fire in Rhode Island in 2003.

Bye everyone, talk soon!