Ep 25 Deepwater Horizon
Hi and welcome to Failurology; a podcast about engineering failures. I’m your host, Nicole, and I’m from Calgary, Alberta.
As I mentioned in the introduction of the last episode, I have a new podcast project centred around. It’s called Glorious Ladies of Engineering, or GLOE for short. Yes this is a bit of a play on the Gorgeous Ladies of Wrestling. Similar to the female wrestlers of the 80’s in that show, women in engineering are also navigating a male dominated industry, amongst other things.
We have been working on this project since March, and it has been a lot of fun. My network of women in engineering is... small. But it has probably doubled, at least, in the last 3 months. We have talked to women across the country, from all different disciplines about their experiences, their successes and some of the challenges they’ve faced in their career. One common thread I’ve noticed is that even though we may feel like we don’t have a ton of representation in our own tiny bubbles of engineering, we actually have similar experiences across disciplines and industries. It’s like we’re part of a club. An awesome, and challenging, and wonderful and sometimes frustrating club.
The other thing I've noticed is that there is something from each interview that sticks with me. I am inspired with each woman’s story and I’ve learned so much about different disciplines. The possibilities in engineering really are endless. There are so many things we can do and so many ways we can make the world a better place. From designing mechanical and electrical systems for affordable housing, to keeping our telecommunications systems up and running, even how to handle the combustibility of dust in the forestry industry. As one women told me in her interview, if you think safety is expensive, try an accident.
Check out Glorious Ladies of Engineering wherever you get your podcasts. You’re not going to want to miss it.
This week’s engineering news is about rechargeable cement-based batteries.
Researchers from the Department of Architecture and Civil Engineering at Chalmers University of Technology in Sweden have developed a prototype to use concrete structures as rechargeable batteries. They used a cement based mixture, added short carbon fibres to increase conductivity and flexural toughness, and then imbedded a metal coated carbon fibre mesh. The prototype has an energy density or capacity of 7 watt hours per m2; while this is low in comparison to other commercial batteries, the scale of concrete for infrastructure makes this more viable. There are a ton of applications for this product. The cement based batteries can be paired with solar panels to provide electricity to a building or act as an energy source for monitoring cracking or corrosion on bridges or highways. One important question that still has be answered is how do they develop the batteries to last as long as the concrete, which is often 50 or 100 years? The concept is still early on and to my knowledge hasn't been tested in real world applications, yet. But one day, hopefully soon, that new sky scraper going up down the street could be a giant battery. How exciting to think about.
If you want to read more on the cement based battery prototype, check out the link in the show notes or head to failurology.ca.
Now on to this week’s engineering failure; Deepwater Horizon. A deepwater oil and gas drilling rig that sank off the coast of Louisiana in April 2010.
This was pretty big in the news when it happened. The environmental impact was astronomical and has been said to be one of the worst environmental disasters in history. And while I knew the oil and gas companies were to blame for the damage, I didn’t realize how many things went wrong or really how the whole event played out. But before I get into that, I want to talk a little about the rig itself.
Deepwater Horizon was an ultra deepwater, offshore drilling rig. It was dynamically positioned, meaning it maintained its gps location with 8 onboard propellers and thrusters. And it was semi submersible, a floating rig with watertight pontoons below the ocean surface and wave action for stability. It was 112m long, 68m wide, 97m tall, and it sat 34m above the water. It had 6 diesel engines and 6 AC generators. It could travel at a speed of 4 knots or 7.4kph.
Originally built by Hyundai Heavy Industries in South Korea for a cost of $560 million USD, it was owned by Transcoean and leased to BP in 2001. The rig cost about $500,000 USD per day just for the bare rig, and then another $500,000 USD with crew, gear and support vessels. Deepwater Horizon spent its whole life in the Gulf of Mexico, with 93% of its working life being spent in operation. The Macondo well, the one where Deepwater Horizon met its tragic end, was actually started by a different drilling rig called the Marianas. The Marianas was damaged during Hurricane Ida in Oct 2009 and Deepwater Horizon was called in to complete drilling of the deepest underwater well in history. It moved to the Macondo well, in February 2010. The well is over 10,000m deep, and located 400km SE of Houston in 1,250m of water.
So what is deepwater drilling exactly? Organic deposits, trapped beneath an impermeable layer below the earths surface, “cook” into liquids and gases over thousands of years. Drilling is finding and removing those oil and gas deposits to create energy. And deepwater drilling occurs when the deposit is located under the ocean floor.
The immediate cause of failure was the inability to control pressures in the well, resulting in what is called a kick, or a surge of hydrocarbons from the well, which expanded quickly as they moved up the riser, and created an explosion on the rig. The fireball was visible 64 km away and was ultimately inextinguishable. The rig was evacuated, but unfortunately 11 crewmen died and some others were airlifted to medical facilities with major injuries. Deepwater Horizon sank 2 days later and left the well gushing oil at the seabed for 87 days before they were able to plug it.
In order to explain what caused the explosion, I need to back up a bit. As I said, there was a lot going on, and there are a few different moving parts to this story. First let’s talk about the rig set up. Deepwater Horizon sits at the water surface and the drill extends from the rig down to the oil and gas reservoir that is within an rock layer below the sea floor. The drill is inside a riser pipe that runs from the rig down to the oil and gas reservoir. And a casing is installed to protect the drill pipe below the sea floor and prevent any leaks when the oil and gas are being extracted. Cement is pumped down the riser, out the bottom of the casing and then back up between the casing and the rock layer to make sure the well is sealed off and no leaks can occur. This cement application is critical to the well itself but also to this story; and I will be circling back to that shortly.
At the wellhead on the sea floor, there is also a blow out preventer which is meant to seal the well off and is the last line of defense to prevent a leak or blowout. The blow out preventer is made up of two large donut shaped rubber elements called “annular preventers” that seal off the annular space around the drill pipe. There are also five sets of metal rams; three sets of pipe rams to close off space around the drill pipe, a casing shear ram which cuts through the casing, and then a “blind shear ram” which cuts through the drill pipe to seal off the well in an emergency. The blind shear ram can be activated manually by drillers on the rig, by a remotely operated vehicle or by automated emergency “deadman system”.
Here’s the thing, these deposits are often under immense pressure. If you just drill a hole, you will likely make a geyser, and a mess. Drilling is a balancing act; they use drilling mud, which is a combination of synthetic fluids, polymers and weighting agents, that is calculated to a specific weight to control the oil and gas pressure in a well. The mud also cools and lubricates the well during drilling. If the mud weight is too high it can fracture the surrounding rock and create what is called a lost circulation event, which essentially means the well leaks and you loose oil. If the mud weight is too low, the fluids from the well can rise up to the rig and create a whole host of problems. For the most part, the deeper the well, the higher the pressure. The mud weight is calculated by the well engineering team and then adjusted by the rig crew as they drill. The mud travels down the pipe to the well and back up between the drill and the riser, bringing cuttings up from the bottom of the well. The mud is then sieved out and then recirculated back down the riser in a closed loop.
There are a lot of things that happened to result in the catastrophic failure of Deepwater Horizon. And I am going to break them down one by one. But honestly, the most interesting thing I learned researching this episode is the mud and how they use it to control pressure. It’s honestly fascinating. I also should take this opportunity to mention that while I practice mechanical engineering in Alberta, I have never worked in oil and gas, so this was a learning experience for me. I am definitely oversimplifying the process; but this episode isn't about engineering theory, its about the failures at Deepwater Horizon and I think a general understanding of how it works is more than adequate to follow the story. So, with that said, here goes.
As I mentioned earlier, Deepwater Horizon took over the well from The Marianas rig in February 2010 after Hurricane Ida. Deepwater horizon hit the “pay zone” or oil and gas reservoir, in early April, and they were preparing to temporarily abandon the well to allow the production rig to come in and extract the oil and gas. And Deepwater Horizon would move onto another well. This is another interesting thing I hadn’t realized. There are drilling rigs and production rigs. I had just assumed that one rig stayed at the well for drilling and production. But that does not appear to be the case.
On April 9th they had a little incident. The drilling mud was too heavy and it flowed into cracks in the rock layer, causing a lost circulation event or leak. The crew poured what is called a “lost circulation pill” down the drill to plug the cracks. This event is a critical turning point at the Macondo well. The crew and engineers were on edge from here on out. The margin of pressure they needed to exert on the well to keep the oil and gas in the well but not fracture the rock was very small. The April 9th event impacted the rest of the decisions the engineers and crew made. Had it not happened, things could have gone very differently.
Between April 11-15 the team concluded that the reservoir they reached was economically viable and it was ready to be temporarily abandoned for the production crew. I should pause for a moment to outline the main players here. BP engineered the well and drill method, Transocean drilled the well, and Halliburton would cement the final steel tube casing in place to seal the wellhead. This is the cement application that I mentioned earlier was critical to the well and also this story.
There are a few different methods for the casing, or the pipe that protects the drill through the rock layer between the sea floor and the reservoir. And due to the earlier lost circulation event, the engineers chose a less risky method as far as pressure, but a more challenging method as far as cementing and sealing the wellhead. To make sure the casing stayed centred in the hole around the rock, centralizers were installed to hold it in place; they helped ensure a more even application of cement around the casing with less risk of more cement going to one side and little or no cement going to the other side. For this particular casing, the crew needed 21 centralizers; they had 6. Did they wait for the 15 centralizers they needed before moving forward with cementing the casing? Nope, they did not. The casing was in place with 6 centralizers and ready for cement on April 19th. This was the day before the explosion.
Before they started cementing, it is recommended to run a full circulation of mud through the riser, this is referred to as “bottoms up” and it brings all of the mud at the bottom of the well back to the top, cleans the riser and allows for a better picture of what’s going on in the well based on what the mud brings back up with it. But due to the lost circulation event that happened 10 days earlier, they decided a bottoms up procedure was too risky, so they didn’t do it. They were also worried about the large volume of cement exerting more pressure on the well and creating another fracture; so they reduced the cement volume to the bare minimum. Not only did this make cement placement more critical, it also reduced how far the cement would travel up between rock and the casing. In fact, it reduced that travel to half of what the engineers own internal guidelines recommended.
This cement job was tricky, even in the best of times. Cement can be pumped too far down the well, not far enough, it could not fill the space between the casing and the rock evenly, allowing oil and gas to leak, or the oil-based drilling mud can contaminate the water-based cement causing it to set slowly or not at all. The other problem with the cement application is they can’t see it. It’s below the sea bed. They have to rely on pressure and volume to gauge how well its going. They aren’t flying completely blind. They know how much cement and mud they’ve sent down the riser, they can see how hard the pumps are working, whether each barrel of cement displaces an equal volume of mud, and a steady increase in pump pressure can indicate the cement has turned the corner at the bottom of the casing and is filling the space between the rock. They just can’t see where the cement is exactly or it’s quality.
I know, after all this, you’re thinking, there can’t possibly be more. But there is. The cement mix that was chosen during design was a “nitrogen foam cement” which had tiny bubbles of nitrogen gas injected into the cement slurry before it goes down the well. Two lab tests were done on the specified cement mixture in February 2010 and both were found to be unstable. A third test was done in April and was also found to be unstable. A fourth test was done a day the cement was placed, but since the test takes 48 hrs, the results weren’t available until after the cement application was done. So they had three bad cement tests, and they still went forward with the cement application. I don’t know if no one received the test results, or perhaps they did and ignored them, but someone wasn’t doing their due diligence here.
They finished cementing the casing within the rock layer at 1240am on April 20th. They had crew on the rig, ready to run a cement evaluation test. But the powers that be determined that since the well wasn’t leaking, ie there wasn’t another lost circulation event, the cement was a success and the test wasn’t required. Boy were they wrong. And honestly, stupid. You already have the crew there, ready to go. You already spent the money to get them there, why not do the test. The arrogance here is really astonishing.
So with the cement in place, or so they thought, they proceeded with the temporary abandonment procedure to cap the well until the production rig could arrive and start pumping oil and gas out of the reservoir. This process also had problems and ultimately led to the explosion. Before they installed the cement plug, which temporarily caps the well until the production rig can arrive, a cap that would be deeper than usual and deeper than typically allowed, they have to run a series of tests. First a positive pressure test. They exert pressure on the inside of well and make sure all of the oil and gas stay inside. This passed. Then they ran a negative pressure test. They remove pressure applied by the rig and make sure the cement around the casing holds once the balancing pressure from the mud is gone. Even though they bled pressure from the riser, they weren’t able to maintain 0 psi, it kept climbing. They tested this three times and then decided to test the kill line, which is a separate pipe that runs parallel to the drill riser from the blowout preventer to the rig. In theory, the pressures in the riser and kill line should be equal. They were able to maintain 0psi on the kill line and considered the test a success. Spoiler alert, this test was not a success, but the crew didn’t know this yet. Next they used seawater to displace the mud out of the well, above the blowout preventer. To do this they had to use a liquid spacer between the mud and seawater. The engineers specified a mixture of two different lost circulation pills to act as a spacer. They were allowed to dump fluids overboard that had been circulated down the well, so this allowed them to avoid disposing of the lost circulation pills on shore. The spacer volume was unusually large and had never been used for this purpose.
As they were displacing the mud and then the spacer, the crew was monitoring for kicks, or gas travelling up the riser at increasing speed, displacing mud. They have a couple of ways to watch for kicks. They can watch the volume of mud in the active pits, the volume going down should match the volume coming up. They can watch the flow rates, if the flow out of the well is greater than that going in, a kick is underway. And if they turn the pumps off, flow in both directions should stop.
Shortly after 9pm, the last of the mud and spacer were returning to the rig. There was quite a bit of commotion on deck at this time, so the details are a bit hazy. But after noticing a pressure difference between the riser and kill line, the pumps were shut off to investigate; but the pressure kept rising. Then all of a sudden the pressure shifted directions and decreased quickly. A kick was coming. Around 940pm, drilling mud spewed onto the rig floor and the crew closed one of the annular preventers, the rubber donuts around the drill riser in the blowout preventer below. But the gas was already above the blowout preventer, it was too late.
At 946pm they activated the bore ram, but the flow might have been too high to seal the well. The blind shear ram should have closed, severed the drill pipe and sealed the well. The crew pushed the button, the indicators lit up, but the blind shear ram never closed. The deadman switch failed too. Investigation would show that of the two deadman switch pods, one had low batteries and the other had defective solenoid valves; both caused by poor maintenance.
So, there was a lot of things going on there. Let’s recap.
The immediate cause of the blow out was the failure to contain oil and gas pressures in the well. Three things could have prevented it: a proper cement application between the casing and the rock, mud in the well and the riser, and the blowout preventer. The change to the casing method, while this decreased risk of another lost circulation event, it increased difficulty in obtaining a reliable cement application between the casing and the well. The cement failure was found to be a direct cause of the blowout. The number of centralizers installed could have played a part in the failure. I say could, because the investigation was not able to determine definitively if only installing six centralizers, instead of twenty one, directly contributed to the failure. But the process the engineers used to arrive at the decision that six centralizers was acceptable showed flaws in management and design procedures as well as poor communication with other trades. The decision to cancel the cement evaluation even though the crew was already on site and ready to go, was dumb. You already had a crew there to do the work, why not test the cement. Speaking of cement, no one seemed to have read the test reports that showed the mix was unstable. This was a huge oversight. The team failed to properly evaluate the risks of the Macondo cementing decisions and procedures. Those decisions and risk factors included, among other things: difficult drilling conditions, risk of another lost circulation event, no bottoms up circulation to essentially flush the well before cementing, using a less than recommended number of centralizers, and using a low cement volume.
The investigation identified a number of potential factors related to the negative pressure test. First, there was no standard procedure for running or interpreting the test in either drilling regulations or written industry protocols. They didn’t even have to do it, even though it was critical to ensure the well was properly sealed. Second, there was no internal procedures for running or interpreting negative-pressure tests, and the personnel were not formally trained to do so. Third, the engineers did not have in place (or did not enforce) any policy that would have required personnel to call back to shore for a second opinion about confusing data. And finally, due to poor communication, it does not appear that the men performing and interpreting the tests did not have the full picture. Nor did they approach the testing with any expectation that the well could lack integrity.
The risks of installing the cement plug lower than normal or allowed, outweighed the benefits and this was not clearly addressed with the team. The plan to remove the mud before installing the cement plug was poor as there was no back up protection if the cement failed. Which is what happened. Without the cement in place to balance the pressures in the well, the oil and gas were free to travel up the riser pipe. The crew failed to detect the kick before it was too late. Mind you, so many other things had gone wrong by this point, that the entire failure does not land on the missed kick. But had it been caught, this could have been a near miss. Although who knows what other end result would have occurred if that was the case. And lastly, the crew should have diverted the kick mud overboard and activated the blind shear ram right away. Possible reasons why they didn’t are that they may not have recognized the severity, although it was a lot of mud so they should have. And they also didn’t have much time. The explosion occurred 6-8min after mud emerged. All that said, the rig crew weren’t adequately trained to respond to this type of emergency. Which is unfortunate as they bear the most consequences from the explosion.
There are also a number of root causes, which are based on failures in industry and government that led to poor decision making and risk management at Macondo. There were systemic failures by industry management and by government to provide effective regulatory oversight. There were failures of industry management in decision making processes, communication amongst all parties and a lack of effective training of key engineering and rig personnel. The management process didn’t adequately identify or address risks created by late changes to well design and procedures. While engineering undergoes a serious review process, changes in well design are subject to a management of change process. But changes to drilling procedures weeks or days out are not subject to either. There is no formal risk analysis or internal expert review. This is pretty concerning. Why go through all those hoops to check design and then just allow last minute changes with no oversight. Management failed to ensure the cement was adequately tested; either at the lab or on site. The various groups working on the well failed to communicate with each other or even amongst themselves, leading to individuals making decisions without full context or the big picture. One of the major players on the Macondo well had an earlier near miss on Dec 23, 2009 in the North Sea. They were able to shut the well before the blowout, although 1 metric ton of oil based mud went into the ocean. It cost them 11 days of added work and more than 5 million British pounds in expenses. They failed to relay this experience to the crew at Deepwater Horizon. Had they done so, the crew may have reacted differently. There were regulatory failures, such as no requirement for the negative pressure test, or no requirement for cement test to confirm well stability, which contributed to the failure at Deepwater Horizon. Endeavours to strengthen regulator oversight have been attempted and either resisted or not supported by industry, members of congress or several administrators. And lastly, the team accepted risks to save time and money without consideration for safety and blowout. The reservoir had an expected capacity of 50 million barrels. In April 2010, oil was $78/barrel; now they obviously weren’t going to extract it all in one month, but lets just use $78 per barrel for a rough figure. They stood to make 3.9 billion dollars from this well. Factoring in 1 million a day carrying costs, they could have taken 10 years to drill and extract this reservoir and would have broken even. Obviously there are many factors, like the price of oil and other costs I am not privy too. I just wanted to show that the team could have delayed their schedule a day or two to drill this well right and still have been profitable. It’s things like this that are why people think the oil and gas industry is greedy. I realize I am generalizing here. But this doesn’t look good on the industry as a whole.
The US courts took BP, Transocean and Halliburton to task for their role in the failure and environmental impact of an 87 day oil leak in the Gulf of Mexico. In January 2013, Transocean agreed to pay $1.4 billion USD for violating the US Clean Water Act. In September 2014, Halliburton agreed to pay a settlement of legal claims by paying $1.1 billion USD into a trust in three installments over two years. On September 4th, 2014, a US District Judge appointed 67% of the blame to BP, 30% to Transocean and 3% to Halliburton. BP appealed this and lost on December 8, 2014. While there was no cap on the settlement, they estimated they would have to pay somewhere in the neighbourhood of $7.8 billion USD. Assuming these numbers are accurate, this failure cost the three companies $10.3 billion dollars. Which is over two and half times the potential revenue, before expenses, that they stood to make on the Macondo well, had it not failed.
So there you have it, so many things went wrong, leading to the explosion and sinking of Deepwater Horizon. I have to ask, is this how deep water wells are normally done? It seems that no consideration was made for the fact that this was the deepest well in history and therefore was uncharted territory. Also, the problems with decision making, risk management and communication speak to much larger systemic problem that extends much, much further beyond the confines of this one failure. And I don’t know where one would even begin to fix that. So many people have gotten rich off of this cutthroat attitude. And it’s likely so widespread. Just thinking about it is almost overwhelming. But that’s another problem for another day.
For photos, sources and transcripts from this week’s episode head to Failurology.ca. If you’re enjoying what you’re hearing, please rate, review and subscribe to failurology, so more people can find it. If you want to chat with me, my twitter handle is @failurology, you can email me at firstname.lastname@example.org, or you can connect with me on Linked In. Check out the show notes for links to all of these.
Thanks everyone for listening. And tune in to the next episode of Failurology where I will cover the Folsom dam spillway gate failure that released nearly 40% of Folsom Lake down the American River. But more on that next time. Bye everyone, talk soon!