Randomness and Engineering Failure

While it is not specifically a book about engineering disaster, or failure for that matter, I do find some interesting connections in the ideas presented in a book by Leonard Mlodinow, “The Drunkard’s Walk: How Randomness Rules Our Lives”. (Mlodinow is a co author, with Stephen Hawking, of “A Briefer History of Time”, as well as a number of other very good books — see his Caltech website http://www.its.caltech.edu/~len/).

Mlodinow talks about how we view events with what is referred to as “20-20 hindsight”. If one takes an extraordinary event, like an engineering disaster, it is often fairly straightforward (after a thorough investigation) to identify what we believe are particular, logical causes. However, it is necessasry to guard against jumping to conclusions about the judgement of those who may have (in some way) caused the failure. The more epic and emotional the situation, the more likely this rush to judgement is to occur. Disasters are, by definition, both epic and emotional.

When I teach about engineering disaster, I ask students (following their analysis of a disaster) to try to shift their perspective to a working system, before the failure. This is very difficult to do, but in many cases may be the only way to understand factors that affect (or cloud) the judgement of engineers, operators and others who play a key role in the eventual failure of the system. Mlodinow talks about this in terms of a concept referred to as “availability bias” in reconstructing the past. This concept is described succinctly on wisegeek.com thus:
“Availability bias is a human cognitive bias that causes us to overestimate probabilities of events associated with memorable or vivid occurrences. Because memorable events are further magnified by coverage in the media, the bias is compounded on the society level. Two prominent examples would be estimations of the probability of plane accidents, and the abduction of children. Both events are quite rare, but the vast majority of the population wildly overestimates their probability, and behaves accordingly. They are falling prey to the availability bias…”
(http://www.wisegeek.com/what-is-availability-bias.htm).

Hence “availability bias” is a very important concept to consider when teaching (and learning) about engineering disaster.

Probability and uncertainty (the primary theme of Mlodinow’s book) of events is a central component of the engineering methodology of “Design for Reliability”. Design for Reliability (or DfR) is really decision making with uncertainty. I will write more about this concept in a future post (and add some links to it in the list of links on this blog page), but I think it would be impossible to teach DfR without first teaching about the concepts and math of probability. We, as engineers, never know for sure how a system might fail. We can only work with likelihood, the relative importance of failure of individual components of a system, and the severity of consequences of failure of a particular component to the performance of the overall system. This is why DfR must use structured approaches (to reduce uncertainty) as well as the intuition of engineers. While intution is extremely valuable, it must be applied within the context of structure (as I have stated in previous posts) to avoid being clouded by emotion, environment, and biases (including, of course, “availability bias”).

Steel gas mains draw regulators’ scrutiny – U.S. news – msnbc.com

Steel gas mains draw regulators’ scrutiny – U.S. news – msnbc.com.

Read the link for an informative article concerning what engineers can learn from natural gas pipelne failures — and why aging gas infrastructure made of steel is becoming more and more a problem.  This is clearly illustrated by the recent disaster and tragic loss of life (and homes) in California.

Materials engineers especially have an important role to play in understanding the materials failure (including corrosion) often at the core of gas pipeline failures.  The National Association of Corrosion Engineers has published a case study on-line of their analysis of a gas pipeline failure in New mexico in 2000.  You can read it at http://events.nace.org/library/corrosion/Pipeline/Carlsbad-explosion.asp.  Obviously corrosion is less of a problem with gas lines that are not made of metal, but chemical attack, environmental degradation and mechanical failures are still hazards engineers need to learn about and pay attention to, especially in light of the aging natural gas infrastructure in the United States.

A professional approach to disaster

One of the roles of professional engineering and scientific societies is to conduct reviews of engineering disasters and help to provide guidance on how to create more reliable designs, structures, etc. in the future. These organizations often convene groups of engineering experts, scientists, legal experts and others to create these studies.

One study I have used in my class was developed by the American Society for Civil Engineering in response to a request for review of the failure of the hurricane protection system in New Orleans during and following Hurricane Katrina in 2005.  This report, entitled “The New Orleans Hurricane Protection System: What Went Wrong and Why”, was created in response to a request to the Society to study and report on the work performed by the  Interagency performance Evaluation Task Force (IPET), a group convened by the U.S. Army Corps of Engineers (USACE) to review the performance of the New Orleans and southeast Louisiana hurricane protection systems.  The report, writtien by the ASCE External Review Panel convened for this purpose, is available for download at http://www.asce.org/uploadedFiles/Publications/ASCE_News/2009/04_April/ERPreport.pdf.  It is about 90 pages long and is a tremndous resource for learning and teaching about this infamous disaster.  Further, ASCE has published a report entitled “Guiding Principles for the Nation’s Critical Infrastructure”.  This document (available at http://www.asce.org/Content.aspx?id=2147485369) was the result of an industry summit convened in order to “identify content for a guidance document outlining key attributes required for successful, safe, resilient, and sustainable critical infrastructure systems. The document will assist in proactively preventing infrastructure catastrophes such as the levee failures in New Orleans during Hurricane Katrina or the collapse of the I-35W Bridge in Minneapolis.”  (from the website).

It is easy to see that this material is an extremely valuable resource for teaching students about what can be learned from engineering disaster.  I have had students download these documents (in whole or in part) and we have used these to guide discussions in class.  Of course, similar studies and reviews are being conducted by professional organizations in response to the Gulf oil spill.  A new website has been established by the National Academy of Engineering which will collect information on what went wrong which caused the spill, and what can be done to prevent such disasters in the future.  The site is entitled “Blowout Prevention: Analysis of the Deepwater Horizon Explosion, Fire and Oil Spill” (http://sites.nationalacademies.org/BlowoutPrevention/). 

As in the case of the Katrina study by ASCE, the study by the National Academy of Engineering/National Research Council has been requested by the government (department of the Interior), and the purpose of the study is, according to the website, to “address the performance of technologies and practices involved in the probable causes of the Macondo well blowout and explosion on the Deepwater Horizon. It will also identify and recommend available technology, industry best practices, best available standards, and other measures in use around the world in deepwater exploratory drilling and well completion to avoid future occurrence of such events.”

As the causes of the Gulf oil spill disaster continue to be explored (as noted in current news stories from the Associated Press and other sources), and sections of the failed blowout preventer (among other materials) are brought to the surface for analysis, the NAE/NRC website will be an excellent place to look for evolving resources resulting from careful study and review.  This type of material will provide excellent teaching and learning resources for courses involving engineering disaster.

Aging infrastructure and the Long Island Railroad

For those of you looking for interesting teaching opportunities in recent engineering failures (and with the new academic year starting, this may be on your mind), a good area to consider are reports of failure in aging infrastructure. Bridges, gas pipelines, electrical transmission systems — examples (unfortunately) abound, and can often be found in your local area. In fact, I occassionally conduct an “un-nature walk” with my students during which I point out not only how engineers design buildings, roads, street lights, etc., but also how infrastructure fails. From cracked sidewalks to rust from rebar coming through walls, examples are everywhere.

An interesting example, with teaching potential, in my neck of the woods, is the recent failure (due to a fire in a century-old switch room) of the Long Island Railroad’s venerable Jamaica hub. The New York Times has covered this nicely (see http://www.nytimes.com/2010/08/24/nyregion/24lirr.html?_r=1&scp=1&sq=long%20island%20transit&st=cse for example), and there is plenty of additional information available. Of course, a major problem with replacing aging infrastructure is cost, which presents a difficulty for engineers and politicians alike. Where, and what, do you repair or replace first? How do you create an effective maintenance plan, taking into account constraints in time, labor, money, etc.? Hopefully, we won’t always have to wait for a failure (or a disaster in terms of lives or costs) to provide guidance.

Good luck in the new academic year, and please write back with your thoughts on learning from disaster!

New York Times wins award for series on problems with medical radiation treatments.

A few posts ago, I wrote about a recent New York Times article on accidental oversdoses of radiation received by some patients undergoing an intensive form of CAT scans for detection of strokes. This article is part of an investigative series in the Times which has now been recognized with a Public Service Award from the Associated Press Managing Editors association. The full article about this award can be found at http://www.google.com/hostednews/ap/article/ALeqM5jTEYAaTLag_kgedHrdnxv0QHbSAgD9HQLMIO0.

I had read some of the previous articles in this series as well — apparently, it is a challenge to design medical treatment equipment which is easy to use and flexible enough in its use to allow variability in treatment (a good thing) while preventing accidental (or even intentional) misuse. Engineers have to pay attention to these critical engineering needs — part of an area of engineering design called ‘ergonomics’.

Ergonomics is sometimes referred to as ‘human design’ or, in other words, design for humans. Knowing how big to make a car seat, how far to put controls from a pilot, and so forth, is only part of it. Ergonomics also includes issues of aesthetics (the appearance of a product) and safety. Safety encompasses not just designing so that the product does not have hazardous components, sharp edges, etc., but also so that the product has clear instruction, and proper safety warnings and labeling. This is a critical part of design — without proper attention to ergonaomic and safety needs, misuse and errors can lead to disaster.

A new blog on disasters …

James Chiles, the author of “Inviting Disaster: Lessons from the Edge of Technology” (listed on our bibliography page) has developed a blog with his take on some recent engineering disasters. Please have a look! It is at: http://disaster-wise.blogspot.com/

Design lessons from medical treatment disaster

A recent article in the New York Times (“The Mark of an Overdose”, by Walt Bogdanich, 8/1/10, http://www.nytimes.com/2010/08/01/health/01radiation.html?pagewanted=all) discusses dangers to patients from CT brain scans. In many cases (and there are shocking pictures of the damage in the article), patients who had CT scans to tests for a stroke in some cases received an overdose (up to 13 times the usual amount for the test) which resulted in striking hair loss, as well as headaches and memory-related symptoms. As stated in the article, the review by the NYT found that “While in some cases technicians did not know how to properly administer the test, interviews with hospital officials and a review of public records raise new questions about the role of manufacturers, including how well they design their software and equipment and train those who use them.” The author found that during application of the test in a way which would provide better images of blood flow, technicians used an automatic feature of the equipment which they thought would lower radiation levels, but in fact raised them. While this excellent article provides a much more in depth analysis of the problem (and you should read it), I wish to focus on a particular aspect related to engineering design and what can be learned about it.

What is the responsibility of an engineer and their company in designing a system so as to avoid possibly disasterous user error? We can site many examples where this becomes a critical issue — other medical examples like the title failure of the cancer treatment equipment in the book “Set Phasers on Stun” (listed on the bibliography page) and non-medical failures such as the one that played a contributing role in the Bhopal/Union Carbide chemical plant disaster. In that case, negligence due to inadequately trained operators played a critical role.

As we, as engineers and designers of powerful equipment and systems, study these failures, these case studies emphasize the need for manufacturers to provide comprehensive and clear instructions, and to design “fail-safe” features and limiting functions into equipment to automatically prevent dangerous situations from occurring. Of course, this can never be accomplished completely. While the light-hearted phase goes “build an idiot-proof system, and someone will build a better idiot”, and this is a deadly serious situation, it does speak to the truth that it is a responsibility of the conscientious engineer to do their best to try to anticipate all uses, and misuses, of the machines, systems and processes they design.

The Gulf oil spill as a learning tool — information around the web

I have noticed some great information lately about how to use the Gulf oil spill in classroom and problem-based learning activities. A blog page by Eric Brunsell at ttp://www.edutopia.org/blog/oil-spill-project-based-learning-resources (Project-Based Learning and the Gulf Oil Spill) collects many of these resources nicely. There is also a great learning activity for grades 6-12 from the New York Times at http://learning.blogs.nytimes.com/2010/05/05/the-drill-on-the-spill-learning-about-the-gulf-oil-leak-in-the-lab/. I strongly recommend both if you are thinking about ways to use the Gulf oil spill in the classroom or as part of group assignments.