Lessons from Japan

The terrible results of the earthquake and tsunami last week in Japan hold many lessons for engineers. It is of course obvious that the impact on human health, the search for and care of survivors should be the primary concern. However, much has ocurred and is occurring which can provide insight into the design of flooding control systems, earthquake-resistant building and infrstructure design, and the safety of nuclear reactor facilities.

While the most recent focus has been on the nuclear reactors and the damage to the spent fuel pool, a recent article in the New York Times (http://www.nytimes.com/2011/03/14/world/asia/14seawalls.html?_r=1&pagewanted=all)  discusses the design of seawalls.  As in the case of the hurricane protection system built in New Orleans (which failed in a spectacular way after Hurricane Katrina due primarily to poor design and outdated data, as well as the failure of backup pumps which should have pumped out some of the water which initially flowed over levees), the seawalls designed to protect shoreline areas including the nuclear power facilities were overwhelmed.  The New York Times article (“Seawalls Offered Little Protection Against Tsunami’s Crushing Waves” by Norimitsu Onishi, 3/13/11) quotes one engineer , Peter Yanev, who points out the fatal miscalculation that ” the diesel generators [used to pump cooling water] were situated in a low spot on the assumption that the walls were high enough to protect against any likely tsunami.”  While higher seawalls can be constructed, it is always possible that a wave too large even for a 40 foot high or more seawall may occur.  This is not to say that seawalls are useless (and in fact have protected communities and power facilities from typhoons and smaller tsunamis).  This just teaches engineers that the best “defense” against nature may be siting critical equipment (and in some cases entire facilities) in stable, protected locations, and also to use the principles of “absolute worst case design” in such cases.

Absolute worst case design (or just “worst case design”) is an important techniques (along with hazard analysis and redundancy) used to enhance reliability of complex systems.  It is most often used in the case of electronics design, but also plays an importnat role in military and space systems.  As you might guess, it starts with the basic idea that you design your system to withstand the worst possible operating conditions.  We often note that electronics or mechanical devices designed for military use tend to be very expensive — in fact it is often a common criticism of expenditures for items built for the U.S. Department of Defense.  Yet one contributing factor to this cost is the “worst case” design specifications used.  A computer used in your home has far fewer requirments (in terms of reliability) than one designed to go into a tank or into a spacecraft.  By developing design requirments which take into account extreme conditions coupled with the need for high reliability, engineers can create systems able to handle harsh conditions without fail.  This concepts should certainly be applied to nuclear reactor components, including cooling systems.

If you are interested in reading more about the U.S. Army’s “design for reliability” practices, there is a handbook available at http://www.amsaa.army.mil/ReliabilityTechnology/RelFiles/Design%20for%20Reliability%20Handbook.pdf.

One other concept which is very important in ensuring reliability of critical systems is the use of engineering standards.  Standards for nuclear power facilities (both for design as well as for operations and maintenance — including handling fuel and waste) are some of the most complex and rigorous ever developed.  For example, the American Nuclear Society maintains a set of standards which consider everythign from “Nuclear Criticality Safety Training” to “Containment System Leakage testing requirements” to “Nuclear Plant Response to an Earthquake”.  ANS, with the experience of many engineers and scientists to guide them, have developed standards for fuel handling, determining the impact of weather on facilities,  alarm systems and reactor design.  (see http://www.new.ans.org/standards/). 

In 2006, ANS published a position statement on Nuclear Facility Safety Standards (http://www.ans.org/pi/ps/docs/ps24.pdf).  In it, they state:

” The American Nuclear Society believes that consistent application of such standards provides a high level of safety. The ultimate responsibility for ensuring safety, however, rests with the operator of the nuclear facility in rigorously applying these standards. An effective and independent regulatory authority is also essential.” 

As always, while use of standards is critical, engineering design is essentially a “human” enterprise, and it is up to those who design, operate and maintain nuclear facilities to make safety their highest priority — a lesson learned from Three Mile Island and Chernobyl as well.

A greate website on Galloping Gertie….

There is a terrific website for learning about the tacoma Narrows Bridge disaster, at: http://www.wsdot.wa.gov/tnbhistory/Machine/machine3.htm (from the Washington State Department of Transportation). I especially like their analysis because they discuss the psychological “blind spot” of the engineers who designed the bridge — how the distinguished, accomplished engineers of the early twentieth century somehow ‘forgot’ the aerodynamic lessons of 19th century bridge designers. Combined with the need to understand the often complex torsional effects of winds on suspended structures, this oversight led to one of the most spectacular bridge disasters (and provides a terrific learning tool). Thank you, Washington State DOT!

Article on Learning from Disaster in Prism magazine

http://www.prism-magazine.org/dec10/feature_02.cfm

Charles Q. Choi has written an article entitled “Learning from Disaster” in Prism (published by the American Society of Engineering Education). It focuses primarily on lessons learned from the Gulf oil spill by engineers, including the role of redundancy (done correctly) in limiting the possibility of failure, and the need for engineers to be conscientious in their designs and persuasive and forthcoming in their critques of engineered systems when they know something to be wrong.

NRC: Backgrounder on the Three Mile Island Accident

NRC: Backgrounder on the Three Mile Island Accident.

As I prepare for another semester teaching “Learning from Disaster”, I am collecting background materials  for study.  The above link provides an excellent outline of the Three Mile Island nuclear power plant failure, which I will use in class this semester as a reading assignment prior to discussion of this disaster.

I will post more of these materials, as well as my class syllabus, over the next few weeks.

Historical bias in disasters

I have recently read a fascinating account of the yellow fever epidemic which killed a significant percenatge of Philadelphia’s (and surrounding area) population in 1793. The book is entitled “Bring Out Your Dead” by J.H. Powell (originally published in 1949 by University of Pennsylvania Press, and reissued by Time Life Books in 1965). While this is an account of a natural disaster (with important human factors — especially bias-related errors made in determining the likely cause of the plague and the best treatment methods), many of the author’s observations can be applied to issues of bias in the human causes and study of any disaster. The following quote I found especially relevant:
“Facts do not make tradition; they are swept away as it forms. But tradition makes history in its own terms, and gives each disaster such place in knowledge as men can know and use. This process begins as soon as disaster is over, and as soon as those who survive begin to forget a part of their experience, and devote the unforgettable remnant to some use. Afterwards, disaster is tortured by reason, tragedy averted by the simple persistance of living.”
While the author referred to the yellow fever epidemic, and how the key actions of the doctors and others involved were viewed by history, we can see how this might apply to bias in the analysis of more recent engineering disasters.

“Teaching by Disaster”

I presented a paper entitled “Teaching by Disaster: The Ethical, Legal and Societal Implications of Engineering Disaster”, at the American Society for Engineering Education, Fall 2010 Middle Atlantic Section Conference (October 15-16, Villanova University), which discussed the results of my first course offering exclusively on engineering disasters. I have included the text of the paper as a separate page with a link on the right — please have a look and let me know what you think!

SEC-CFTC Flash Crash Report Leaves Experts Unsatisfied

SEC-CFTC Flash Crash Report Leaves Experts Unsatisfied.

This news report, published today on the FinanceTech website and written by Ivy Schmerken, discusses lessons learned from studying the causes of the May 6th “flash crash” of the stock market.  Discussing learning about problems that can occur with computerized trading, the article states: “In terms of lessons learned from, it points out the perils of algorithmic trading strategy run amok when it is executed without regard to time and price. “.  It is another example of what engineers (or software designers) can learn from a system failure in order to make more relaible/robust designs.

Randomness and Engineering Failure

While it is not specifically a book about engineering disaster, or failure for that matter, I do find some interesting connections in the ideas presented in a book by Leonard Mlodinow, “The Drunkard’s Walk: How Randomness Rules Our Lives”. (Mlodinow is a co author, with Stephen Hawking, of “A Briefer History of Time”, as well as a number of other very good books — see his Caltech website http://www.its.caltech.edu/~len/).

Mlodinow talks about how we view events with what is referred to as “20-20 hindsight”. If one takes an extraordinary event, like an engineering disaster, it is often fairly straightforward (after a thorough investigation) to identify what we believe are particular, logical causes. However, it is necessasry to guard against jumping to conclusions about the judgement of those who may have (in some way) caused the failure. The more epic and emotional the situation, the more likely this rush to judgement is to occur. Disasters are, by definition, both epic and emotional.

When I teach about engineering disaster, I ask students (following their analysis of a disaster) to try to shift their perspective to a working system, before the failure. This is very difficult to do, but in many cases may be the only way to understand factors that affect (or cloud) the judgement of engineers, operators and others who play a key role in the eventual failure of the system. Mlodinow talks about this in terms of a concept referred to as “availability bias” in reconstructing the past. This concept is described succinctly on wisegeek.com thus:
“Availability bias is a human cognitive bias that causes us to overestimate probabilities of events associated with memorable or vivid occurrences. Because memorable events are further magnified by coverage in the media, the bias is compounded on the society level. Two prominent examples would be estimations of the probability of plane accidents, and the abduction of children. Both events are quite rare, but the vast majority of the population wildly overestimates their probability, and behaves accordingly. They are falling prey to the availability bias…”
(http://www.wisegeek.com/what-is-availability-bias.htm).

Hence “availability bias” is a very important concept to consider when teaching (and learning) about engineering disaster.

Probability and uncertainty (the primary theme of Mlodinow’s book) of events is a central component of the engineering methodology of “Design for Reliability”. Design for Reliability (or DfR) is really decision making with uncertainty. I will write more about this concept in a future post (and add some links to it in the list of links on this blog page), but I think it would be impossible to teach DfR without first teaching about the concepts and math of probability. We, as engineers, never know for sure how a system might fail. We can only work with likelihood, the relative importance of failure of individual components of a system, and the severity of consequences of failure of a particular component to the performance of the overall system. This is why DfR must use structured approaches (to reduce uncertainty) as well as the intuition of engineers. While intution is extremely valuable, it must be applied within the context of structure (as I have stated in previous posts) to avoid being clouded by emotion, environment, and biases (including, of course, “availability bias”).