码迷,mamicode.com
首页 > 其他好文 > 详细

Everything Will Ultimately Fail

时间:2015-07-30 11:30:50      阅读:113      评论:0      收藏:0      [点我收藏+]

标签:hardware   fail   michael   add   

?
Everything Will Ultimately Fail

Michael Nygard

HARdWARE iS FAlliBlE, So WE Add REdundAnCy. This allows us to sur- vive individual hardware failures, but increases the likelihood of having at least one failure present at any given time.
Software is fallible. Our applications are made of software, so they’re vulner- able to failures. We add monitoring to tell us when the applications fail, but that monitoring is made of more software, so it too is fallible.
Humans make mistakes; we are fallible also. So, we automate actions, diagnos- tics, and processes. Automation removes the chance for an error of commission, but increases the chance of an error of omission. No automated system can respond to the same range of situations that a human can.
Therefore, we add monitoring to the automation. More software, more oppor- tunities for failures.
Networks are built out of hardware, software, and very long wires. Therefore, networks are fallible. Even when they work, they are unpredictable because the state space of a large network is, for all practical purposes, infinite. Individual components may act deterministically, but still produce essentially chaotic behavior.
Every safety mechanism we employ to mitigate one kind of failure adds new failure modes. We add clustering software to move applications from a failed server to a healthy one, but now we risk “split-brain syndrome” if the cluster’s network acts up.
???16 97 Things Every Software Architect Should Know
?
??It’s worth remembering that the Three Mile Island accident was largely caused by a pressure relief value—a safety mechanism meant to prevent certain types of overpressure failures.
So, faced with the certainty of failure in our systems, what can we do about it?
Accept that, no matter what, your system will have a variety of failure modes. Deny that inevitability, and you lose your power to control and contain them. Once you accept that failures will happen, you have the ability to design your system’s reaction to specific failures. Just as auto engineers create crumple zones—areas designed to protect passengers by failing first—you can create safe failure modes that contain the damage and protect the rest of the system.
If you do not design your failure modes, then you will get whatever unpredict- able—and usually dangerous—ones happen to emerge.
Michael Nygard wrote Release It! Design and Deploy Production-Ready Soft- ware (Pragmatic Bookshelf), which won a Jolt Productivity award in 2008. His other writings can be found at http://www.michaelnygard.com/blog.

Everything Will Ultimately Fail

标签:hardware   fail   michael   add   

原文地址:http://blog.csdn.net/wangzi11322/article/details/47144395

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!