As more companies move toward microservices and other distributed technologies, the complexity of these systems increases. You can't remove the complexity, but through Chaos Engineering you can discover vulnerabilities and prevent outages before they impact your customers. This practical guide shows engineers how to navigate complex systems while optimizing to meet business goals.
Two of the field's prominent figures, Casey Rosenthal and Nora Jones, pioneered the discipline while working together at Netflix. In this book, they expound on the what, how, and why of Chaos Engineering while facilitating a conversation from practitioners across industries. Many chapters are written by contributing authors to widen the perspective across verticals within (and beyond) the software industry.
Learn how Chaos Engineering enables your organization to navigate complexity Explore a methodology to avoid failures within your application, network, and infrastructure Move from theory to practice through real-world stories from industry experts at Google, Microsoft, Slack, and LinkedIn, among others Establish a framework for thinking about complexity within software systems Design a Chaos Engineering program around game days and move toward highly targeted, automated experiments Learn how to design continuous collaborative chaos experiments
Honestly, down the road I think this will be a three-star book. For now, at this point where the literature stands for this specific topic, though, I think it's four stars. It's a good overview by practitioners at various points in their journey.
But, as an an example, there are places where people misinterpret some ideas, not out of maliciousness, just error, not having quite thought their point through enough. For example, while Chaos Engineering is a mode for thinking about a "population" and its overtly visible surface, and as such root-cause analysis isn't always appropriate, root-cause analysis is misinterpreted by some of the authors -- perhaps because they've been in blame-game cultures. Or they've been in cultures where people insist one root cause is the root of dysfunction in a complex system. Root-cause analysis is just part of systems-level thinking: given a set of conditions, how does this generate these behaviors? Where do those conditions come from? It's all about understanding the interbeing of a particular development environment. In particular, in understanding that, we can redefine the game being played to produce better results. Really, Chaos Engineering itself is only another way of changing the rules of play. So it makes sense for those in chaos engineering mode to view things from a certain perspective, but some authors lost track of the fact that it's only one mode.
It's also both good and bad that this is a book where the authors are still articulating how this discipline should work. On the one hand, it's great to see it in its raw, developing state. Eventually, though it'll become more codified -- that's why expect it'll be more a three-star book one day. There will eventually be a rendition of this topic that covers it in a more developed form, perhaps in five years. And perhaps written entirely by one person, as opposed to several, all with different levels of experience.
If you're interested in chaos engineering, though, I think this is something to read for sure.
The book was a bit hard to read as it wasn't really a book but a loosely linked chapters written by different authors. The writing styles and approaches to the topics differed a lot and I would like to see better editorial work on this one.
Despite this, I did enjoy the book and plenty of the chapters within it. The most interesting were the chapters on the application of Security via Chaos Engineering, the two chapters relating to the Human Systems (How do you apply the chaos engineering principles on the organizational level), and the one that was reflecting on the similarities between the material engineering safety practices and the chaos engineering ones.
Altogether, I believe it is worth reading if you already have some idea about the Chaos Engineering and want to expand them.
Chaos engineering is a quite new growing area. This book explores it through different companies and methodologies that they had implemented. Very promising and definitely worth studying if you are in software development.
An excellent book that is able to balance theory, real-world cases and plenty of practical takeaways to govern and guide the adoption of chaos engineering practices. This book gets the balance right between theory, practice and application and is enjoyable to read because you can pick it up and put it down due to the chapters being very well organised
One of the things that I liked a lot was the the inclusion of industry recognised models and other strategies for measuring and assessing how you can adopt and utilise the concepts introduced in this book. There is also a comprehensive list of references which will be valuable In articulating value up and down and organisation for establishing a chaos engineering practice. The book places a strong emphasis on scientific thinking for running chaos experiments with the underlying point of chaos engineering being able to learn and adapt not just selectively break stuff and fix it for the sake of it
This book will become an excellent reference source, one that should be mandatory reading for anyone working in the software industry, especially ones dealing with ever increasing complexity in the systems they are building and maintaining who desire to increase the overall resiliency of their systems following a proactive and measured practice
Your mileage may vary with this book. It is well written and structured taking you on a great tour of where Chaos Engineering is today. The problem is that Chaos Engineering as a discipline is new and unsettled, a problem that the authors address and tackle candidly. Because the field is so new, the book is sometimes self-contradictory, another fact that is not lost on the authors. For me, I’m just not at a point to tackle CE just yet. As a result, the competing views were enlightening but lost on me without experience to apply my own views to the conversation.
If you’re currently doing Chaos Engineering, I imagine this book will be invaluable. If you’re dipping your toes, but still a ways away from implementing, you might be able to pass on this book and revisit it when you’re closer. The field might have taken better solid form by then. If it hasn’t, this will still be a useful guide.
I intend to read this again when I’m ready for CE and wouldn’t be surprised if my rating changed at that time.
Basically it looks like someone drew from long time known practices across field of systems engineering, process mining and designing and called it a new field of engineering for hype and marketing. Chaos engineering is only a new field as long as we can consider putting addition, multiplication and division in the same equation a completely new field of mathematics. Sans the hype, it looks like the authors are rediscovering what it means to deliver resilency and stability across distributed and large scale systems. The world has been developing and delivering large scale distributed systems for decades now. Even NASA delivers and monitors SPACE stations using systems engineering, so we can understand how chaos engineering is largely a false hype.
This book is mostly around the managerial know how about the ideas around chaos engineering and provides ideas on how you can execute it and relevant processes in your organization. Unfortunately this did not satisfy me as the book described very few actual technical solutions and the tone of the book was such that it did not really go deep inside what was going on inside each of the described organizations. The book had chapters written by a lot of writers, so the progression between chapters was weird at times.
The book composed by short stories around Chaos Engineering initiative sums various approaches on how to make the systems more robust and resilient. Some parts/chapters felt pretty basic but some were quite fascinating and putting the whole area into different light (ie. link to systems thinking or human aspects). Definitely recommended.
There is much here of interest; modern practice, in systems which can be torn, and brought up, and scaled --with a simple series of commands, or a script-- have been the norm for a long time. MTBF is perhaps a key issue for servers and cloud systems; for a neophyte like myself, in this area, this is a welcome addition. Or Food for thought, at least- how will we defend such systems from attack?
I didn't read it from cover to cover. The first two chapters are a very interesting take on complexity. Since I don't plan to implement a chaos engineering process right now, the other chapters felt unnecessary.