How to Avoid Cascading Failures in Distributed Systems

How to Avoid Cascading Failures in Distributed Systems

Cascading failures are failures that involve some kind of feedback mechanism. In distributed software systems they generally involve a feedback loop where some event causes either a reduction in capacity, an increase in latency, or a spike of errors. Laura Nolan explores them using public accounts of real production incidents.

Source: infoq.com