Testing Distributed Systems
"Testing software is hard enough when it’s constrained to a single machine, but testing distributed systems that consist of multiple components adds a whole new level of complexity. Usually, a new cluster of real, virtual, or emulated machines needs to be created, provisioned, and destroyed for every test run. That means release building, dependency management, cluster deployment, configuration management, and parallel provisioning all must be automated. Throw in responsiveness to continuous changes and the test harness might end up being more complex than the system it’s testing! I’ll review several options for locally testing Mesos frameworks, Marathon applications, and DCOS services. Then I’ll go over some lessons I've learned from testing continuously, including how to containerize tests, pipelining strategies, and tactics for failure debugging."