From the 80's to 2024 - how CI tests were invented and optimized
๐ Abstract
The article discusses the evolution of software testing practices, from manual code reviews in the 1980s to the modern era of continuous integration (CI) testing. It explores how testing has become faster and more automated over time, and the various techniques used to maximize the speed of testing individual code changes and all changes in a codebase.
๐ Q&A
[01] The slowest way to test code - by reading
1. What was the focus of software testing in the 1980s?
- The focus of software testing in the 1980s was on finding possible errors in the code above all else. Manual code reviews, known as "Fagan Inspections", were a common practice where groups of engineers would pour over code printouts looking for mistakes.
2. What were the limitations of software testing in the 1990s?
- In the 1990s, software unit tests became more prevalent, but they were predominantly written by specialized software testers using custom tooling and practices. This meant that the original code authors might have blind spots in testing their own code, and the test suites could be slow to execute, taking hours or even a full day to complete.
[02] Testing it yourself, sometimes
1. How did the Extreme Programming (XP) movement change the approach to software testing?
- The XP movement encouraged engineers to write small isolated tests for each new piece of code they contributed, with the idea that programmers could learn to be effective testers, at least at the unit level. This led to a faster feedback loop for developers, as they could test their code more frequently before merging.
2. What were the limitations of the self-testing approach?
- The self-testing approach was an opt-in process, relying on authors to diligently run local tests before merging. The test's success was also dependent on the author's local computer running it, rather than a source-of-truth server, leaving codebases at risk of breaking when a build was cut and the full test suite executed.
[03] Run an automated server to test the changes
1. What was the key development that ushered in the era of automated continuous integration testing?
- The creation of the "Hudson" (later renamed to Jenkins) tool by Sun Microsystem's engineer, Kohsuke Kawaguchi, in 2004 was a key development. This automated program acted as a long-lived test server that could automatically verify each code change as it integrated into the codebase.
2. How did the popularity of Hudson/Jenkins impact the industry?
- The open-sourcing of Hudson and its subsequent explosion in popularity led to the first generation of automated continuous integration tests becoming common, where it became standard to test every code change as it was written.
[04] Pay someone else to test your changes automatically
1. How did the shift to cloud-based code hosting and continuous integration services impact software testing?
- The shift to hosting code on platforms like GitHub, along with the launch of cloud-based CI services like CircleCI and Travis CI in the late 2000s, allowed smaller companies to outsource the maintenance of the test runners themselves, making it more accessible to test every code change.
2. What were the two key evolutions of cloud-based CI systems in the mid-2010s?
-
- Zero-maintenance CI systems merged with code hosting services, like GitLab and GitHub Actions, offering an all-in-one solution for triggering CI tests.
-
- Larger organizations shifted off Jenkins to more modern self-hosted options like BuildKite, allowing them to get the benefits of web dashboards and coordination while still hosting their code and test executions on their own compute.
[05] Maximizing how fast a single change can be tested
1. What are the three main ways to speed up software testing?
-
- Vertical scaling (using more powerful CPUs)
-
- Parallelization (defining graphs of testing steps to execute in parallel)
-
- Caching (minimizing repeated work by caching install and build steps)
2. What are the theoretical limits of speeding up a single code change validation?
- Engineering teams are reaching the theoretical limit of how fast a single code change can be validated, assuming the validation requirements are to "run all tests and build every code change."
[06] Speeding up how fast all changes can be tested
1. What are the two ways that some high-velocity organizations are leveraging to save computing resources and give engineers faster feedback?
-
- Company merge queues that batch and skip test execution on some changes, based on the order of the changes.
-
- Stacked code changes, where the dependency graph across multiple pull requests is used to batch and bisect testing.
2. How might AI-based code review potentially change the equation of software testing?
- AI code review could potentially scan diffs for common mistakes in 10 seconds or less, flagging issues like linter concerns, misaligned patterns, and spelling mistakes. This could allow AI review to run faster than traditional CI tests, potentially becoming a new form of "fast, cheap" testing.