The Internet service provider Cloudflare explains why part of the Internet could not be reached at the beginning of July. In the Blog Post Cloudflare is also aware of the damage to its customers. “We are ashamed for the fact that this happened , writes the offerer. Many users received a 502 bath gateway error instead of a website for a few minutes. The reason: A misdefined regular expression overloaded the CPU cores of servers for HTTPS and HTTP provisioning. It was unintentionally introduced along with a new rule for the manufacturer’s Web Application Firewall (Waf).
“We are constantly improving our set rules for our waf to respond to new vulnerabilities and threats,” said developer and CTO John Graham-Cumming. However, according to him, the failure was probably not only due to a poorly implemented expression, after the bug had also appeared in Cloudflare, where about 80 percent of the traffic collapsed. The worldwide distributed test programs, which check Cloudflare servers for their functionality, issued many errors. As a result, the London development team gathered for a meeting. At first it was assumed that it was “an attack we had never seen before,” writes Graham-Cumming. But it turned out to be an internal bug.
To get the system up and running, the team had to shut down the waf. This is done by a command that can be executed from anywhere, “Global Waf Kill”. However, Cloudflare could not access its own products due to its own failure. The developers could not log on to the internal control panel. An unnamed bridging mechanism enabled a team member to switch off the firewall. It could then be checked offline for the error.
The malfunction occurred on July 2, 2019 and resulted in the failure of several Internet pages. Even then Cloudflare admitted: The own test mechanisms are insufficient. The company does not want to repeat this mistake.