“Facebook can’t be down, can it?”, we thought, for a second.
Today at 15:51 UTC, we opened an internal incident entitled “Facebook DNS lookup returning SERVFAIL” because we were worried that something was wrong with our DNS resolver 220.127.116.11. But as we were about to post on our public status page we realized something else more serious was going on.
Social media quickly burst into flames, reporting what our engineers rapidly confirmed too. Facebook and its affiliated services WhatsApp and Instagram were, in fact, all down. Their DNS names stopped resolving, and their infrastructure IPs were unreachable. It was as if someone had “pulled the cables” from their data centers all at once and disconnected them from the Internet.
This wasn’t a DNS issue itself, but failing DNS was the first symptom we’d seen of a larger Facebook outage.
How’s that even possible?