The outages on Monday at Facebook, WhatsApp and Instagram occurred because of a problem in the company’s domain name system, a relatively unknown -- at least to the masses -- but crucial component of the internet.
Commonly known as DNS, it’s like a phone book for the internet. It’s the tool that converts a web domain, like Facebook.com, into the actual internet protocol, or IP, address where the site resides. Think of Facebook.com as the person one might look up in the white pages, and the IP address as the physical address they’ll find.
On Monday, a technical problem related to Facebook’s DNS records caused outages. When a DNS error occurs, that makes turning Facebook.com into a user’s profile page impossible. That’s apparently what happened inside Facebook -- but at a scale that’s temporarily crippled the entire Facebook ecosystem.
Not only are Facebook’s primary platforms down, but so too are some of their internal applications, including the company’s own email system. Users on Twitter and Reddit also indicate that employees at the company’s Menlo Park, California, campus were unable to access offices and conference rooms that required a security badge. That could happen if the system that grants access is also connected to the same domain -- Facebook.com.
The problem at Facebook Inc. appears to have its origins in the Border Gateway Protocol, or BGP. If DNS is the internet’s phone book, BGP is its postal service. When a user enters data in the internet, BGP determines the best available paths that data could travel.
Minutes before Facebook’s platforms stopped loading, public records show that a large number of changes were made to Facebook’s BGP routes, according to Cloudflare Inc.’s chief technology officer, John Graham-Cumming, in a Tweet.
While the BGP snafu may explain why Facebook’s DNS has failed, the company hasn’t yet commented on why the BGP routes were withdrawn early on Oct. 4.