What Caused Gmail To Go Down?

Gmail went down yesterday which caused a surge in search traffic for confirmation and the causes.

Now, Google engineer Ben Treynor, explains through the official Gmail blog:

We took a small fraction of Gmail’s servers offline to perform routine upgrades. We had slightly underestimated the load which some recent changes (ironically, some designed to improve service availability) placed on the request routers — servers which direct web queries to the appropriate Gmail server for response.

At about 12:30 pm Pacific a few of the request routers became overloaded and in effect told the rest of the system “stop sending us traffic, we’re too slow!”. This transferred the load onto the remaining request routers, causing a few more of them to also become overloaded, and within minutes nearly all of the request routers were overloaded.

As a result, people couldn’t access Gmail via the web interface because their requests couldn’t be routed to a Gmail server. IMAP/POP access and mail processing continued to work normally because these requests don’t use the same routers.

Gmail / Google was able to restore their service after about two hours. Google simply brought more routers online and spread the traffic among them. Google says it is tweaking its architecture so that the problem doesn’t happen again.