Tuesday, April 30, 2013

Why the Rest of Us Need Virtualization Even If Facebook Doesn’t


Friday, April 26, 2013 9:00 AMWhy the Rest of Us Need Virtualization Even If Facebook Doesn'tPuppet LabsScott Johnston

There is a fascinating article in a recent Ars Technica on why Facebook creates its own hardware and how it avoids virtualization on its servers. Facebook just unveiled its first data center that has only its own custom hardware, designed per the Facebook-founded Open Compute Project. Facebook answers the "What is virtualization" question by saying, "Something we here at Facebook don't need."

Facebook says that it doesn't use virtualization because it doesn't suffer from underutilized servers the way many companies do. ("Facebook doesn't virtualize its servers, because its software already consumes all the hardware resources, meaning virtualization would result in a performance penalty without a gain in efficiency.") In other words, if you have a full load on each of thousands of servers all the time, why bother virtualizing? Individual machines are each just one worker bee among thousands all serving up the same application. There are other facets to this as well: Facebook doesn't have the same needs around information partitioning that IT shops do.

I can understand why Facebook went to custom designed hardware instead of buying off the shelf servers from OEMs. The biggest reason is efficiency at an enormous scale; with small design decisions multiplied many thousand-fold. The author notes that even touches like a plastic bezel with a logo have an enormous impact on cooling costs, since the bezel impedes airflow substantially. Small design issues add up at the gargantuan scale that Facebook buys, builds, and provisions its data centers, and the article points out that many design features that are added by the OEM for "management" wind up as unnecessary cruft, so the Open Compute specification gets rid of that.

Facebook's architecture includes several different types of machines tuned to different parts of their system load. Front-end servers used to serve web pages are quick, but have limited memory and disk, and a single power supply serves multiple servers. Database servers delivering active data queries no longer have any spinning disks, relying on flash memory for all of their storage. All of those photos that you take once and store on Facebook never to be seen again have their own efficient but relatively slow "cold storage."

There are very few organizations in the world that operate global data infrastructures with heavy system loads and hardware custom designed to fit inside each system's uniquely designed configuration. (I suspect we could count them on two hands.) For the rest of us, virtualization is a cornerstone for managing the modern data center.

Learn More

No comments:

Post a Comment