A few days ago, Facebook open-sourced its artificial intelligence (AI) hardware computing design. Most people don’t know that large companies such as Facebook, Google, and Amazon don’t buy hardware from the usual large computer suppliers like Dell, HP, and IBM but instead design their own hardware based on commodity components. The Facebook website and all its myriad apps and subsystems persist on a cloud infrastructure constructed from tens of thousands of computers designed from scratch by Facebook’s own hardware engineers.
Open-sourcing Facebook’s AI hardware means that deep learning has graduated from the Facebook Artificial Intelligence Research (FAIR) lab into Facebook’s mainstream production systems intended to run apps created by its product development teams. If Facebook software developers are to build deep-learning systems for users, a standard hardware module optimised for fast deep learning execution that fits into and scales with Facebook’s data centres needs to be designed, competitively procured, and deployed. The module, called Big Sur, looks like any rack mounted commodity hardware unit found in any large cloud data centre.
But Big Sur differs from the other data centre hardware modules that serve Facebook’s browser and smartphone newsfeed in one significant way: it is built around the Nvidia Tesla M40 GPU. Up to eight Nvidia Tesla M40 cards like the one pictured to the right can be squeezed into a single Big Sur chassis. Each Nvidia Telsa M40 card has 3072 cores and 12GB of memory.