Meta says, the new AI Research SuperCluster or RSC, is already one of the fastest machines of its kind. When completed in mid-2022, it will be the fastest in the world.
“Meta has developed what we believe is the world’s fastest AI supercomputer,” Meta CEO Mark Zuckerberg said in a statement. “We call it the RSC for the AI Research SuperCluster and it will be completed later this year.”
The news shows the absolute centrality of AI research for a company like Meta. Rivals like Microsoft and Nvidia have announced their own “AI supercomputers,” which are a bit different from what are considered to be ordinary supercomputers.
The RSC will be used to train various systems across the Meta business. From the content moderation algorithms used to detect hate speech on Facebook and Instagram, to the augmented reality (AR) features that will one day be available on the company’s future AR hardware.
Meta says RSC will be used to design a metaverse experience – a persistent corporate branding for a series of interconnected virtual spaces, from the office to the online arena.
“RSC will help AI Meta researchers build new and improved AI models that can learn from trillions of examples; works in hundreds of different languages; analyze text, images and videos simultaneously; developing new augmented reality tools; and more,” wrote Meta engineers Kevin Lee and Shubho Sengupta in a blog post outlining the news.
“We hope that RSC will help us build completely new AI systems that can, for example, support real-time voice translation to large groups of people, each speaking a different language, so they can seamlessly collaborate on research projects or play AR games. together,” he added.
Work on the RSC began a year and a half ago, with Meta engineers designing various machine systems – cooling, power, networking and cabling – all from scratch. The first phase of the RSC is up and running and consists of 760 Nvidia GGX A100 systems containing 6,080 GPUs connected – a type of processor that is very good at dealing with machine learning issues.
Meta claims to have provided up to a 20-fold increase in performance on its standard machine vision research tasks.
However, before the end of 2022, RSC phase two will be completed. By then, it will contain around 16,000 GPUs in total and will be able to train AI systems “with over a trillion parameters on an exabyte of dataset”. This raw GPU count provides only a narrow metric for overall system performance, but, for comparison, Microsoft’s AI supercomputer built with the OpenAI research lab is built from 10,000 GPUs.
These numbers are all very impressive, but they beg the question, what is an AI supercomputer? And how does it compare to what we normally think of as supercomputers—large machines used by universities and governments to calculate numbers in complex domains such as outer space, nuclear physics, and climate change.
The two types of systems, known as high-performance computers or HPCs, are certainly more alike than different. Both are closer to the data center than individual computers in size and appearance and rely on a large number of interconnected processors to exchange data at very fast speeds.
But there are key differences between the two, as HPC analyst Bob Sorensen of Hyperion Research explained to The Verge. “AI-based HPCs live in a somewhat different world than their traditional HPC counterparts,” says Sorensen, and the big difference is in accuracy.
The short explanation is that machine learning requires lower accuracy than tasks performed by traditional supercomputers. So the “AI supercomputer” (a bit of a recent branding) can do more computations per second than its regular brethren using the same hardware.