In 2012 researchers from the Parallel Systems Architecture Laboratory (PARSA) and the Data Intensive Applications & Systems Lab (DIAS) published a hugely ambitious paper entitled Clearing the clouds: a study of emerging scale-out workloads on modern hardware. A decade on, it’s received the Influential Paper Award at this year’s International Conference on Architectural Support for Programming Languages and Operating Systems.
Clearing the clouds had three main goals: to appraise the existing software and hardware used in large data centers; to demonstrate an open-source suite of benchmarks to measure the performance and efficiency of data center systems effectively; and to make recommendations for future data center solutions
The paper marked the release of CloudSuite, the first widely successful open-source benchmark suite for datacenter workloads. In the succeeding years Intel researchers published papers on CloudSuite, and Google provided funding to integrate it into its PerfKit Benchmarker toolkit. According to Google Scholar the paper has been cited 1200 times, including citations by Google’s scholars themselves.
A decade after the paper was published, Prof. Babak Falsafi of the Parallel Systems Architecture Laboratory has announced the release of CloudSuite 4.0. So how has the landscape changed since version 1.0?
“150-billion-dollar markets move slowly,” says Prof. Falsafi. “Intel makes processors for desktops and when we wrote that paper those same desktop processors were being used in data centers. That was true then, and it is still true today!”
Pointing at an iPhone, and at a MacBook with the recent Apple silicon technology, the professor explains: “Apple has developed native processors for its smartphones and laptops, with performance and efficiency improvements an order of magnitude better than the previous commodity versions. The company adapted the hardware for the specific needs of the device, and developed its own operating system, built for purpose. This is what needs to happen in data centers.”
When Moore’s Law still applied, that is to say when computer chips would double in density every two years, it was arguably feasible to continue using desktop processors for the constantly expanding demands of data center technology, but – as has been widely reported – Moore’s Law is no longer applicable.
“We need processors designed for the specific job of the data center server,” explains Falsafi, “with each component built for purpose to maximize silicon efficiency. Specifically, when accelerators or the network need access to data stored in memory on a server, the processor and the operating system need to get out of the way. The processor and the operating system should exist on one level – a control plane managing computation and memory resources, but accelerators and the network interface should work on another level – a data plane, unobstructed and unimpeded with protected access to memory.”
Manufacturers have developed products that meet some of these demands for workload optimization, such as Cavium’s 2014 ThunderX ARM microprocessor which followed the paper’s findings and, more recently, Amazon’s AWS Graviton. However, the vast majority of data centers still use standard desktop processors.
CloudSuite presents a unique solution to the challenges faced by data centers, explains Prof. Anastasia Ailamaki, co-author of the award-winning paper. “Distributed data processing is omnipresent in the compute stack of every conceivable application and its scalability is key to guaranteeing efficiency when running database workloads at scale. Cloud and SaaS (Software as a Service) providers fight an everyday battle with data management bottlenecks, which multiply in quantity and in nature as a result of the increased application heterogeneity.”
“Through an analysis of the microarchitectural characteristics of data management and analytic workloads on modern hardware, CloudSuite provides insights into these bottlenecks and limitations, and fuels the development of more efficient distributed database systems and cloud computing platforms.”
As Prof. Falsafi explains, CloudSuite is being used not only to appraise current performance, but also to design better solutions for the future. “CloudSuite incorporates support for the most recent libraries and software. The next release will provide support for RISC-V architecture, which is not even being used in servers yet.”
With this kind of future planning, perhaps data centers around the globe will, one day, catch up with the available technology.
Clearing the clouds: a study of emerging scale-out workloads on modern hardware
Michael Ferdman, Almutaz Adileh, Onur Kocberber, Stavros Volos, Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian Daniel Popescu, Anastasia Ailamaki, Babak Falsafi
Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, London, UK, March 3-7, 2012
– Winner of the ASPLOS 2023 Influential Paper Award.
Where are the authors now?
Michael Ferdman – Associate Professor at Stony Brook University, New York
Almutaz Adileh – CPU Architect at Toga Networks (Huawei)
Onur Kocberber – Consulting Member at Oracle
Stavros Volos – Principal Researcher at Microsoft Research, UK (Confidential Computing)
Mohammad Alisafaee – System Specialist at the Swiss Data Science Center at EPFL (SDSC)
Djordje Jevdjic – Assistant Professor at the National University of Singapore
Cansu Kaynak – Consulting Member at Oracle
Adrian Daniel Popescu – Principle Member of Technical Staff at Oracle
Anastasia Ailamaki – Full Professor at EPFL (DIAS)
Babak Falsafi – Full Professor at EPFL (PARSA)