Research Partners

Huawei Cloud Business Unit Huawei CloudBU
Heating Bits Heating Bits
EPFL EcoCloud Center EcoCloud

Sources of Funding

RECIPE H2020
MANGO H2020
Compusapien
DeepHealth H2020
EPFL / Huawei Cloud


This research line focuses on multi-objective resource management of heterogeneous High Performance Computing (HPC) servers and datacenters through machine learning-based approaches.

Our research leverages system-level resource management techniques, such as  Dynamic Voltage and Frequency Scaling (DVFS), task scheduling and allocation, and thread migration, to simultaneously satisfy different design- and run-time objectives and constraints including power/energy consumption, temperature, performance, and Quality-of-Service.

Darong Huang and David Atienza
Darong Huang and David Atienza

News

  Predicting the future with CloudProphet



Related Publications

Is the powersave governor really saving power?
Huang, Darong; Costero Valero, Luis Maria; Atienza Alonso, David
2024-02-12Conference PaperPublication funded by EPFL / Huawei Cloud (Intelligent Cloud Technologies Initiative)
CloudProphet: A Machine Learning-Based Performance Prediction for Public Clouds
Huang, Darong; Costero Valero, Luis Maria; Pahlevan, Ali; Zapater Sancho, Marina; Atienza Alonso, David
2024-01-23IEEE Transactions on Sustainable ComputingPublication funded by RECIPE H2020 (REliable power and time-ConstraInts-aware Predictive management of heterogeneous Exascale systems)Publication funded by EPFL / Huawei Cloud (Intelligent Cloud Technologies Initiative)
Reinforcement Learning-Based Joint Reliability and Performance Optimization for Hybrid-Cache Computing Servers
Huang, Darong; Pahlevan, Ali; Costero, Luis; Zapater Sancho, Marina; Atienza Alonso, David
2022-03-07IEEE Transactions on Computer-Aided Design of Integrated Circuits and SystemsPublication funded by RECIPE H2020 (REliable power and time-ConstraInts-aware Predictive management of heterogeneous Exascale systems)Publication funded by DeepHealth H2020 (Deep-Learning and HPC to Boost Biomedical Applications for Health)Publication funded by Compusapien (Next-gen computing systems inspired by the human brain)
Resource Management for Power-Constrained HEVC Transcoding Using Reinforcement Learning
Costero, Luis; Iranfar, Arman; Zapater Sancho, Marina; D. Igual, Francisco; Olcoz, Katzalin; Atienza Alonso, David
2020IEEE Transactions on Parallel and Distributed SystemsPublication funded by Compusapien (Next-gen computing systems inspired by the human brain)Publication funded by DeepHealth H2020 (Deep-Learning and HPC to Boost Biomedical Applications for Health)
A Machine Learning-Based Framework for Throughput Estimation of Time-Varying Applications in Multi-Core Servers
Iranfar, Arman; Silva De Souza, Wellington; Zapater Sancho, Marina; Olcoz, Katzalin; Xavier de Souza, Samuel; Atienza Alonso, David
2019Conference PaperPublication funded by RECIPE H2020 (REliable power and time-ConstraInts-aware Predictive management of heterogeneous Exascale systems)Publication funded by Compusapien (Next-gen computing systems inspired by the human brain)
A Machine Learning-Based Strategy for Efficient Resource Management of Video Encoding on Heterogeneous MPSoCs
Iranfar, Arman; Simon, William Andrew; Zapater Sancho, Marina; Atienza Alonso, David
2018Conference PaperPublication funded by MANGO H2020 (Exploring Manycore Architectures for Next-GeneratiOn HPC systems)Publication funded by Compusapien (Next-gen computing systems inspired by the human brain)
Machine Learning-Based Quality-Aware Power and Thermal Management of Multistream HEVC Encoding on Multicore Servers
Iranfar, Arman; Zapater Sancho, Marina; Atienza Alonso, David
2018Journal of IEEE Transactions on Parallel and Distributed Systems (TPDS)Publication funded by MANGO H2020 (Exploring Manycore Architectures for Next-GeneratiOn HPC systems)Publication funded by Compusapien (Next-gen computing systems inspired by the human brain)