Using Reinforcement Learning

Team

The advent of massively parallel and heterogeneous architectures has necessitated the co-location of applications in order to exploit the potential underlying performance, usually under tight system-level or application-level limits. Such systems require holistic and autonomous resource management schemes. In such scenarios, Artificial Intelligence (AI) techniques can be of great help. In fact, the development of agents that optimally learn and improve their behavior in an autonomous fashion has traditionally been one of the most important goals of AI. With their responsiveness and self-adaptation to the environment, AI systems can sense, interact, and react to environmental changes without human intervention.

In our line of research, we integrate a field of AI called Reinforcement Learning (RL), which learns by interaction, into a self-adaptive resource manager to tackle the problem of automatic and dynamic application- and system-wide knob adaptation for multi-user video transcoding scenarios on modern multi-core servers. Our study shows that RL is an effective and efficient technique to automatically extract and apply policies that simultaneously fulfill performance, quality, and power restrictions when targeting resource management on multiple application instances. In our proposal, each agent independently explores a particular subspace to attain sufficient knowledge about the environment faster. Further, each agent exploits its internal knowledge jointly with others’ knowledge in a cooperative manner to optimally behave in the environment.

We used the specific use case of multi-user video transcoding via a highly tuned HEVC encoder modified to expose dynamic application-level knobs. The benefits of our approach are revealed in terms of adaptability and quality (up to 4x improvements in terms of QoS when compared to a static scheme), and learning time (6x faster than an equivalent mono-agent implementation). Our power-capping techniques outperform the hardware-based power capping with respect to quality.

The management of dynamic application- and system-level knobs in a holistic fashion can be extended with further parameters or output metrics, and to other applications, both in the multimedia area and in other fields. Besides, the architectural-related techniques applied to deal with system knobs are of wide appeal to be applied (isolated or in conjunction) to other present and future architectures.

Suggested Reading

https://infoscience.epfl.ch/record/278279/files/2019_MAL_Luis_Journal%281%29.pdf

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Upcoming Events

EcoCloud Annual Event

Resource Management for Power-Constrained HEVC Transcoding

Using Reinforcement Learning

Team