While a core focus of your research has been embedded systems, you have branched out into a number of areas. Is there a common thread that runs through the varied projects you work on?
The common thread of my research work, from low-power embedded devices to high-end computing servers, has always been the concept of co-design. This means that I investigate how to get the best out of deploying architectures, methodologies, and techniques that expand optimization beyond existing hardware and software boundaries. Indeed, my projects always explore how the system can be improved by iteratively optimizing hardware and software aspects in a closed loop.
The paper you co-authored, 3D-ICE: Fast Compact Transient Thermal Modeling for 3D ICs With Inter-Tier Liquid Cooling, was recognized with a 10-Year Retrospective Most Influential Paper Award. Why is thermal management such an important aspect of computer system design? What has been an especially significant innovation in this area over the past 10 years?
Digital circuits and systems are increasingly dependent on power, and thermal constraints are becoming much greater as Moore's law slows down and fails. In fact, the latest computing systems (both high-performance and battery-operated) require power and thermal management to operate correctly. Power brings heat, and heat affects power consumption. Because of this, accurate thermal modeling is a must, with a configurable granularity and simulation time overhead, to evaluate different cooling techniques and system co-design.
Three-dimensional integrated circuits have chips piled vertically one on top of another, a technology that brings great performance but also new problems - in particular, how to keep them cool. In our 3D-ICE paper in 2010 we proposed a theoretically grounded way to perform transient thermal modelling in 2D and 3D computing systems that had been manufactured with nano-scale technologies, allowing scientists to explore different power and thermal management techniques. Our open-source tool is greatly appreciated by the computing architecture and engineering community, with more than 2,000 registered users today (see 3D-ICE v3.1).
Since the time we published that paper, a very significant innovation has been the inclusion of machine learning (e.g., multi-armed bandits or multi-agent techniques) to develop thermal management schemes. Researchers can use these new techniques to analyze the thermal modelling data collected with 3D-ICE to figure out the best thermal operating point for each type of workload, in the context of a specific computer design or a particular manufacturing technology.
Computing technologies, including AI, are energy intensive. What innovations do you see on the horizon, to make computing more sustainable?
We need to understand that sustainability is not only about the energy consumed when executing a certain AI algorithm, but also the creation of the computing material that is used (e.g., GPUs, memory, etc.). Therefore, we need to develop new system integration schemes to reduce manufacturing costs, and to exploit the latest heterogeneous 2.5D and 3D IC packaging solutions.
Secondly—considering the execution of Artificial Intelligence or other workloads—new computing innovations are required to minimize the communication energy and performance overhead, by combining software (memory) and computing (logic), which is broadly known as the new computing-in-memory concept.
Thirdly, general-purpose computing systems have a large energy and cooling overhead, with respect to effective computing efficiency and specialized hardware. Therefore, we need to develop Exploratory Data Analysis (EDA) methodologies and more effective high-level synthesis tools, to develop accelerator-based architectures from high-level AI/Machine Learning/Deep Learning descriptions. These newly synthesized architectures can selectively activate or deactivate the required computing blocks, according to what is needed at any moment in time for the target AI system. This holistic approach, with strong synergies across all the abstraction layers of the computing design process, will minimize energy for operation and cooling at the same time, to target truly sustainable computing systems.
As part of the ACM Distinguished Speakers series, you give a talk entitled Biologically-Inspired IoT Systems for Federated Learning-Based Healthcare. Will you tell us a little about how this new design approach can improve the next generation of Edge AI computing systems?
Some of the points I raised above were long ago implemented in biological systems, as a result of many millions of years of evolution. This talk is about creating a new generation of Edge AI computing systems, that is, the deployment of sets of inter-connected, artificially intelligent sensor devices into the physical world. The idea is to combine all the most groundbreaking concepts to co-design the next generation of Edge AI computing systems, taking inspiration from how biological computing systems operate.
The critical elements of this new design approach are the combination of multiple types of ultra-low power (but imprecise) specialized computing platforms, that execute multiple types of ensembles of neural networks. This will improve the robustness of the final outputs at the system level, while minimizing memory and computation resources.
Moreover, these specialized computing platforms can help each other with any computing task by exploiting federated learning, and sharing their network model information, while at the same time preserving the privacy of the computation performed in each individual unit, without sending data between Edge AI devices. This is very similar to the process of the human nervous system, when it detects a certain event from signals sent by different parts of the body. In both cases it is a much more efficient technique than it would be to transfer all the data to a single device for computation.