A paradigm shift in virtual memory use: Midgard

Researchers at Ecocloud, the EPFL Center for Sustainable Cloud Computing, have pioneered an innovative approach to implementing virtual memory in data centers, which will greatly increase server efficiency.

Virtual Memory has always been a pillar for memory isolation, protection and security in digital platforms. The use of virtual memory is non-negotiable, even in widely-used hardware accelerators like GPU, NICs, FPGAs and secure CPU architectures. It is therefore vital that silicon should be used as frugally as possible.

As services host more data in server memory for faster access, the traditional virtual memory technologies that look up data in server memory and check for protection have emerged as a bottleneck. Modern graph analytics workloads (e.g., on social media) spend over 20% of their time in virtual memory translation and protection checks. Server virtualization for cloud computing, to help increase utilization of infrastructure and return on investment in data centers, dramatically exacerbates this problem by requiring lookups and protection checks across multiple layers of guest (customer) and host (cloud provider) software.

The way in which virtual memory is assigned in these servers is critical because, with such huge quantities of data involved, changes in strategy can have a massive effect on server efficiency and data security.

“Virtual memory technology has been around since the 1960’s, and lays the foundation for memory protection and security in modern digital platforms,” write the authors of “Rebooting Virtual Memory with Midgard”, a paper they will present next month at ISCA’21, the flagship conference in computer architecture.

Memory has become the most precious silicon product in data centers in recent years, as more services are brought online. Virtual memory traditionally divides up the physical storage into fixed size units, for optimal capacity management. This division slows down lookups and protection checks as memory capacity increases, because large regions of memory in application software (e.g., GBs) is disintegrated into millions of pages (e.g., KB). Modern chips (e.g., the recently announced Apple M1) employ thousands of table entries per processor to do lookups and perform protection checks for each memory access.

Namespaces are used to store unique references for data, in structured hierarchies. Removing some of this hierarchy and reducing the number of translations would represent a net gain in efficiency. The authors propose Midgard, which introduces a namespace for data lookup and memory protection checks in the memory system without making any modifications to the application software or the programming interface in modern platforms (e.g., Linux, Android, macOS/iOS).

With Midgard, data lookups and protection checks are done directly in the Midgard namespace in on-chip memory, and a translation to fixed size pages is only needed for access to physical memory. Unlike traditional virtual memory whose overhead grows with memory capacity, Midgard future-proofs virtual memory as the overhead of translation and protection check to physical memory decreases with growing on-chip memory capacity in future products filtering traffic to physical memory.

Analytic and empirical results described in the paper show a remarkable performance from Midgard when compared to traditional technology, or even rival new technologies (e.g., the larger fixed size pages used in certain applications). At low loads the Midgard system was 5% behind standard performance, but with loads of 256 MB aggregate large cache it can match and even outperform traditional systems in terms of virtual memory overheads.

Figure 1: The Average Memory Access Time for address translations on low-memory, high-memory and Midgard systems.

The authors conclude: “This paper is the first of several steps needed to demonstrate a fully working system with Midgard. We focused on a proof-of-concept software-modelled prototype of key architectural components. Future work will address the wide spectrum of topics needed to realize Midgard in real systems.”

Rebooting Virtual Memory with Midgard
S. Gupta; A. Bhattacharyya; Y. Oh; A. Bhattacharjee; B. Falsafi and M. Payer
ISCA 2021 48th International Symposium on Computer Architecture, Online conference, June 14-19, 2021.

Midgard etymology: a middle realm between heaven (Asgard) and hell (Helheim)

Read More

Google Scholarship for EcoCloud researcher Simla Burcu Harma

Simla Burcu Harma has received the Generation Google Scholarship for Women in Computer Science, one of only two students in Switzerland to receive this award.Generation Google Scholarship for Women in Computer Science is awarded every year to a selection of 20 PhD students across Europe, Africa and the Middle East who have demonstrated a passion for technology, academic excellence, and have proven themselves as exceptional leaders and role models.

Simla comes from Turkey, and is pursuing a PhD at the Parallel Systems Architecture Lab (PARSA), under the supervision of Prof. Babak Falsafi. Her research interests lie in the area of Systems for Machine Learning. She is working on the EcoCloud project ColTrain, which aims to restore datacenter homogeneity and co-locate training and inference without compromising inference efficiency or quality of  service (QoS) guarantees.

Her work has included contributions to the ColTrain HBFPEmulator:

Read More

ColTraIn Releases Open-source HBFP Training Emulator

DNN training and inference have similar basic operators but with fundamentally different requirements. The former is throughput bound and relies on high precision floating-point arithmetic for convergence while the latter is latency-bound and tolerant to low-precision arithmetic. Both workloads require high computational capabilities and can benefit from hardware accelerators. The disparity in resource requirements forces datacenter operators to choose between custom accelerators for training and inference or training accelerators for inference.

However, neither of these two options is an optimum solution. While the former results in datacenter heterogeneity and higher management costs, the latter results in inefficient inference. Moreover, dedicated inference accelerators face load fluctuations, leading to overprovisioning and low average utilization.

The objective of EPFL’s ColTraIn: Co-located DNN Training and Inference team of PARSA and MLO is to restore datacenter homogeneity and co-locate training and inference without compromising inference efficiency or quality of service (QoS) guarantees. ColTraIn aims to overcome two key challenges: (1) the difference in the arithmetic representation used in workloads, and (2) the scheduling of training tasks in inference-bound accelerators. The recent release of HBFP (Hybrid Block Floating Point) meets the first challenge.

HBFP trains DNNs with dense, fixed-point-like arithmetic for most operations without sacrificing accuracy, thus facilitating effective co-location. More specifically, HBFP offers the accuracy of 32-bit floating-point with the numeric and silicon density of 8-bit fixed-point for many models (ResNet, WideResNet, DenseNet, AlexNet, LSTM, and BERT).

The open-source project repository is available for ongoing research on training DNNs with HBFP.

The ColTraIn team is working to address the second challenge of developing a co-locating accelerator. The design adds training capabilities to an inference accelerator and pairs it with a scheduler that takes both resource utilization and tasks’ QoS constraints into account to co-locate DNN training and inference.

Read More

New Transistor Design Reduces Energy Dissipation in High-power Applications

Energy dissipation is one of the biggest challenges in running high-power applications, such as an electric vehicle or a solar panel installation. Although state-of-art power converters are built to tolerate high voltages while converting energy to make our electronic devices work, they lose almost a fifth of their energy in the process. Researchers at the Power and Wide-band-gap Electronics Research Laboratory (POWERLAB) in EPFL’s Institute of Electrical Engineering (IEL) have been working to circumvent this problem. In a collaboration with other scholars in the field, the POWERLAB team led by Elison Matioli has just published the results of their research in Nature Electronics.

In their paper, the researchers introduce a nanowire-based device to create high-electron-mobility tri-gate transistors for power-conversion applications. Based on nanoscale structures, the novel transistor design significantly reduces heat loss during the energy conversion process. The nano-transistors work well in high-power applications by tackling electrical resistance, the main culprit in electric power loss. As Elison Matioli explains, “We see examples of electric power losses every day, such as when the charger of your laptop heats up.” The problem multiplies in high-power applications because the resistance increases with an increase in the voltage of the semiconductor components. That has a direct impact on the efficiency of the application. For example, it could reduce the range of an EV because power losses occur in the vehicle when charging.

Compared to traditional transistors, the new nano-transistor has less than 50% resistance even at high voltages of more than 1000 volts. It delivers such high efficiency because of two innovative design elements. First, it incorporates multiple two-dimensional electron gas channels, which allow more electrons to flow and improve conductivity. Second, it uses nanowires made of gallium nitride, a semiconducting material best suited for power applications. Further, the nanowires have higher resistance due to their funnel-like arrangement.

The new technology is still being tested, but the researchers are confident that they can quickly transition to large-scale production. That is very good news for a market ripe for transistors that can perform efficiently at high voltages. The POWERLAB team has already received several inquiries from manufacturers to work together to take the technology forward.

Read More

Can Influencers Sway Public Opinion?

This is the age of social media, where the opinions of celebrities make a big difference. Influential people are roped in to endorse a consumer product, a social movement, an election campaign, or even a vaccination drive. The assumption is that the influencers have enough clout to sway opinions and thoughts. However, as two separate research papers from EPFL have discovered, that is not always the ground reality. Instead, the experiments suggest that opinions are too well entrenched to change despite the best efforts from celebrity spokespersons. Surprisingly, the likeability of the concerned celebrity has little bearing on the outcome.

In the first study, EPFL’s Andreas Spitz and Robert West teamed up with Ahmad Abu-Akel (University of Lausanne) to conduct a survey-based randomized controlled trial, in which respondents’ opinions were recorded on an array of current topics, as well as their attitude towards a set of celebrity spokespersons. The respondents were then presented with the opinions of spokespersons who were categorized as:

  1. a liked celebrity with a disagreeing opinion;
  2. a disliked celebrity with a disagreeing opinion;
  3. an expert with a disagreeing opinion;
  4. and a disliked but agreeing with celebrity.

Contrary to expectations, the experiment revealed that the respondents held on to their views firmly, regardless of the celebrity inputs or their esteem in the eyes of the respondents. It was also clear that respondents liked to hear an opinion identical to their own even if it came from a disliked celebrity. Conversely, a dissenting opinion by a celebrity or expert reduced the respondent’s empathy for that person.

While the first study considered the rather more polarized opinions on a range of topics, the second randomized control study by the same researchers centered on COVID-19, a common challenge facing humanity where most opinions might be expected to veer in the same direction. To test the response to social distancing, calls were issued on Facebook to practice that safety norm. The calls were randomly attributed to an expert (Dr. Anthony Fauci), a celebrity, and a government official. The impact of that attribution was recorded in terms of the willingness of the participants to share the message with their contacts. The results showed that the messages attributed to the expert were reshared most often, and were thus most effective. The messages attributed to the official came in next, while those by the celebrity were least effective. The results of this study could be used for future campaigns related to a health crisis.

Both studies demonstrate that it is very easy for well-intended messages to have a completely undesirable impact on public opinion. On the other hand, proper messaging on a crisis can have a positive impact and help build societal consensus.

Read More

EPFL’s PowerSGD Algorithm Features in Software Industry

Artificial neural networks, a subset of machine learning, are having a major impact on many facets of life and industry. Rapid advancements in the field have allowed us to intervene in, and improve, processes as varied as voice and image recognition, development of new drugs, and reducing fraudulent activities. They have even helped those with hearing impairments to discern and isolate required sounds amidst a cacophony. But as the ambit of neural networks increases, so does their size. A huge bandwidth running into several PetaBytes is required to achieve an accurate model. To address that problem, two EPFL students have developed PowerSGD, an algorithm that allows compression of the needed bandwidth without compromising the accuracy of the training.

Thijs Vogels and Sai Praneeth Karimireddy are pursuing their Ph.D. under the guidance of Professor Martin Jaggi, head of the Machine Learning and Optimization Laboratory at the School of Computer and Communications Sciences (IC). In their research, Vogels and Karimireddy have applied the power method to achieve a reduction of up to 99% of the communication among Graphics Processing Units without affecting the model’s accuracy. Apart from the communication compression, PowerSGD also helps crunch energy consumption, thus playing a role in the fight against climate change.

The algorithm has already been adopted by the software industry, including the world’s most popular deep learning software PyTorch. PyTorch is an extremely versatile software employed by 80% of academic publications using deep learning. It is also used by Tesla’s Autopilot AI application and Facebook’s translation software. The newest version (PyTorch 1.8) comes with PowerSGD built-in, allowing users to activate communication compression with a simple software switch. Apart from PyTorch, PowerSGD also features in Open-AI’s DALL-E, which can generate creative images from text.

Carrying their work forward, the EPFL researchers are applying the same principle to decentralized learning. That is a major step forward because it could help mitigate the risk of data leakage and privacy concerns about sensitive information such as medical records or data stored on mobile phones.

Read More

New Protocol to Help Prevent Transmission of SARS-CoV-2 Strains

Just as the world began to heave a sigh of relief that the worst of the novel coronavirus was over, the virus has mutated into new strains that are even more contagious than the first.  Two of the safety precautions advocated to prevent transmission of the virus—wearing masks and maintaining social distancing—are no longer considered sufficient to guarantee protection from the new virus strains. In that precarious situation, it becomes imperative to find new means of fighting the spread of the disease. Enter CrowdNotifier, a new protocol developed in part at EPFL. It alerts people who attended an event where there was a risk of COVID-19 infection.

In many cases, it is unavoidable to hold meetings in enclosed spaces. Despite the protocols of wearing masks and safe distancing, such events carry the risk of transmission. With CrowdNotifier, it is possible to alert participants in case any of the attendees was a carrier of the virus and later tests positive for COVID-19. The alert is sent to their smartphones through an app. In Switzerland, the protocol works with the NotifyMe Check-in app developed by Ubique, the same company that developed SwissCovid. The app is available for download on both Apple’s App Store and Google’s Play Store.

NotifyMe has several distinctive features compared to other tracing apps. It works in a decentralized manner on each user’s smartphone, which protects privacy. Neither does it create a centralized database, nor does it store any private information. When a user enters a venue, it is necessary to scan a QR code with the app. That allows the phone to keep an encrypted record of the event sans any private data such as the phone number. If any of the participants at the event turns out to be infectious, the contact tracing features kick in by communicating the details of the event to the organizer. When the organizer uploads a decryption key to a secure server, it is accessed by the app to alert the user to follow isolation and testing processes.

In its test phase, NotifyMe has been deployed at several meeting rooms, classrooms, and cafeterias, and will be subsequently rolled out across the EPFL campus. There is a likelihood of integrating the app with SwissCovid soon.

Matthias Gäumann, Vice President for Operations and president of the operational Covid commission at EPFL, gave further details about the potential impact of NotifyMe:

“NotifyMe will enable us to beef up our COVID-19 protection plan. It will let the students and staff engage in certain activities on-site if they have to while limiting the risk of creating a transmission chain.”


Read More

New Research Presents First Online FL System

Big tech companies make no secret of the fact that they use artificial intelligence to track your browsing behavior and offer products and services that seem to target your interests and needs. While that helps them offer personalized services, there is the risk of sharing your data with a third party. That raises the specter of compromised digital data. In contrast, Federated Learning (FL) systems are gaining credence because they compute all updates on mobile devices, thus keeping the data local. The drawback of standard FL systems, however, is their unsuitability for applications that require frequent updates online, such as news recommendations.

In a major step forward for FL systems, researchers from EPFL and INRIA have developed the first online FL system called FLeet, which makes it possible to carry out machine learning on mobile devices in real-time without any impact on learning tasks. More importantly, data is not shared with any tech company but remains local and secure. The research recently won the best conference paper at the 2020 ACM/IFIP Middleware Conference.

FLeet manages to deliver the best of both worlds—privacy offered by standard FL systems and the precision of online learning. Helping it do so are two main components: I-Prof, a new lightweight profiler that predicts and controls the impact of learning tasks on mobile devices; and AdaSGD, a new adaptive learning algorithm that is resilient to delayed updates.

Test results have shown that FLeet can deliver a 2.3x quality boost compared to Standard FL, while only consuming 0.036% of the battery per day. I-Prof improves the prediction accuracy up to 3.6× (computation time) and up to 19× (energy), while AdaSGD outperforms alternative FL approaches by 18.4% in terms of convergence speed on heterogeneous data.

Rachid Guerraoui and Anne-Marie Kermarrec, professors at EPFL’s School of Computer and Communication Science and authors in the study, emphasize that today’s smartphones have the power to enable distributed machine learning without having to share raw data with, or rely on, large centralized systems.

As Professor Guerraoui explains, “With FLeet it is possible, while you are using your mobile phone, to use some of its spare power towards machine learning tasks without having to worry that your call or internet search will be interrupted…. we don’t want the machine learning to only be happening when you are asleep and your phone is on charge…sometimes we want, and need, real-time information.” Professor Kermarrec elucidates how the findings of their study can foster “truly collaborative learning where local models are aggregated and contribute to the global model but you don’t share raw data and that protects your privacy….”

The researchers are currently exploring options to develop the FLeet prototype into a safe, secure, and usable end product.

“FLeet: Online Federated Learning via Staleness Awareness and Performance Prediction”, authored by Georgios Damaskinos (FaceBook) Rachid Guerraoui, Anne-Marie Kermarrec (EPFL), Rhicheek Patra (Oracle Labs), Francois Taiani (INRIA/University of Rennes), and Vlad Nitu (CNRS)

Read More

Parental Control or Parent Trap?

Over the last decade, the tsunami of mobile services has practically inundated humans and their social behavior, and children are no exception. Even eight-year-olds are glued to their tablets or smartphones. Consequently, they are open to the risks of the all-pervasive Internet world, from violent content to explicit material not suitable for their tender age. It is therefore not surprising that millions of parents install apps that help them monitor and sieve content accessed by their children, and even keep a tab on their location. However, as a recent study* has shown, they could be paying a heavy price for it by jeopardizing private data. While giving parents perceived control over their child’s browsing habits, those apps could be leading them into a ‘parent trap.’

Privacy concerns arising from parental control apps have so far been overlooked by organizations recommending the use of such apps, including some European and other national security centers. Dozens of such popular apps with combined downloads exceeding 20 million in the Google Play Store exist, thereby amplifying the privacy concerns manifold. Recognizing this lacuna in the extant literature, EPFL’s Carmela Troncoso and a group of researchers from IMDEA conducted the first in-depth study of the Android parental control app’s ecosystem from privacy and regulatory point of view.

After a thorough static and dynamic analysis of 46 apps from 43 developers, the researchers concluded that the apps demand more permissions than the top 150 apps in the Google Play Store; and the alarming or hazardous nature of the permissions increases with every new release. The study found that 34% of the apps collect and share personal information without user consent and 72% of the apps share data with third parties.

In many ways, the threat to privacy from some parental control apps is similar to that from spyware. Troncoso, head of the Security and Privacy Engineering Lab (SPRING) at EPFL’s School of Computer and Communication Sciences (IC), expressed her surprise at the extensive presence of surveillant libraries in parental control apps and said, “With some of the apps you can’t look at anything on your phone without information being sent to the backend server.”

The study, which has won the “Prize for the research and Personal Data Protection Emilio Aced” given by the Spanish data protection agency (AEPD), raises a potent question: Does the use of parental control apps justify the dangers arising from the collection and processing of private data? The researchers hope that the study serves as a wake-up call for both parents as well as regulators on the risks associated with these apps. They also call for a security and privacy analysis to help parents download and use an app that protects their data privacy to the maximum extent possible.

* Feal, Álvaro & Calciati, Paolo & Vallina-Rodriguez, Narseo & Troncoso, Carmela & Gorla, Alessandra. (2020). Angel or Devil? A Privacy Study of Mobile Parental Control Apps. Proceedings on Privacy Enhancing Technologies. 2020. 314-335. 10.2478/popets-2020-0029.

Pdf accessible here.

Read More

Photonic Chips: Breakthrough in Data Processing

High-speed mobile networks, connected devices, and widescale adoption of AI technologies contribute to an exponential generation of data that needs to be processed rapidly and efficiently. With conventional chips often found wanting, an international group of scientists has collaborated to present a photonic hardware accelerator that is capable of operating at speeds of “trillions of multiply-accumulate operations per second,” far beyond the capabilities of existing computer processors. The breakthrough study, published in Nature, was carried out by scientists from EPFL, the Universities of Oxford, Münster, Exeter, Pittsburgh, and IBM Research – Zurich.

The scientists have adopted a new approach and architecture that integrates processing and data storage onto a single chip by using light-based or “photonic” processors that outperform conventional electronic chips by achieving “parallelized photonic in-memory computing using phase-change-material memory arrays and photonic chip-based optical frequency combs.” Professor Tobias Kippenberg, EcoCloud member and one of the authors of the study, elaborates on the use of a chip-based “frequency comb” (a technology developed at EPFL) as a light source: “Our study is the first to apply frequency combs in the field of artificial neural networks. The frequency comb provides a variety of optical wavelengths that are processed independently of one another in the same photonic chip.”

Speaking about the advantage of using light-based processors, co-author Wolfram Pernice (Münster University) said that it is “much faster than conventional chips which rely on electronic data transfer, such as graphic cards or specialized hardware like TPU’s.”

The photonic chip developed by the researchers was tested on a neural network that recognizes hand-written numbers. According to the authors, the results of their study indicate the applicability of integrated photonics for parallel, fast, and efficient computational hardware in data-heavy AI applications such as autonomous driving, live video processing, and next-generation cloud computing services.

The study was funded by EPSRC, Deutsche Forschungsgemeinschaft (DFG), European Research Council, European Union’s Horizon 2020 Research and Innovation Programme (Fun-COMP), and Studienstiftung des deutschen Volkes.


Read More