EcoCloud is getting ready for exorbitant amounts of data

The Square Kilometre Array Observatory (SKAO) is a radio-astronomy project that will consist of two radio telescope arrays in Australia and South Africa, on which construction began December 5, 2022. The data that come from these telescopes will revolutionize our understanding of the universe and its origins, as well as the laws of fundamental physics. It promises to have a major impact on society, in science and beyond.

SKACH is the organisation that will be delivering Swiss contributions to SKAO, and is led by Prof. Jean-Paul Kneib. During its operation, the SKAO will generate a flow of 707 petabytes of data per year. As part of this amazing project, participating scientists will need to handle these large data streams, preferably in the most energy-efficient way possible. This is where EcoCloud comes in.

Right now, EcoCloud scientists Dr. Denisa Constantinescu and Dr. Miguel Peón Quirós are looking at ways that the mountains of data that the SKAO will need to produce and move can be handled with as little energy expenditure as necessary.

The collaboration, which began in September of this year, was borne out of the EcoCloud Annual Event, organized in May 2022. Researchers from SCITAS, who work on high-performance computing at EPFL, presented their work for SKACH, and a new collaboration kicked off.

“We saw that the type of computation they are doing is similar to what we have been doing for the biomedical domain at a different scale,” says Peón Quirós. “And that they have exactly the same problem now, that they are less concerned about speed of calculation than they are about energy consumption.”

Because EcoCloud already has this expertise in the biomedical domain, Peón Quirós and Constantinescu believed that they might be able to help with SKACH’s big data problem, and are now in the middle of a six-month exploratory process to see if the collaboration could be fruitful.

“Right now, I’m just profiling the server to see exactly what kind of computation is in there, see if maybe we can optimize any of the kernels, and what exactly is the best kind of accelerator that could be used for each one of these kernels,” explains Constantinescu, “then we’ll explore different kinds of algorithms, different ways of implementing the same thing, looking for the most efficient imaging process. We’ll test different hypotheses and see which is going to bring the most benefit in terms of energy consumption.

“It is likely that we will be using heterogenous computing, specialized architectures, for different parts of the whole pipeline.” Heterogenous computing involves the bringing together of different types of processors, with a particular task in mind. The huge quantities of data involved in this project mean than there will be a great benefit from processing hardware that is not only diverse, but also made-to-measure.

Then there is the time element: the SKAO project is projected to last for many years, so making the data processing energy efficient has important long-term implications.

“If you want it running for 50 years, think of the prospect of providing enough energy for such a long time,” suggests Constantinescu. “If you could get that down to zero energy costs, that would be wonderful. This is pure fantasy, but our goal will be to get as close to that as possible.”

Constantinescu and Peón Quirós will continue this initial exploratory phase, which is funded by EcoCloud, through the beginning of 2023. In the meantime, they are applying for funding to continue this promising line of research.

Just like the SKAO itself, this will be a long-term project. Depending on what they come up with, it may require not only making it work, but also figuring out how to mass produce this new technology.

“Right now we are making a promise, we are defining research,” explains Constantinescu, “and we are hopeful that we will get something interesting. But we don’t have anything concrete as yet. We have just an informed intuition with which we are formulating hypotheses for how it should work.”

The Square Kilometre Array Observatory is gearing up to be fifty times more sensitive than any telescope currently in existence. But this will only be useful if the resulting data is efficiently accessible to scientists in the long term.

EcoCloud Center researchers relish a challenge, but this one involves exorbitant amounts of data, over decades, with the ever-present demands for low energy: it’s huge. In fact, it’s astronomical.

Text: Stephanie Parker and John Maxwell

Photo: John Maxwell