A new generic algorithmic building block to accelerate training of machine learning models on heterogeneous computing systems
| Jaggi Martin |
A team of scientists from IBM research and EPFL has translated theory into practice by proposing a new generic algorithmic building block to accelerate training of machine learning models on heterogeneous computing systems.
In a paper presented at the 2017 NIPS Conference by IBM researchers Celestine Dünner and Thomas Parnell and their colleague Martin Jaggi from EPFL, the team demonstrated how GPUs with a limited memory capacity (currently up to 16 GB) could be used to process machine learning at 10X speeds. Their technique is based on identifying the part of data that is most important for the training algorithm and then processing the data points in the correct sequence. The proposed strategy is called Duality-gap based Heterogeneous Learning (DuHL).
The scientists proved the efficacy of their algorithm by training a large-scale dataset of 30 GB in less than one minute by using an NVIDIA Quadro M4000 GPU with 8 GB of memory. The speed of learning was ten times faster than that achieved by the CPU.
The next logical step of the researchers is to offer DuHL as a service in the cloud. In that environment, resources are billed on an hourly basis; therefore, a 10X speed-up would imply significant cost saving.