Algorithms that combine three key aspects: accelerated, non-convex, and distributed training

Team

The relationship between optimization and machine learning is one of the most significant aspects of modern computational science. To obtain essential information from voluminous data, it is vital that new algorithms are designed with optimization formulations and methods. This is particularly relevant because of the growing intricacy, size, and diversity of machine learning models today. Therefore, it is necessary to reconsider existing theories and introduce efficient and scalable optimization algorithms that bring modularity and scalability to the training process. Toward this objective, EPFL’s Martin Jaggi and co-principal investigators have developed algorithms that combine three key aspects: accelerated, non-convex, and distributed training.

In proposing convergence acceleration techniques for solving generic optimization problems, the researchers adopt a statistical view on optimization methods. In typical optimization algorithms, the sequence of iterates is discarded and only its last point is used to estimate the optimum. However, Martin Jaggi and others use the iterates produced by the optimization algorithm and estimate the solution directly from this sequence. The key benefit of their scheme is minimal implementation cost with significant speedups and negligible complexity. It doesn’t necessitate any change to the existing neural network training code.

In the context of non-convex optimization problems, the investigators adopt a two-pronged approach: design algorithms that answer local convergence issues for non-convex problems with a focus on deep neural networks and matrix factorization.

The third focus area of the project is distribution optimization. To achieve the desired scalability, optimization algorithms need to transfer and manage information between distributed devices. However, existing distributed machine learning models are unable to achieve high returns due to the increasing number of devices. Conversely, the current research brings modularity and scalability to the training process by targeting extensions that efficiently use second-order information in the distributed setting. It demonstrates improved training speed for linear machine learning models by utilizing the memory hierarchy as well as different degrees of compute parallelism between two devices.

In recognition of their project proposal (“Large-Scale Optimization: Beyond Convexity Accelerated, non-convex & distributed optimization for machine learning”), Google has conferred the researchers with its Focused Research Award for 2018 in the area of Machine Learning.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Upcoming Events

Future Health: Harnessing Multimodal Data and GenAI for Health Promotion

Swiss Federal Offices Day 2024

Annual Event

Modularity and Scalability in ML

Algorithms that combine three key aspects: accelerated, non-convex, and distributed training

Team