An algorithm termed stochastic spectral descent (SSD) for training deep neural networks
Team
Cevher Volkan |
In 2011, it was established that neural networks could outperform humans in many computational tasks. Since then, neural networks have served as the go-to standard. However, the biggest challenge has been the development of efficient training algorithms. Convergence on complex solutions calls for careful hyperparameter tuning because a weak design could have a major impact on the performance of the neural network.
An efficient training algorithm must overcome a couple of stumbling blocks related to loss functions: they are highly non-convex and exhibit inhomogeneous curvatures along different directions. Although stochastic gradient descent (SGD), along with its variants, has served as the standard algorithm for improving the efficiency of neural networks, it has proved less than adequate in accounting for the curvatures and dealing with saddle points. Therefore, a Microsoft Research, undertaken by Prof. Volkan Cevher (Laboratory for Information and Inference Systems, EPFL) proposes to use an algorithm termed stochastic spectral descent (SSD) for training deep neural networks.
In contrast to the emphasis on using geometry for theory-based algorithms, the SSD algorithm is proposed as a novel application to deliver state-of-the-art learning performance. It was first applied for Restricted Boltzmann Machines (RBMs), but has subsequently been extended successfully to other models, including neural networks. The current research proposes to use the new algorithmic framework to concurrently achieve fast and robust hyperparameter tuning.
Apart from SSD, the project will also develop a package for automatic training of deep neural networks, whose training algorithms include SSD variants. In the process, the study seeks to arrive at robust hyperparameter configurations for neural networks.
The project is among the selected proposals for 2017 under the Microsoft Research PhD Scholarship Programme in EMEA (Europe, Middle East, Africa). It has Prof. Volkan Cevher as Supervisor and Ryota Tomioka as the Microsoft Research Supervisor.
Suggested readings:
http://proceedings.mlr.press/v38/carlson15.pdf
https://wiki.epfl.ch/edicpublic/documents/Candidacy%20exam/PR15Collins.pdf