Abstract

Deep learning has brought unprecedented success in various tasks ranging from natural language processing, computer vision, to playing strategic games. Nevertheless, deep learning research is mostly guided by empirical observations, and the successful deployment of deep learning technology often requires various heuristics and extensive hyperparameter tuning. In this project, we intend to develop rigorous theories to understand (and possibly solve) various aspects of deep learning including trainability, generalization, and robustness of neural networks.

Publications

Guiding Neural Collapse: Optimising Towards the Nearest Simplex Equiangular Tight Frame.
Evan Markou, Thalaiyasingam Ajanthan, and Stephen Gould.
Neural Information Processing Systems (NeurIPS), December 2024.
[to appear] [bib]

Bidirectional Self-Normalizing Neural Networks.
Yao Lu, Stephen Gould, and Thalaiyasingam Ajanthan.
Neural Networks, August 2023.
[pdf] [arxiv] [talk] [bib]