About me
I am a recent PhD graduate from the AIML Lab, University of St.Gallen, where I was advised by Damian Borth, and co-adivsed by Michael Mahoney, and Xavier Giro-i-Nieto. My research focused on representation learning on neural network weights, specifically investigating populations of neural networks and identifying latent structures by learning hyper-representations.
During my PhD, I was a visiting scholar with Michael Mahoney at ICSI Berkeley, collaborating on scaling hyper-representations to large models and diverse architectures and weight analysis. This collaboration extended to working with Yaoqing Yang on identifying and utilizing phase transitions in neural networks. Additionally, I completed a PhD Research Internship with Minmin Chen at Google Deepmind, focusing on data-free methods to mitigate forgetting in large language models.
My original background is in mechanical engineering with a focus on simulation science and automatic control, studied at RWTH Aachen University. I have visited Aalto University in Helsinki through an Erasmus scholarship and interned at Siemens via Jiatong-University in Shanghai as a DAAD PROMOS scholar. Before my PhD, I interned at the Institute for Snow and Avalanche Research with Henning Loewe on coupled simulations of energy and mass transfer in snowpack.
Research Statement
As a PostDoc at AIML Lab, my research focuses on scaling hyper-representations towards foundation models of neural networks, which could allow model initialization to be tailored to a specific task or dataset, as well as how models are tuned or evaluated. Further, I continue work on investigating and operationalizing phase transitions in neural network models, which may build understanding on when and why neural network methods perform well.
I am broadly interested in representation learning, particularly in challenging and impactful domains with sparse and multi-modal data, such as scientific and biomedical applications.
News
- 2024
- Paper accepted at ICML 2024 GRaM Workshop: “Dirac-Bianconi Graph Neural Networks – Enabling Non-Diffusive Long-Range Graph Predictions”
- Two papers accepted at ICML 2024:
- “Towards Scalable and Versatile Weight Space Learning” arxiv
- “MD tree: a model-diagnostic tree grown on loss landscape” openreview
- Completed my PhD at the University of St.Gallen with honors (Summa Cum Laude).
- Reviewer for International Conference on Machine Learning (ICML) 2024.
- Invited Talk at Dartmouth College: Weight Space Learning: Learning from Populations of Neural Networks, 01/2024.
- 2023
- Recently finished a research internship at Google Deepmind, working on data-free methods to mitigate forgetting in LLMs.
- Paper accepted to Chaos: An Interdisciplinary Journal of Nonlinear Science: “Toward dynamic stability assessment of power grid topologies using graph neural networks”, paper
- Paper accepted to NeurIPS 2023 ATTRIB Workshop: “Why do landscape diagnostics matter? Pinpointing the failure mode of generalization” poster paper
- Paper accepted to ICML 2023 HiLD Workshop: “Hyperparamter Tuning using Loss Landscape”, poster
- Invited Talk at Google Algorithms Seminar in Mountain View: Hyper-Representations: Learning from Populations of Neural Networks.
- Received the HSG Impact Award 2023 with Damian Borth for our work on Hyper-Representations, announcement.
- Reviewer for Winter Conference on Applications of Computer Vision (WACV) 2023.
- Reviewer for Conference on Neural Information Processing Systems (NeurIPS) 2023.
- Visiting Scholar with Michael Mahoney at ICSI Berkeley: With Michael, I investigated the weight space and loss surface landscape of Neural Networks. We worked on scaling hyper-representations to large models and diverse architectures, and collaborated with Yaoqing Yang on identifying and utilizing phase transitions in neural networks.
- Paper accepted at ICLR 2023 Workshop on Sparsity in Neural Networks: Sparsified Model Zoo Twins: Investigating Populations of Sparsified Neural Network Models. arxiv
- 2022
- Google Research Scholar Award for “Hyper-Representations: Learning from Populations of Neural Networks” with PI Damian Borth. announcement, article
- Paper accepted at NeurIPS 2022: Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights. paper
- Paper accepted at NeurIPS 2022 Track on Datasets and Benchmarks: Model Zoos: A Dataset of Diverse Populations of Neural Networks. paper, modelzoos.cc
- Invited Talk at University of St. Gallen, Deep Learning Lecture Series: Hyper-Representations, 11/2022.
- Paper accepted at NeurIPS 2022 Climate Change AI Workshop: Towards dynamic stability analysis of sustainable power grids using graph neural networks. arxiv
- Paper accepted at ICML 2022 Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward: Hyper-Representation for Pre-Training and Transfer Learning. arxiv
- Reviewer for Conference on Neural Information Processing Systems (NeurIPS) 2022, Track on Datasets and Benchmarks.
- Paper accepted to New Journal of Physics: Predicting basin stability of power grids using graph neural networks. paper
- 2021
- Paper accepted at NeurIPS 2021: Self-Supervised Representation Learning on Neural Network Weights for Model Characteristic Prediction. proceedings, arxiv, blog, talk, code, data
- Paper accepted to EGU The Cryosphere: Elements of future snowpack modeling–Part 1: A physical instability arising from the nonlinear coupling of transport and phase changes. paper
- 2019