About me

I am a recent PhD graduate from the AIML Lab, University of St.Gallen, where I was advised by Damian Borth, and co-adivsed by Michael Mahoney, and Xavier Giro-i-Nieto. My research focused on representation learning on neural network weights, specifically investigating populations of neural networks and identifying latent structures by learning hyper-representations.

During my PhD, I was a visiting scholar with Michael Mahoney at ICSI Berkeley, collaborating on scaling hyper-representations to large models and diverse architectures and weight analysis. This collaboration extended to working with Yaoqing Yang on identifying and utilizing phase transitions in neural networks. Additionally, I completed a PhD Research Internship with Minmin Chen at Google Deepmind, focusing on data-free methods to mitigate forgetting in large language models.

My original background is in mechanical engineering with a focus on simulation science and automatic control, studied at RWTH Aachen University. I have visited Aalto University in Helsinki through an Erasmus scholarship and interned at Siemens via Jiatong-University in Shanghai as a DAAD PROMOS scholar. Before my PhD, I interned at the Institute for Snow and Avalanche Research with Henning Loewe on coupled simulations of energy and mass transfer in snowpack.

Research Statement

As a PostDoc at AIML Lab, my research focuses on scaling hyper-representations towards foundation models of neural networks, which could allow model initialization to be tailored to a specific task or dataset, as well as how models are tuned or evaluated. Further, I continue work on investigating and operationalizing phase transitions in neural network models, which may build understanding on when and why neural network methods perform well.

I am broadly interested in representation learning, particularly in challenging and impactful domains with sparse and multi-modal data, such as scientific and biomedical applications.

News

  • 2024
    • Paper accepted at ICML 2024 GRaM Workshop: “Dirac-Bianconi Graph Neural Networks – Enabling Non-Diffusive Long-Range Graph Predictions”
    • Two papers accepted at ICML 2024:
      • “Towards Scalable and Versatile Weight Space Learning” arxiv
      • “MD tree: a model-diagnostic tree grown on loss landscape” openreview
    • Completed my PhD at the University of St.Gallen with honors (Summa Cum Laude).
    • Reviewer for International Conference on Machine Learning (ICML) 2024.
    • Invited Talk at Dartmouth College: Weight Space Learning: Learning from Populations of Neural Networks, 01/2024.
  • 2023
    • Recently finished a research internship at Google Deepmind, working on data-free methods to mitigate forgetting in LLMs.
    • Paper accepted to Chaos: An Interdisciplinary Journal of Nonlinear Science: “Toward dynamic stability assessment of power grid topologies using graph neural networks”, paper
    • Paper accepted to NeurIPS 2023 ATTRIB Workshop: “Why do landscape diagnostics matter? Pinpointing the failure mode of generalization” poster paper
    • Paper accepted to ICML 2023 HiLD Workshop: “Hyperparamter Tuning using Loss Landscape”, poster
    • Invited Talk at Google Algorithms Seminar in Mountain View: Hyper-Representations: Learning from Populations of Neural Networks.
    • Received the HSG Impact Award 2023 with Damian Borth for our work on Hyper-Representations, announcement.
    • Reviewer for Winter Conference on Applications of Computer Vision (WACV) 2023.
    • Reviewer for Conference on Neural Information Processing Systems (NeurIPS) 2023.
    • Visiting Scholar with Michael Mahoney at ICSI Berkeley: With Michael, I investigated the weight space and loss surface landscape of Neural Networks. We worked on scaling hyper-representations to large models and diverse architectures, and collaborated with Yaoqing Yang on identifying and utilizing phase transitions in neural networks.
    • Paper accepted at ICLR 2023 Workshop on Sparsity in Neural Networks: Sparsified Model Zoo Twins: Investigating Populations of Sparsified Neural Network Models. arxiv
  • 2022
    • Google Research Scholar Award for “Hyper-Representations: Learning from Populations of Neural Networks” with PI Damian Borth. announcement, article
    • Paper accepted at NeurIPS 2022: Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights. paper
    • Paper accepted at NeurIPS 2022 Track on Datasets and Benchmarks: Model Zoos: A Dataset of Diverse Populations of Neural Networks. paper, modelzoos.cc
    • Invited Talk at University of St. Gallen, Deep Learning Lecture Series: Hyper-Representations, 11/2022.
    • Paper accepted at NeurIPS 2022 Climate Change AI Workshop: Towards dynamic stability analysis of sustainable power grids using graph neural networks. arxiv
    • Paper accepted at ICML 2022 Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward: Hyper-Representation for Pre-Training and Transfer Learning. arxiv
    • Reviewer for Conference on Neural Information Processing Systems (NeurIPS) 2022, Track on Datasets and Benchmarks.
    • Paper accepted to New Journal of Physics: Predicting basin stability of power grids using graph neural networks. paper
  • 2021
    • Paper accepted at NeurIPS 2021: Self-Supervised Representation Learning on Neural Network Weights for Model Characteristic Prediction. proceedings, arxiv, blog, talk, code, data
    • Paper accepted to EGU The Cryosphere: Elements of future snowpack modeling–Part 1: A physical instability arising from the nonlinear coupling of transport and phase changes. paper
  • 2019
    • New position: I’ve joined the AI:ML Lab at University of St. Gallen as Researcher and PhD Student!
    • Poster accepted to EGU General Assembly 2019: On water vapor transport in snowpack models: Comparison of existing schemes, numerical requirements and the role of non-local advection. abstract