Posts by Collection

portfolio

publications

Decoder ensembling for learned latent geometries

Published in GRaM Workshop @ ICML 2024, 2024

Latent space geometry provides a rigorous and empirically valuable framework for interacting with the latent variables of deep generative models. This approach reinterprets Euclidean latent spaces as Riemannian through a pull-back metric, allowing for a standard differential geometric analysis of the latent space. Unfortunately, data manifolds are generally compact and easily disconnected or filled with holes, suggesting a topological mismatch to the Euclidean latent space. The most established solution to this mismatch is to let uncertainty be a proxy for topology, but in neural network models, this is often realized through crude heuristics that lack principle and generally do not scale to high-dimensional representations. We propose using ensembles of decoders to capture model uncertainty and show how to easily compute geodesics on the associated expected manifold. Empirically, we find this simple and reliable, thereby coming one step closer to easy-to-use latent geometries.

Recommended citation: Syrota, S., Moreno-Muñoz, P. & Hauberg, S. (2024, October). Decoder ensembling for learned latent geometries. In Geometry-grounded Representation Learning and Generative Modeling Workshop (GRaM) at ICML 2024 (pp. 277-285). PMLR.
Download Paper

Identifying metric structures of deep latent variable models

Published in (Under review for ICML 2025), 2025

Deep latent variable models learn condensed representations of data that, hopefully, reflect the inner workings of the studied phenomena. Unfortunately, these latent representations are not statistically identifiable, meaning they cannot be uniquely determined. Domain experts, therefore, need to tread carefully when interpreting these. Current solutions limit the lack of identifiability through additional constraints on the latent variable model, e.g. by requiring labeled training data, or by restricting the expressivity of the model. We change the goal: instead of identifying the latent variables, we identify relationships between them such as meaningful distances, angles, and volumes. We prove this is feasible under very mild model conditions and without additional labeled data. We empirically demonstrate that our theory results in more reliable latent distances, offering a principled path forward in extracting trustworthy conclusions from deep latent variable models.

Recommended citation: Syrota, S., Zainchkovskyy, Y., Xi, J., Bloem-Reddy, B., & Hauberg, S. (2025). Identifying metric structures of deep latent variable models. arXiv preprint arXiv:2502.13757.
Download Paper

talks

teaching

Bayesian Machine Learning

Master's level course, Technical University of Denmark, Department of Mathematics and Computer Science, 2023

Description of the course

The purpose of the course is two-fold. First of all, the goal is to equip students with a deeper theoretical understanding of probabilistic machine learning and to enable them to read and understand the newest research literature in the field. Second, to enable students to discuss probabilistic models for practical problems and to discuss and apply appropriate inference algorithms.

Introduction to Machine Learning and Data Mining

Master's level course, Technical University of Denmark, Department of Mathematics and Computer Science, 2023

Content of the course

principal component analysis. Similarity measures and summary statistics. Visualization and interpretation of models. Overfitting and generalization. Classification (decision trees, nearest neighbor, naive Bayes, neural networks, and ensemble methods.) Linear regression. Clustering (k-means, hierarchical clustering, and mixture models.) Association rules. Density estimation and outlier detection. Applications in a broad range of engineering sciences.