Skip to Content

Focus Area: Clinical AI

Faster Machine Unlearning via Natural Gradient Descent

We address the challenge of efficiently and reliably deleting data from machine learning models trained using Empirical Risk Minimization (ERM), a process known as machine unlearning. To avoid retraining models from scratch, we propose a novel algorithm leveraging Natural Gradient Descent (NGD). Our theoretical framework ensures strong privacy guarantees for convex models, while a practical Min/Max optimization algorithm is developed for non-convex models. Comprehensive evaluations show significant improvements in privacy, computational efficiency, and generalization compared to state-of-the-art methods, advancing both the theoretical and practical aspects of machine unlearning.

Contributors: Omri Lev Learn more

MicrobioRaman: an open-access web repository for microbiological Raman spectroscopy data

Here we present the establishment of an open-access web-based repository for microbiological Raman spectroscopy data. The data collection, called ‘MicrobioRaman’ (https://www.ebi.ac.uk/biostudies/MicrobioRaman/studies), was inspired by the great success and usefulness of research databases such as GenBank and UniProt. This centralized repository, residing within the BioStudies database — which is maintained by a public institution, the European Bioinformatics Institute — minimizes the risk of data loss or eventual abandonment, offering a long-term common reference for analysis with advantages in accessibility and transparency over commercial data analysis tools. We feel that MicrobioRaman will provide a foundation for this growing field by serving as an open-access repository for sharing microbiological Raman data and through the codification of a set of reporting standards. Contributors: Kang Soo Lee, Zachary Landry, Awais Athar, Uria Alcolombri, Pratchaya Pramoj Na Ayutthaya, David Berry, Philippe de Bettignies, Ji-Xin Cheng, Gabor Csucs, Li Cui, Volker Deckert, Thomas Dieing, Jennifer Dionne, Ondrej Doskocil, Glen D’Souza, Cristina García-Timermans, Notburga Gierlinger, Keisuke Goda, Roland Hatzenpichler, Richard Henshaw, Wei Huang, Ievgeniia Iermak, Natalia Ivleva, Janina Kneipp, Patrick Kubryk, Kirsten Küsel, Tae Kwon Lee, Sung Sik Lee, Bo Ma, Clara Martínez-Pérez, Pavel Matousek, Rainer U. Meckenstock, Wei Min, Peter Mojzeš, Oliver Müller, Naresh Kumar, Per Halkjær Nielsen, Ioan Notingher, Márton Palatinszky, Fátima C. Pereira, Giuseppe Pezzotti, Zdenek Pilat, Filip Plesinger, Jürgen Popp, Alexander Probst, Alessandra Riva, Amr. Saleh, Ota Samek, Haley Sapers, Olga Schubert, Astrid Stubbusch, Gordon Taylor, Michael Wagner, Jing Wang, Huabing Yin, Yang Yue, Renato Zenobi, Jacopo Zini, Ugis Sarkans & Roman Stocker. Learn more

Sharpness-Aware Minimization (SAM) Improves Classification Accuracy of Bacterial Raman Spectral Data Enabling Portable Diagnostics

Antimicrobial resistance is expected to claim 10 million lives per year by 2050, and resource-limited regions are most affected. Raman spectroscopy is a novel pathogen diagnostic approach promising rapid and portable antibiotic resistance testing within a few hours, compared to days when using gold standard methods. However, current algorithms for Raman spectra analysis 1) are unable to generalize well on limited datasets across diverse patient populations and 2) require increased complexity due to the necessity of non-trivial pre-processing steps, such as feature extraction, which are essential to mitigate the low-quality nature of Raman spectral data. In this work, we address these limitations using Sharpness-Aware Minimization (SAM) to enhance model generalization across a diverse array of hyperparameters in clinical bacterial isolate classification tasks. We demonstrate that SAM achieves accuracy improvements of up to 10.7% on a single split, and an increase in average accuracy of 2.5% across all splits in spectral classification tasks over the traditional optimizer, Adam. These results display the capability of SAM to advance the clinical application of AI-powered Raman spectroscopy tools.

Contributors: Kaitlin Zareno, Jarett Dewbury, Siamak Sorooshyari, Hossein Mobahi Learn more

Prediction-powered Generalization of Causal Inferences

Causal inferences from a randomized controlled trial (RCT) may not pertain to a target population where some effect modifiers have a different distribution. Prior work studies generalizing the results of a trial to a target population with no outcome but covariate data available. We show how the limited size of trials makes generalization a statistically infeasible task, as it requires estimating complex nuisance functions. We develop generalization algorithms that supplement the trial data with a prediction model learned from an additional observational study (OS), without making any assumptions on the OS. We theoretically and empirically show that our methods facilitate better generalization when the OS is "high-quality", and remain robust when it is not, and e.g., have unmeasured confounding.

Contributors: Ilker Demirel, Ahmed Alaa, Anthony Philippakis Learn more

Position: Scarce Resource Allocations That Rely On Machine Learning Should Be Randomized

Contrary to traditional deterministic notions of algorithmic fairness, this paper argues that fairly allocating scarce resources using machine learning often requires randomness. We address why, when, and how to randomize by offering a set of stochastic procedures that more adequately account for all of the claims individuals have to allocations of social goods or opportunities and effectively balances their interests.

Contributors: Shomik Jain, Kathleen Creel Learn more

Mean-field Underdamped Langevin Dynamics and its Spacetime Discretization

We propose a new method called the N-particle underdamped Langevin algorithm for optimizing a special class of non-linear functionals defined over the space of probability measures. Examples of problems with this formulation include training mean-field neural networks, maximum mean discrepancy minimization and kernel Stein discrepancy minimization. Our algorithm is based on a novel spacetime discretization of the mean-field underdamped Langevin dynamics, for which we provide a new, fast mixing guarantee. In addition, we demonstrate that our algorithm converges globally in total variation distance, bridging the theoretical gap between the dynamics and its practical implementation.

Contributor: Qiang Fu Learn more

Measuring Stochastic Data Complexity with Boltzmann Influence Functions

Estimating the uncertainty of a model’s prediction on a test point is a crucial part of ensuring reliability and calibration under distribution shifts.A minimum description length approach to this problem uses the predictive normalized maximum likelihood (pNML) distribution, which considers every possible label for a data point, and decreases confidence in a prediction if other labels are also consistent with the model and training data. In this work we propose IF-COMP, a scalable and efficient approximation of the pNML distribution that linearizes the model with a temperature-scaled Boltzmann influence function. IF-COMP can be used to produce well-calibrated predictions on test points as well as measure complexity in both labelled and unlabelled settings. We experimentally validate IF-COMP on uncertainty calibration, mislabel detection, and OOD detection tasks, where it consistently matches or beats strong baseline methods.

Contributors: Nathan Hoyen Ng, Roger Baker Grosse Learn more

Learning Optimal Projection for Forecast Reconciliation of Hierarchical Time Series

Hierarchical time series forecasting requires not only prediction accuracy but also coherency, i.e., forecasts add up appropriately across the hierarchy. Recent literature has shown that reconciliation via projection outperforms prior methods such as top-down or bottom-up approaches. Unlike existing work that pre-specifies a projection matrix (e.g., orthogonal), we study the problem of learning the optimal oblique projection from data for coherent forecasting of hierarchical time series. In addition to the unbiasedness-preserving property, oblique projection implicitly accounts for the hierarchy structure and assigns different weights to individual time series, providing significant adaptability over orthogonal projection which treats base forecast errors equally. We examine two broad classes of projections, namely Euclidean projection and general oblique projections. We propose to model the reconciliation step as a learnable, structured, projection layer in the neural forecaster architecture. The proposed approach allows for the efficient learning of the optimal projection in an end-to-end framework where both the neural forecaster and the projection layer are learned simultaneously. An empirical evaluation of real-world hierarchical time series datasets demonstrates the superior performance of the proposed method over existing state-of-the-art approaches.

Contributors:Asterios Tsiourvas, Wei Sun, Georgia Perakis, Pin-Yu Chen, Yada Zhu Learn more

Overcoming the Optimizer’s Curse: Obtaining Realistic Prescriptions from Neural Networks

We study the problem of obtaining optimal and realistic prescriptions when using ReLU networks for data-driven decision-making. In this setting, the network is used to predict a quantity of interest and then is optimized to retrieve the decisions that maximize the quantity (e.g. find the best prices that maximize revenue). However, optimizing over-parameterized models often produces unrealistic prescriptions, far from the data manifold. This phenomenon is known as the Optimizer's Curse. To tackle this problem, we model the requirement for the resulting decisions to align with the data manifold as a tractable optimization constraint. This is achieved by reformulating the highly nonlinear Local Outlier Factor (LOF) metric as a single linear or quadratic constraint. To solve the problem efficiently for large networks, we propose an adaptive sampling algorithm that reduces the initial hard-to-solve optimization problem into a small number of significantly easier-to-solve problems by restricting the decision space to realistic polytopes, i.e. polytopes of the decision space that contain at least one realistic data point. Experiments on publicly available networks demonstrate the efficacy and scalability of our approach.

Contributor: Asterios Tsiourvas Learn more

Implicit Representations via Operator Learning

The idea of representing a signal as the weights of a neural network, called Implicit Neural Representations (INRs), has led to exciting implications for compression, view synthesis and 3D volumetric data understanding. One problem in this setting pertains to the use of INRs for downstream processing tasks. Despite some conceptual results, this remains challenging because the INR for a given image/signal often exists in isolation. What does the neighborhood around a given INR correspond to? Based on this question, we offer an operator theoretic reformulation of the INR model, which we call Operator INR (or O-INR). At a high level, instead of mapping positional encodings to a signal, O-INR maps one function space to another function space. A practical form of this general casting is obtained by appealing to Integral Transforms. The resultant model does not need multi-layer perceptrons (MLPs), used in most existing INR models – we show that convolutions are sufficient and offer benefits including numerically stable behavior. We show that O-INR can easily handle most problem settings in the literature, and offers a similar performance profile as baselines. These benefits come with minimal, if any, compromise. Our code is available at https://github.com/vsingh-group/oinr.

Contributors: Sourav Pal, Harshavardhan Adepu, Clinton Wang, Vikas Singh Learn more
image description