OPUS 4 | Search

Bernstein flows for flexible posteriors in variational Bayes (2024)

Dürr, Oliver ; Hörtling, Stefan ; Dold, Daniel ; Kovylov, Ivonne ; Sick, Beate

Black-box variational inference (BBVI) is a technique to approximate the posterior of Bayesian models by optimization. Similar to MCMC, the user only needs to specify the model; then, the inference procedure is done automatically. In contrast to MCMC, BBVI scales to many observations, is faster for some applications, and can take advantage of highly optimized deep learning frameworks since it can be formulated as a minimization task. In the case of complex posteriors, however, other state-of-the-art BBVI approaches often yield unsatisfactory posterior approximations. This paper presents Bernstein flow variational inference (BF-VI), a robust and easy-to-use method flexible enough to approximate complex multivariate posteriors. BF-VI combines ideas from normalizing flows and Bernstein polynomial-based transformation models. In benchmark experiments, we compare BF-VI solutions with exact posteriors, MCMC solutions, and state-of-the-art BBVI methods, including normalizing flow-based BBVI. We show for low-dimensional models that BF-VI accurately approximates the true posterior; in higher-dimensional models, BF-VI compares favorably against other BBVI methods. Further, using BF-VI, we develop a Bayesian model for the semi-structured melanoma challenge data, combining a CNN model part for image data with an interpretable model part for tabular data, and demonstrate, for the first time, the use of BBVI in semi-structured models.

Bayesian Calibration of MEMS Accelerometers (2023)

Dürr, Oliver ; Fan, Po-Yu ; Yin, Zong-Xian

This study aims to investigate the utilization of Bayesian techniques for the calibration of micro-electro-mechanical system (MEMS) accelerometers. These devices have garnered substantial interest in various practical applications and typically require calibration through error-correcting functions. The parameters of these error-correcting functions are determined during a calibration process. However, due to various sources of noise, these parameters cannot be determined with precision, making it desirable to incorporate uncertainty in the calibration models. Bayesian modeling offers a natural and complete way of reflecting uncertainty by treating the model parameters as variables rather than fixed values. In addition, Bayesian modeling enables the incorporation of prior knowledge, making it an ideal choice for calibration. Nevertheless, it is infrequently used in sensor calibration. This study introduces Bayesian methods for the calibration of MEMS accelerometer data in a straightforward manner using recent advances in probabilistic programming.

Deep transformation models for functional outcome prediction after acute ischemic stroke (2023)

Herzog, Lisa ; Kook, Lucas ; Götschi, Andrea ; Petermann, Katrin ; Hänsel, Martin ; Hamann, Janne ; Dürr, Oliver ; Wegener, Susanne ; Sick, Beate

Novel AI-Based Algorithm for the Automated Computation of Coronal Parameters in Adolescent Idiopathic Scoliosis Patients (2023)

Berlin, Clara ; Adomeit, Sonja ; Grover, Priyanka ; Dreischarf, Marcel ; Halm, Henry ; Dürr, Oliver ; Obid, Peter

Study design: Retrospective, mono-centric cohort research study. Objectives: The purpose of this study is to validate a novel artificial intelligence (AI)-based algorithm against human-generated ground truth for radiographic parameters of adolescent idiopathic scoliosis (AIS). Methods: An AI-algorithm was developed that is capable of detecting anatomical structures of interest (clavicles, cervical, thoracic, lumbar spine and sacrum) and calculate essential radiographic parameters in AP spine X-rays fully automatically. The evaluated parameters included T1-tilt, clavicle angle (CA), coronal balance (CB), lumbar modifier, and Cobb angles in the proximal thoracic (C-PT), thoracic, and thoracolumbar regions. Measurements from 2 experienced physicians on 100 preoperative AP full spine X-rays of AIS patients were used as ground truth and to evaluate inter-rater and intra-rater reliability. The agreement between human raters and AI was compared by means of single measure Intra-class Correlation Coefficients (ICC; absolute agreement; .75 rated as excellent), mean error and additional statistical metrics. Results: The comparison between human raters resulted in excellent ICC values for intra- (range: .97-1) and inter-rater (.85-.99) reliability. The algorithm was able to determine all parameters in 100% of images with excellent ICC values (.78-.98). Consistently with the human raters, ICC values were typically smallest for C-PT (eg, rater 1A vs AI: .78, mean error: 4.7°) and largest for CB (.96, -.5 mm) as well as CA (.98, .2°). Conclusions: The AI-algorithm shows excellent reliability and agreement with human raters for coronal parameters in preoperative full spine images. The reliability and speed offered by the AI-algorithm could contribute to the efficient analysis of large datasets (eg, registry studies) and measurements in clinical practice.

Bericht Freistellungssemester an der Southern Taiwan University (STUST) in Tainan im Wintersemester 2022/2023 (2023)

Dürr, Oliver

Estimating Conditional Distributions with Neural Networks using R package deeptrafo (2023)

Kook, Lucas ; Baumann, Philipp F. M. ; Dürr, Oliver ; Sick, Beate ; Rügamer, David

Contemporary empirical applications frequently require flexible regression models for complex response types and large tabular or non-tabular, including image or text, data. Classical regression models either break down under the computational load of processing such data or require additional manual feature extraction to make these problems tractable. Here, we present deeptrafo, a package for fitting flexible regression models for conditional distributions using a tensorflow backend with numerous additional processors, such as neural networks, penalties, and smoothing splines. Package deeptrafo implements deep conditional transformation models (DCTMs) for binary, ordinal, count, survival, continuous, and time series responses, potentially with uninformative censoring. Unlike other available methods, DCTMs do not assume a parametric family of distributions for the response. Further, the data analyst may trade off interpretability and flexibility by supplying custom neural network architectures and smoothers for each term in an intuitive formula interface. We demonstrate how to set up, fit, and work with DCTMs for several response types. We further showcase how to construct ensembles of these models, evaluate models using inbuilt cross-validation, and use other convenience functions for DCTMs in several applications. Lastly, we discuss DCTMs in light of other approaches to regression with non-tabular data.

Short-Term Density Forecasting of Low-Voltage Load using Bernstein-Polynomial Normalizing Flows (2023)

Arpogaus, Marcel ; Voss, Marcus ; Sick, Beate ; Nigge-Uricher, Mark ; Dürr, Oliver

The transition to a fully renewable energy grid requires better forecasting of demand at the low-voltage level to increase efficiency and ensure reliable control. However, high fluctuations and increasing electrification cause huge forecast variability, not reflected in traditional point estimates. Probabilistic load forecasts take uncertainties into account and thus allow more informed decision-making for the planning and operation of low-carbon energy systems. We propose an approach for flexible conditional density forecasting of short-term load based on Bernstein polynomial normalizing flows, where a neural network controls the parameters of the flow. In an empirical study with 3639 smart meter customers, our density predictions for 24h-ahead load forecasting compare favorably against Gaussian and Gaussian mixture densities. Furthermore, they outperform a non-parametric approach based on the pinball loss, especially in low-data scenarios.

Deep and interpretable regression models for ordinal outcomes (2022)

Kook, Lucas ; Herzog, Lisa ; Hothorn, Torsten ; Dürr, Oliver ; Sick, Beate

Outcomes with a natural order commonly occur in prediction problems and often the available input data are a mixture of complex data like images and tabular predictors. Deep Learning (DL) models are state-of-the-art for image classification tasks but frequently treat ordinal outcomes as unordered and lack interpretability. In contrast, classical ordinal regression models consider the outcome’s order and yield interpretable predictor effects but are limited to tabular data. We present ordinal neural network transformation models (ontrams), which unite DL with classical ordinal regression approaches. ontrams are a special case of transformation models and trade off flexibility and interpretability by additively decomposing the transformation function into terms for image and tabular data using jointly trained neural networks. The performance of the most flexible ontram is by definition equivalent to a standard multi-class DL model trained with cross-entropy while being faster in training when facing ordinal outcomes. Lastly, we discuss how to interpret model components for both tabular and image data on two publicly available datasets.

Accelerating Active Learning Image Labeling Through Bulk Shift Recommendations (2021)

Scharpf, Philipp ; Hong, Chi Lap ; Dürr, Oliver

Nowadays, the inexpensive memory space promotes an accelerating growth of stored image data. To exploit the data using supervised Machine or Deep Learning, it needs to be labeled. Manually labeling the vast amount of data is time-consuming and expensive, especially if human experts with specific domain knowledge are indispensable. Active learning addresses this shortcoming by querying the user the labels of the most informative images first. One way to obtain the ‘informativeness’ is by using uncertainty sampling as a query strategy, where the system queries those images it is most uncertain about how to classify. In this paper, we present a web-based active learning framework that helps to accelerate the labeling process. After manually labeling some images, the user gets recommendations of further candidates that could potentially be labeled equally (bulk image folder shift). We aim to explore the most efficient ‘uncertainty’ measure to improve the quality of the recommendations such that all images are sorted with a minimum number of user interactions (clicks). We conducted experiments using a manually labeled reference dataset to evaluate different combinations of classifiers and uncertainty measures. The results clearly show the effectiveness of an uncertainty sampling with bulk image shift recommendations (our novel method), which can reduce the number of required clicks to only around 20% compared to manual labeling.

Probabilistic Short-Term Low-Voltage Load Forecasting using Bernstein-Polynomial Normalizing Flows (2021)

Arpogaus, Marcel ; Voß, Marcus ; Sick, Beate ; Nigge-Uricher, Mark ; Dürr, Oliver

The transition to a fully renewable energy grid requires better forecasting of demand at the low-voltage level. However, high fluctuations and increasing electrification cause huge forecast errors with traditional point estimates. Probabilistic load forecasts take future uncertainties into account and thus enables various applications in low-carbon energy systems. We propose an approach for flexible conditional density forecasting of short-term load based on Bernstein-Polynomial Normalizing Flows where a neural network controls the parameters of the flow. In an empirical study with 363 smart meter customers, our density predictions compare favorably against Gaussian and Gaussian mixture densities and also outperform a non-parametric approach based on the pinball loss for 24h-ahead load forecasting for two different neural network architectures.

Author(s)
Title
Additional Person(s)
Referee(s)
Abstract
Fulltext

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Keywords

Institute

22 search hits