OPUS 4 | Search

Transformation Models for Flexible Posteriors in Variational Bayes (2021)

Hörtling, Stefan ; Dold, Daniel ; Dürr, Oliver ; Sick, Beate

The main challenge in Bayesian models is to determine the posterior for the model parameters. Already, in models with only one or few parameters, the analytical posterior can only be determined in special settings. In Bayesian neural networks, variational inference is widely used to approximate difficult-to-compute posteriors by variational distributions. Usually, Gaussians are used as variational distributions (Gaussian-VI) which limits the quality of the approximation due to their limited flexibility. Transformation models on the other hand are flexible enough to fit any distribution. Here we present transformation model-based variational inference (TM-VI) and demonstrate that it allows to accurately approximate complex posteriors in models with one parameter and also works in a mean-field fashion for multi-parameter models like neural networks.

Single Shot MC Dropout Approximation (2020)

Brach, Kai ; Sick, Beate ; Dürr, Oliver

Deep neural networks (DNNs) are known for their high prediction performance, especially in perceptual tasks such as object recognition or autonomous driving. Still, DNNs are prone to yield unreliable predictions when encountering completely new situations without indicating their uncertainty. Bayesian variants of DNNs (BDNNs), such as MC dropout BDNNs, do provide uncertainty measures. However, BDNNs are slow during test time because they rely on a sampling approach. Here we present a single shot MC dropout approximation that preserves the advantages of BDNNs without being slower than a DNN. Our approach is to analytically approximate for each layer in a fully connected network the expected value and the variance of the MC dropout signal. We evaluate our approach on different benchmark datasets and a simulated toy example. We demonstrate that our single shot MC dropout approximation resembles the point estimate and the uncertainty estimate of the predictive distribution that is achieved with an MC approach, while being fast enough for real-time deployments of BDNNs.

Short-Term Density Forecasting of Low-Voltage Load using Bernstein-Polynomial Normalizing Flows (2023)

Arpogaus, Marcel ; Voss, Marcus ; Sick, Beate ; Nigge-Uricher, Mark ; Dürr, Oliver

The transition to a fully renewable energy grid requires better forecasting of demand at the low-voltage level to increase efficiency and ensure reliable control. However, high fluctuations and increasing electrification cause huge forecast variability, not reflected in traditional point estimates. Probabilistic load forecasts take uncertainties into account and thus allow more informed decision-making for the planning and operation of low-carbon energy systems. We propose an approach for flexible conditional density forecasting of short-term load based on Bernstein polynomial normalizing flows, where a neural network controls the parameters of the flow. In an empirical study with 3639 smart meter customers, our density predictions for 24h-ahead load forecasting compare favorably against Gaussian and Gaussian mixture densities. Furthermore, they outperform a non-parametric approach based on the pinball loss, especially in low-data scenarios.

Probabilistic Deep Learning (2020)

Dürr, Oliver ; Sick, Beate

Probabilistic Deep Learning is a hands-on guide to the principles that support neural networks. Learn to improve network performance with the right distribution for different data types, and discover Bayesian variants that can state their own uncertainty to increase accuracy. This book provides easy-to-apply code and uses popular frameworks to keep you focused on practical applications.

Novel AI-Based Algorithm for the Automated Computation of Coronal Parameters in Adolescent Idiopathic Scoliosis Patients (2023)

Berlin, Clara ; Adomeit, Sonja ; Grover, Priyanka ; Dreischarf, Marcel ; Halm, Henry ; Dürr, Oliver ; Obid, Peter

Study design: Retrospective, mono-centric cohort research study. Objectives: The purpose of this study is to validate a novel artificial intelligence (AI)-based algorithm against human-generated ground truth for radiographic parameters of adolescent idiopathic scoliosis (AIS). Methods: An AI-algorithm was developed that is capable of detecting anatomical structures of interest (clavicles, cervical, thoracic, lumbar spine and sacrum) and calculate essential radiographic parameters in AP spine X-rays fully automatically. The evaluated parameters included T1-tilt, clavicle angle (CA), coronal balance (CB), lumbar modifier, and Cobb angles in the proximal thoracic (C-PT), thoracic, and thoracolumbar regions. Measurements from 2 experienced physicians on 100 preoperative AP full spine X-rays of AIS patients were used as ground truth and to evaluate inter-rater and intra-rater reliability. The agreement between human raters and AI was compared by means of single measure Intra-class Correlation Coefficients (ICC; absolute agreement; .75 rated as excellent), mean error and additional statistical metrics. Results: The comparison between human raters resulted in excellent ICC values for intra- (range: .97-1) and inter-rater (.85-.99) reliability. The algorithm was able to determine all parameters in 100% of images with excellent ICC values (.78-.98). Consistently with the human raters, ICC values were typically smallest for C-PT (eg, rater 1A vs AI: .78, mean error: 4.7°) and largest for CB (.96, -.5 mm) as well as CA (.98, .2°). Conclusions: The AI-algorithm shows excellent reliability and agreement with human raters for coronal parameters in preoperative full spine images. The reliability and speed offered by the AI-algorithm could contribute to the efficient analysis of large datasets (eg, registry studies) and measurements in clinical practice.

Know when you don't know (2018)

Dürr, Oliver ; Murina, Elvis ; Siegismund, Daniel ; Tolkachev, Vasily ; Steigele, Stephan ; Sick, Beate

Deep convolutional neural networks show outstanding performance in image-based phenotype classification given that all existing phenotypes are presented during the training of the network. However, in real-world high-content screening (HCS) experiments, it is often impossible to know all phenotypes in advance. Moreover, novel phenotype discovery itself can be an HCS outcome of interest. This aspect of HCS is not yet covered by classical deep learning approaches. When presenting an image with a novel phenotype to a trained network, it fails to indicate a novelty discovery but assigns the image to a wrong phenotype. To tackle this problem and address the need for novelty detection, we use a recently developed Bayesian approach for deep neural networks called Monte Carlo (MC) dropout to define different uncertainty measures for each phenotype prediction. With real HCS data, we show that these uncertainty measures allow us to identify novel or unclear phenotypes. In addition, we also found that the MC dropout method results in a significant improvement of classification accuracy. The proposed procedure used in our HCS case study can be easily transferred to any existing network architecture and will be beneficial in terms of accuracy and novelty detection.

Integrating uncertainty in deep neural networks for MRI based stroke analysis (2020)

Herzog, Lisa ; Murina, Elvis ; Dürr, Oliver ; Wegener, Susanne ; Sick, Beate

At present, the majority of the proposed Deep Learning (DL) methods provide point predictions without quantifying the model's uncertainty. However, a quantification of the reliability of automated image analysis is essential, in particular in medicine when physicians rely on the results for making critical treatment decisions. In this work, we provide an entire framework to diagnose ischemic stroke patients incorporating Bayesian uncertainty into the analysis procedure. We present a Bayesian Convolutional Neural Network (CNN) yielding a probability for a stroke lesion on 2D Magnetic Resonance (MR) images with corresponding uncertainty information about the reliability of the prediction. For patient-level diagnoses, different aggregation methods are proposed and evaluated, which combine the individual image-level predictions. Those methods take advantage of the uncertainty in the image predictions and report model uncertainty at the patient-level. In a cohort of 511 patients, our Bayesian CNN achieved an accuracy of 95.33% at the image-level representing a significant improvement of 2% over a non-Bayesian counterpart. The best patient aggregation method yielded 95.89% of accuracy. Integrating uncertainty information about image predictions in aggregation models resulted in higher uncertainty measures to false patient classifications, which enabled to filter critical patient diagnoses that are supposed to be closer examined by a medical doctor. We therefore recommend using Bayesian approaches not only for improved image-level prediction and uncertainty estimation but also for the detection of uncertain aggregations at the patient-level.

Developing deep learning applications for life science and pharma industry (2018)

Siegismund, Daniel ; Tolkachev, Vasily ; Heyse, Stephan ; Sick, Beate ; Dürr, Oliver ; Steigele, Stephan

Deep transformation models for functional outcome prediction after acute ischemic stroke (2023)

Herzog, Lisa ; Kook, Lucas ; Götschi, Andrea ; Petermann, Katrin ; Hänsel, Martin ; Hamann, Janne ; Dürr, Oliver ; Wegener, Susanne ; Sick, Beate

Deep transformation models (2021)

Sick, Beate ; Hathorn, Torsten ; Dürr, Oliver

We present a deep transformation model for probabilistic regression. Deep learning is known for outstandingly accurate predictions on complex data but in regression tasks it is predominantly used to just predict a single number. This ignores the non-deterministic character of most tasks. Especially if crucial decisions are based on the predictions, like in medical applications, it is essential to quantify the prediction uncertainty. The presented deep learning transformation model estimates the whole conditional probability distribution, which is the most thorough way to capture uncertainty about the outcome. We combine ideas from a statistical transformation model (most likely transformation) with recent transformation models from deep learning (normalizing flows) to predict complex outcome distributions. The core of the method is a parameterized transformation function which can be trained with the usual maximum likelihood framework using gradient descent. The method can be combined with existing deep learning architectures. For small machine learning benchmark datasets, we report state of the art performance for most dataset and partly even outperform it. Our method works for complex input data, which we demonstrate by employing a CNN architecture on image data.

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Keywords

Institute

16 search hits