Institut für Optische Systeme - IOS
Refine
Year of publication
Document Type
- Conference Proceeding (24)
- Article (8)
- Part of a Book (3)
- Doctoral Thesis (2)
- Master's Thesis (2)
- Report (2)
- Bachelor Thesis (1)
- Book (1)
- Preprint (1)
Keywords
- 3D ship detection (1)
- Bayesian convolutional neural networks (1)
- Bayesian neural network (1)
- Calibration (1)
- Calibration procedure (1)
- Chamfer distance (1)
- Classification (1)
- Computer vision (1)
- Cone (1)
- Convolutional networks (1)
Increasing the perception mechanism of robotic systems, and therefore their level of autonomy, is a challenging task, particularly when production costs must be maintained at a minimum. To enhance the autonomous capabilities of Autonomous Mobile Robots (AMRs) without increasing production costs, we propose a novel 2D-Lidar mirror combination with a main focus on the calibration procedure and the resulting performance figures. This approach leverages precise calibration to enhance the robot’s perception without the necessity of adding costly sensors such as 3D-Lidars, thereby maintaining affordability while simultaneously increasing the effectiveness of 2D-Lidar sensors.
Semi-structured regression models enable the joint modeling of interpretable structured and complex unstructured feature effects. The structured model part is inspired by statistical models and can be used to infer the input-output relationship for features of particular importance. The complex unstructured part defines an arbitrary deep neural network and thereby provides enough flexibility to achieve competitive prediction performance. While these models can also account for aleatoric uncertainty, there is still a lack of work on accounting for epistemic uncertainty. In this paper, we address this problem by presenting a Bayesian approximation for semi-structured regression models using subspace inference. To this end, we extend subspace inference for joint posterior sampling from a full parameter space for structured effects and a subspace for unstructured effects. Apart from this hybrid sampling scheme, our method allows for tunable complexity of the subspace and can capture multiple minima in the loss landscape. Numerical experiments validate our approach’s efficacy in recovering structured effect parameter posteriors in semi-structured models and approaching the full-space posterior distribution of MCMC for increasing subspace dimension. Further, our approach exhibits competitive predictive performance across simulated and real-world datasets
In extended object tracking, random matrices are commonly used to filter the mean and covariance matrix from measurement data. However, the relation from mean and covariance matrix to the extension parameters can become challenging when a lidar sensor is used. To address this, we propose virtual measurement models to estimate those parameters iteratively by adapting them, until the statistical moments of the measurements they would cause, match the random matrix result. While previous work has focused on 2D shapes, this paper extends the methodology to encompass 3D shapes such as cones, ellipsoids and rectangular cuboids. Additionally, we introduce a classification method based on Chamfer distances for identifying the best-fitting shape when the object’s shape is unknown. Our approach is evaluated through simulation studies and with real lidar data from maritime scenarios. The results indicate that a cone is the best representation for sailing boats, while ellipsoids are optimal for motorboats.
Im Sommersemester 2024 konnte ich mich während eines Freistellungssemesters ganz einem Forschungsthema widmen. Das Thema „Compressed Sensing“ stand im Mittelpunkt dieses Semesters. Dies ist eine moderne Methodik der Datenerfassung, die in mein Berufungsgebiet der „Sensorik und Messtechnik“ als Professor an der HTWG Konstanz fällt.
Zu Beginn dieses Semesters habe ich mich eingearbeitet in dieses Forschungs-Themengebiet. Dazu habe ich Fachbücher, wissenschaftliche Forschungsartikel und Review-Paper gesucht, teilweise angeschafft, und durchgearbeitet.
Nach dieser umfangreichen Literaturrecherche habe ich dann einige Software-Bibliotheken für diese Methodik evaluiert. Mehrere dieser Bibliotheken habe ich installiert und mit Anwendungsbeispielen evaluiert. Ein konkretes, industrielles Messverfahren für Lidar-Sensoren habe ich vertieft mit der Methode der „Finite Rate of Innovation“ untersucht und daraus ein Konzept für einen Forschungsantrag erarbeitet.
Zum Abschluss des Semesters habe ich mit mehreren Firmen der Region Kontakt aufgenommen und geklärt ob Interesse an diesem Forschungsthema besteht. Interessierte Firmen habe ich vor Ort besucht, mein Forschungsthema dort vorgestellt und die Möglichkeit von industriellen Anwendungsmöglichkeiten und Forschungs-Kooperationen diskutiert.
Autonomous navigation on inland waters requires an accurate understanding of the environment in order to react to possible obstacles. Deep learning is a promising technique to detect obstacles robustly. However, supervised deep learning models require large data-sets to adjust their weights and to generalize to unseen data. Therefore, we equipped our research vessel with a laser scanner and a stereo camera to record a novel obstacle detection data-set for inland waters. We annotated 1974 stereo images and lidar point clouds with 3d bounding boxes. Furthermore, we provide an initial approach and a suitable metric to compare the results on the test data-set. The data-set is publicly available and seeks to make a contribution towards increasing the safety on inland waters.
Black-box variational inference (BBVI) is a technique to approximate the posterior of Bayesian models by optimization. Similar to MCMC, the user only needs to specify the model; then, the inference procedure is done automatically. In contrast to MCMC, BBVI scales to many observations, is faster for some applications, and can take advantage of highly optimized deep learning frameworks since it can be formulated as a minimization task. In the case of complex posteriors, however, other state-of-the-art BBVI approaches often yield unsatisfactory posterior approximations. This paper presents Bernstein flow variational inference (BF-VI), a robust and easy-to-use method flexible enough to approximate complex multivariate posteriors. BF-VI combines ideas from normalizing flows and Bernstein polynomial-based transformation models. In benchmark experiments, we compare BF-VI solutions with exact posteriors, MCMC solutions, and state-of-the-art BBVI methods, including normalizing flow-based BBVI. We show for low-dimensional models that BF-VI accurately approximates the true posterior; in higher-dimensional models, BF-VI compares favorably against other BBVI methods. Further, using BF-VI, we develop a Bayesian model for the semi-structured melanoma challenge data, combining a CNN model part for image data with an interpretable model part for tabular data, and demonstrate, for the first time, the use of BBVI in semi-structured models.
Incremental one-class learning using regularized null-space training for industrial defect detection
(2024)
One-class incremental learning is a special case of class-incremental learning, where only a single novel class is incrementally added to an existing classifier instead of multiple classes. This case is relevant in industrial defect detection scenarios, where novel defects usually appear during operation. Existing rolled-out classifiers must be updated incrementally in this scenario with only a few novel examples. In addition, it is often required that the base classifier must not be altered due to approval and warranty restrictions. While simple finetuning often gives the best performance across old and new classes, it comes with the drawback of potentially losing performance on the base classes (catastrophic forgetting [1]). Simple prototype approaches [2] work without changing existing weights and perform very well when the classes are well separated but fail dramatically when not. In theory, null-space training (NSCL) [3] should retain the basis classifier entirely, as parameter updates are restricted to the null space of the network with respect to existing classes. However, as we show, this technique promotes overfitting in the case of one-class incremental learning. In our experiments, we found that unconstrained weight growth in null space is the underlying issue, leading us to propose a regularization term (R-NSCL) that penalizes the magnitude of amplification. The regularization term is added to the standard classification loss and stabilizes null-space training in the one-class scenario by counteracting overfitting. We test the method’s capabilities on two industrial datasets, namely AITEX and MVTec, and compare the performance to state-of-the-art algorithms for class-incremental learning.
Using multi-camera matching techniques for 3d reconstruction there is usually the trade-off between the quality of the computed depth map and the speed of the computations. Whereas high quality matching methods take several seconds to several minutes to compute a depth map for one set of images, real-time methods achieve only low quality results. In this paper we present a multi-camera matching method that runs in real-time and yields high resolution depth maps. Our method is based on a novel multi-level combination of normalized cross correlation, deformed matching windows based on the multi-level depth map information, and sub-pixel precise disparity maps. The whole process is implemented completely on the GPU. With this approach we can process four 0.7 megapixel images in 129 milliseconds to a full resolution 3d depth map. Our technique is tailored for the recognition of non-technical shapes, because our target application is face recognition.
Das Projekt eFlow, an dem unter anderem die HTWG Konstanz seit 2012 forscht, simuliert mit Hilfe einer mathematischen Simulation wie sich Menschenmassen verhalten, wenn sie ein vorgegebenes Gelände verlassen sollen. Die Simulation baut auf einen Ansatz der Finite Elemente Methode auf, in der mehrere gekoppelte Differenzialgleichungen berechnet werden müssen. Diese Berechnungen erweisen sich gerade bei komplexen Szenarien mit großem Gelände und vielen Personen als sehr rechenintensiv. Ziel dieser Bachelorarbeit ist es ein Surrogate Modell zu erstellen, welches basierend auf machine-learning Ansätzen im spezifischen auf Regressionsmethoden Ergebnisse der Simulation vorhersagen soll. Somit müssen Datensätze generiert werden. Diese entstehen durch wiederholte Durchläufe der Simulation, in der jeweils die Eingabeparameter, die in das Regressionsmodell einfließen sollen variiert werden und mit dem entsprechenden Ergebnis der Simulation verknüpft werden. Die Regressionsansätze werden dabei pro Durchlauf komplexer, in dem jeweils zusätzliche Eingabeparameter mit in die Datengenerierung aufgenommen werden. Es soll überprüft werden, ob diese Simulation mittels machine-learning Ansätzen reproduzierbar ist. Basierend auf diesen Surrogate Modellen soll es möglich gemacht werden, Situationen in Echtzeit zu überprüfen, ohne dabei den Weg der rechenaufwendigen Simulation zu gehen. Die Ergebnisse bestätigen, dass die mathematische Simulation mittels Regression reproduzierbar ist. Es erweist sich jedoch als sehr rechenaufwendig, Daten zu sammeln, um genügend Eingabeparameter mit in die Regressionsmethode einfließen zu lassen. Diese Arbeit gestaltet somit eine Vorstudie zur Umsetzung eines ausgereiften Surrogate Modells, welches jegliche Eingabeparameter der Simulation berücksichtigen kann.
Random matrices are used to filter the center of gravity (CoG) and the covariance matrix of measurements. However, these quantities do not always correspond directly to the position and the extent of the object, e.g. when a lidar sensor is used.In this paper, we propose a Gaussian processes regression model (GPRM) to predict the position and extension of the object from the filtered CoG and covariance matrix of the measurements. Training data for the GPRM are generated by a sampling method and a virtual measurement model (VMM). The VMM is a function that generates artificial measurements using ray tracing and allows us to obtain the CoG and covariance matrix that any object would cause. This enables the GPRM to be trained without real data but still be applied to real data due to the precise modeling in the VMM. The results show an accurate extension estimation as long as the reality behaves like the modeling and e.g. lidar measurements only occur on the side facing the sensor.