Institut für Optische Systeme - IOS
Refine
Document Type
- Conference Proceeding (17)
- Article (8)
- Part of a Book (3)
- Doctoral Thesis (2)
- Master's Thesis (2)
- Bachelor Thesis (1)
- Book (1)
- Preprint (1)
- Report (1)
Keywords
- 3D ship detection (1)
- Bayesian convolutional neural networks (1)
- Calibration procedure (1)
- Classification (1)
- Convolutional networks (1)
- Crowdmanagement (1)
- Deep Transformation Model (1)
- Deep learning (4)
- Defect detection (1)
- Didaktik (2)
Institute
Das Projekt eFlow, an dem unter anderem die HTWG Konstanz seit 2012 forscht, simuliert mit Hilfe einer mathematischen Simulation wie sich Menschenmassen verhalten, wenn sie ein vorgegebenes Gelände verlassen sollen. Die Simulation baut auf einen Ansatz der Finite Elemente Methode auf, in der mehrere gekoppelte Differenzialgleichungen berechnet werden müssen. Diese Berechnungen erweisen sich gerade bei komplexen Szenarien mit großem Gelände und vielen Personen als sehr rechenintensiv. Ziel dieser Bachelorarbeit ist es ein Surrogate Modell zu erstellen, welches basierend auf machine-learning Ansätzen im spezifischen auf Regressionsmethoden Ergebnisse der Simulation vorhersagen soll. Somit müssen Datensätze generiert werden. Diese entstehen durch wiederholte Durchläufe der Simulation, in der jeweils die Eingabeparameter, die in das Regressionsmodell einfließen sollen variiert werden und mit dem entsprechenden Ergebnis der Simulation verknüpft werden. Die Regressionsansätze werden dabei pro Durchlauf komplexer, in dem jeweils zusätzliche Eingabeparameter mit in die Datengenerierung aufgenommen werden. Es soll überprüft werden, ob diese Simulation mittels machine-learning Ansätzen reproduzierbar ist. Basierend auf diesen Surrogate Modellen soll es möglich gemacht werden, Situationen in Echtzeit zu überprüfen, ohne dabei den Weg der rechenaufwendigen Simulation zu gehen. Die Ergebnisse bestätigen, dass die mathematische Simulation mittels Regression reproduzierbar ist. Es erweist sich jedoch als sehr rechenaufwendig, Daten zu sammeln, um genügend Eingabeparameter mit in die Regressionsmethode einfließen zu lassen. Diese Arbeit gestaltet somit eine Vorstudie zur Umsetzung eines ausgereiften Surrogate Modells, welches jegliche Eingabeparameter der Simulation berücksichtigen kann.
Random matrices are used to filter the center of gravity (CoG) and the covariance matrix of measurements. However, these quantities do not always correspond directly to the position and the extent of the object, e.g. when a lidar sensor is used.In this paper, we propose a Gaussian processes regression model (GPRM) to predict the position and extension of the object from the filtered CoG and covariance matrix of the measurements. Training data for the GPRM are generated by a sampling method and a virtual measurement model (VMM). The VMM is a function that generates artificial measurements using ray tracing and allows us to obtain the CoG and covariance matrix that any object would cause. This enables the GPRM to be trained without real data but still be applied to real data due to the precise modeling in the VMM. The results show an accurate extension estimation as long as the reality behaves like the modeling and e.g. lidar measurements only occur on the side facing the sensor.
Motion estimation is an essential element for autonomous vessels. It is used e.g. for lidar motion compensation as well as mapping and detection tasks in a maritime environment. Because the use of gyroscopes is not reliable and a high performance inertial measurement unit is quite expensive, we present an approach for visual pitch and roll estimation that utilizes a convolutional neural network for water segmentation, a stereo system for reconstruction and simple geometry to estimate pitch and roll. The algorithm is validated on a novel, publicly available dataset recorded at Lake Constance. Our experiments show that the pitch and roll estimator provides accurate results in comparison to an Xsens IMU sensor. We can further improve the pitch and roll estimation by sensor fusion with a gyroscope. The algorithm is available in its implementation as a ROS node.
Targetless Lidar-camera registration is a repeating task in many computer vision and robotics applications and requires computing the extrinsic pose of a point cloud with respect to a camera or vice-versa. Existing methods based on learning or optimization lack either generalization capabilities or accuracy. Here, we propose a combination of pre-training and optimization using a neural network-based mutual information estimation technique (MINE [1]). This construction allows back-propagating the gradient to the calibration parameters and enables stochastic gradient descent. To ensure orthogonality constraints with respect to the rotation matrix we incorporate Lie-group techniques. Furthermore, instead of optimizing on entire images, we operate on local patches that are extracted from the temporally synchronized projected Lidar points and camera frames. Our experiments show that this technique not only improves over existing techniques in terms of accuracy, but also shows considerable generalization capabilities towards new Lidar-camera configurations.
Image novelty detection is a repeating task in computer vision and describes the detection of anomalous images based on a training dataset consisting solely of normal reference data. It has been found that, in particular, neural networks are well-suited for the task. Our approach first transforms the training and test images into ensembles of patches, which enables the assessment of mean-shifts between normal data and outliers. As mean-shifts are only detectable when the outlier ensemble and inlier distribution are spatially separate from each other, a rich feature space, such as a pre-trained neural network, needs to be chosen to represent the extracted patches. For mean-shift estimation, the Hotelling T2 test is used. The size of the patches turned out to be a crucial hyperparameter that needs additional domain knowledge about the spatial size of the expected anomalies (local vs. global). This also affects model selection and the chosen feature space, as commonly used Convolutional Neural Networks or Vision Image Transformers have very different receptive field sizes. To showcase the state-of-the-art capabilities of our approach, we compare results with classical and deep learning methods on the popular dataset CIFAR-10, and demonstrate its real-world applicability in a large-scale industrial inspection scenario using the MVTec dataset. Because of the inexpensive design, our method can be implemented by a single additional 2D-convolution and pooling layer and allows particularly fast prediction times while being very data-efficient.
Lidar sensors are widely used for environmental perception on autonomous robot vehicles (ARV). The field of view (FOV) of Lidar sensors can be reshaped by positioning plane mirrors in their vicinity. Mirror setups can especially improve the FOV for ground detection of ARVs with 2D-Lidar sensors. This paper presents an overview of several geometric designs and their strengths for certain vehicle types. Additionally, a new and easy-to-implement calibration procedure for setups of 2D-Lidar sensors with mirrors is presented to determine precise mirror orientations and positions, using a single flat calibration object with a pre-aligned simple fiducial marker. Measurement data from a prototype vehicle with a 2D-Lidar with a 2 m range using this new calibration procedure are presented. We show that the calibrated mirror orientations are accurate to less than 0.6° in this short range, which is a significant improvement over the orientation angles taken directly from the CAD. The accuracy of the point cloud data improved, and no significant decrease in distance noise was introduced. We deduced general guidelines for successful calibration setups using our method. In conclusion, a 2D-Lidar sensor and two plane mirrors calibrated with this method are a cost-effective and accurate way for robot engineers to improve the environmental perception of ARVs.
We analyse the results of a finite element simulation of a macroscopic model, which describes the movement of a crowd, that is considered as a continuum. A new formulation based on the macroscopic model from Hughes [2] is given. We present a stable numerical algorithm by approximating with a viscosity solution. The fundamental setting is given by an arbitrary domain that can contain several obstacles, several entries and must have at least one exit. All pedestrians have the goal to leave the room as quickly as possible. Nobody prefers a particular exit.
Wer schon einmal dicht gedrängt vor der Konzertbühne stand kann sich die aussichtslose Lage, wenn die Stimmung kippt und Panik aufkommt, gut vorstellen. Es ist sehr wichtig, Räume und Events, die zeitweise von sehr vielen Menschen aufgesucht werden, so zu gestalten und zu planen, dass maximale Sicherheit gewährleistet ist. Damit eine öffentliche Veranstaltung reibungslos verläuft ist eine gründliche Planung, also ein qualitativ hochwertiges Crowd Management unabdingbar.
Die Frage „Wozu braucht man das?“ vonseiten der Studierenden oder Aussagen wie „Das habe ich im Beruf später nie mehr benötigt.“ von ehemaligen Studierenden ist den meisten Mathematikdozierenden sehr vertraut. Im Projekt BiLeSA wird dem Wunsch nach Integration von Praxisnähe im Mathematikunterricht mithilfe einer Smartphone-App, welche ausgewählte Themen in der Mathematik anhand von digitalen Bildern sichtbar macht, umgesetzt. Bei den ausgewählten Themen handelt es sich um (affin) lineare Abbildungen, Ableitungen in höheren Raumdimensionen und Potenzen von Komplexen Zahlen. Die Konzeptionierung des Lernobjekts erfolgte mit dem Design Based Research (DBR) Ansatz, welches im Basisprojekt des IBH-Labs „Seamless Learning“ konzipiert und entwickelt wurde.
Interpretability and uncertainty modeling are important key factors for medical applications. Moreover, data in medicine are often available as a combination of unstructured data like images and structured predictors like patient’s metadata. While deep learning models are state-of-the-art for image classification, the models are often referred to as ’black-box’, caused by the lack of interpretability. Moreover, DL models are often yielding point predictions and are too confident about the parameter estimation and outcome predictions.
On the other side with statistical regression models, it is possible to obtain interpretable predictor effects and capture parameter and model uncertainty based on the Bayesian approach. In this thesis, a publicly available melanoma dataset, consisting of skin lesions and patient’s age, is used to predict the melanoma types by using a semi-structured model, while interpretable components and model uncertainty is quantified. For Bayesian models, transformation model-based variational inference (TM-VI) method is used to determine the posterior distribution of the parameter. Several model constellations consisting of patient’s age and/or skin lesion were implemented and evaluated. Predictive performance was shown to be best by using a combined model of image and patient’s age, while providing the interpretable posterior distribution of the regression coefficient is possible. In addition, integrating uncertainty in image and tabular parts results in larger variability of the outputs corresponding to high uncertainty of the single model components.