Refine
Year of publication
Document Type
- Conference Proceeding (20)
- Other Publications (3)
- Article (2)
- Report (1)
Language
- English (26)
Keywords
- 3D ship detection (1)
- ADAM (1)
- B-spline activation function (1)
- Camera characteristic (1)
- Computer vision (1)
- Damage Detection (1)
- Dark current (1)
- Deep learning (1)
- Defect detection (1)
- Freistellungssemesterbericht (1)
Institute
Using multi-camera matching techniques for 3d reconstruction there is usually the trade-off between the quality of the computed depth map and the speed of the computations. Whereas high quality matching methods take several seconds to several minutes to compute a depth map for one set of images, real-time methods achieve only low quality results. In this paper we present a multi-camera matching method that runs in real-time and yields high resolution depth maps. Our method is based on a novel multi-level combination of normalized cross correlation, deformed matching windows based on the multi-level depth map information, and sub-pixel precise disparity maps. The whole process is implemented completely on the GPU. With this approach we can process four 0.7 megapixel images in 129 milliseconds to a full resolution 3d depth map. Our technique is tailored for the recognition of non-technical shapes, because our target application is face recognition.
Digital cameras are subject to physical, electronic and optic effects that result in errors and noise in the image. These effects include for example a temperature dependent dark current, read noise, optical vignetting or different sensitivities of individual pixels. The task of a radiometric calibration is to reduce these errors in the image and thus improve the quality of the overall application. In this work we present an algorithm for radiometric calibration based on Gaussian processes. Gaussian processes are a regression method widely used in machine learning that is particularly useful in our context. Then Gaussian process regression is used to learn a temperature and exposure time dependent mapping from observed gray-scale values to true light intensities for each pixel. Regression models based on the characteristics of single pixels suffer from excessively high runtime and thus are unsuitable for many practical applications. In contrast, a single regression model for an entire image with high spatial resolution leads to a low quality radiometric calibration, which also limits its practical use. The proposed algorithm is predicated on a partitioning of the pixels such that each pixel partition can be represented by one single regression model without quality loss. Partitioning is done by extracting features from the characteristic of each pixel and using them for lexicographic sorting. Splitting the sorted data into partitions with equal size yields the final partitions, each of which is represented by the partition centers. An individual Gaussian process regression and model selection is done for each partition. Calibration is performed by interpolating the gray-scale value of each pixel with the regression model of the respective partition. The experimental comparison of the proposed approach to classical flat field calibration shows a consistently higher reconstruction quality for the same overall number of calibration frames.
Classification of point clouds by different types of geometric primitives is an essential part in the reconstruction process of CAD geometry. We use support vector machines (SVM) to label patches in point clouds with the class labels tori, ellipsoids, spheres, cones, cylinders or planes. For the classification features based on different geometric properties like point normals, angles, and principal curvatures are used. These geometric features are estimated in the local neighborhood of a point of the point cloud. Computing these geometric features for a random subset of the point cloud yields a feature distribution. Different features are combined for achieving best classification results. To minimize the time consuming training phase of SVMs, the geometric features are first evaluated using linear discriminant analysis (LDA).
LDA and SVM are machine learning approaches that require an initial training phase to allow for a subsequent automatic classification of a new data set. For the training phase point clouds are generated using a simulation of a laser scanning device. Additional noise based on an laser scanner error model is added to the point clouds. The resulting LDA and SVM classifiers are then used to classify geometric primitives in simulated and real laser scanned point clouds.
Compared to other approaches, where all known features are used for classification, we explicitly compare novel against known geometric features to prove their effectiveness.
We present a 3d-laser-scan simulation in virtual
reality for creating synthetic scans of CAD models. Consisting of
the virtual reality head-mounted display Oculus Rift and the
motion controller Razer Hydra our system can be used like
common hand-held 3d laser scanners. It supports scanning of
triangular meshes as well as b-spline tensor product surfaces
based on high performance ray-casting algorithms. While point
clouds of known scanning simulations are missing the man-made
structure, our approach overcomes this problem by imitating
real scanning scenarios. Calculation speed, interactivity and the
resulting realistic point clouds are the benefits of this system.
Reconstruction of hand-held laser scanner data is used in industry primarily for reverse engineering. Traditionally, scanning and reconstruction are separate steps. The operator of the laser scanner has no feedback from the reconstruction results. On-line reconstruction of the CAD geometry allows for such an immediate feedback.
We propose a method for on-line segmentation and reconstruction of CAD geometry from a stream of point data based on means that are updated on-line. These means are combined to define complex local geometric properties, e.g., to radii and center points of spherical regions. Using means of local scores, planar, cylindrical, and spherical segments are detected and extended robustly with region growing. For the on-line computation of the means we use so-called accumulated means. They allow for on-line insertion and removal of values and merging of means. Our results show that this approach can be performed on-line and is robust to noise. We demonstrate that our method reconstructs spherical, cylindrical, and planar segments on real scan data containing typical errors caused by hand-held laser scanners.
IGA using subdivision-solids
(2015)
Creating cages that enclose a 3D-model of some sort is part of many preprocessing pipelines in computational geometry. Creating a cage of preferably lower resolution than the original model is of special interest when performing an operation on the original model might be to costly. The desired operation can be applied to the cage first and then transferred to the enclosed model. With this paper the authors present a short survey of recent and well known methods for cage computation.
The authors would like to give the reader an insight in common methods and their differences.
Deep 3D
(2017)
Vortrag
In the reverse engineering process one has to classify parts of point clouds with the correct type of geometric primitive. Features based on different geometric properties like point relations, normals, and curvature information can be used, to train classifiers like Support Vector Machines (SVM). These geometric features are estimated in the local neighborhood of a point of the point cloud. The multitude of different features makes an in-depth comparison necessary. In this work we evaluate 23 features for the classification of geometric primitives in point clouds. Their performance is evaluated on SVMs when used to classify geometric primitives in simulated and real laser scanned point clouds. We also introduce a normalization of point cloud density to improve classification generalization.
In this paper we present a method using deep learning to compute parametrizations for B-spline curve approximation. Existing methods consider the computation of parametric values and a knot vector as separate problems. We propose to train interdependent deep neural networks to predict parametric values and knots. We show that it is possible to include B-spline curve approximation directly into the neural network architecture. The resulting parametrizations yield tight approximations and are able to outperform state-of-the-art methods.
Deep neural networks have been successfully applied to problems such as image segmentation, image super-resolution, coloration and image inpainting. In this work we propose the use of convolutional neural networks (CNN) for image inpainting of large regions in high-resolution textures. Due to limited computational resources processing high-resolution images with neural networks is still an open problem. Existing methods separate inpainting of global structure and the transfer of details, which leads to blurry results and loss of global coherence in the detail transfer step. Based on advances in texture synthesis using CNNs we propose patch-based image inpainting by a single network topology that is able to optimize for global as well as detail texture statistics. Our method is capable of filling large inpainting regions, oftentimes exceeding quality of comparable methods for images of high-resolution (2048x2048px). For reference patch look-up we propose to use the same summary statistics that are used in the inpainting process.
Knot placement for curve approximation is a well known and yet open problem in geometric modeling. Selecting knot values that yield good approximations is a challenging task, based largely on heuristics and user experience. More advanced approaches range from parametric averaging to genetic algorithms.
In this paper, we propose to use Support Vector Machines (SVMs) to determine suitable knot vectors for B-spline curve approximation. The SVMs are trained to identify locations in a sequential point cloud where knot placement will improve the approximation error. After the training phase, the SVM can assign, to each point set location, a so-called score. This score is based on geometric and differential geometric features of points. It measures the quality of each location to be used as knots in the subsequent approximation. From these scores, the final knot vector can be constructed exploring the topography of the score-vector without the need for iteration or optimization in the approximation process. Knot vectors computed with our approach outperform state of the art methods and yield tighter approximations.
In my research sabbatical I was working on three different topics, namely orthogonal polynomials in geometric modeling, re-parametrized univariate subdivision curves, and reconstruction of 3d-fish-models and other zoological artifacts. In the subsequent Sections, I will describe my particular activity in these different fields. The sections are meant to present an overview of my research activities, leaving out the technical details.
Section 1 is on orthogonal polynomials and other related generating systems for functions systems of smooth function.
In Section 2, I will discuss the application of various re-parametrization schemes for interpolatory subdivision algorithms for the generation of space curves.
The next Section 3 is concerned with my research at the University of Queensland, Brisbane, in collaboration with Dr. Ulrike Siebeck from the School of Biomedical Sciences on fish behavior and reconstruction of 3d-fish models in particular.
In the last Section 4, I will describe what effects this research will have on in my subsequent teaching at the University of Applied Science Konstanz (HTWG).
The ageing infrastructure in ports requires regular inspection. This inspection is currently carried out manually by divers who sense by hand the entire underwater infrastructure. This process is cost-intensive as it involves a lot of time and human resources. To overcome these difficulties, we propose to scan the above and underwater port structure with a Multi-SensorSystem, and -by a fully automated processto classify the obtained point cloud into damaged and undamaged zones. We make use of simulated training data to test our approach since not enough training data with corresponding class labels are available yet. To that aim, we build a rasterised heightfield of a point cloud of a sheet pile wall by cutting it into verticall slices. The distance from each slice to the corresponding line generates the heightfield. This latter is propagated through a convolutional neural network which detects anomalies. We use the VGG19 Deep Neural Network model pretrained on natural images. This neural network has 19 layers and it is often used for image recognition tasks. We showed that our approach can achieve a fully automated, reproducible, quality-controlled damage detection which is able to analyse the whole structure instead of the sample wise manual method with divers. The mean true positive rate is 0.98 which means that we detected 98 % of the damages in the simulated environment.
Three-dimensional ship localization with only one camera is a challenging task due to the loss of depth information caused by perspective projection. In this paper, we propose a method to measure distances based on the assumption that ships lie on a flat surface. This assumption allows to recover depth from a single image using the principle of inverse perspective. For the 3D ship detection task, we use a hybrid approach that combines image detection with a convolutional neural network, camera geometry and inverse perspective. Furthermore, a novel calculation of object height is introduced. Experiments show that the monocular distance computation works well in comparison to a Velodyne lidar. Due to its robustness, this could be an easy-to-use baseline method for detection tasks in navigation systems.
Image novelty detection is a repeating task in computer vision and describes the detection of anomalous images based on a training dataset consisting solely of normal reference data. It has been found that, in particular, neural networks are well-suited for the task. Our approach first transforms the training and test images into ensembles of patches, which enables the assessment of mean-shifts between normal data and outliers. As mean-shifts are only detectable when the outlier ensemble and inlier distribution are spatially separate from each other, a rich feature space, such as a pre-trained neural network, needs to be chosen to represent the extracted patches. For mean-shift estimation, the Hotelling T2 test is used. The size of the patches turned out to be a crucial hyperparameter that needs additional domain knowledge about the spatial size of the expected anomalies (local vs. global). This also affects model selection and the chosen feature space, as commonly used Convolutional Neural Networks or Vision Image Transformers have very different receptive field sizes. To showcase the state-of-the-art capabilities of our approach, we compare results with classical and deep learning methods on the popular dataset CIFAR-10, and demonstrate its real-world applicability in a large-scale industrial inspection scenario using the MVTec dataset. Because of the inexpensive design, our method can be implemented by a single additional 2D-convolution and pooling layer and allows particularly fast prediction times while being very data-efficient.
We are interested in computing a mini-batch-capable end-to-end algorithm to identify statistically independent components (ICA) in large scale and high-dimensional datasets. Current algorithms typically rely on pre-whitened data and do not integrate the two procedures of whitening and ICA estimation. Our online approach estimates a whitening and a rotation matrix with stochastic gradient descent on centered or uncentered data. We show that this can be done efficiently by combining Batch Karhunen-Löwe-Transformation [1] with Lie group techniques. Our algorithm is recursion-free and can be organized as feed-forward neural network which makes the use of GPU acceleration straight-forward. Because of the very fast convergence of Batch KLT, the gradient descent in the Lie group of orthogonal matrices stabilizes quickly. The optimization is further enhanced by integrating ADAM [2], an improved stochastic gradient descent (SGD) technique from the field of deep learning. We test the scaling capabilities by computing the independent components of the well-known ImageNet challenge (144 GB). Due to its robustness with respect to batch and step size, our approach can be used as a drop-in replacement for standard ICA algorithms where memory is a limiting factor.
Targetless Lidar-camera registration is a repeating task in many computer vision and robotics applications and requires computing the extrinsic pose of a point cloud with respect to a camera or vice-versa. Existing methods based on learning or optimization lack either generalization capabilities or accuracy. Here, we propose a combination of pre-training and optimization using a neural network-based mutual information estimation technique (MINE [1]). This construction allows back-propagating the gradient to the calibration parameters and enables stochastic gradient descent. To ensure orthogonality constraints with respect to the rotation matrix we incorporate Lie-group techniques. Furthermore, instead of optimizing on entire images, we operate on local patches that are extracted from the temporally synchronized projected Lidar points and camera frames. Our experiments show that this technique not only improves over existing techniques in terms of accuracy, but also shows considerable generalization capabilities towards new Lidar-camera configurations.
Motion estimation is an essential element for autonomous vessels. It is used e.g. for lidar motion compensation as well as mapping and detection tasks in a maritime environment. Because the use of gyroscopes is not reliable and a high performance inertial measurement unit is quite expensive, we present an approach for visual pitch and roll estimation that utilizes a convolutional neural network for water segmentation, a stereo system for reconstruction and simple geometry to estimate pitch and roll. The algorithm is validated on a novel, publicly available dataset recorded at Lake Constance. Our experiments show that the pitch and roll estimator provides accurate results in comparison to an Xsens IMU sensor. We can further improve the pitch and roll estimation by sensor fusion with a gyroscope. The algorithm is available in its implementation as a ROS node.
Incremental one-class learning using regularized null-space training for industrial defect detection
(2024)
One-class incremental learning is a special case of class-incremental learning, where only a single novel class is incrementally added to an existing classifier instead of multiple classes. This case is relevant in industrial defect detection scenarios, where novel defects usually appear during operation. Existing rolled-out classifiers must be updated incrementally in this scenario with only a few novel examples. In addition, it is often required that the base classifier must not be altered due to approval and warranty restrictions. While simple finetuning often gives the best performance across old and new classes, it comes with the drawback of potentially losing performance on the base classes (catastrophic forgetting [1]). Simple prototype approaches [2] work without changing existing weights and perform very well when the classes are well separated but fail dramatically when not. In theory, null-space training (NSCL) [3] should retain the basis classifier entirely, as parameter updates are restricted to the null space of the network with respect to existing classes. However, as we show, this technique promotes overfitting in the case of one-class incremental learning. In our experiments, we found that unconstrained weight growth in null space is the underlying issue, leading us to propose a regularization term (R-NSCL) that penalizes the magnitude of amplification. The regularization term is added to the standard classification loss and stabilizes null-space training in the one-class scenario by counteracting overfitting. We test the method’s capabilities on two industrial datasets, namely AITEX and MVTec, and compare the performance to state-of-the-art algorithms for class-incremental learning.