Refine
Document Type
- Conference Proceeding (19) (remove)
Language
- English (19)
Keywords
- 3D ship detection (1)
- Computer vision (1)
- Extended object tracking (1)
- Extension estimation (1)
- Finite-element (1)
- GPU (1)
- Gaussian processes (1)
- Inverse perspective (1)
- Learning to cluster (1)
- Lidar (1)
Institute
- Institut für Optische Systeme - IOS (19) (remove)
Targetless Lidar-camera registration is a repeating task in many computer vision and robotics applications and requires computing the extrinsic pose of a point cloud with respect to a camera or vice-versa. Existing methods based on learning or optimization lack either generalization capabilities or accuracy. Here, we propose a combination of pre-training and optimization using a neural network-based mutual information estimation technique (MINE [1]). This construction allows back-propagating the gradient to the calibration parameters and enables stochastic gradient descent. To ensure orthogonality constraints with respect to the rotation matrix we incorporate Lie-group techniques. Furthermore, instead of optimizing on entire images, we operate on local patches that are extracted from the temporally synchronized projected Lidar points and camera frames. Our experiments show that this technique not only improves over existing techniques in terms of accuracy, but also shows considerable generalization capabilities towards new Lidar-camera configurations.
Optical surface inspection: A novelty detection approach based on CNN-encoded texture features
(2018)
In inspection systems for textured surfaces, a reference texture is typically known before novel examples are inspected. Mostly, the reference is only available in a digital format. As a consequence, there is no dataset of defective examples available that could be used to train a classifier. We propose a texture model approach to novelty detection. The texture model uses features encoded by a convolutional neural network (CNN) trained on natural image data. The CNN activations represent the specific characteristics of the digital reference texture which are learned by a one-class classifier. We evaluate our novelty detector in a digital print inspection scenario. The inspection unit is based on a camera array and a flashing light illumination which allows for inline capturing of multichannel images at a high rate. In order to compare our results to manual inspection, we integrated our inspection unit into an industrial single-pass printing system.
Motion estimation is an essential element for autonomous vessels. It is used e.g. for lidar motion compensation as well as mapping and detection tasks in a maritime environment. Because the use of gyroscopes is not reliable and a high performance inertial measurement unit is quite expensive, we present an approach for visual pitch and roll estimation that utilizes a convolutional neural network for water segmentation, a stereo system for reconstruction and simple geometry to estimate pitch and roll. The algorithm is validated on a novel, publicly available dataset recorded at Lake Constance. Our experiments show that the pitch and roll estimator provides accurate results in comparison to an Xsens IMU sensor. We can further improve the pitch and roll estimation by sensor fusion with a gyroscope. The algorithm is available in its implementation as a ROS node.
Three-dimensional ship localization with only one camera is a challenging task due to the loss of depth information caused by perspective projection. In this paper, we propose a method to measure distances based on the assumption that ships lie on a flat surface. This assumption allows to recover depth from a single image using the principle of inverse perspective. For the 3D ship detection task, we use a hybrid approach that combines image detection with a convolutional neural network, camera geometry and inverse perspective. Furthermore, a novel calculation of object height is introduced. Experiments show that the monocular distance computation works well in comparison to a Velodyne lidar. Due to its robustness, this could be an easy-to-use baseline method for detection tasks in navigation systems.
Using multi-camera matching techniques for 3d reconstruction there is usually the trade-off between the quality of the computed depth map and the speed of the computations. Whereas high quality matching methods take several seconds to several minutes to compute a depth map for one set of images, real-time methods achieve only low quality results. In this paper we present a multi-camera matching method that runs in real-time and yields high resolution depth maps. Our method is based on a novel multi-level combination of normalized cross correlation, deformed matching windows based on the multi-level depth map information, and sub-pixel precise disparity maps. The whole process is implemented completely on the GPU. With this approach we can process four 0.7 megapixel images in 129 milliseconds to a full resolution 3d depth map. Our technique is tailored for the recognition of non-technical shapes, because our target application is face recognition.
Deep neural networks (DNNs) are known for their high prediction performance, especially in perceptual tasks such as object recognition or autonomous driving. Still, DNNs are prone to yield unreliable predictions when encountering completely new situations without indicating their uncertainty. Bayesian variants of DNNs (BDNNs), such as MC dropout BDNNs, do provide uncertainty measures. However, BDNNs are slow during test time because they rely on a sampling approach. Here we present a single shot MC dropout approximation that preserves the advantages of BDNNs without being slower than a DNN. Our approach is to analytically approximate for each layer in a fully connected network the expected value and the variance of the MC dropout signal. We evaluate our approach on different benchmark datasets and a simulated toy example. We demonstrate that our single shot MC dropout approximation resembles the point estimate and the uncertainty estimate of the predictive distribution that is achieved with an MC approach, while being fast enough for real-time deployments of BDNNs.
We analyse the results of a finite element simulation of a macroscopic model, which describes the movement of a crowd, that is considered as a continuum. A new formulation based on the macroscopic model from Hughes [2] is given. We present a stable numerical algorithm by approximating with a viscosity solution. The fundamental setting is given by an arbitrary domain that can contain several obstacles, several entries and must have at least one exit. All pedestrians have the goal to leave the room as quickly as possible. Nobody prefers a particular exit.
Fast and reliable acquisition of truth data for document analysis using cyclic suggest algorithms
(2019)
In document analysis the availability of ground truth data plays a crucial role for the success of a project. This is even more true at the rise of new deep learning methods which heavily rely on the availability of training data. But even for traditional, hand crafted algorithms that are not trained on data, reliable test data is important for the improvement and evaluation of the methods. Because ground truth acquisition is expensive and time consuming, semi-automatic methods are introduced which make use of suggestions coming from document analysis systems. The interaction between the human operator and the automatic analysis algorithms is the key to speed up the process while improving the quality of the data. The final confirmation of data may always be done by the human operator. This paper demonstrates a use case for acquisition of truth data in a mail processing system. It shows why a new, extended view on truth data is necessary in development and engineering of such systems. An overview over the tool and the data handling is given, the advantages in the workflow are shown, and consequences for the construction of analysis algorithms are discussed. It can be shown that the interplay between suggest algorithms and human operator leads to very fast truth data capturing. The surprising finding is the fact that if multiple suggest algorithms circularly depend on data, they are especially effective in terms of speed and accuracy.