Refine
Year of publication
Document Type
- Conference Proceeding (20)
- Other Publications (3)
- Article (2)
- Report (1)
Language
- English (26)
Keywords
- 3D ship detection (1)
- ADAM (1)
- B-spline activation function (1)
- Camera characteristic (1)
- Computer vision (1)
- Damage Detection (1)
- Dark current (1)
- Deep learning (1)
- Defect detection (1)
- Freistellungssemesterbericht (1)
Institute
Classification of point clouds by different types of geometric primitives is an essential part in the reconstruction process of CAD geometry. We use support vector machines (SVM) to label patches in point clouds with the class labels tori, ellipsoids, spheres, cones, cylinders or planes. For the classification features based on different geometric properties like point normals, angles, and principal curvatures are used. These geometric features are estimated in the local neighborhood of a point of the point cloud. Computing these geometric features for a random subset of the point cloud yields a feature distribution. Different features are combined for achieving best classification results. To minimize the time consuming training phase of SVMs, the geometric features are first evaluated using linear discriminant analysis (LDA).
LDA and SVM are machine learning approaches that require an initial training phase to allow for a subsequent automatic classification of a new data set. For the training phase point clouds are generated using a simulation of a laser scanning device. Additional noise based on an laser scanner error model is added to the point clouds. The resulting LDA and SVM classifiers are then used to classify geometric primitives in simulated and real laser scanned point clouds.
Compared to other approaches, where all known features are used for classification, we explicitly compare novel against known geometric features to prove their effectiveness.
We present a 3d-laser-scan simulation in virtual
reality for creating synthetic scans of CAD models. Consisting of
the virtual reality head-mounted display Oculus Rift and the
motion controller Razer Hydra our system can be used like
common hand-held 3d laser scanners. It supports scanning of
triangular meshes as well as b-spline tensor product surfaces
based on high performance ray-casting algorithms. While point
clouds of known scanning simulations are missing the man-made
structure, our approach overcomes this problem by imitating
real scanning scenarios. Calculation speed, interactivity and the
resulting realistic point clouds are the benefits of this system.
Reconstruction of hand-held laser scanner data is used in industry primarily for reverse engineering. Traditionally, scanning and reconstruction are separate steps. The operator of the laser scanner has no feedback from the reconstruction results. On-line reconstruction of the CAD geometry allows for such an immediate feedback.
We propose a method for on-line segmentation and reconstruction of CAD geometry from a stream of point data based on means that are updated on-line. These means are combined to define complex local geometric properties, e.g., to radii and center points of spherical regions. Using means of local scores, planar, cylindrical, and spherical segments are detected and extended robustly with region growing. For the on-line computation of the means we use so-called accumulated means. They allow for on-line insertion and removal of values and merging of means. Our results show that this approach can be performed on-line and is robust to noise. We demonstrate that our method reconstructs spherical, cylindrical, and planar segments on real scan data containing typical errors caused by hand-held laser scanners.
Using multi-camera matching techniques for 3d reconstruction there is usually the trade-off between the quality of the computed depth map and the speed of the computations. Whereas high quality matching methods take several seconds to several minutes to compute a depth map for one set of images, real-time methods achieve only low quality results. In this paper we present a multi-camera matching method that runs in real-time and yields high resolution depth maps. Our method is based on a novel multi-level combination of normalized cross correlation, deformed matching windows based on the multi-level depth map information, and sub-pixel precise disparity maps. The whole process is implemented completely on the GPU. With this approach we can process four 0.7 megapixel images in 129 milliseconds to a full resolution 3d depth map. Our technique is tailored for the recognition of non-technical shapes, because our target application is face recognition.
Three-dimensional ship localization with only one camera is a challenging task due to the loss of depth information caused by perspective projection. In this paper, we propose a method to measure distances based on the assumption that ships lie on a flat surface. This assumption allows to recover depth from a single image using the principle of inverse perspective. For the 3D ship detection task, we use a hybrid approach that combines image detection with a convolutional neural network, camera geometry and inverse perspective. Furthermore, a novel calculation of object height is introduced. Experiments show that the monocular distance computation works well in comparison to a Velodyne lidar. Due to its robustness, this could be an easy-to-use baseline method for detection tasks in navigation systems.
Motion estimation is an essential element for autonomous vessels. It is used e.g. for lidar motion compensation as well as mapping and detection tasks in a maritime environment. Because the use of gyroscopes is not reliable and a high performance inertial measurement unit is quite expensive, we present an approach for visual pitch and roll estimation that utilizes a convolutional neural network for water segmentation, a stereo system for reconstruction and simple geometry to estimate pitch and roll. The algorithm is validated on a novel, publicly available dataset recorded at Lake Constance. Our experiments show that the pitch and roll estimator provides accurate results in comparison to an Xsens IMU sensor. We can further improve the pitch and roll estimation by sensor fusion with a gyroscope. The algorithm is available in its implementation as a ROS node.
Targetless Lidar-camera registration is a repeating task in many computer vision and robotics applications and requires computing the extrinsic pose of a point cloud with respect to a camera or vice-versa. Existing methods based on learning or optimization lack either generalization capabilities or accuracy. Here, we propose a combination of pre-training and optimization using a neural network-based mutual information estimation technique (MINE [1]). This construction allows back-propagating the gradient to the calibration parameters and enables stochastic gradient descent. To ensure orthogonality constraints with respect to the rotation matrix we incorporate Lie-group techniques. Furthermore, instead of optimizing on entire images, we operate on local patches that are extracted from the temporally synchronized projected Lidar points and camera frames. Our experiments show that this technique not only improves over existing techniques in terms of accuracy, but also shows considerable generalization capabilities towards new Lidar-camera configurations.
The ageing infrastructure in ports requires regular inspection. This inspection is currently carried out manually by divers who sense by hand the entire underwater infrastructure. This process is cost-intensive as it involves a lot of time and human resources. To overcome these difficulties, we propose to scan the above and underwater port structure with a Multi-SensorSystem, and -by a fully automated processto classify the obtained point cloud into damaged and undamaged zones. We make use of simulated training data to test our approach since not enough training data with corresponding class labels are available yet. To that aim, we build a rasterised heightfield of a point cloud of a sheet pile wall by cutting it into verticall slices. The distance from each slice to the corresponding line generates the heightfield. This latter is propagated through a convolutional neural network which detects anomalies. We use the VGG19 Deep Neural Network model pretrained on natural images. This neural network has 19 layers and it is often used for image recognition tasks. We showed that our approach can achieve a fully automated, reproducible, quality-controlled damage detection which is able to analyse the whole structure instead of the sample wise manual method with divers. The mean true positive rate is 0.98 which means that we detected 98 % of the damages in the simulated environment.
We are interested in computing a mini-batch-capable end-to-end algorithm to identify statistically independent components (ICA) in large scale and high-dimensional datasets. Current algorithms typically rely on pre-whitened data and do not integrate the two procedures of whitening and ICA estimation. Our online approach estimates a whitening and a rotation matrix with stochastic gradient descent on centered or uncentered data. We show that this can be done efficiently by combining Batch Karhunen-Löwe-Transformation [1] with Lie group techniques. Our algorithm is recursion-free and can be organized as feed-forward neural network which makes the use of GPU acceleration straight-forward. Because of the very fast convergence of Batch KLT, the gradient descent in the Lie group of orthogonal matrices stabilizes quickly. The optimization is further enhanced by integrating ADAM [2], an improved stochastic gradient descent (SGD) technique from the field of deep learning. We test the scaling capabilities by computing the independent components of the well-known ImageNet challenge (144 GB). Due to its robustness with respect to batch and step size, our approach can be used as a drop-in replacement for standard ICA algorithms where memory is a limiting factor.