Refine
Document Type
- Conference Proceeding (14) (remove)
Language
- English (14)
Has Fulltext
- no (14) (remove)
Keywords
- 3D ship detection (1)
- Extended object tracking (1)
- Extension estimation (1)
- Finite-element (1)
- Gaussian processes (1)
- Inverse perspective (1)
- Lidar (1)
- Lidar-camera registration (1)
- Mask R-CNN (1)
- Modelling (1)
Institute
- Institut für Optische Systeme - IOS (14) (remove)
Three-dimensional ship localization with only one camera is a challenging task due to the loss of depth information caused by perspective projection. In this paper, we propose a method to measure distances based on the assumption that ships lie on a flat surface. This assumption allows to recover depth from a single image using the principle of inverse perspective. For the 3D ship detection task, we use a hybrid approach that combines image detection with a convolutional neural network, camera geometry and inverse perspective. Furthermore, a novel calculation of object height is introduced. Experiments show that the monocular distance computation works well in comparison to a Velodyne lidar. Due to its robustness, this could be an easy-to-use baseline method for detection tasks in navigation systems.
In this paper we present a method using deep learning to compute parametrizations for B-spline curve approximation. Existing methods consider the computation of parametric values and a knot vector as separate problems. We propose to train interdependent deep neural networks to predict parametric values and knots. We show that it is possible to include B-spline curve approximation directly into the neural network architecture. The resulting parametrizations yield tight approximations and are able to outperform state-of-the-art methods.
Multi-Dimensional Connectionist Classification is amethod for weakly supervised training of Deep Neural Networksfor segmentation-free multi-line offline handwriting recognition.MDCC applies Conditional Random Fields as an alignmentfunction for this task. We discuss the structure and patterns ofhandwritten text that can be used for building a CRF. Since CRFsare cyclic graphical models, we have to resort to approximateinference when calculating the alignment of multi-line text duringtraining, here in the form of Loopy Belief Propagation. This workconcludes with experimental results for transcribing small multi-line samples from the IAM Offline Handwriting DB which showthat MDCC is a competitive methodology.
Random matrices are used to filter the center of gravity (CoG) and the covariance matrix of measurements. However, these quantities do not always correspond directly to the position and the extent of the object, e.g. when a lidar sensor is used.In this paper, we propose a Gaussian processes regression model (GPRM) to predict the position and extension of the object from the filtered CoG and covariance matrix of the measurements. Training data for the GPRM are generated by a sampling method and a virtual measurement model (VMM). The VMM is a function that generates artificial measurements using ray tracing and allows us to obtain the CoG and covariance matrix that any object would cause. This enables the GPRM to be trained without real data but still be applied to real data due to the precise modeling in the VMM. The results show an accurate extension estimation as long as the reality behaves like the modeling and e.g. lidar measurements only occur on the side facing the sensor.
Fast and reliable acquisition of truth data for document analysis using cyclic suggest algorithms
(2019)
In document analysis the availability of ground truth data plays a crucial role for the success of a project. This is even more true at the rise of new deep learning methods which heavily rely on the availability of training data. But even for traditional, hand crafted algorithms that are not trained on data, reliable test data is important for the improvement and evaluation of the methods. Because ground truth acquisition is expensive and time consuming, semi-automatic methods are introduced which make use of suggestions coming from document analysis systems. The interaction between the human operator and the automatic analysis algorithms is the key to speed up the process while improving the quality of the data. The final confirmation of data may always be done by the human operator. This paper demonstrates a use case for acquisition of truth data in a mail processing system. It shows why a new, extended view on truth data is necessary in development and engineering of such systems. An overview over the tool and the data handling is given, the advantages in the workflow are shown, and consequences for the construction of analysis algorithms are discussed. It can be shown that the interplay between suggest algorithms and human operator leads to very fast truth data capturing. The surprising finding is the fact that if multiple suggest algorithms circularly depend on data, they are especially effective in terms of speed and accuracy.
We analyse the results of a finite element simulation of a macroscopic model, which describes the movement of a crowd, that is considered as a continuum. A new formulation based on the macroscopic model from Hughes [2] is given. We present a stable numerical algorithm by approximating with a viscosity solution. The fundamental setting is given by an arbitrary domain that can contain several obstacles, several entries and must have at least one exit. All pedestrians have the goal to leave the room as quickly as possible. Nobody prefers a particular exit.
Deep neural networks have been successfully applied to problems such as image segmentation, image super-resolution, coloration and image inpainting. In this work we propose the use of convolutional neural networks (CNN) for image inpainting of large regions in high-resolution textures. Due to limited computational resources processing high-resolution images with neural networks is still an open problem. Existing methods separate inpainting of global structure and the transfer of details, which leads to blurry results and loss of global coherence in the detail transfer step. Based on advances in texture synthesis using CNNs we propose patch-based image inpainting by a single network topology that is able to optimize for global as well as detail texture statistics. Our method is capable of filling large inpainting regions, oftentimes exceeding quality of comparable methods for images of high-resolution (2048x2048px). For reference patch look-up we propose to use the same summary statistics that are used in the inpainting process.
Incremental one-class learning using regularized null-space training for industrial defect detection
(2024)
One-class incremental learning is a special case of class-incremental learning, where only a single novel class is incrementally added to an existing classifier instead of multiple classes. This case is relevant in industrial defect detection scenarios, where novel defects usually appear during operation. Existing rolled-out classifiers must be updated incrementally in this scenario with only a few novel examples. In addition, it is often required that the base classifier must not be altered due to approval and warranty restrictions. While simple finetuning often gives the best performance across old and new classes, it comes with the drawback of potentially losing performance on the base classes (catastrophic forgetting [1]). Simple prototype approaches [2] work without changing existing weights and perform very well when the classes are well separated but fail dramatically when not. In theory, null-space training (NSCL) [3] should retain the basis classifier entirely, as parameter updates are restricted to the null space of the network with respect to existing classes. However, as we show, this technique promotes overfitting in the case of one-class incremental learning. In our experiments, we found that unconstrained weight growth in null space is the underlying issue, leading us to propose a regularization term (R-NSCL) that penalizes the magnitude of amplification. The regularization term is added to the standard classification loss and stabilizes null-space training in the one-class scenario by counteracting overfitting. We test the method’s capabilities on two industrial datasets, namely AITEX and MVTec, and compare the performance to state-of-the-art algorithms for class-incremental learning.
Optical surface inspection: A novelty detection approach based on CNN-encoded texture features
(2018)
In inspection systems for textured surfaces, a reference texture is typically known before novel examples are inspected. Mostly, the reference is only available in a digital format. As a consequence, there is no dataset of defective examples available that could be used to train a classifier. We propose a texture model approach to novelty detection. The texture model uses features encoded by a convolutional neural network (CNN) trained on natural image data. The CNN activations represent the specific characteristics of the digital reference texture which are learned by a one-class classifier. We evaluate our novelty detector in a digital print inspection scenario. The inspection unit is based on a camera array and a flashing light illumination which allows for inline capturing of multichannel images at a high rate. In order to compare our results to manual inspection, we integrated our inspection unit into an industrial single-pass printing system.