In this paper we present a method using deep learning to compute parametrizations for B-spline curve approximation. Existing methods consider the computation of parametric values and a knot vector as separate problems. We propose to train interdependent deep neural networks to predict parametric values and knots. We show that it is possible to include B-spline curve approximation directly into the neural network architecture. The resulting parametrizations yield tight approximations and are able to outperform state-of-the-art methods.
Deep neural networks have been successfully applied to problems such as image segmentation, image super-resolution, coloration and image inpainting. In this work we propose the use of convolutional neural networks (CNN) for image inpainting of large regions in high-resolution textures. Due to limited computational resources processing high-resolution images with neural networks is still an open problem. Existing methods separate inpainting of global structure and the transfer of details, which leads to blurry results and loss of global coherence in the detail transfer step. Based on advances in texture synthesis using CNNs we propose patch-based image inpainting by a single network topology that is able to optimize for global as well as detail texture statistics. Our method is capable of filling large inpainting regions, oftentimes exceeding quality of comparable methods for images of high-resolution (2048x2048px). For reference patch look-up we propose to use the same summary statistics that are used in the inpainting process.
Deep 3D
(2017)
In the reverse engineering process one has to classify parts of point clouds with the correct type of geometric primitive. Features based on different geometric properties like point relations, normals, and curvature information can be used, to train classifiers like Support Vector Machines (SVM). These geometric features are estimated in the local neighborhood of a point of the point cloud. The multitude of different features makes an in-depth comparison necessary. In this work we evaluate 23 features for the classification of geometric primitives in point clouds. Their performance is evaluated on SVMs when used to classify geometric primitives in simulated and real laser scanned point clouds. We also introduce a normalization of point cloud density to improve classification generalization.
The detection of anomalous or novel images given a training dataset of only clean reference data (inliers) is an important task in computer vision. We propose a new shallow approach that represents both inlier and outlier images as ensembles of patches, which allows us to effectively detect novelties as mean shifts between reference data and outliers with the Hotelling T2 test. Since mean-shift can only be detected when the outlier ensemble is sufficiently separate from the typical set of the inlier distribution, this typical set acts as a blind spot for novelty detection. We therefore minimize its estimated size as our selection rule for critical hyperparameters, such as, e.g., the size of the patches is crucial. To showcase the capabilities of our approach, we compare results with classical and deep learning methods on the popular datasets MNIST and CIFAR-10, and demonstrate its real-world applicability in a large-scale industrial inspection scenario.
We are interested in computing a mini-batch-capable end-to-end algorithm to identify statistically independent components (ICA) in large scale and high-dimensional datasets. Current algorithms typically rely on pre-whitened data and do not integrate the two procedures of whitening and ICA estimation. Our online approach estimates a whitening and a rotation matrix with stochastic gradient descent on centered or uncentered data. We show that this can be done efficiently by combining Batch Karhunen-Löwe-Transformation [1] with Lie group techniques. Our algorithm is recursion-free and can be organized as feed-forward neural network which makes the use of GPU acceleration straight-forward. Because of the very fast convergence of Batch KLT, the gradient descent in the Lie group of orthogonal matrices stabilizes quickly. The optimization is further enhanced by integrating ADAM [2], an improved stochastic gradient descent (SGD) technique from the field of deep learning. We test the scaling capabilities by computing the independent components of the well-known ImageNet challenge (144 GB). Due to its robustness with respect to batch and step size, our approach can be used as a drop-in replacement for standard ICA algorithms where memory is a limiting factor.
Digital cameras are subject to physical, electronic and optic effects that result in errors and noise in the image. These effects include for example a temperature dependent dark current, read noise, optical vignetting or different sensitivities of individual pixels. The task of a radiometric calibration is to reduce these errors in the image and thus improve the quality of the overall application. In this work we present an algorithm for radiometric calibration based on Gaussian processes. Gaussian processes are a regression method widely used in machine learning that is particularly useful in our context. Then Gaussian process regression is used to learn a temperature and exposure time dependent mapping from observed gray-scale values to true light intensities for each pixel. Regression models based on the characteristics of single pixels suffer from excessively high runtime and thus are unsuitable for many practical applications. In contrast, a single regression model for an entire image with high spatial resolution leads to a low quality radiometric calibration, which also limits its practical use. The proposed algorithm is predicated on a partitioning of the pixels such that each pixel partition can be represented by one single regression model without quality loss. Partitioning is done by extracting features from the characteristic of each pixel and using them for lexicographic sorting. Splitting the sorted data into partitions with equal size yields the final partitions, each of which is represented by the partition centers. An individual Gaussian process regression and model selection is done for each partition. Calibration is performed by interpolating the gray-scale value of each pixel with the regression model of the respective partition. The experimental comparison of the proposed approach to classical flat field calibration shows a consistently higher reconstruction quality for the same overall number of calibration frames.
The detection of differences between images of a printed reference and a reprinted wood decor often requires an initial image registration step. Depending on the digitalization method, the reprint will be displaced and rotated with respect to the reference. The aim of registration is to match the images as precisely as possible. In our approach, images are first matched globally by extracting feature points from both images and finding corresponding point pairs using the RANSAC algorithm. From these correspondences, we compute a global projective transformation between both images. In order to get a pixel-wise registration, we train a learning machine on the point correspondences found by RANSAC. The learning algorithm (in our case Gaussian process regression) is used to nonlinearly interpolate between the feature points which results in a high precision image registration method on wood decors.
Increasing robustness of handwriting recognition using character N-Gram decoding on large lexica
(2016)
Offline handwriting recognition systems often include a decoding step, that is retrieving the most likely character sequence from the underlying machine learning algorithm. Decoding is sensitive to ranges of weakly predicted characters, caused e.g. by obstructions in the scanned document. We present a new algorithm for robust decoding of handwriting recognizer outputs using character n-grams. Multidimensional hierarchical subsampling artificial neural networks with Long-Short-Term-Memory cells have been successfully applied to offline handwriting recognition. Output activations from such networks, trained with Connectionist Temporal Classification, can be decoded with several different algorithms in order to retrieve the most likely literal string that it represents. We present a new algorithm for decoding the network output while restricting the possible strings to a large lexicon. The index used for this work is an n-gram index with tri-grams used for experimental comparisons. N-grams are extracted from the network output using a backtracking algorithm and each n-gram assigned a mean probability. The decoding result is obtained by intersecting the n-gram hit lists while calculating the total probability for each matched lexicon entry. We conclude with an experimental comparison of different decoding algorithms on a large lexicon.
Digital bedruckte Oberflächen müssen strengen funktionalen und ästhetischen Anforderungen genügen. Diese Eigenschaften werden im Rahmen der Qualitätsprüfung kontrolliert. Hierbei wirken sich Oberflächendefekte oftmals erst dann aus, wenn diese auch vom Menschen wahrgenommen werden. Aufgrund der hohen Produktionsgeschwindigkeit kann eine solche Bewertung der Sichtbarkeit von Defekten bisher nur außerhalb des Produktionsflusses durch manuelle - subjektiv geprägte - Inspektion erfolgen. Ziel des Projektes ist (1) die Modellierung von Texturen in einer Form, die an das menschliche visuelle System angepasst ist und (2) die automatisierte Beurteilung der Wahrnehmung von Texturfehlern. Im Rahmen des Projekts wurde ein prototypisches System zur Inline-Erfassung von texturierten Oberflächen entwickelt. Auf Basis von realen Aufnahmen industriell produzierter Holzdekore wurde eine repräsentative Texturdatenbank erstellt. Gezeigt werden erste Resultate im Bereich der Defektdetektion auf Basis von statistischen Merkmalen. Diese Ergebnisse dienen als Grundlage für die spätere wahrnehmungsorientierte Bewertung. Letztlich sollen die im Rahmen des Projekts erlangten Ergebnisse in einen prototypischen Aufbau zur Inspektion von digital bedruckten Dekoren einfließen.