OPUS 4 | Institut für Optische Systeme

CNN-Based Monocular 3D Ship Detection Using Inverse Perspective (2020)

Griesser, Dennis ; Dold, Daniel ; Umlauf, Georg ; Franz, Matthias O.

Three-dimensional ship localization with only one camera is a challenging task due to the loss of depth information caused by perspective projection. In this paper, we propose a method to measure distances based on the assumption that ships lie on a flat surface. This assumption allows to recover depth from a single image using the principle of inverse perspective. For the 3D ship detection task, we use a hybrid approach that combines image detection with a convolutional neural network, camera geometry and inverse perspective. Furthermore, a novel calculation of object height is introduced. Experiments show that the monocular distance computation works well in comparison to a Velodyne lidar. Due to its robustness, this could be an easy-to-use baseline method for detection tasks in navigation systems.

Fast and reliable acquisition of truth data for document analysis using cyclic suggest algorithms (2019)

In document analysis the availability of ground truth data plays a crucial role for the success of a project. This is even more true at the rise of new deep learning methods which heavily rely on the availability of training data. But even for traditional, hand crafted algorithms that are not trained on data, reliable test data is important for the improvement and evaluation of the methods. Because ground truth acquisition is expensive and time consuming, semi-automatic methods are introduced which make use of suggestions coming from document analysis systems. The interaction between the human operator and the automatic analysis algorithms is the key to speed up the process while improving the quality of the data. The final confirmation of data may always be done by the human operator. This paper demonstrates a use case for acquisition of truth data in a mail processing system. It shows why a new, extended view on truth data is necessary in development and engineering of such systems. An overview over the tool and the data handling is given, the advantages in the workflow are shown, and consequences for the construction of analysis algorithms are discussed. It can be shown that the interplay between suggest algorithms and human operator leads to very fast truth data capturing. The surprising finding is the fact that if multiple suggest algorithms circularly depend on data, they are especially effective in terms of speed and accuracy.

Machine learning methods for reverse engineering of defective structured surfaces (2020)

Laube, Pascal

Pascal Laube presents machine learning approaches for three key problems of reverse engineering of defective structured surfaces: parametrization of curves and surfaces, geometric primitive classification and inpainting of high-resolution textures. The proposed methods aim to improve the reconstruction quality while further automating the process. The contributions demonstrate that machine learning can be a viable part of the CAD reverse engineering pipeline.

Developing deep learning applications for life science and pharma industry (2018)

Siegismund, Daniel ; Tolkachev, Vasily ; Heyse, Stephan ; Sick, Beate ; Dürr, Oliver ; Steigele, Stephan

Automatic classification of non-small cell lung cancer histologic sub-types by deep learning (2018)

Casanova, R. ; Murina, Elvis ; Haberecker, M. ; Honcharova-Biletska, H. ; Vrugt, B. ; Dürr, Oliver ; Sick, Beate ; Soltermann, A.

Capturing suprasegmental features of a voicewith RNNs for improved speaker clustering (2018)

Stadelmann, Thilo ; Glinski-Haefeli, Sebastian ; Gerber, Patrick ; Dürr, Oliver

Deep neural networks have become a veritable alternative to classic speaker recognition and clustering methods in recent years. However, while the speech signal clearly is a time series, and despite the body of literature on the benefits of prosodic (suprasegmental) features, identifying voices has usually not been approached with sequence learning methods. Only recently has a recurrent neural network (RNN) been successfully applied to this task, while the use of convolutional neural networks (CNNs) (that are not able to capture arbitrary time dependencies, unlike RNNs) still prevails. In this paper, we show the effectiveness of RNNs for speaker recognition by improving state of the art speaker clustering performance and robustness on the classic TIMIT benchmark. We provide arguments why RNNs are superior by experimentally showing a “sweet spot” of the segment length for successfully capturing prosodic information that has been theoretically predicted in previous work.

Learning neural models for end-to-end clustering (2018)

Meier, Benjamin Bruno ; Elezi, Ismail ; Amirian, Mohammadreza ; Dürr, Oliver ; Stadelmann, Thilo

We propose a novel end-to-end neural network architecture that, once trained, directly outputs a probabilistic clustering of a batch of input examples in one pass. It estimates a distribution over the number of clusters k, and for each 1≤k≤kmax, a distribution over the individual cluster assignment for each data point. The network is trained in advance in a supervised fashion on separate data to learn grouping by any perceptual similarity criterion based on pairwise labels (same/different group). It can then be applied to different data containing different groups. We demonstrate promising performance on high-dimensional data like images (COIL-100) and speech (TIMIT). We call this “learning to cluster” and show its conceptual difference to deep metric learning, semi-supervise clustering and other related approaches while having the advantage of performing learnable clustering fully end-to-end.

Beyond ImageNet: deep learning in industrial practice (2019)

Stadelmann, Thilo ; Tolkachev, Vasily ; Sick, Beate ; Dürr, Oliver ; Stampfli, Jan

Bone erosion scoring for rheumatoid arthritis with deep convolutional neural networks (2019)

Rohrbach, Janick ; Reinhard, Tobias ; Sick, Beate ; Dürr, Oliver

Rheumatoid arthritis is an autoimmune disease that causes chronic inflammation of synovial joints, often resulting in irreversible structural damage. The activity of the disease is evaluated by clinical examinations, laboratory tests, and patient self-assessment. The long-term course of the disease is assessed with radiographs of hands and feet. The evaluation of the X-ray images performed by trained medical staff requires several minutes per patient. We demonstrate that deep convolutional neural networks can be leveraged for a fully automated, fast, and reproducible scoring of X-ray images of patients with rheumatoid arthritis. A comparison of the predictions of different human experts and our deep learning system shows that there is no significant difference in the performance of human experts and our deep learning model.

Dissecting multi-line handwriting for multi-dimensional connectionist classification (2020)

Schall, Martin ; Schambach, Marc-Peter ; Franz, Matthias O.

Multi-Dimensional Connectionist Classification is amethod for weakly supervised training of Deep Neural Networksfor segmentation-free multi-line offline handwriting recognition.MDCC applies Conditional Random Fields as an alignmentfunction for this task. We discuss the structure and patterns ofhandwritten text that can be used for building a CRF. Since CRFsare cyclic graphical models, we have to resort to approximateinference when calculating the alignment of multi-line text duringtraining, here in the form of Loopy Belief Propagation. This workconcludes with experimental results for transcribing small multi-line samples from the IAM Offline Handwriting DB which showthat MDCC is a competitive methodology.

Open Access

Institut für Optische Systeme - IOS

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Keywords

Institute

40 search hits