Improving gradient-based LSTM training for offline handwriting recognition by careful selection of the optimization method

Recent years have seen the proposal of several different gradient-based optimization methods for training artificial neural networks. Traditional methods include steepest descent with momentum, newer methods are based on per-parameter learning rates and some approximate Newton-step updates. This work contains the result of several experiments comparing different optimization methods. The experiments were targeted at offline handwriting recognition using hierarchical subsampling networks with recurrent LSTM layers. We present an overview of the used optimization methods, the results that were achieved and a discussion of why the methods lead to different results.

Metadaten
Author:	Martin Schall, Marc-Peter Schambach, Matthias O. Franz ORCiD GND
URL:	https://opus.hs-offenburg.de/1786
ISBN:	978-3-943301-21-2
Parent Title (English):	3rd Baden-Württemberg Center of Applied Research Symposium on Information and Communication Systems - SInCom 2016 - Karlsruhe, December 2nd, 2016
Document Type:	Conference Proceeding
Language:	English
Year of Publication:	2016
Release Date:	2018/11/20
First Page:	11
Last Page:	16
Open Access?:	Ja
Relevance:	Keine peer reviewed Publikation (Wissenschaftlicher Artikel und Aufsatz, Proceeding, Artikel in Tagungsband)
Licence (German):	Urheberrechtlich geschützt

Open Access