Automation of the process of selecting hyperparameters for artificial neural networks for processing retrospective text information | Zendy

Aleksey F. Rogachev | Zendy; Elena Melikhova | Zendy

Open Access

Automation of the process of selecting hyperparameters for artificial neural networks for processing retrospective text information

Author(s) -

Aleksey F. Rogachev,

Elena Melikhova

Publication year - 2020

Publication title -

iop conference series. earth and environmental science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.179

H-Index - 26

eISSN - 1755-1307

pISSN - 1755-1315

DOI - 10.1088/1755-1315/577/1/012012

Subject(s) - hyperparameter , computer science , convolutional neural network , bayesian optimization , artificial intelligence , machine learning , artificial neural network , dropout (neural networks) , process (computing) , random search , speedup , automation , hyperparameter optimization , bayesian network , algorithm , support vector machine , programming language , mechanical engineering , engineering , operating system

Neural network technologies are successfully used in solving problems from various areas of the economy - industry, agriculture, medicine. The problems of substantiating the choice of architecture and hyperparameters of artificial neural networks (ins) aimed at solving various classes of applied problems are caused by the need to improve the quality and speed of deep ins training. Various methods of optimizing ins hyperparameters are known, for example, using genetic algorithms, but this requires writing additional software. To optimize the process of selecting hyperparameters, Google research has developed the KerasTuner Toolkit, which is a user-friendly platform for automated search for optimal hyperparameter combinations. In the described Kerastuner Toolkit, you can use random search, Bayesian optimization, or Hyperband methods. In numerical experiments, 14 hyperparameters varied: the number of blocks of convolutional layers and their forming filters, the type of activation functions, the parameters of the «dropout» regulatory layers, and others. The studied tools demonstrated high optimization efficiency while simultaneously varying more than a dozen parameters of the convolutional network, while the calculation time on the Colaboratory platform for the studied INM architectures was several hours, even with the use of GPU graphics accelerators. For ins focused on processing and recognizing text information in natural language (NLP), the recognition quality has been improved to 83-92%.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore