z-logo
open-access-imgOpen Access
Automation of the process of selecting hyperparameters for artificial neural networks for processing retrospective text information
Author(s) -
Aleksey F. Rogachev,
Elena Melikhova
Publication year - 2020
Publication title -
iop conference series. earth and environmental science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.179
H-Index - 26
eISSN - 1755-1307
pISSN - 1755-1315
DOI - 10.1088/1755-1315/577/1/012012
Subject(s) - hyperparameter , computer science , convolutional neural network , bayesian optimization , artificial intelligence , machine learning , artificial neural network , dropout (neural networks) , process (computing) , random search , speedup , automation , hyperparameter optimization , bayesian network , algorithm , support vector machine , programming language , mechanical engineering , engineering , operating system
Neural network technologies are successfully used in solving problems from various areas of the economy - industry, agriculture, medicine. The problems of substantiating the choice of architecture and hyperparameters of artificial neural networks (ins) aimed at solving various classes of applied problems are caused by the need to improve the quality and speed of deep ins training. Various methods of optimizing ins hyperparameters are known, for example, using genetic algorithms, but this requires writing additional software. To optimize the process of selecting hyperparameters, Google research has developed the KerasTuner Toolkit, which is a user-friendly platform for automated search for optimal hyperparameter combinations. In the described Kerastuner Toolkit, you can use random search, Bayesian optimization, or Hyperband methods. In numerical experiments, 14 hyperparameters varied: the number of blocks of convolutional layers and their forming filters, the type of activation functions, the parameters of the «dropout» regulatory layers, and others. The studied tools demonstrated high optimization efficiency while simultaneously varying more than a dozen parameters of the convolutional network, while the calculation time on the Colaboratory platform for the studied INM architectures was several hours, even with the use of GPU graphics accelerators. For ins focused on processing and recognizing text information in natural language (NLP), the recognition quality has been improved to 83-92%.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here