z-logo
Premium
A hybrid neural network for input that is both categorical and quantitative
Author(s) -
Brouwer Roelof K.
Publication year - 2004
Publication title -
international journal of intelligent systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.291
H-Index - 87
eISSN - 1098-111X
pISSN - 0884-8173
DOI - 10.1002/int.20032
Subject(s) - categorical variable , continuous variable , variable (mathematics) , computer science , function (biology) , artificial neural network , metric (unit) , sigmoid function , mathematics , subnetwork , pattern recognition (psychology) , artificial intelligence , algorithm , mathematical optimization , machine learning , mathematical analysis , operations management , computer security , evolutionary biology , economics , biology
Abstract The data on which a MLP (multilayer perceptron) is normally trained to approximate a continuous function may include inputs that are categorical in addition to the numeric or quantitative inputs. Examples of categorical variables are gender, race, and so on. An approach examined in this article is to train a hybrid network consisting of a MLP and an encoder with multiple output units; that is, a separate output unit for each of the various combinations of values of the categorical variables. Input to the feed forward subnetwork of the hybrid network is then restricted to truly numerical quantities. A MLP with connection matrices that multiply input values and sigmoid functions that further transform values represents a continuous mapping in all input variables. A MLP therefore requires that all inputs correspond to numeric, continuously valued variables and represents a continuous function in all input variables. A categorical variable, on the other hand, produces a discontinuous relationship between an input variable and the output. The way that this problem is often dealt with is to replace the categorical values by numeric ones and treat them as if they were continuously valued. However there is no meaningful correspondence between the continuous quantities generated this way and the original categorical values. The basic difficulty with using these variables is that they define a metric for the categories that may not be reasonable. This suggests that the categorical inputs should be segregated from the continuous inputs as explained above. Results show that the method utilizing a hybrid network and separating numerical from quantitative input, as discussed here, is quite effective. © 2004 Wiley Periodicals, Inc. Int J Int Syst 19: 979–1001, 2004.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here