Premium
Combining homogeneous groups of preclassified observations with application to international trade
Author(s) -
Cerasa Andrea
Publication year - 2016
Publication title -
statistica neerlandica
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.52
H-Index - 39
eISSN - 1467-9574
pISSN - 0039-0402
DOI - 10.1111/stan.12086
Subject(s) - representativeness heuristic , homogeneous , dimension (graph theory) , computer science , monte carlo method , database transaction , regression , transaction cost , data mining , cluster analysis , econometrics , mathematical optimization , mathematics , statistics , economics , machine learning , database , microeconomics , combinatorics , pure mathematics
This article proposes three methods for merging homogeneous clusters of observations that are grouped according to a pre‐existing (known) classification. This clusterwise regression problem is at the very least compelling in analyzing international trade data, where transaction prices can be grouped according to the corresponding origin–destination combination. A proper merging of these prices could simplify the analysis of the market without affecting the representativeness of the data and highlight commercial anomalies that may hide frauds. The three algorithms proposed are based on an iterative application of the F ‐test and have the advantage of being extremely flexible, as they do not require to predetermine the number of final clusters, and their output depends only on a tuning parameter. Monte Carlo results show very good performances of all the procedures, whereas the application to a couple of empirical data sets proves the practical utility of the methods proposed for reducing the dimension of the market and isolating suspicious commercial behaviors.