Open Access
Identification of Distribution Laws Using the Correlation Coefficient Using Python
Author(s) -
D. Losikhin,
Olga Oliynyk,
Olena Chorna,
Olena Gnatko
Publication year - 2018
Publication title -
metrologìâ ta priladi
Language(s) - English
Resource type - Journals
eISSN - 2663-9564
pISSN - 2307-2180
DOI - 10.33955/2307-2180(6)2018.36-38
Subject(s) - mathematics , law , probability density function , gaussian , probability distribution , histogram , normal distribution , statistics , computer science , artificial intelligence , physics , quantum mechanics , political science , image (mathematics)
The article is devoted to the development of a new method for identifying the distribution laws when evaluating the results of multiple measurements. The identification of the distribution laws is today an urgent metrological task, since the adopted restrictions on the number of measurements and assumptions about the distribution law of random error may introduce additional uncertainty in the assessment of the measurement result.
The use of well-known classical approaches to the identification of distribution laws is associated with a number of difficulties associated with the need to use the completeness of the considered set of models and the correct application of the corresponding statistical methods. The main limitation associated with the use of classical approaches to the identification of distribution laws is that they are designed for use in data processing systems based on Gaussian distribution (normal) and, thus, are not universal. The imperfection of mathematical models of processing measurement information leads to the possible erroneous identification of the distribution law.
The paper proposes a method for identifying the distribution laws for data outside the Gaussian distribution region. The model is based on the calculation of correlation coefficients for data with different distribution laws. The correlation coefficient is used to estimate the proximity of probability density functions and is calculated for pairs of different probability densities represented by histograms in a multidimensional vector space on an orthonormal basis of unit sampling intervals. Based on the obtained matrix of the values of the correlation coefficients, a classification estimate of the unknown distribution laws is performed based on the experimental data of the simulated samples. A listing of the software implementation of the model in the Python software environment is given.