Premium
When is a curve an outlier? An account of a tricky problem
Author(s) -
Manchester Lise,
Blanchard Wade
Publication year - 1996
Publication title -
canadian journal of statistics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.804
H-Index - 51
eISSN - 1708-945X
pISSN - 0319-5724
DOI - 10.2307/3315327
Subject(s) - outlier , multivariate statistics , covariate , multivariate analysis , identification (biology) , logistic regression , series (stratigraphy) , statistics , computer science , set (abstract data type) , data set , anomaly detection , data mining , mathematics , geology , paleontology , botany , biology , programming language
ABSTRACT The analysis of a set of data consisting of N short (≤20 observations each) multivariate time series, where the observations are irregularly spaced and where observations for the different components of each multivariate series are observed at different times, is discussed. With the increased use of automatic recording devices in many fields, data such as these, which are of course samples from smooth response curves, are becoming more common. In this application, which was a clinical trial comparing two cements for use in hip replacement surgery, the key to the analysis was in recognizing that the interest lay in the degree to which the five curves representing a patient's vital signs deviated from baseline (i.e., normal for that patient) during surgery. This enabled the statisticians to define appropriate response variables. The analysis included Rosseeuw's (1984) technique for the identification of multivariate outliers and logistic regressions to identify any effects on the process producing the outliers due to treatment or covariates.