
Reliability of PET activation across statistical methods, subject groups, and sample sizes
Author(s) -
Grabowski T. J.,
Frank R. J.,
Brown C. K.,
Damasio H.,
Ponto L. L. Boles,
Watkins G. L.,
Hichwa R. D.
Publication year - 1996
Publication title -
human brain mapping
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.005
H-Index - 191
eISSN - 1097-0193
pISSN - 1065-9471
DOI - 10.1002/(sici)1097-0193(1996)4:1<23::aid-hbm2>3.0.co;2-r
Subject(s) - smoothing , computer science , noise (video) , variance (accounting) , pattern recognition (psychology) , sample size determination , false positive paradox , nonparametric statistics , statistics , artificial intelligence , reliability (semiconductor) , sample (material) , task (project management) , type i and type ii errors , outlier , mathematics , power (physics) , image (mathematics) , physics , chemistry , accounting , management , chromatography , quantum mechanics , business , economics
Four pixel‐based methods for estimating regional activation in positron emission tomography (PET) images were implemented so as to allow the comparison of their performances in the same dataset. Change distribution analysis, Worsley's method, a pixelwise general linear model, a nonparametric method, and several methods derived from them were investigated. Important technical factors, including the degree of smoothing, stereotactic transform, coregistration algorithm, search volume, and the volumetric alpha level, were held constant. The dataset, which was obtained with a verb generation paradigm, was large enough to permit assessment of concordance between independent samples of conventional size, as well assessment of within‐cohort replicability. (Eighteen normal subjects performed four GENERATE‐READ pairs each.) Same‐task (noise) images were also analyzed. In noise datasets, type I errors (false positives) occurred at the nominal rate (in 5% of datasets). Detected regions of activation were highly likely to be internally replicated (93%). The detected activations were a superset of activations previously reported using the same paradigm. The methods were chiefly distinguished by type II error rates and by the stability of the location of activation clusters. Those methods dependent on local variance estimates were less powerful with small sample sizes and less stable with respect to the attributed location of task‐induced changes. The use of pooled variance (Worsley's method) reduced these problems, but variance was not stationary. Overall, the power of all analyses was modest with samples of conventional size (nine subjects × one or two task‐pairs). Modeling of the sources of variance, particularly improvement of anatomical standardization, is likely to improve the power of pixel‐based analyses. © 1996 Wiley‐Liss, Inc.