Testing Hypotheses on Simulated Data: Why Traditional Hypotheses-Testing Statistics Are Not Always Adequate for Simulated Data, and How to Modify Them
Author(s) -
Richard Aló,
Владик Крейнович,
Scott A. Starks
Publication year - 2006
Publication title -
journal of advanced computational intelligence and intelligent informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.172
H-Index - 20
eISSN - 1343-0130
pISSN - 1883-8014
DOI - 10.20965/jaciii.2006.p0260
Subject(s) - statistic , statistics , computer science , statistical hypothesis testing , algorithm , mathematics
To check whether a new algorithm is better, researchers use traditional statistical techniques for hypotheses testing. In particular, when the results are inconclusive, they run more and more simulations (n2 > n1, n3 > n2, . . . , nm > nm−1) until the results become conclusive. In this paper, we point out that these results may be misleading. Indeed, in the traditional approach, we select a statistic and then choose a threshold for which the probability of this statistic “accidentally” exceeding this threshold is smaller than, say, 1%. It is very easy to run additional simulations with ever-larger n. The probability of error is still 1% for each ni, but the probability that we reach an erroneous conclusion for at least one of the values ni increases as m increases. In this paper, we design new statistical techniques oriented towards experiments on simulated data, techniques that would guarantee that the error stays under, say, 1% no matter how many experiments we run. I. HYPOTHESES TESTING: AN IMPORTANT APPLIED PROBLEM One of the main uses of statistics is to compare two (or more) hypotheses. For example, we would like to check whether a new medical treatment is better than the previously known one. Let us describe this problem in more precise terms. Usually, an efficiency of a method can be described by an appropriate numerical quantity x. For example, an efficiency of an anti-cholesterol medicine can be described by the average amount to which its use lowers the patient’s originally high cholesterol level during a certain period of time. So, we arrive at the following problem: • we know the average amount μ corresponding to the original hypothesis (e.g., the original treatment); • we have the results x1, . . . , xn of the experiments with the new method. Based on these results, we would like to check whether the new method is indeed better, i.e., whether for the new method, the mean value μx is larger than μ. II. HYPOTHESES TESTING: HOW IT IS CURRENTLY DONE There are many known statistical methods for hypotheses testing; see, e.g., [5], [6]. One of these methods is as follows: we compute the population average x = x1 + . . . + xn n and the population standard deviation s = √√√√ 1 n− 1 · n ∑
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom