Skip to content

SMEP: Goodness of fit tests

Josef Perktold edited this page Jun 19, 2013 · 6 revisions

SMEP: Goodness of fit tests

Status: partially implemented, largely planning

  • binning,
  • sup and square integral based tests
  • special cases, for example many tests for normality
  • smooth tests (Neyman, series expansion)
  • empirical characteristic function - nothing yet
  • graphical, visual "tests", qqplot

scipy, statsmodels

kuiper (Anne) but no license

Stephens' papers, based on tables of critical values for D+, D-, D, V, W2, U2, A2

If location and scale are estimated, but there are no shape parameters to be estimated, then the distribution of the test statistic does not depend on the true parameters. In this case, we can use tables for the distributions or exact or approximate formulas.

tests with estimated location and scale, i.e. mean and standard deviation.

various tests in scipy and statsmodels: shapiro, omnibus, skew, kurtosis, ...

Anderson-Darling statsmodels.stats Kolmogorov-Smirnov Lilliefors in statsmodels.stats

part of GOF class - Stephens,

Stephens 1986: chapter 4 in D'Agostini and Stephens see also SAS manual (?)

distributions:

  • normal distribution
  • exponential distribution
  • extreme value distribution
  • Weibull distribution
  • Gamma distribution (approximate)
  • Logistic distribution
  • Cauchy distribution
  • VonMises distribution

Stephens 1986 and other Stephens papers have tables of critical values if no, one or two parameters are estimated.

In this case the asymptotic distribution is not nuisance parameter free.

P-values or critical values can be obtained through Monte Carlo simulation, and maybe parametric bootstrap.

chisquare test:

But, parameters are supposed to be estimated by MLE with the binned data in order for the chis-square distribution to be the correct distribution of the test statistic.

see also special cases or applications below, at end

standard for discrete distribution.

Main question is binning. For chisquare test to be a good approximation for the distribution of the test statistic, the expected bin size should be 5 observations. There is also a rule that a percentage of bins should have enough observations.

Special case: multinomial distribution in contingency table, implemented in scipy.stats (Warren)

Status implemented

A general class of discrete tests with asymptotic chi-square distributions that include chisquare test and likelihood ratio tests as special cases.

Status test implemented, but assumes binned and expected number of observations

for multivariate case.

Since parameters are estimated, this should be taken into account in the distribution. Escanciano 2006 uses bootstrap.

I don't remember what I looked at, I didn't read these two papers:

Escanciano, J. Carlos. 2006. “A Consistent Diagnostic Test for Regression Models Using Projections.” Econometric Theory 22 (06): 1030–1051. doi:10.1017/S0266466606060506.

Bierens, Herman J., and Li Wang. 2012. “Integrated Conditional Moment Tests for Parametric Conditional Distributions.” Econometric Theory 28 (02): 328–362. doi:10.1017/S0266466611000168.

GOF for generalized linear model, discrete models, single index models

STUTE, W. and ZHU, L.-X. (2002), Model Checks for Generalized Linear Models. Scandinavian Journal of Statistics, 29: 535–545. doi: 10.1111/1467-9469.00304 http://onlinelibrary.wiley.com/doi/10.1111/1467-9469.00304/abstract

and cited by http://scholar.google.ca/scholar?cites=15004922531789515297&as_sdt=2005&sciodt=0,5&hl=en

another example

Konstantinos Fokianos and Michael H. Neumann (2013), A goodness-of-fit test for Poisson count processes. Electron. J. Statist. Volume 7 (2013), 793-819. http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.ejs/1364220671

Most or all of the tests above are omnibus tests that are consistent against all alternatives. Instead, in many applications we are interested in a specific distribution as alternative. For example in lifetime estimation, weibull against exponential (?) or Poisson versus Negative-Binomial for count data models.

examples: tests for symmetry, test for increasing or decreasing hazard rate, ...

specific functions

TableDist

GOF Class

Monte Carlo p-values

Bootstrap

Smoothed non-parametric distribution for the distribution of the test statistic and p-values

serial dependence of data:

almost all tests above assume i.i.d. sampling. This might not be a problem when we look at residuals or innovations, but will be in other cases with dependent data.

Paper for conditional Kolmogorov-Smirnov test with bootstrap p-values (which ?)

estimation of distributions:

still a shaky in scipy, I will get some special cases from Stephens' papers

D’Agostino, Ralph B., and Michael A. Stephens. 1986. Goodness-Of-Fit Techniques. M. Dekker.

Stephens, M. A. 1970. “Use of the Kolmogorov-Smirnov, Cramer-Von Mises and Related Statistics Without Extensive Tables.” Journal of the Royal Statistical Society. Series B (Methodological) 32 (1) (January 1): 115–122.

Stephens, M. A. 1974. “EDF Statistics for Goodness of Fit and Some Comparisons.” Journal of the American Statistical Association 69 (347): 730–737. doi:10.2307/2286009.

Stephens, M.A. 1986: Tests based on EDF statistics. In: D'Agostino, R.B. and Stephens, M.A., eds.: Goodness-of-Fit Techniques. Marcel Dekker, New York, page 97-193

Clone this wiki locally