Misplaced Pages

Asymptotic theory (statistics)

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Study of convergence properties of statistical estimators

In statistics, asymptotic theory, or large sample theory, is a framework for assessing properties of estimators and statistical tests. Within this framework, it is often assumed that the sample size n may grow indefinitely; the properties of estimators and tests are then evaluated under the limit of n → ∞. In practice, a limit evaluation is considered to be approximately valid for large finite sample sizes too.

Overview

Most statistical problems begin with a dataset of size n. The asymptotic theory proceeds by assuming that it is possible (in principle) to keep collecting additional data, thus that the sample size grows infinitely, i.e. n → ∞. Under the assumption, many results can be obtained that are unavailable for samples of finite size. An example is the weak law of large numbers. The law states that for a sequence of independent and identically distributed (IID) random variables X1, X2, ..., if one value is drawn from each random variable and the average of the first n values is computed as Xn, then the Xn converge in probability to the population mean E as n → ∞.

In asymptotic theory, the standard approach is n → ∞. For some statistical models, slightly different approaches of asymptotics may be used. For example, with panel data, it is commonly assumed that one dimension in the data remains fixed, whereas the other dimension grows: T = constant and N → ∞, or vice versa.

Besides the standard approach to asymptotics, other alternative approaches exist:

  • Within the local asymptotic normality framework, it is assumed that the value of the "true parameter" in the model varies slightly with n, such that the n-th model corresponds to θn = θ + h/√n . This approach lets us study the regularity of estimators.
  • When statistical tests are studied for their power to distinguish against the alternatives that are close to the null hypothesis, it is done within the so-called "local alternatives" framework: the null hypothesis is H0: θ = θ0 and the alternative is H1: θ = θ0 + h/√n . This approach is especially popular for the unit root tests.
  • There are models where the dimension of the parameter space Θn slowly expands with n, reflecting the fact that the more observations there are, the more structural effects can be feasibly incorporated in the model.
  • In kernel density estimation and kernel regression, an additional parameter is assumed—the bandwidth h. In those models, it is typically taken that h → 0 as n → ∞. The rate of convergence must be chosen carefully, though, usually hn.

In many cases, highly accurate results for finite samples can be obtained via numerical methods (i.e. computers); even in such cases, though, asymptotic analysis can be useful. This point was made by Small (2010, §1.4), as follows.

A primary goal of asymptotic analysis is to obtain a deeper qualitative understanding of quantitative tools. The conclusions of an asymptotic analysis often supplement the conclusions which can be obtained by numerical methods.

Modes of convergence of random variables

Further information: Convergence of random variables

Asymptotic properties

Estimators

Consistency

A sequence of estimates is said to be consistent, if it converges in probability to the true value of the parameter being estimated:

θ ^ n   p   θ 0 . {\displaystyle {\hat {\theta }}_{n}\ {\xrightarrow {\overset {}{p}}}\ \theta _{0}.}

That is, roughly speaking with an infinite amount of data the estimator (the formula for generating the estimates) would almost surely give the correct result for the parameter being estimated.

Asymptotic distribution

If it is possible to find sequences of non-random constants {an}, {bn} (possibly depending on the value of θ0), and a non-degenerate distribution G such that

b n ( θ ^ n a n )   d   G , {\displaystyle b_{n}({\hat {\theta }}_{n}-a_{n})\ {\xrightarrow {d}}\ G,}

then the sequence of estimators θ ^ n {\displaystyle \textstyle {\hat {\theta }}_{n}} is said to have the asymptotic distribution G.

Most often, the estimators encountered in practice are asymptotically normal, meaning their asymptotic distribution is the normal distribution, with an = θ0, bn = √n, and G = N(0, V):

n ( θ ^ n θ 0 )   d   N ( 0 , V ) . {\displaystyle {\sqrt {n}}({\hat {\theta }}_{n}-\theta _{0})\ {\xrightarrow {d}}\ {\mathcal {N}}(0,V).}

Asymptotic confidence regions

Asymptotic theorems

See also

References

  1. Höpfner, R. (2014), Asymptotic Statistics, Walter de Gruyter. 286 pag. ISBN 3110250241, ISBN 978-3110250244
  2. ^ A. DasGupta (2008), Asymptotic Theory of Statistics and Probability, Springer. ISBN 0387759700, ISBN 978-0387759708

Bibliography

Statistics
Descriptive statistics
Continuous data
Center
Dispersion
Shape
Count data
Summary tables
Dependence
Graphics
Data collection
Study design
Survey methodology
Controlled experiments
Adaptive designs
Observational studies
Statistical inference
Statistical theory
Frequentist inference
Point estimation
Interval estimation
Testing hypotheses
Parametric tests
Specific tests
Goodness of fit
Rank statistics
Bayesian inference
Correlation
Regression analysis
Linear regression
Non-standard predictors
Generalized linear model
Partition of variance
Categorical / Multivariate / Time-series / Survival analysis
Categorical
Multivariate
Time-series
General
Specific tests
Time domain
Frequency domain
Survival
Survival function
Hazard function
Test
Applications
Biostatistics
Engineering statistics
Social statistics
Spatial statistics
Category: