Misplaced Pages

Control variates

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
(Redirected from Control variate) Technique for increasing the precision of estimates in Monte Carlo experiments

The control variates method is a variance reduction technique used in Monte Carlo methods. It exploits information about the errors in estimates of known quantities to reduce the error of an estimate of an unknown quantity.

Underlying principle

Let the unknown parameter of interest be μ {\displaystyle \mu } , and assume we have a statistic m {\displaystyle m} such that the expected value of m is μ: E [ m ] = μ {\displaystyle \mathbb {E} \left=\mu } , i.e. m is an unbiased estimator for μ. Suppose we calculate another statistic t {\displaystyle t} such that E [ t ] = τ {\displaystyle \mathbb {E} \left=\tau } is a known value. Then

m = m + c ( t τ ) {\displaystyle m^{\star }=m+c\left(t-\tau \right)\,}

is also an unbiased estimator for μ {\displaystyle \mu } for any choice of the coefficient c {\displaystyle c} . The variance of the resulting estimator m {\displaystyle m^{\star }} is

Var ( m ) = Var ( m ) + c 2 Var ( t ) + 2 c Cov ( m , t ) . {\displaystyle {\textrm {Var}}\left(m^{\star }\right)={\textrm {Var}}\left(m\right)+c^{2}\,{\textrm {Var}}\left(t\right)+2c\,{\textrm {Cov}}\left(m,t\right).}

By differentiating the above expression with respect to c {\displaystyle c} , it can be shown that choosing the optimal coefficient

c = Cov ( m , t ) Var ( t ) {\displaystyle c^{\star }=-{\frac {{\textrm {Cov}}\left(m,t\right)}{{\textrm {Var}}\left(t\right)}}}

minimizes the variance of m {\displaystyle m^{\star }} . (Note that this coefficient is the same as the coefficient obtained from a linear regression.) With this choice,

Var ( m ) = Var ( m ) [ Cov ( m , t ) ] 2 Var ( t ) = ( 1 ρ m , t 2 ) Var ( m ) {\displaystyle {\begin{aligned}{\textrm {Var}}\left(m^{\star }\right)&={\textrm {Var}}\left(m\right)-{\frac {\left^{2}}{{\textrm {Var}}\left(t\right)}}\\&=\left(1-\rho _{m,t}^{2}\right){\textrm {Var}}\left(m\right)\end{aligned}}}

where

ρ m , t = Corr ( m , t ) {\displaystyle \rho _{m,t}={\textrm {Corr}}\left(m,t\right)\,}

is the correlation coefficient of m {\displaystyle m} and t {\displaystyle t} . The greater the value of | ρ m , t | {\displaystyle \vert \rho _{m,t}\vert } , the greater the variance reduction achieved.

In the case that Cov ( m , t ) {\displaystyle {\textrm {Cov}}\left(m,t\right)} , Var ( t ) {\displaystyle {\textrm {Var}}\left(t\right)} , and/or ρ m , t {\displaystyle \rho _{m,t}\;} are unknown, they can be estimated across the Monte Carlo replicates. This is equivalent to solving a certain least squares system; therefore this technique is also known as regression sampling.

When the expectation of the control variable, E [ t ] = τ {\displaystyle \mathbb {E} \left=\tau } , is not known analytically, it is still possible to increase the precision in estimating μ {\displaystyle \mu } (for a given fixed simulation budget), provided that the two conditions are met: 1) evaluating t {\displaystyle t} is significantly cheaper than computing m {\displaystyle m} ; 2) the magnitude of the correlation coefficient | ρ m , t | {\displaystyle |\rho _{m,t}|} is close to unity.

Example

We would like to estimate

I = 0 1 1 1 + x d x {\displaystyle I=\int _{0}^{1}{\frac {1}{1+x}}\,\mathrm {d} x}

using Monte Carlo integration. This integral is the expected value of f ( U ) {\displaystyle f(U)} , where

f ( U ) = 1 1 + U {\displaystyle f(U)={\frac {1}{1+U}}}

and U follows a uniform distribution . Using a sample of size n denote the points in the sample as u 1 , , u n {\displaystyle u_{1},\cdots ,u_{n}} . Then the estimate is given by

I 1 n i f ( u i ) . {\displaystyle I\approx {\frac {1}{n}}\sum _{i}f(u_{i}).}

Now we introduce g ( U ) = 1 + U {\displaystyle g(U)=1+U} as a control variate with a known expected value E [ g ( U ) ] = 0 1 ( 1 + x ) d x = 3 2 {\displaystyle \mathbb {E} \left=\int _{0}^{1}(1+x)\,\mathrm {d} x={\tfrac {3}{2}}} and combine the two into a new estimate

I 1 n i f ( u i ) + c ( 1 n i g ( u i ) 3 / 2 ) . {\displaystyle I\approx {\frac {1}{n}}\sum _{i}f(u_{i})+c\left({\frac {1}{n}}\sum _{i}g(u_{i})-3/2\right).}

Using n = 1500 {\displaystyle n=1500} realizations and an estimated optimal coefficient c 0.4773 {\displaystyle c^{\star }\approx 0.4773} we obtain the following results

Estimate Variance
Classical estimate 0.69475 0.01947
Control variates 0.69295 0.00060

The variance was significantly reduced after using the control variates technique. (The exact result is I = ln 2 0.69314718 {\displaystyle I=\ln 2\approx 0.69314718} .)

See also

This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Control variates" – news · newspapers · books · scholar · JSTOR (August 2011) (Learn how and when to remove this message)

Notes

  1. Lemieux, C. (2017). "Control Variates". Wiley StatsRef: Statistics Reference Online: 1–8. doi:10.1002/9781118445112.stat07947. ISBN 9781118445112.
  2. Glasserman, P. (2004). Monte Carlo Methods in Financial Engineering. New York: Springer. ISBN 0-387-00451-3 (p. 185)
  3. ^ Botev, Z.; Ridder, A. (2017). "Variance Reduction". Wiley StatsRef: Statistics Reference Online: 1–6. doi:10.1002/9781118445112.stat07975. ISBN 9781118445112.

References

Categories: