Misplaced Pages

BRS-inequality

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
This article includes a list of general references, but it lacks sufficient corresponding inline citations. Please help to improve this article by introducing more precise citations. (February 2022) (Learn how and when to remove this message)

BRS-inequality is the short name for Bruss-Robertson-Steele inequality. This inequality gives a convenient upper bound for the expected maximum number of non-negative random variables one can sum up without exceeding a given upper bound s > 0 {\displaystyle s>0} .

For example, suppose 100 random variables X 1 , X 2 , . . . , X 100 {\displaystyle X_{1},X_{2},...,X_{100}} are all uniformly distributed on [ 0 , 1 ] {\displaystyle } , not necessarily independent, and let s = 10 {\displaystyle s=10} , say. Let N [ n , s ] := N [ 100 , 10 ] {\displaystyle N:=N} be the maximum number of X j {\displaystyle X_{j}} one can select in { X 1 , X 2 , . . . , X 100 } {\displaystyle \{X_{1},X_{2},...,X_{100}\}} such that their sum does not exceed s = 10 {\displaystyle s=10} . N [ 100 , 10 ] {\displaystyle N} is a random variable, so what can one say about bounds for its expectation? How would an upper bound for E ( N [ n , s ] ) {\displaystyle E(N)} behave, if one changes the size n {\displaystyle n} of the sample and keeps s {\displaystyle s} fixed, or alternatively, if one keeps n {\displaystyle n} fixed but varies s {\displaystyle s} ? What can one say about E ( N [ n , s ] ) {\displaystyle E(N)} , if the uniform distribution is replaced by another continuous distribution? In all generality, what can one say if each X k {\displaystyle X_{k}} may have its own continuous distribution function F k {\displaystyle F_{k}} ?

General problem

Let X 1 , X 2 , . . . {\displaystyle X_{1},X_{2},...} be a sequence of non-negative random variables (possibly dependent) that are jointly continuously distributed. For n { 1 , 2 , . . . } {\displaystyle n\in \{1,2,...\}} and s R + {\displaystyle s\in \mathbb {R} ^{+}} let N [ n , s ] {\displaystyle N} be the maximum number of observations among X 1 , X 2 , . . . , X n {\displaystyle X_{1},X_{2},...,X_{n}} that one can sum up without exceeding s {\displaystyle s} .

Now, to obtain N [ n , s ] {\displaystyle N} one may think of looking at the list of all observations, first select the smallest one, then add the second smallest, then the third and so on, as long as the accumulated sum does not exceed s {\displaystyle s} . Hence N [ s , n ] {\displaystyle N} can be defined in terms of the increasing order statistics of X 1 , X 2 , , X n {\displaystyle X_{1},X_{2},\cdots ,X_{n}} , denoted by X 1 , n X 2 , n X n , n , {\displaystyle X_{1,n}\leq X_{2,n}\leq \cdots \leq X_{n,n},} , namely by

N [ n , s ] = { 0 ,   i f     X 1 , n > s , max {   k N :   X 1 , n + X 2 , n + + X k , n s } ,   o t h e r w i s e . ( 1 ) {\displaystyle {\begin{aligned}N={\begin{cases}0&,{\rm {~if~}}~X_{1,n}>s,\\\max\{~k\in \mathbb {N} :~X_{1,n}+X_{2,n}+\cdots +X_{k,n}\leq s\}&,{\rm {~otherwise}}.\end{cases}}\end{aligned}}(1)}

What is the best possible general upper bound for E ( N [ n , s ] ) {\displaystyle E(N)} if one requires only the continuity of the joint distribution of all variables? And then, how to compute this bound?

Identically distributed random variables.

Theorem 1 Let X 1 , X 2 , , X n {\displaystyle X_{1},X_{2},\cdots ,X_{n}} be identically distributed non-negative random variables with absolutely continuous distribution function F {\displaystyle F} . Then

E ( N [ n , s ] ) n F ( t ) , {\displaystyle E(N)\leq nF(t),} (2)

where t := t ( n , s ) {\displaystyle t:=t(n,s)} solves the so-called BRS-equation

n 0 t x d F ( x ) = s {\displaystyle n\int _{0}^{t}x\,dF(x)\,=\,s} . (3)

As an example, here are the answers for the questions posed at the beginning. Thus let all X 1 , X 2 , , X n {\displaystyle X_{1},X_{2},\cdots ,X_{n}} be uniformly distributed on [ 0 , 1 ] {\displaystyle } . Then F ( t ) = t {\displaystyle F(t)=t} on [ 0 , 1 ] {\displaystyle } , and hence d F ( x ) / d x = 1 {\displaystyle dF(x)/dx=1} on [ 0 , 1 ] {\displaystyle } . The BRS-equation becomes

n 0 t x d x = n t 2 / 2 = s . {\displaystyle n\int _{0}^{t}xdx=nt^{2}/2=s.}

The solution is t = 2 s / n {\displaystyle t={\sqrt {2s/n}}} , and thus from the inequality (2)

E ( N [ n , s ] ) n F ( t ) = n 2 s / n = 2 s n {\displaystyle E(N)\leq n\,F(t)=n{\sqrt {2s/n}}={\sqrt {2sn}}} . (4)

Since one always has N [ n , s ] n {\displaystyle N\leq n} , this bound becomes trivial for s n E ( X ) = n / 2 {\displaystyle s\geq nE(X)=n/2} .

For the example questions with n = 100 , s = 10 {\displaystyle n=100,s=10} this yields E ( N [ 100 , 10 ] ) 2000 44.7 {\displaystyle E(N)\leq {\sqrt {2000}}\approx 44.7} . As one sees from (4), doubling the sample size n {\displaystyle n} and keeping s {\displaystyle s} fixed, or vice versa, yield for the uniform distribution in the non-trivial case the same upper bound.

Generalised BRS-inequality

Theorem 2. Let X 1 , X 2 , , X n {\displaystyle X_{1},X_{2},\cdots ,X_{n}} be positive random variables that are jointly distributed such that X k {\displaystyle X_{k}} has an absolutely continuous distribution function F k ,   k = 1 , 2 , , n . {\displaystyle F_{k},~k=1,2,\cdots ,n.} . If N [ n , s ] {\displaystyle N} is defined as before, then

E ( N [ n , s ] ) k = 1 n F k ( t ) {\displaystyle E(N)\leq \sum _{k=1}^{n}F_{k}(t)} , (5)

where t := t ( n , s ) {\displaystyle t:=t(n,s)} is the unique solution of the BRS-equation

k = 1 n 0 t x d F k ( x ) = s . {\displaystyle \sum _{k=1}^{n}\int _{0}^{t}\,x\,dF_{k}(x)=s.} (6)

Clearly, if all random variables X i , i = 1 , 2 , , n {\displaystyle X_{i},i=1,2,\cdots ,n} have the same marginal distribution F {\displaystyle F} , then (6) recaptures (3), and (5) recaptures (2). Again it should be pointed out that no independence hypothesis whatsoever is needed.

Refinements of the BRS-inequality

Depending on the type of the distributions F k {\displaystyle F_{k}} , refinements of Theorem 2 can be of true interest. We just mention one of them.

Let A [ n , s ] {\displaystyle A} be the random set of those variables one can sum up to yield the maximum random number N [ n , s ] {\displaystyle N} , that is,

# A [ n , s ] = N [ n , s ] {\displaystyle \#A=N} ,

and let S A [ n , s ] {\displaystyle S_{A}} denote the sum of these variables. The so-called residual s S A [ n , s ] {\displaystyle s-S_{A}} is by definition always non-negative, and one has:

Theorem 3. Let X 1 , X 2 , , X n {\displaystyle X_{1},X_{2},\cdots ,X_{n}} be jointly continuously distributed with marginal distribution functions F k , k = 1 , 2 , , n {\displaystyle F_{k},k=1,2,\cdots ,n} , and let t := t ( n , s ) {\displaystyle t:=t(n,s)} be the solution of (6). Then

E ( N [ n , s ] ) ( k = 1 n F k ( t ( n , s ) ) ) s E ( S A [ n , s ] ) t ( n , s ) {\displaystyle E(N)\leq \left(\sum _{k=1}^{n}F_{k}(t(n,s))\right)-{\frac {s-E(S_{A})}{t(n,s)}}} . (7)

The improvement in (7) compared with (5) therefore consists of

s E ( S A [ n , s ] ) t ( n , s ) {\displaystyle {\frac {s-E(S_{A})}{t(n,s)}}} .

The expected residual in the numerator is typically difficult to compute or estimate, but there exist nice exceptions. For example, if all X k {\displaystyle X_{k}} are independent exponential random variables, then the memoryless property implies (if s is exceeded) the distributional symmetry of the residual and the overshoot over s {\displaystyle s} . For fixed s {\displaystyle s} one can then show that : s E ( S A [ n , s ] ) t ( n , s ) 1 / 2   a s   n {\displaystyle {\frac {s-E(S_{A})}{t(n,s)}}\to 1/2{\rm {~as~}}n\to \infty } . This improvement fluctuates around 1 / 2 {\displaystyle 1/2} , and the convergence to 1 / 2 {\displaystyle 1/2} , (simulations) seems quick.

Source

The first version of the BRS-inequality (Theorem 1) was proved in Lemma 4.1 of F. Thomas Bruss and James B. Robertson (1991). This paper proves moreover that the upper bound is asymptotically tight if the random variables are independent of each other. The generalisation to arbitrary continuous distributions (Theorem 2) is due to J. Michael Steele (2016). Theorem 3 and other refinements of the BRS-inequality are more recent and proved in Bruss (2021).

Applications

The BRS-inequality is a versatile tool since it applies to many types of problems, and since the computation of the BRS-equation is often not very involved. Also, and in particular, one notes that the maximum number N [ n , s ] {\displaystyle N} always dominates the maximum number of selections under any additional constraint, such as e.g. for online selections without recall. Examples studied in Steele (2016) and Bruss (2021) touch many applications, including comparisons between i.i.d. sequences and non-i.i.d. sequences, problems of condensing point processes, “awkward” processes, selection algorithms, knapsack problems, Borel-Cantelli-type problems, the Bruss-Duerinckx theorem, and online Tiling strategies.

References

Bruss F. T. and Robertson J. B. (1991) ’Wald's Lemma’ for Sums of Order Statistics of i.i.d. Random Variables, Adv. Appl. Probab., Vol. 23, 612-623.

Bruss F. T. and Duerinckx M. (2015), Resource dependent branching processes and the envelope of societie, Ann. of Appl. Probab., Vol. 25 (1), 324-372.

Steele J.M. (2016), The Bruss-Robertson Inequality: Elaborations, Extensions, and Applications, Math. Applicanda, Vol. 44, No 1, 3-16.

Bruss F. T. (2021),The BRS-inequality and its applications, Probab. Surveys, Vol.18, 44-76.

Category: