Log sum inequality - Misplaced Pages

This is an old revision of this page, as edited by JoshuaZ (talk | contribs) at 23:44, 12 January 2023 (→Generalizations: Generalization due to Dannan, Neff, Thiel). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Revision as of 23:44, 12 January 2023 by JoshuaZ (talk | contribs) (→Generalizations: Generalization due to Dannan, Neff, Thiel)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

The log sum inequality is used for proving theorems in information theory.

Statement

Let $a_{1},\ldots ,a_{n}$ and $b_{1},\ldots ,b_{n}$ be nonnegative numbers. Denote the sum of all $a_{i}$ s by $a$ and the sum of all $b_{i}$ s by $b$ . The log sum inequality states that

\sum _{i=1}^{n}a_{i}\log {\frac {a_{i}}{b_{i}}}\geq a\log {\frac {a}{b}},

with equality if and only if ${\frac {a_{i}}{b_{i}}}$ are equal for all $i$ , in other words $a_{i}=cb_{i}$ for all $i$ .

(Take $a_{i}\log {\frac {a_{i}}{b_{i}}}$ to be $0$ if $a_{i}=0$ and $\infty$ if $a_{i}>0,b_{i}=0$ . These are the limiting values obtained as the relevant number tends to $0$ .)

Proof

Notice that after setting $f(x)=x\log x$ we have

{\begin{aligned}\sum _{i=1}^{n}a_{i}\log {\frac {a_{i}}{b_{i}}}&{}=\sum _{i=1}^{n}b_{i}f\left({\frac {a_{i}}{b_{i}}}\right)=b\sum _{i=1}^{n}{\frac {b_{i}}{b}}f\left({\frac {a_{i}}{b_{i}}}\right)\\&{}\geq bf\left(\sum _{i=1}^{n}{\frac {b_{i}}{b}}{\frac {a_{i}}{b_{i}}}\right)=bf\left({\frac {1}{b}}\sum _{i=1}^{n}a_{i}\right)=bf\left({\frac {a}{b}}\right)\\&{}=a\log {\frac {a}{b}},\end{aligned}}

where the inequality follows from Jensen's inequality since ${\frac {b_{i}}{b}}\geq 0$ , $\sum _{i=1}^{n}{\frac {b_{i}}{b}}=1$ , and $f$ is convex.

Generalizations

The inequality remains valid for $n=\infty$ provided that $a<\infty$ and $b<\infty$ . The proof above holds for any function $g$ such that $f(x)=xg(x)$ is convex, such as all continuous non-decreasing functions. Generalizations to non-decreasing functions other than the logarithm is given in Csiszár, 2004.

Another generalization is due to Dannan, Neff and Thiel, who showed that if $a_{1},a_{2}\cdots a_{n}$ and $b_{1},b_{2}\cdots b_{n}$ are positive real numbers with $a_{1}+a_{2}\cdots +a_{n}=a$ and $b_{1}+b_{2}\cdots +b_{n}=b$ , and $k\geq 0$ , then $\sum _{i=1}^{n}a_{i}\left(\log {\frac {a_{i}}{b_{i}}}\right)\geq a\log \left({\frac {a}{b}}+k\right)$ .

Applications

The log sum inequality can be used to prove inequalities in information theory. Gibbs' inequality states that the Kullback-Leibler divergence is non-negative, and equal to zero precisely if its arguments are equal. One proof uses the log sum inequality.

Proof
Let $P=(p_{i})_{i\in \mathbb {N} }$ and $Q=(q_{i})_{i\in \mathbb {N} }$ be pmfs. In the log sum inequality, substitute $n=\infty$ , $a_{i}=p_{i}$ and $b_{i}=q_{i}$ to get $\mathbb {D} _{\mathrm {KL} }(P\\|Q)\equiv \sum _{i}p_{i}\log _{2}{\frac {p_{i}}{q_{i}}}\geq 1\log {\frac {1}{1}}=0$ with equality if and only if $p_{i}=q_{i}$ for all i (as both $P$ and $Q$ sum to 1).

Proof

Let

P=(p_{i})_{i\in \mathbb {N} }

and

Q=(q_{i})_{i\in \mathbb {N} }

be pmfs. In the log sum inequality, substitute

n=\infty

a_{i}=p_{i}

and

b_{i}=q_{i}

to get

\mathbb {D} _{\mathrm {KL} }(P\|Q)\equiv \sum _{i}p_{i}\log _{2}{\frac {p_{i}}{q_{i}}}\geq 1\log {\frac {1}{1}}=0

with equality if and only if $p_{i}=q_{i}$ for all i (as both $P$ and $Q$ sum to 1).

The inequality can also prove convexity of Kullback-Leibler divergence.

Notes

^ Cover & Thomas (1991), p. 29.
F. M. Dannan, P. Neff, C. Thiel (2016). "On the sum of squared logarithms inequality and related inequalities" (PDF). Journal of Mathematical Inequalities. 10 (1). doi:10.7153/jmi-10-01. Retrieved 12 January 2023.{{cite journal}}: CS1 maint: multiple names: authors list (link)
MacKay (2003), p. 34.
Cover & Thomas (1991), p. 30.

References

Cover, Thomas M.; Thomas, Joy A. (1991). Elements of Information Theory. Hoboken, New Jersey: Wiley. ISBN 978-0-471-24195-9.
Csiszár, I.; Shields, P. (2004). "Information Theory and Statistics: A Tutorial" (PDF). Foundations and Trends in Communications and Information Theory. 1 (4): 417–528. doi:10.1561/0100000004. Retrieved 2009-06-14.
T.S. Han, K. Kobayashi, Mathematics of information and coding. American Mathematical Society, 2001. ISBN 0-8218-0534-7.
Information Theory course materials, Utah State University . Retrieved on 2009-06-14.
MacKay, David J.C. (2003). Information Theory, Inference, and Learning Algorithms. Cambridge University Press. ISBN 0-521-64298-1.

Categories: