Misplaced Pages

Yao's principle: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editNext edit →Content deleted Content addedVisualWikitext
Revision as of 21:47, 14 November 2024 editDavid Eppstein (talk | contribs)Autopatrolled, Administrators226,410 edits Formulation: ce← Previous edit Revision as of 21:50, 14 November 2024 edit undoDavid Eppstein (talk | contribs)Autopatrolled, Administrators226,410 edits rm no-longer-used subscriptsNext edit →
Line 20: Line 20:
<math display=block> <math display=block>
\min_{A\in\mathcal{A}} \mathbb{E}_{x\sim D} \min_{A\in\mathcal{A}} \mathbb{E}_{x\sim D}
\le \max_{x\in\mathcal{X}_n} \mathbb{E}. \le \max_{x\in\mathcal{X}} \mathbb{E}.
</math> </math>
That is, the best possible deterministic performance against distribution <math>D</math> is a ] for the performance of any randomized algorithm <math>R</math> against its worst-case input. One may also observe this weaker version of Yao's principle directly, as That is, the best possible deterministic performance against distribution <math>D</math> is a ] for the performance of any randomized algorithm <math>R</math> against its worst-case input. One may also observe this weaker version of Yao's principle directly, as
Line 26: Line 26:
\min_{A\in\mathcal{A}} \mathbb{E}_{x\sim D} \min_{A\in\mathcal{A}} \mathbb{E}_{x\sim D}
\le \mathbb{E}_{x\sim D} \le \mathbb{E}_{x\sim D}
\le \max_{x\in\mathcal{X}_n} \mathbb{E}, \le \max_{x\in\mathcal{X}} \mathbb{E},
</math> </math>
by ] and the principle that <math>\min\le\mathbb{E}\le\max</math> for any distribution. By avoiding maximization and minimization over <math>\mathcal{D}</math> and <math>\mathcal{R}</math>, this version of Yao's principle can apply in some cases where <math>\mathcal{X}</math> or <math>\mathcal{A}</math> are not finite.{{r|borel}} by ] and the principle that <math>\min\le\mathbb{E}\le\max</math> for any distribution. By avoiding maximization and minimization over <math>\mathcal{D}</math> and <math>\mathcal{R}</math>, this version of Yao's principle can apply in some cases where <math>\mathcal{X}</math> or <math>\mathcal{A}</math> are not finite.{{r|borel}}

Revision as of 21:50, 14 November 2024

Bound on performance of random algorithms

In computational complexity theory, Yao's principle (also called Yao's minimax principle or Yao's lemma) relates the performance of randomized algorithms to deterministic (non-random) algorithms. It states that, for certain classes of algorithms, and certain measures of the performance of the algorithms, the following two quantities are equal:

  • The optimal performance that can be obtained by a deterministic algorithm on a random input, for a probability distribution on inputs chosen to be as hard as possible and for an algorithm chosen to work as well as possible against that distribution
  • The optimal performance that can be obtained by a random algorithm on a deterministic input, for an algorithm chosen to have the best performance on its worst case inputs, and the worst case input to the algorithm

Yao's principle is often used to prove limitations on the performance of randomized algorithms, by finding a probability distribution on inputs that is difficult for deterministic algorithms, and inferring that randomized algorithms have the same limitation on their worst case performance.

This principle is named after Andrew Yao, who first proposed it in a 1977 paper. It is closely related to the minimax theorem in the theory of zero-sum games, and to the duality theory of linear programs.

Formulation

Consider any cost measure c ( A , x ) {\displaystyle c(A,x)} of an algorithm A {\displaystyle A} on an input x {\displaystyle x} , such as its running time, for which we want to study the expected value over randomized algorithms and random inputs. Consider, also, any finite set A {\displaystyle {\mathcal {A}}} of deterministic algorithms (made finite, for instance, by limiting the algorithms to a specific input size), and a finite set X {\displaystyle {\mathcal {X}}} of inputs to these algorithms. Let R {\displaystyle {\mathcal {R}}} denote the class of of randomized algorithms obtained from probability distributions over the deterministic behaviors in A {\displaystyle {\mathcal {A}}} , and let D {\displaystyle {\mathcal {D}}} denote the class of probability distributions on inputs in X {\displaystyle {\mathcal {X}}} . Then, Yao's principle states that:

max D D min A A E x D [ c ( A , x ) ] = min R R max x X E [ c ( R , x ) ] . {\displaystyle \max _{D\in {\mathcal {D}}}\min _{A\in {\mathcal {A}}}\mathbb {E} _{x\sim D}=\min _{R\in {\mathcal {R}}}\max _{x\in {\mathcal {X}}}\mathbb {E} .}

Here, finiteness of A {\displaystyle {\mathcal {A}}} and X {\displaystyle {\mathcal {X}}} allows D {\displaystyle {\mathcal {D}}} and R {\displaystyle {\mathcal {R}}} to be interpreted as compact spaces, simplices of probability vectors, implying that the minima and maxima in these formulas exist.

A weaker form of the same principle, as an inequality rather than an equality, converts any hard input distribution for deterministic algorithms into a lower bound on the cost of all randomized algorithms. If D D {\displaystyle D\in {\mathcal {D}}} is any specific choice of a hard input distribution, and R {\displaystyle R} is any specific randomized algorithm in R {\displaystyle {\mathcal {R}}} , then min A A E x D [ c ( A , x ) ] max x X E [ c ( R , x ) ] . {\displaystyle \min _{A\in {\mathcal {A}}}\mathbb {E} _{x\sim D}\leq \max _{x\in {\mathcal {X}}}\mathbb {E} .} That is, the best possible deterministic performance against distribution D {\displaystyle D} is a lower bound for the performance of any randomized algorithm R {\displaystyle R} against its worst-case input. One may also observe this weaker version of Yao's principle directly, as min A A E x D [ c ( A , x ) ] E x D [ c ( R , x ) ] max x X E [ c ( R , x ) ] , {\displaystyle \min _{A\in {\mathcal {A}}}\mathbb {E} _{x\sim D}\leq \mathbb {E} _{x\sim D}\leq \max _{x\in {\mathcal {X}}}\mathbb {E} ,} by linearity of expectation and the principle that min E max {\displaystyle \min \leq \mathbb {E} \leq \max } for any distribution. By avoiding maximization and minimization over D {\displaystyle {\mathcal {D}}} and R {\displaystyle {\mathcal {R}}} , this version of Yao's principle can apply in some cases where X {\displaystyle {\mathcal {X}}} or A {\displaystyle {\mathcal {A}}} are not finite.

Applications and examples

Time complexity

When the cost c {\displaystyle c} denotes the running time of an algorithm, Yao's principle states that the best possible running time of a deterministic algorithm, on a hard input distribution, gives a lower bound for the expected time of any Las Vegas algorithm on its worst-case input. Here, a Las Vegas algorithm is a randomized algorithm whose runtime may vary, but for which the result is always correct. For example, this form of Yao's principle has been used to prove the optimality of certain Monte Carlo tree search algorithms for the exact evaluation of game trees.

Query complexity

One of the original applications by Yao of his principle was to the evasiveness of graph properties, the number of tests of the adjacency of pairs of vertices needed to determine whether a graph has a given property, when the only access to the graph is through such tests. As Yao showed, for graph properties that are true of the empty graph but false for some other graph on n {\displaystyle n} vertices with only a bounded number of edges, such as planarity, a randomized algorithm must probe a quadratic number of pairs of vertices. The proof uses Yao's principle, with a hard distribution of inputs consisting of permutations of the graph that does not have the property.

The complexity of comparison-based sorting and selection algorithms is often measured in the number of comparisons made between pairs of data elements. For these problems, an averaging argument identifies the hardest input distributions: they are the distributions on n {\displaystyle n} distinct elements for which all permutations are equally likely. Yao's principle extends lower bounds for the average case number of comparisons made by deterministic algorithms, for this input distribution, to the worst case analysis of randomized comparison algorithms.

In black-box optimization, the problem is to determine the minimum or maximum value of a function, from a given class of functions, accessible only through calls to the function on arguments from some finite domain. In this case, the cost to be optimized is the number of calls. Yao's principle has been described as "the only method available for proving lower bounds for all randomized search heuristics for selected classes of problems".

Communication complexity

In communication complexity, an algorithm describes a communication protocol between two or more parties, and its cost may be the number of bits or messages transmitted between the parties. In this case, Yao's principle describes an equality between the average-case complexity of deterministic communication protocols, on an input distribution that is the worst case for the problem, and the expected communication complexity of randomized protocols on their worst-case inputs.

Online algorithms

Yao's principle has also been applied to the competitive ratio of online algorithms. An online algorithm must respond to a sequence of requests, without knowledge of future requests, incurring some cost or profit per request depending on its choices. The competitive ratio is the ratio of its cost or profit to the value that could be achieved achieved by an offline algorithm with access to knowledge of all future requests, for a worst-case request sequence that causes this ratio to be as far from one as possible. Here, one must be careful to formulate the ratio with the algorithm's performance in the numerator and the optimal performance of an offline algorithm in the denominator, so that the cost measure can be formulated as an expected value rather than as the reciprocal of an expected value.

An example given by Borodin & El-Yaniv (2005) concerns page replacement algorithms, which respond to requests for pages of computer memory by using a cache of k {\displaystyle k} pages, for a given parameter k {\displaystyle k} . If a request matches a cached page, it costs nothing; otherwise one of the cached pages must be replaced by the requested page, at a cost of one page fault. A difficult distribution of request sequences for this model can be generated by choosing each request uniformly at random from a pool of k + 1 {\displaystyle k+1} pages. Any deterministic online algorith has n k + 1 {\displaystyle {\tfrac {n}{k+1}}} expected page faults, over n {\displaystyle n} requests. Instead, an offline algorithm can divide the request sequence into phases within which only k {\displaystyle k} pages are used, incurring only one fault at the start of a phase to replace the one page that is unused within the phase. As an instance of the coupon collector's problem, the expected requests per phase is ( k + 1 ) H k {\displaystyle (k+1)H_{k}} , where H k = 1 + 1 2 + + 1 k {\displaystyle H_{k}=1+{\tfrac {1}{2}}+\cdots +{\tfrac {1}{k}}} is the k {\displaystyle k} th harmonic number. By renewal theory, the offline algorithm incurs n ( k + 1 ) H k + o ( n ) {\displaystyle {\tfrac {n}{(k+1)H_{k}}}+o(n)} page faults with high probability, so the competitive ratio of any deterministic algorithm against this input distribution is at least H k {\displaystyle H_{k}} . By Yao's principle, H k {\displaystyle H_{k}} also lower bounds the competitive ratio of any randomized page replacement algorithm against a request sequence chosen by an oblivious adversary to be a worst case for the algorithm but without knowledge of the algorithm's random choices.

Relation to game theory and linear programming

Yao's principle may be interpreted in game theoretic terms, via a two-player zero-sum game in which one player, Alice, selects a deterministic algorithm, the other player, Bob, selects an input, and the payoff is the cost of the selected algorithm on the selected input. Any randomized algorithm R {\displaystyle R} may be interpreted as a randomized choice among deterministic algorithms, and thus as a mixed strategy for Alice. Similarly, a non-random algorithm may be thought of as a pure strategy for Alice. In any two-player zero-sum game, if one player fixes a fixed mixed strategy, then there is an optimal pure strategy that the other player can choose against it. By the minimax theorem of John von Neumann, there exists a game value c {\displaystyle c} , and mixed strategies for each player, such that the players can guarantee expected value c {\displaystyle c} or better by playing those strategies, and such that the optimal pure strategy against either mixed strategy produces expected value exactly c {\displaystyle c} . Thus, the minimax mixed strategy for Alice, set against the best opposing pure strategy for Bob, produces the same expected game value c {\displaystyle c} as the minimax mixed strategy for Bob, set against the best opposing pure strategy for Alice. This equality of expected game values, for the game described above, is Yao's principle in its form as an equality. Yao's 1977 paper, originally formulating Yao's principle, proved it in this way.

The optimal mixed strategy for Alice (a randomized algorithm) and the optimal mixed strategy for Bob (a hard input distribution) may each be computed using a linear program that has one player's probabilities as its variables, with a constraint on the game value for each choice of the other player. The two linear programs obtained in this way for each player are dual linear programs, whose equality is an instance of linear programming duality. However, although linear programs may be solved in polynomial time, the numbers of variables and constraints in these linear programs (numbers of possible algorithms and inputs) are typically too large to list explicitly. Therefore, formulating and solving these programs to find these optimal strategies is often impractical.

References

  1. ^ Arora, Sanjeev; Barak, Boaz (2009), "Note 12.8: Yao's Min-Max Lemma", Computational Complexity: A Modern Approach, Cambridge University Press, p. 265, ISBN 9780511530753
  2. ^ Yao, Andrew (1977), "Probabilistic computations: Toward a unified measure of complexity", Proceedings of the 18th IEEE Symposium on Foundations of Computer Science (FOCS), pp. 222–227, doi:10.1109/SFCS.1977.24
  3. ^ Borodin, Allan; El-Yaniv, Ran (2005), "8.3 Yao's principle: A technique for obtaining lower bounds", Online Computation and Competitive Analysis, Cambridge University Press, pp. 115–120, ISBN 9780521619462
  4. Moore, Cristopher; Mertens, Stephan (2011), "Theorem 10.1 (Yao's principle)", The Nature of Computation, Oxford University Press, p. 471, ISBN 9780199233212
  5. ^ Motwani, Rajeev; Raghavan, Prabhakar (2010), "Chapter 12: Randomized Algorithms", in Atallah, Mikhail J.; Blanton, Marina (eds.), Algorithms and Theory of Computation Handbook: General Concepts and Techniques (2nd ed.), CRC Press, pp. 12-1 – 12-24; see in particular Section 12.5: The minimax principle and lower bounds, pp. 12-8 – 12-10
  6. ^ Wegener, Ingo (2005), "9.2 Yao's minimax principle", Complexity Theory: Exploring the Limits of Efficient Algorithms, Springer-Verlag, pp. 118–120, doi:10.1007/3-540-27477-4, ISBN 978-3-540-21045-0, MR 2146155
  7. ^ Fortnow, Lance (October 16, 2006), "Favorite theorems: Yao principle", Computational Complexity
  8. Wigderson, Avi (2019), Mathematics and Computation: A Theory Revolutionizing Technology and Science, Princeton University Press, p. 210, ISBN 9780691189130
  9. Borodin & El-Yaniv (2005), pp. 120–122, 8.4 Paging revisited.

Further reading

  • Ben-David, Shalev; Blais, Eric (2023), "A new minimax theorem for randomized algorithms", Journal of the ACM, 70 (6) 38, arXiv:2002.10802, doi:10.1145/3626514, MR 4679504
  • de Graaf, Mart; de Wolf, Ronald (2002), "On quantum versions of the Yao principle", in Alt, Helmut; Ferreira, Afonso (eds.), STACS 2002, 19th Annual Symposium on Theoretical Aspects of Computer Science, Antibes – Juan les Pins, France, March 14–16, 2002, Proceedings, Lecture Notes in Computer Science, vol. 2285, Springer, pp. 347–358, arXiv:quant-ph/0109070, doi:10.1007/3-540-45841-7_28
Categories: