Suppose there is a coin that shows heads with an unknown probability,
Specifically, the only functions that can be simulated this way are continuous and polynomially bounded on their domain, and map
This page contains several questions about the Bernoulli factory problem. Answers to them will greatly improve my pages on this site about Bernoulli factories. If you can answer any of them, post an issue in the GitHub issues page.
Note: The Bernoulli factory problem is a special case of a more general mathematical problem that I call "The Sampling Problem".
- Background
- Contents
- Polynomials that approach a factory function "fast"
- Other Questions
- End Notes
- Notes
This question involves solving the Bernoulli factory problem with polynomials.2
In this question, a polynomial
where
The degree-$n$ Bernstein polynomial of an arbitrary function
Suppose
- (Exact Bernoulli factory): Compute the Bernstein coefficients of a sequence of polynomials (
$g_n$ ) of degree 2, 4, 8, ...,$2^i$ , ... that converge to$f$ from below and satisfy:$(g_{2n}-g_{n})$ is a polynomial with nonnegative Bernstein coefficients once it's rewritten to a polynomial in Bernstein form of degree exactly$2n$ . (See Note 3 in "End Notes".) Assume$0\lt f(\lambda)\lt 1$ or$f$ is polynomially bounded. - (Approximate Bernoulli factory): Given
$\epsilon > 0$ , compute the Bernstein coefficients of a polynomial or rational function (of some degree$n$ ) that is within$\epsilon$ of$f$ . - (Series expansion of simple functions): Find a nonnegative random variable
$X$ and a series$f(\lambda)=\sum_{a\ge 0}\gamma_a(\lambda)$ such that$\gamma_a(\lambda)/\mathbb{P}(X=a)$ (letting 0/0 equal 0) is a polynomial or rational function with rational Bernstein coefficients lying in$[0, 1]$ . (See note 1 in "End Notes".)
The convergence rate must be
An algorithm (Łatuszyński et al. 2009/2011)3 simulates a factory function
- Generate U, a uniform random variate in
$[0, 1]$ . - Flip the input coin (with a probability of heads of
$\lambda$ ), then build an upper and lower bound for$f(\lambda)$ , based on the outcomes of the flips so far. In this case, these bounds come from two degree-$n$ polynomials that approach$f$ as$n$ gets large, where$n$ is the number of coin flips so far in the algorithm. - If U is less than or equal to the lower bound, return 1. If U is greater than the upper bound, return 0. Otherwise, go to step 2.
The result of the algorithm is 1 with probability exactly equal to
However, the algorithm requires the polynomial sequences to meet certain requirements; among them, the sequences must be of Bernstein-form polynomials that converge from above and below to a factory function. Specifically:
For $f(\lambda)$ there must be a sequence of polynomials (
However, ordinary Bernstein polynomials converge to a function at the rate
But Lorentz (1966)7 showed that if the function is positive and has a continuous
Thus, researchers have developed alternatives, including linear combinations and iterated Boolean sums of Bernstein polynomials, to improve the convergence rate. These include Micchelli (1973)8, Guan (2009)9, Güntürk and Li (2021a)10, (2021b)11, the "Lorentz operator" in Holtz et al. (2011)5, Draganov (2014), and Tachev (2022)12.
These alternative polynomials usually include results where the error bound is the desired
The following is a conjecture that could help reduce this problem to the problem of finding explicit error bounds when approximating a function by polynomials.
Let
For each integer
whenever
Equivalently (see also Nacu and Peres (2005)4), there is
My goal is to see not just whether this conjecture is true, but also which value of
The following are some strategies for answering these questions:
-
For iterated Boolean sums (linear combinations of iterates) of Bernstein polynomials (
$U_{n,k}$ in Micchelli 1973; see also Güntürk and Li), verify my proofs of these bounds in Propositions B10C and B10D. -
For linear combinations of Bernstein polynomials (Butzer (1953)13, Tachev 2022), verify my proof of those error bounds in my Proposition B10.
-
For the "Lorentz operator" (Holtz et al. 2011)5, find explicit bounds, with no hidden constants, on the approximation error for the operator
$Q_{n,r}(f)$ and for the polynomials$(f_n)$ and$(g_n)$ formed with it, and find the hidden constants$\theta_\alpha$ ,$s$ , and$D$ as well as those in Lemmas 15, 17 to 22, 24, and 25 in the paper. Or verify my proof of the order-2 operator's error bounds in my Proposition B10A. The bounds should have the form$C\cdot\max((\lambda(1-\lambda)/n)^{1/2}, 1/n)^r$ , where$C$ is an explicitly given constant depending only on$f$ and$r$ . -
Let
$f:[-1,1]\to [0,1]$ be continuous. Find explicit bounds, with no hidden constants, on the error in approximating$f$ with the following polynomials: The polynomials are similar to Chebyshev interpolants, but evaluate$f$ at rational values of$\lambda$ that converge to Chebyshev or Legendre points (for example, converging to$\cos(j\pi/n)$ with increasing$n$ ). The error bounds must be close to those of Chebyshev interpolants (see, for example, chapters 7, 8, and 12 of Trefethen, Approximation Theory and Approximation Practice, 2013). -
Find other polynomial operators meeting the requirements of the main question (see "Main Question", earlier) and having explicit error bounds, with no hidden constants, especially operators that preserve polynomials of a higher degree than linear functions.
-
Find a sequence of functions
$(W_n(f))$ and an explicit and tight upper bound on$C_1>0$ such that, for each integer$n\ge 1$ that's a power of 2—$$\text{abs}\left(\left(\sum_{i=0}^k W_n\left(\frac{i}{n}\right)\sigma_{n,k,i}\right)-W_{2n}\left(\frac{k}{2n}\right)\right)=\text{abs}(\mathbb{E}[W_n(X_k/n)] - W_{2n}(\mathbb{E}[X_k/n]))\le \frac{C_1 M}{n^{r/2}},\tag{PB}$$ whenever
$0\le k\le 2n$ , where$M = \max(L, \max\text{abs}(f^{(0)}), ...,\max\text{abs}(f^{(r-1)}))$ ,$L$ is$\max\text{abs}(f^{(r)})$ or the Lipschitz constant of$f^{(r-1)}$ ,$X_k$ is a hypergeometric($2n$ ,$k$ ,$n$ ) random variable, and$\sigma_{n,k,i} = {n\choose i}{n\choose {k-i}}/{2n \choose k}=\mathbb{P}(X_k=i)$ is the probability that$X_k$ equals$i$ . (See notes 3 and 4 in "End Notes" as well as "Proofs for Polynomial-Building Schemes".)
-
Let
$f(\lambda):[0,1]\to [0,1]$ be writable as$f(\lambda)=\sum_{n\ge 0} a_n \lambda^n,$ where$a_n\ge 0$ is rational,$a_n$ is nonzero infinitely often, and$f(1)$ is irrational. Then what are simple criteria to determine whether there is$0\lt p\lt 1$ such that$0\le a_n\le p(1-p)^n$ and, if so, to find such$p$ ? Obviously, if$(a_n)$ is nowhere increasing then$1\gt p\ge a_0$ . -
For each
$r>0$ , characterize the functions$f(\lambda)$ that admit a Bernoulli factory where the expected number of coin flips, raised to the power of$r$ , is finite. -
Multiple-output Bernoulli factories: Let
$f(\lambda):[a, b] \to (0, 1)$ be continuous, where$0\lt a$ ,$a\lt b$ ,$b\lt 1$ . Define the entropy bound as$h(f(\lambda))/h(\lambda),$ where$h(x)=-x \ln(x)-(1-x) \ln(1-x)$ is related to the Shannon entropy function. Then there is an algorithm that tosses heads with probability$f(\lambda)$ given a coin that shows heads with probability$\lambda$ and no other source of randomness (Keane and O'Brien 1994)1.But, is there an algorithm for
$f$ that produces multiple outputs rather than one and has an expected number of coin flips per output that is arbitrarily close to the entropy bound, uniformly for every$\lambda$ in$f$ 's domain? Call such an algorithm an optimal factory. (See Nacu and Peres (2005, Question 1)4.) And, does the answer change if the algorithm has access to a fair coin in addition to the biased coin?So far, constants as well as
$\lambda$ and$1-\lambda$ do admit an optimal factory (see same work), and, as Yuval Peres (Jun. 24, 2021) told me, there is an efficient multiple-output algorithm for$f(\lambda) = \lambda/2$ . But are there others? See an appendix in one of my articles for more information on my progress on the problem. -
Pushdown automata and algebraic functions: A pushdown automaton is a finite state machine with an unbounded stack, driven by a biased coin with an unknown probability of heads,
$\lambda$ . Its stack starts with a single symbol. On each step, the machine flips the coin, then, based on the coin flip, the current state, and the top stack symbol, it moves to a new state (or keeps it unchanged) and replaces the top stack symbol with zero or more symbols. When the stack is empty, the machine stops and returns either 0 or 1 depending on the state it ends up at.Let
$f(\lambda)$ be continuous and map the open interval (0, 1) to itself. Mossel and Peres (2005)14 showed that a pushdown automaton can output 1 with probability$f(\lambda)$ only if$f$ is algebraic over the rational numbers (there is a nonzero polynomial$P(x, y)$ in two variables and whose coefficients are rational numbers, such that$P(x, f(x)) = 0$ for every$x$ in the domain of$f$ ). See an appendix in one of my articles for more information on my progress on the problem.Prove or disprove:
- If
$f$ is algebraic over rational numbers it can be simulated by a pushdown automaton. - min(
$\lambda$ ,$1-\lambda$ ) and$\lambda^{1/p}$ , for every prime$p\ge 3$ , can be simulated by a pushdown automaton. - Given that
$f$ is algebraic over rational numbers, it can be simulated by a pushdown automaton if and only if its "critical exponent" is a dyadic number greater than$-1$ or has the form$-1-1/2^k$ for some integer$k\ge 1$ . (See note 2 in "End Notes".)
- If
-
Coin-flipping degree: Let
$p(\lambda)$ be a polynomial that maps the closed unit interval to itself and satisfies$0\lt p(\lambda)\lt 1$ whenever$0\lt\lambda\lt 1$ . Then its coin-flipping degree (Wästlund 1999)15 is the smallest value of$n$ such that$p$ 's Bernstein coefficients of degree$n$ lie in the closed unit interval. Given that a polynomial's degree is$m$ and its "standard" coefficients are integers, what are upper bounds (or even exact maximums) on its coin flipping degree? -
Simple simulation algorithms: References are sought to papers and books that describe irrational constants or Bernoulli factory functions (continuous functions mapping (0,1) to itself) in any of the following ways. Ideally they should involve only rational numbers and should not compute p-adic digit expansions.
- Simulation experiments that succeed with an irrational probability.
- Simple continued fraction expansions of irrational constants.
- Functions written as infinite power series with rational coefficients (see "Certain Power Series").
- Irrational numbers written as series expansions with rational coefficients (see "Certain Converging Series").
- Functions whose integral is an irrational number.
- Closed shapes inside the unit square whose area is an irrational number. (Includes algorithms that tell whether a box lies inside, outside, or partly inside or outside the shape.) Example.
- Generate a uniform (x, y) point inside a closed shape, then return 1 with probability x. For what shapes is the expected value of x an irrational number? Example..
-
Given integer m≥0, rational number 0<k≤exp(1), and unknown heads probability 0≤λ≤1, find a Bernoulli factory for—
$$f(\lambda)=\exp(-(\exp(m+\lambda)-(k(m+\lambda)))) = \frac{\exp(-\exp(m+\lambda))}{\exp(-(k(m+\lambda)))},\tag{PD}$$ that, as much as possible, avoids calculating
$h(\lambda) = \exp(m+\lambda)-k(m+\lambda)$ ; in this sense, the more implicitly the Bernoulli factory works with irrational or transcendental functions, the better. A solution is sought especially when k is 1 or 2. Note that the right-hand side of (PD) can be implemented by ExpMinus and division Bernoulli factories, but is inefficient and heavyweight due to the need to calculate$\epsilon$ for the division factory. In addition there is a Bernoulli factory that first calculates$h(\lambda)$ and$floor(h(\lambda))$ using constructive reals and then runs ExpMinus, but this is likewise far from lightweight. (Calculating exp(.) with floating-point operations is not acceptable for this question.)
Prove or disprove:
-
Given that
$f:[0,1]\to (0,1]$ is convex, the polynomials$(g_n) = (B_n(f) - \max_{0\le\lambda\le 1}\text{abs}(B_n(f)(\lambda)-f(\lambda)))$ (where$n\ge 1$ is an integer power of 2) are in Bernstein form of degree$n$ , converge to$f$ from below, and satisfy:$(g_{2n}-g_{n})$ is a polynomial with nonnegative Bernstein coefficients once it's rewritten to a polynomial in Bernstein form of degree exactly$2n$ . The same is true for the polynomials$(g_n) = (B_n(f) - \text{abs}(B_n(f)(1/2)-f(1/2)))$ , if$f$ is also symmetric about 1/2. -
Let
$f:(D\subseteq [0, 1])\to [0,1]$ . Given a coin that shows heads with probability$\lambda$ (which can be 0 or 1), it is possible to toss heads with probability$f(\lambda)$ using the coin and no other sources of randomness (and, thus,$f$ is strongly simulable) if and only if—-
$f$ is constant on its domain, or is continuous and polynomially bounded on its domain (polynomially bounded means, both$f$ and$1-f$ are bounded below by min($x^n$ ,$(1-x)^n$ ) for some integer$n$ (Keane and O'Brien 1994)1), and -
$f(0)$ is 0 or 1 if 0 is in$f$ 's domain and$f(1)$ is 0 or 1 whenever 1 is in$f$ 's domain, and - if
$f(0) = 0$ or$f(1) = 0$ or both, then there is a polynomial$g(x):[0,1]\to [0,1]$ with computable coefficients, such that$g(0) = f(0)$ and$g(1) = f(1)$ whenever 0 or 1, respectively, is in the domain of f, and such that$g(x)\gt f(x)$ for every$x$ in the domain of$f$ , except at 0 and 1, and - if
$f(0) = 1$ or$f(1) = 1$ or both, then there is a polynomial$h(x):[0,1]\to [0,1]$ with computable coefficients, such that$h(0) = f(0)$ and$h(1) = f(1)$ whenever 0 or 1, respectively, is in the domain of$f$ , and such that$g(x)\lt f(x)$ for every$x$ in the domain of f, except at 0 and 1.
A condition such as "0 is not in the domain of
$f$ , or$f$ can be extended to a Lipschitz continuous function on$[0, \epsilon)$ for some$\epsilon>0$ " does not work. A counterexample is$f(x)=(\sin(1/x)/4+1/2)\cdot(1-(1-x)^n)$ for$n\ge 1$ ($f(0)=0$ ), which is strongly simulable at 0 despite not being Lipschitz at 0. ($(1-x)^n$ is the probability of the biased coin showing zero$n$ times in a row.) Keane and O'Brien already showed strong simulability when$D$ contains neither 0 nor 1. -
Note 1: An example of
Note 2: On pushdown automata: Etessami and Yannakakis (2009)16 showed that pushdown automata with rational probabilities are equivalent to recursive Markov chains (with rational transition probabilities), and that for every recursive Markov chain, the system of polynomial equations has nonnegative coefficients. But this paper doesn't deal with the case of recursive Markov chains where the transition probabilities cannot just be rational, but can also be
Note 3: This condition is also known as a "consistency requirement"; it ensures that not only the polynomials "increase" to
Note 4: If
This question is a problem of finding the Jensen gap of
Special cases for this question are if
Particularly for the case
Footnotes
-
Keane, M. S., and O'Brien, G. L., "A Bernoulli factory", ACM Transactions on Modeling and Computer Simulation 4(2), 1994. ↩ ↩2 ↩3 ↩4
-
See also the following questions on Mathematics Stack Exchange and MathOverflow: Converging polynomials, Error bounds, A conjecture, Hypergeometric random variable, Lorentz operators, Derivatives of moments, Series representations. ↩
-
Łatuszyński, K., Kosmidis, I., Papaspiliopoulos, O., Roberts, G.O., "Simulating events of unknown probabilities via reverse time martingales", arXiv:0907.4018v2 [stat.CO], 2009/2011. ↩
-
Nacu, Şerban, and Yuval Peres. "Fast simulation of new coins from old", The Annals of Applied Probability 15, no. 1A (2005): 93-115. ↩ ↩2 ↩3 ↩4
-
Holtz, O., Nazarov, F., Peres, Y., "New Coins from Old, Smoothly", Constructive Approximation 33 (2011). ↩ ↩2 ↩3
-
E. Voronovskaya, "Détermination de la forme asymptotique d'approximation des fonctions par les polynômes de M. Bernstein", 1932. ↩
-
G.G. Lorentz, "The degree of approximation by polynomials with positive coefficients", 1966. ↩
-
Micchelli, Charles. "The saturation class and iterates of the Bernstein polynomials", Journal of Approximation Theory 8, no. 1 (1973): 1-18. ↩
-
Guan, Zhong. "Iterated Bernstein polynomial approximations." arXiv preprint arXiv:0909.0684 (2009). ↩
-
Güntürk, C. Sinan, and Weilin Li. "Approximation with one-bit polynomials in Bernstein form", arXiv:2112.09183 (2021); Constructive Approximation, pp.1-30 (2022). ↩
-
Güntürk, C. Sinan, and Weilin Li. "Approximation of functions with one-bit neural networks", arXiv:2112.09181 (2021). ↩
-
Tachev, Gancho. "Linear combinations of two Bernstein polynomials", Mathematical Foundations of Computing, 2022. ↩
-
Butzer, P.L., "Linear combinations of Bernstein polynomials", Canadian Journal of Mathematics 15 (1953). ↩
-
Mossel, Elchanan, and Yuval Peres. New coins from old: computing with unknown bias. Combinatorica, 25(6), pp.707-724, 2005. ↩
-
Wästlund, J., "Functions arising by coin flipping", 1999. ↩
-
Etessami, K. And Yannakakis, M., "Recursive Markov chains, stochastic grammars, and monotone systems of nonlinear equations", Journal of the ACM 56(1), pp.1-66, 2009. ↩
-
Banderier, C. And Drmota, M., 2015. Formulae and asymptotics for coefficients of algebraic functions. Combinatorics, Probability and Computing, 24(1), pp.1-53. ↩
-
Lee, Sang Kyu, Jae Ho Chang, and Hyoung-Moon Kim. "Further sharpening of Jensen's inequality." Statistics 55, no. 5 (2021): 1154-1168. ↩