Optimal Bounds for Noisy Sorting

Yuzhou Gu
MIT
[email protected] Yinzhan Xu
MIT
[email protected]

Abstract

Sorting is a fundamental problem in computer science. In the classical setting, it is well-known that $(1\pm o(1))n\log_{2}n$ comparisons are both necessary and sufficient to sort a list of $n$ elements. In this paper, we study the Noisy Sorting problem, where each comparison result is flipped independently with probability $p$ for some fixed $p\in(0,\frac{1}{2})$ . As our main result, we show that

(1\pm o(1))\left(\frac{1}{I(p)}+\frac{1}{(1-2p)\log_{2}\left(\frac{1-p}{p}\right)}\right)n\log_{2}n

noisy comparisons are both necessary and sufficient to sort $n$ elements with error probability $o(1)$ using noisy comparisons, where $I(p)=1+p\log_{2}p+(1-p)\log_{2}(1-p)$ is capacity of BSC channel with crossover probability $p$ . This simultaneously improves the previous best lower and upper bounds (Wang, Ghaddar and Wang, ISIT 2022) for this problem.

For the related Noisy Binary Search problem, we show that

(1\pm o(1))\left((1-\delta)\frac{\log_{2}(n)}{I(p)}+\frac{2\log_{2}\left(\frac{1}{\delta}\right)}{(1-2p)\log_{2}\left(\frac{1-p}{p}\right)}\right)

noisy comparisons are both necessary and sufficient to find the predecessor of an element among $n$ sorted elements with error probability $\delta$ . This extends the previous bounds of (Burnashev and Zigangirov, 1974), which are only tight for $\delta=1/n^{o(1)}$ .

1 Introduction

Sorting is one of the most fundamental problems in computer science. Among the wide variety of sorting algorithms, an important class is comparison-based sorting. It is well-known that sorting a list of $n$ numbers requires $(1-o(1))n\log n$ comparisons (see e.g., [CLRS22]), even if the input list is a random permutation.¹¹1All logarithms in the paper are of base $2$ . A good number of algorithms require only $(1+o(1))n\log n$ comparisons, such as binary insertion sort, merge sort (see e.g., [Knu98]), and merge insertion sort [FJ59].

Sorting is run on many real-world applications with large data sets. Thus, it is important to consider faults that arise unavoidably in large systems. For comparison-based sorting, this means that the result of a comparison could be incorrect. We consider one well-studied model for simulating such faults, studied in, e.g., [Hor63, Gal78, Pel89, BM86, FRPU94, BMW16]. In this model, the result of each query (or noisy comparison in the context of sorting) is flipped independently with probability $p$ for some fixed $p\in(0,\frac{1}{2})$ and repeated queries are allowed. Let us call it the noisy model. In this model, we define $\mathsf{NoisySorting}(n)$ as the task to sort $n$ elements using noisy comparisons. As this model inherently has errors and no algorithm can always produce correct outputs, we consider algorithms with $o(1)$ error probability.

If we repeat each noisy comparison $\Theta(\log n)$ times, the majority of the returned results is the correct comparison result with high probability. Therefore, we can modify any classical comparison-based sorting algorithm with $O(n\log n)$ comparisons, by replacing each comparison with $\Theta(\log n)$ noisy comparisons, to get an $O(n\log^{2}n)$ -comparison algorithm for Noisy Sorting that succeeds with $1-o(1)$ probability.

In fact, better algorithms were known. Feige, Raghavan, Peleg, and Upfal [FRPU94] gave an algorithm for $\mathsf{NoisySorting}(n)$ with only $O(n\log n)$ noisy comparisons. Recently, Wang, Ghaddar, Wang [WGW22] analyzed the constant factor hidden in [FRPU94]’s algorithm,²²2The actual constant is quite complicated, see [WGW22] for details. and gave an improved upper bound of $(1+o(1))\frac{2}{\log\left(\frac{1}{\frac{1}{2}+\sqrt{p(1-p)}}\right)}n\log n$ noisy comparisons. They also showed that $(1-o(1))\frac{1}{I(p)}n\log n$ comparisons are necessary for any algorithm with $o(1)$ error probability, where $I(p)=1-h(p)$ is the capacity of BSC channel with crossover probability $p$ (which is also the amount of information each noisy comparison gives) and $h(p)=-p\log p-(1-p)\log(1-p)$ is the binary entropy function.

Despite these progresses, there is still a big gap between the upper and lower bounds. For instance, when $p=0.1$ , the constant in front of $n\log n$ for the lower bound is roughly $1.88322$ , whereas the constant for the upper bound is roughly $6.21257$ , more than three times as large. It is a fascinating question to narrow down this gap, or even close it like the case of classical sorting.

As the main result of our paper, we show that it is indeed possible to close this gap by giving both an improved upper bound and an improved lower bound:

Theorem 1 (Noisy Sorting Upper Bound).

There exists a deterministic algorithm for $\mathsf{NoisySorting}(n)$ with error probability $o(1)$ which always uses at most

(1+o(1))\left(\frac{1}{I(p)}+\frac{1}{(1-2p)\log\frac{1-p}{p}}\right)n\log n

noisy comparisons.

Theorem 2 (Noisy Sorting Lower Bound).

Any (deterministic or randomized) algorithm for $\mathsf{NoisySorting}(n)$ with error probability $o(1)$ must make at least

(1-o(1))\left(\frac{1}{I(p)}+\frac{1}{(1-2p)\log\frac{1-p}{p}}\right)n\log n

noisy comparisons in expectation, even if the input is a uniformly random permutation.

To compare with previous works more quantitatively, our constant is roughly $2.27755$ for $p=0.1$ . When $p$ approaches $0$ , i.e., when the noisy comparisons are almost errorless, our bounds approach $(1\pm o(1))n\log n$ , matching the bounds of classical sorting.

Noisy Binary Search.

We also study the closely related Noisy Binary Search problem. In $\mathsf{NoisyBinarySearch}(n)$ , we are given a sorted list of $n$ elements and an element $x$ , and the goal is to find the predecessor of $x$ in the sorted list. Noisy Binary Search is well-studied [Hor63, BZ74, FRPU94, BH08, Pel89, DŁU21] in the noisy model we consider, as well as in some other models [AD91, BK93, DGW92, Mut94, RMK⁺80].

Both previous algorithms [FRPU94, WGW22] for $\mathsf{NoisySorting}(n)$ use binary insertion sort at the high level. More specifically, the algorithms keep a sorted array of the first $i$ elements for $i$ from $1$ to $n$ . When the $i$ -th element needs to be inserted for $i>1$ , the algorithms use some implementations of $\mathsf{NoisyBinarySearch}(i-1)$ to find the correct position to insert it, with error probability $o(1/n)$ (it also suffices to have average error probability $o(1/n)$ ). A simple union bound then implies that the error probability of the sorting algorithm is $o(1)$ . The difference between the two algorithms is that [FRPU94] uses an algorithm for Noisy Binary Search based on random walk, while [WGW22] builds on Burnashev and Zigangirov’s algorithm [BZ74].

In fact, a variable-length version of Burnashev and Zigangirov’s algorithm [BZ74] achieves optimal query complexity for the Noisy Binary Search problem.³³3The original paper is in Russian. See [WGZW23, Theorem 5] for an English version. They proved that $\mathsf{NoisyBinarySearch}(n)$ with error probability $\delta$ can be solved using at most $\frac{\log n+\log(1/\delta)+\log((1-p)/p)}{I(p)}$ noisy comparisons in expectation (we remark that their algorithm is randomized and assumes uniform prior). Information theoretical lower bound (e.g., [BH08, Theorem B.1]) shows that $(1-\delta-o(1))\frac{\log n}{I(p)}$ noisy comparisons are necessary. These bounds essentially match when $1/n^{o(1)}\leq\delta\leq o(1)$ . However, for the application of Noisy Sorting, we must have $\delta=o(1/n)$ on average. In this case, the lower and upper bounds do not match.

Dereniowski, Łukasiewicz, and Uznański [DŁU21] designed alternative algorithms Noisy Binary Search in various settings. They also considered a more general problem called Graph Search.

We remark that Ben-Or and Hassidim [BH08], likely unaware of [BZ74], claims an algorithm for solving $\mathsf{NoisyBinarySearch}(n)$ with error probability $\delta$ using at most $(1-\delta-o(1))\frac{\log n+O(\log(1/\delta))}{I(p)}$ noisy comparisons in expectation. However, as pointed out by [DŁU21], there are potential issues in [BH08]’s proof.

By using a similar technique as our lower bound for Noisy Sorting, we are also able to improve the lower bound for Noisy Binary Search for $\delta=1/n^{\Omega(1)}$ .

Theorem 3 (Noisy Binary Search Lower Bound).

Any (deterministic or randomized) algorithm for $\mathsf{NoisyBinarySearch}(n)$ with error probability $\leq\delta$ must make at least

\displaystyle(1-o(1))\left((1-\delta)\frac{\log n}{I(p)}+\frac{2\log\frac{1}{\delta}}{(1-2p)\log\frac{1-p}{p}}\right)

noisy comparisons in expectation, even if the position of the element to search for is uniformly random.

This lower bound indicates that, any algorithm for $\mathsf{NoisySorting}(n)$ purely based on binary insertion sort, which requires $n-1$ executions of Noisy Binary Search with average error probability $o(1/n)$ , needs at least $(1-o(1))\left(\frac{1}{I(p)}+\frac{2}{(1-2p)\log\frac{1-p}{p}}\right)n\log n$ noisy comparisons in expectation to achieve error probability $o(1)$ . In light of Theorem 1, we see that any algorithm for Noisy Sorting purely based on binary insertion sort cannot be optimal.

We also show an algorithm for $\mathsf{NoisyBinarySearch}(n)$ that matches with the lower bound in Theorem 3:

Theorem 4 (Noisy Binary Search Upper Bound).

There exists a randomized algorithm for $\mathsf{NoisyBinarySearch}(n)$ with error probability $\leq\delta$ that makes at most

\displaystyle(1+o(1))\left((1-\delta)\frac{\log n}{I(p)}+\frac{2\log\frac{1}{\delta}}{(1-2p)\log\frac{1-p}{p}}\right)

noisy comparisons in expectation.

Concurrent Work.

Concurrent work [WGZW23] also improves over [WGW22] in many aspects. Our results are stronger than their main results. Our algorithm in Theorem 1 uses fewer noisy comparisons than their algorithms in expectation [WGZW23, Theorem 1 and 2]. Our Theorem 2 is strictly stronger than [WGZW23, Theorem 3]. Our lower bound for insertion-based sorting algorithms (see discussion after Theorem 3) is strictly stronger than [WGZW23, Theorem 4].

Other Related Works.

People have also considered Noisy Sorting in a slightly different noisy model, where every pair of elements is allowed to be compared at most once [BM08, KPSW11, GLLP17, GLLP20, GLLP19]. Sorting is harder in this model than the model we consider, as it is information-theoretically impossible to correctly sort all the elements with $1-o(1)$ probability for $p>0$ [GLLP17]. For $p<1/16$ , Geissmann, Leucci, Liu, Penna [GLLP19] achieved an $O(n\log n)$ time algorithm that guarantees $O(\log n)$ maximum dislocation and $O(n)$ total dislocation with high probability, matching the lower bound given in [GLLP17].

Acknowledgements.

We thank Przemysław Uznański for telling us about the issue of [BH08].

1.1 Technical Overview

Before we give the overview of our techniques, let us first give some intuitions about the constant, $\frac{1}{I(p)}+\frac{1}{(1-2p)\log\frac{1-p}{p}}$ , in our bound. $I(p)$ is essentially the amount of information in bits each noisy query can give, and the ordering of $n$ elements requires $\log(n!)\approx n\log n$ bits to represent. Therefore, $\frac{n\log n}{I(p)}$ queries are necessary intuitively. On the other hand, $\frac{\log n}{(1-2p)\log\frac{1-p}{p}}$ is roughly the number of noisy comparisons required to compare two elements with error probability $\leq\frac{1}{n}$ . Therefore, even for the simpler task of checking whether a list of $n$ elements are sorted, roughly $\frac{n\log n}{(1-2p)\log\frac{1-p}{p}}$ queries seem necessary to compare all the adjacent elements with overall error probability $o(1)$ . Therefore, the constants $\frac{1}{I(p)}$ and $\frac{1}{(1-2p)\log\frac{1-p}{p}}$ are natural.

However, the above discussion only intuitively suggests a $\max\left\{\frac{1}{I(p)},\frac{1}{(1-2p)\log\frac{1-p}{p}}\right\}$ lower bound on the constant, and a priori it is unclear how to strengthen it to their sum. To resolve this issue, we design an easier version of Noisy Sorting as an intermediate problem. In this version, suppose elements are split into continuous groups of sizes $\log n$ in the sorted order (and the groups are unknown to the algorithm). Any time the algorithm tries to compare two elements from the same group, the query result tells the algorithm that these two elements are in the same group, without noise; otherwise, the query result flips the result of the comparison with probability $p$ , as usual. The algorithm is also not required to sort elements inside the same group correctly. We can show that in this version, each query still only gives roughly $I(p)$ bits of information, and the total amount of bits the output space can represent is $\log\frac{n!}{((\log n)!)^{n/\log n}}\approx n\log n$ . Therefore, this simpler problem still requires approximately $\frac{n\log n}{I(p)}$ queries. By designing a reduction from this problem to Noisy Sorting, we can show that any algorithm for Noisy Sorting requires roughly $\frac{n\log n}{I(p)}$ queries comparing elements not from the same group. At the same time, inside each group, the algorithm needs to perform $\log n-1$ comparisons with $\leq\frac{1}{n}$ error probability, even for checking if all the groups are sorted. This part requires roughly $\frac{n\log n}{(1-2p)\log\frac{1-p}{p}}$ queries.

Our algorithm that matches this lower bound can be viewed as a remote variant of quicksort, with a large number of pivots and one level of recursion. First, we sample a random subset of elements $S$ of size $\frac{n}{\log n}$ , which can be viewed as pivots. Sorting $S$ only contributes to lower order terms in the query complexity. These pivots separate the list to multiple sublists. Then, we use Noisy Binary Search to find the correct sublist for the remaining elements, with error probability $\frac{1}{\log n}$ . This step contributes to the $\frac{n\log n}{I(p)}$ part of the query complexity. Then we design an algorithm that correctly sorts $m$ elements with probability $\geq 1-o(\frac{1}{n})$ using only $O(m\log m)+(1+o(1))\frac{m\log n}{(1-2p)\log\frac{1-p}{p}}$ queries. This algorithm is then used to sort all the sublists. Summing over all the sublists, the first $O(m\log m)$ term becomes $O(n\log\log n)$ , as we expect each sublist to have size $\operatorname{polylog}(n)$ . The second term sums up to $(1+o(1))\frac{n\log n}{(1-2p)\log\frac{1-p}{p}}$ . The total error probability is $o(\frac{1}{n})\cdot n=o(1)$ Finally, we have to correct the elements that are sent to the wrong sublists due to the error probability in the Noisy Binary Search algorithm, but we only expect to see at most $\frac{n}{\log n}$ of them, so the number of queries for handling them ends up being a lower order term.

2 Preliminaries

In this section we introduce several basic notations, definitions, and lemmas used in the proof.

2.1 Basic Notions

Notation 5.

We use the following notations throughout this paper.

•

For $k\in\mathbb{Z}_{\geq 0}$ , let $[k]:=\{1,\ldots,k\}$ .
•

For two random variables $X,Y$ , let $I(X;Y)$ denote the mutual information between $X$ and $Y$ .
•

For a distribution $P=(p_{1},\ldots,p_{k})$ , define its entropy function as $h(P):=-\sum_{i\in[k]}p_{i}\log p_{i}$ . Specially, for $p\in[0,1]$ , define $h(p):=h(p,1-p)$ . Define $I(p):=1-h(p)$ .
•

For $p\in[0,1]$ , let $\operatorname{\mathrm{Bern}}(p)$ denote the Bernoulli random variable, such that for $X\sim\operatorname{\mathrm{Bern}}(p)$ , we have $\mathbb{P}[X=0]=1-p$ , $\mathbb{P}[X=1]=p$ .
•

For $p\in[0,1]$ , let $\operatorname{\mathrm{BSC}}_{p}$ be the binary symmetric channel with crossover probability $p$ , i.e., on input $x\in\{0,1\}$ , the output $\operatorname{\mathrm{BSC}}_{p}(x)$ follows distribution $\operatorname{\mathrm{Bern}}(p+(1-2p)x)$ .
•

For two random variables $X,Y$ , define $X\land Y:=\min\{X,Y\}$ .
•

Let $x$ be a variable and $f(x)$ and $g(x)$ be two functions which are defined for all large enough values of $x$ . We say $f(x)=o_{x}(g(x))$ if $\lim_{x\to\infty}f(x)/g(x)=0$ . Specially, if $x=n$ , we omit the subscript and write $f(n)=o(g(n))$ .

We also assume some basic knowledge about information theory, such as chain rule of entropy and mutual information and Fano’s inequality. See e.g. [PW22] for an overview.

In Noisy Sorting and Noisy Binary Search, the algorithm is allowed to make noisy comparison queries.

Definition 6 (Noisy Comparison Query).

Fix $p\in(0,\frac{1}{2})$ . Let $x,y$ be two comparable elements (i.e., either $x\leq y$ or $y\leq x$ ). We define the noisy comparison query $\textsc{NoisyComparison}(x,y)$ as returning $\operatorname{\mathrm{BSC}}_{p}(\mathbbm{1}\{x<y\})$ , where the randomness is independent for every query.

We will omit the crossover probability $p$ when it is clear from the context throughout the paper.

With these definitions, we are ready to define the problems of study.

Definition 7 (Noisy Sorting Problem).

Let $\mathsf{NoisySorting}(n)$ be the following problem: Given $n$ comparable elements $(a_{i})_{i\in[n]}$ , an algorithm is allowed to make query $\textsc{NoisyComparison}(a_{i},a_{j})$ for $i,j\in[n]$ . The goal is to output a permutation $(b_{i})_{i\in[n]}$ of $(a_{i})_{i\in[n]}$ such that $b_{i}\leq b_{i+1}$ for all $i\in[n-1]$ .

Definition 8 (Noisy Binary Search Problem).

Let $\mathsf{NoisyBinarySearch}(n)$ be the following problem: Given $n$ elements $(a_{i})_{i\in[n]}$ satisfying $a_{i}\leq a_{i+1}$ for all $i\in[n-1]$ and an element $x$ . An algorithm is allowed to make query $\textsc{NoisyComparison}(s,t)$ , where $\{s,t\}=\{a_{i},x\}$ for some $i\in[n]$ . The goal is to output $\max(\{0\}\cup\{i\in[n]:a_{i}<x\})$ (i.e., the index of the largest element in $(a_{i})_{i\in[n]}$ less than $x$ ).

Remark 9.

By our definition of the Noisy Sorting Problem, we can WLOG assume that all elements $(a_{i})_{i\in[n]}$ are distinct. In fact, suppose we have an algorithm $\mathcal{A}$ which works for the distinct elements case. Then we can modify it such that if it is about to call $\textsc{NoisyComparison}(a_{i},a_{j})$ , it instead calls $\textsc{NoisyComparison}(a_{\max\{i,j\}},a_{\min\{i,j\}})$ (but flip the returning bit if $i<j$ ). Call the modified algorithm $\mathcal{A}^{\prime}$ . This way, to the view of the algorithm, the elements are ordered by $(a_{i},i)$ . It is easy to see that $\mathcal{A}^{\prime}$ has the same distribution of number of queries, and error probability is no larger than error probability of $\mathcal{A}$ . Thus, we get an algorithm which works for the case where elements are not necessarily distinct.

Similarly, by our definition of the Noisy Binary Search Problem, we can WLOG assume that all elements $(a_{i})_{i\in[n]}$ and $x$ are distinct.

2.2 Known and Folklore Algorithms

Theorem 10 ([DŁU21, Corollary 1.6]).

There exists a randomized algorithm for $\mathsf{NoisyBinarySearch}(n)$ with error probability at most $\delta$ using

(1+o(1))\left(\frac{\log n+O(\log(1/\delta))}{I(p)}\right)

noisy comparisons in expectation, and using $O(\log n)$ random bits always.

Remark 11.

[DŁU21] stated their problem as finding an element in a sorted array, but their algorithm can also find the predecessor. The only place their algorithm uses randomness is to shift the initial array randomly, which requires $O(\log n)$ random bits (as we can WLOG assume $n$ is a power of two in $\mathsf{NoisyBinarySearch}(n)$ , it always takes $O(\log n)$ random bits to sample a random shift, instead of $O(\log n)$ random bits in expectation).

We could also use the noisy binary search algorithm in [BZ74] (see also [WGZW23, Theorem 5]), which achieves a slightly better bound on the number of noisy comparisons. However, the number of random bits used in Burnashev-Zigangirov’s algorithm is $O(\log n)$ in expectation, not always. This would make our control of number of random bits used slightly more complicated.

Corollary 12.

There exists a randomized algorithm for $\mathsf{NoisySorting}(n)$ with failure probability $\delta$ using $O(n\log(n/\delta))$ noisy comparisons in expectation, and using $O(n\log n)$ random bits always.

Proof.

The algorithm keeps a sorted array of the first $i$ elements for $i$ from $1$ to $n$ . Each time a new element needs to be inserted, the algorithm uses Theorem 10 to find its correct position, with error probability $\delta/n$ . By union bound, the overall error probability is bounded by $\delta/n\cdot n=\delta$ , the expected number of queries is $n\cdot(1+o(1))\left(\frac{\log n+\log(n/\delta)}{I(p)}\right)=O(\frac{n\log(n/\delta)}{I(p)})$ , and the number of random bits used is $O(n\log n)$ . ∎

We show how to use repeated queries to boost the correctness probability of pairwise comparisons. This is a folklore result, but we present a full proof here for completeness.

Lemma 13.

There exists a deterministic algorithm which compares two (unequal) elements using noisy comparisons with error probability $\leq\delta$ using

\displaystyle\frac{1}{1-2p}\left\lceil\frac{\log\frac{1-\delta}{\delta}}{\log\frac{1-p}{p}}\right\rceil=(1+o_{1/\delta}(1))\frac{\log(1/\delta)}{(1-2p)\log((1-p)/p)}

noisy queries in expectation.

Proof.

Consider Algorithm 1 which maintains the posterior probability that $x<y$ .

Algorithm 1

1:procedure LessThan(

x,y,\delta

)

a\leftarrow\frac{1}{2}

3: while true do

4: if

\textsc{NoisyComparison}(x,y)=1

then

a\leftarrow\frac{(1-p)a}{(1-p)a+p(1-a)}

6: else

a\leftarrow\frac{pa}{pa+(1-p)(1-a)}

8: if

a\geq 1-\delta

then return true

\triangleright

x<y

9: if

a\leq\delta

then return false

\triangleright

x>y

Because of the return condition, if prior distribution of $\mathbbm{1}\{x<y\}$ is $\operatorname{\mathrm{Bern}}(\frac{1}{2})$ , then probability of error is $\leq\delta$ . However, by symmetry, probability of error of the algorithm does not depend on the ground truth, so the probability of error is $\leq\delta$ regardless of the ground truth.

In the following, let us analyze the number of queries used.

Let $a_{t}$ be the value of $a$ after the $t$ -th query. If the ground truth is $x<y$ , then

\displaystyle\log\frac{a_{t+1}}{1-a_{t+1}}=\log\frac{a_{t}}{1-a_{t}}+\left\{\begin{array}[]{ll}+\log\frac{1-p}{p}&\text{w.p. $1-p$,}\\ -\log\frac{1-p}{p}&\text{w.p. $p$.}\end{array}\right.

Thus,

\displaystyle\mathbb{E}\left[\log\frac{a_{t+1}}{1-a_{t+1}}\right]=\log\frac{a_{t}}{1-a_{t}}+(1-2p)\log\frac{1-p}{p},

where the expectation is over the randomness of the $(t+1)$ -th query.

Define two sequences $(X_{t})_{t\geq 0}$ , $(Y_{t})_{t\geq 0}$ as $X_{t}=\log\frac{a_{t}}{1-a_{t}}$ and $Y_{t}=X_{t}-t(1-2p)\log\frac{1-p}{p}$ . Let $\tau$ be the stopping time $\tau:=\min\{t:a_{t}\leq\delta\text{ or }a_{t}\geq 1-\delta\}$ .

By the above discussion, $(Y_{t})_{t\geq 0}$ is a martingale. Using Optional Stopping Theorem, we have

\displaystyle\mathbb{E}[Y_{\tau\land n}]=\mathbb{E}[Y_{0}]=0

for every $n\geq 0$ , so

\displaystyle\mathbb{E}[X_{\tau\land n}]=\mathbb{E}[\tau\land n](1-2p)\log\frac{1-p}{p}.

By the bounded convergence theorem, $\mathbb{E}[X_{\tau\land n}]$ goes to $\mathbb{E}[X_{\tau}]$ as $n\to\infty$ . By the monotone convergence theorem, $\mathbb{E}[\tau\land n]$ goes to $\mathbb{E}[\tau]$ as $n\to\infty$ . Therefore,

\displaystyle\mathbb{E}[X_{\tau}]=\mathbb{E}[\tau](1-2p)\log\frac{1-p}{p},

which leads to

\displaystyle\mathbb{E}[\tau]=\frac{\mathbb{E}[X_{\tau}]}{(1-2p)\log\frac{1-p}{p}}\leq\frac{1}{1-2p}\left\lceil\frac{\log\frac{1-\delta}{\delta}}{\log\frac{1-p}{p}}\right\rceil

where we used the fact that $X_{t}$ is always an integer multiple of $\log\frac{1-p}{p}$ and hence $X_{\tau}\leq\left\lceil\frac{\log\frac{1-\delta}{\delta}}{\log\frac{1-p}{p}}\right\rceil\cdot\log\frac{1-p}{p}$ .

Thus, the algorithm stops within $\frac{1}{1-2p}\left\lceil\frac{\log\frac{1-\delta}{\delta}}{\log\frac{1-p}{p}}\right\rceil$ queries in expectation.

Similarly, if the ground truth is $x>y$ , the algorithm stops within the same expected number of queries. This finishes the proof. ∎

For simplicity, we use $f_{p}(\delta)$ to denote $\frac{1}{1-2p}\left\lceil\frac{\log\frac{1-\delta}{\delta}}{\log\frac{1-p}{p}}\right\rceil$ for $0<p<\frac{1}{2}$ and $\delta>0$ , which, by Lemma 13, upper bounds the expected number of comparisons needed by Algorithm 1 to compare two elements with error probability $\leq\delta$ . When clear from the context, we drop the subscript $p$ .

2.3 Pairwise Independent Hashing

During our algorithm, we sometimes run into the following situation: we need to run a certain algorithm $n$ times, where each instance uses $m=n^{c}$ random bits for some $0<c<1$ . We can only afford $o(n\log n)$ fresh random bits in total (for derandomization purpose), so the instances need to share randomness in some way. On the other hand, we want randomness between any two instances to be independent of each other, in order for concentration bounds to hold. Therefore we need the following standard result on pairwise independent hashing.

Lemma 14 ([CW77]).

There exists a pairwise independent hash family of size $2^{k}-1$ using $k$ fresh random bits.

By using $m$ fully independent copies of pairwise independent hash families, we can achieve pairwise independence between instances using $O(m\log n)$ fresh random bits.

3 Noisy Sorting Algorithm

In this section, we will present our algorithm for noisy sorting and prove Theorem 1. We will first present the following randomized version of our algorithm, and then convert it to a deterministic one in Section 3.1.

Theorem 15.

There exists a randomized algorithm for $\mathsf{NoisySorting}(n)$ with error probability $o(1)$ which always uses at most

(1+o(1))\left(\frac{1}{I(p)}+\frac{1}{(1-2p)\log\frac{1-p}{p}}\right)n\log n

noisy comparisons, and uses $O(n)$ random bits in expectation.

We start with a subroutine that sorts a list with a small number of inversion pairs.

Lemma 16.

Fix any parameter $\sigma\in(0,1)$ which can depend on $n$ . Given a list of $n$ elements with $t$ inversion pairs, there exists a deterministic algorithm using noisy comparisons that sorts these $n$ elements with error probability $\leq(n-1+t)\sigma$ using $(n-1+t)f(\sigma)+(n-1+t)\sigma n^{2}f(\sigma)$ noisy queries in expectation. The algorithm does not have to know $t$ in advance.

Proof.

Consider Algorithm 2, which is essentially insertion sort.

Algorithm 2

1:procedure SortInversion(

n,(a_{i})_{i\in[n]},p,\sigma

)

2: for

i=1\to n

3: for

j=i\to 2

4: if LessThan(

a_{j},a_{j-1},\sigma

) then

5: Swap

a_{j}

and

a_{j-1}

6: else

7: break

8: return

(a_{i})_{i\in[n]}

By union bound, with probability $\geq 1-(n-1+t)\sigma$ , the first $(n-1+t)$ calls to LessThan all return correctly. Conditioned on this happening, the algorithm acts exactly like a normal insertion sort, which halts after the first $(n-1+t)$ calls to LessThan, which takes $(n-1+t)f(\sigma)$ noisy comparisons in expectation.

With probability $\leq(n-1+t)\sigma$ , the algorithm does not run correctly, but the algorithm still halts in at most $\binom{n}{2}f(\sigma)\leq n^{2}f(\sigma)$ queries in expectation. ∎

Next, we use Lemma 16 to construct the following subroutine. It is ultimately used to sort some small sets of elements in the main algorithm.

Lemma 17.

Fix any parameter $\delta\in(0,1)$ which can depend on $n$ . There is a randomized algorithm for $\mathsf{NoisySorting}(n)$ with error probability at most $\delta$ using $O(n\log n)+nf(\delta/n)+n^{2}\delta f(\delta/n)$ noisy comparisons in expectation, and using $O(n\log n)$ random bits always.

Proof.

We first use Corollary 12 with error probability $P_{e}\leq n^{-2}$ , then use Lemma 16 with error probability $\sigma=\delta/n$ .

After the first step, the expected number of inversion pairs is

\displaystyle\mathbb{E}[t]\leq P_{e}n^{2}\leq 1,

so the second step uses at most $nf(\delta/n)+n^{2}\delta f(\delta/n)$ queries in expectation.

The overall error probability is at most $\mathbb{E}[(n-1+t)\sigma]\leq n\sigma=\delta$ . ∎

The following lemma allows us to construct algorithms with guaranteed tail behavior for the number of queries. This will be helpful for concentration bounds.

Lemma 18.

Suppose we have an algorithm $\mathcal{A}$ (solving a certain task) that has error probability $\leq\delta$ and number of queries $\tau$ with $\mathbb{E}[\tau]\leq m$ , and uses $\leq r$ random bits always. Then we can construct an algorithm $\mathcal{B}$ solving the same task satisfying the following properties:

•

$\mathcal{B}$ has error probability $(1+o_{m}(1))\delta$ ;
•

Let $\rho$ be the number of queries algorithm $\mathcal{B}$ uses. Then

$\displaystyle\mathbb{E}[\rho]$ $\displaystyle\leq(1+o_{m}(1))m,$

$\displaystyle\mathbb{E}[\rho^{2}]$ $\displaystyle=O(m^{3}).$

•

Let $r^{\prime}$ be the number of random bits algorithm $\mathcal{B}$ uses. Then

	$\displaystyle\mathbb{E}[r^{\prime}]$	$\displaystyle\leq(1+o_{m}(1))r,$
	$\displaystyle\mathbb{E}[(r^{\prime})^{2}]$	$\displaystyle=O(r^{2}).$

Proof.

Let $k$ be a parameter to be chosen later. Consider the following algorithm $\mathcal{B}$ :

1.

Run algorithm $\mathcal{A}$ until it halts, or it is about to use the $(k+1)$ -th query.
2.

If $\mathcal{A}$ halts, then we return the return value of $\mathcal{A}$ ; otherwise, we restart the whole algorithm.

Let us compute the probability of a restart. By Markov’s inequality,

\displaystyle\mathbb{P}[\text{restart}]=\mathbb{E}[\tau>k]\leq\frac{m}{k}.

Algorithm $\mathcal{B}$ ’s error probability $P_{e}(\mathcal{B})$ is

\displaystyle P_{e}(\mathcal{B})=\mathbb{P}[\text{$\mathcal{A}$ errs}\mid\tau\leq k]\leq\frac{\mathbb{P}[\text{$\mathcal{A}$ errs}]}{\mathbb{P}[\tau\leq k]}\leq\frac{\delta}{1-\frac{m}{k}}.

The expected number of queries $\rho$ used by Algorithm $B$ satisfies

	$\displaystyle\mathbb{E}[\rho]$	$\displaystyle=\mathbb{E}[\tau\land k]+\mathbb{P}[\text{restart}]\mathbb{E}[\rho]$
		$\displaystyle\leq m+\frac{m}{k}\mathbb{E}[\rho].$

Solving this, we get

\displaystyle\mathbb{E}[\rho]\leq\frac{m}{1-\frac{m}{k}}.

The second moment satisfies

	$\displaystyle\mathbb{E}[\rho^{2}]\leq\mathbb{E}[(\tau\land k)^{2}]+\mathbb{P}[\text{restart}]\mathbb{E}[(k+\rho)^{2}]$
	$\displaystyle\leq k^{2}+\frac{m}{k}\cdot(\mathbb{E}[\rho^{2}]+2k\mathbb{E}[\rho]+k^{2})$
	$\displaystyle\leq k^{2}+mk+\frac{2m^{2}}{1-\frac{m}{k}}+\frac{m}{k}\cdot\mathbb{E}[\rho^{2}].$

Solving this, we get

\displaystyle\mathbb{E}[\rho^{2}]\leq\left(1-\frac{m}{k}\right)^{-1}\left(k^{2}+mk+\frac{2m^{2}}{1-\frac{m}{k}}\right).

Similar to $\rho$ , we have

	$\displaystyle\mathbb{E}[r^{\prime}]$	$\displaystyle\leq r+\mathbb{P}[\text{restart}]\mathbb{E}[r^{\prime}]$
		$\displaystyle\leq r+\frac{m}{k}\cdot\mathbb{E}[r^{\prime}].$

Solving this we get

\displaystyle\mathbb{E}[r^{\prime}]\leq\frac{r}{1-\frac{m}{k}}.

Also,

	$\displaystyle\mathbb{E}[(r^{\prime})^{2}]$	$\displaystyle\leq r^{2}+\mathbb{P}[\text{restart}]\mathbb{E}[(r+r^{\prime})^{2}]$
		$\displaystyle\leq r^{2}+\frac{m}{k}(r^{2}+2r\mathbb{E}[r^{\prime}]+\mathbb{E}[(r^{\prime})^{2}])$
		$\displaystyle\leq r^{2}+\frac{m}{k}\cdot r^{2}+\frac{m}{k}\cdot\frac{2r^{2}}{1-\frac{m}{k}}+\frac{m}{k}\cdot\mathbb{E}[(r^{\prime})^{2}].$

Solving this we get

\displaystyle\mathbb{E}[(r^{\prime})^{2}]\leq(1-\frac{m}{k})^{-1}(r^{2}+\frac{m}{k}\cdot r^{2}+\frac{m}{k}\cdot\frac{2r^{2}}{1-\frac{m}{k}}).

Choosing $k=m\log m$ , we have $1/(1-m/k)=1+o_{m}(1)$ , so we get

	$\displaystyle P_{e}(\mathcal{B})$	$\displaystyle\leq(1+o_{m}(1))\delta,$
	$\displaystyle\mathbb{E}[\rho]$	$\displaystyle\leq(1+o_{m}(1))m,$
	$\displaystyle\mathbb{E}[\rho^{2}]$	$\displaystyle\leq O(m^{2}\log^{2}m)=O(m^{3}),$
	$\displaystyle\mathbb{E}[r^{\prime}]$	$\displaystyle\leq(1+o_{m}(1))r,$
	$\displaystyle\mathbb{E}[(r^{\prime})^{2}]$	$\displaystyle\leq O(r^{2}).$

∎

Using Lemma 18, we are able to construct “safe” (i.e., with guaranteed tail behavior) versions of algorithms introduced before.

Corollary 19 (Safe Algorithms).

Given $\delta\in(0,1)$ , there exists a randomized algorithm SafeBinarySearch for $\mathsf{NoisyBinarySearch}(n)$ , with error probability $\leq\delta$ , with the number of queries $\tau$ satisfying

\displaystyle\mathbb{E}[\tau]=(1+o(1))\left(\frac{\log n+O(\log\frac{1}{\delta})}{I(p)}\right),\quad\operatorname{\mathrm{Var}}[\tau]=O(\log^{3}n+\log^{3}\frac{1}{\delta}).

and with the number of used random bits $r$ satisfying

\displaystyle\mathbb{E}[r]=O(\log n),\quad\operatorname{\mathrm{Var}}[r]=O(\log^{2}n).

Given $\delta\in(0,1)$ , there exists a deterministic algorithm SafeLessThan which compares two elements with error probability $\leq\delta$ , with the number of queries $\tau$ satisfying

\displaystyle\mathbb{E}[\tau]=(1+o_{1/\delta}(1))\frac{\log\frac{1}{\delta}}{(1-2p)\log\frac{1-p}{p}},\quad\operatorname{\mathrm{Var}}[\tau]=O(\log^{3}\frac{1}{\delta}).

Given $\delta=O(\frac{1}{n^{3}})$ , there exists a randomized algorithm SafeWeakSort for $\mathsf{NoisySorting}(n)$ , with error probability $\leq\delta$ , with the number of queries $\tau$ satisfying

\displaystyle\mathbb{E}[\tau]=O(n\log n)+(1+o(1))\frac{n\log\frac{1}{\delta}}{(1-2p)\log\frac{1-p}{p}},\quad\operatorname{\mathrm{Var}}[\tau]=O(n^{3}\log^{3}\frac{1}{\delta}).

and with the number of used random bits $r$ satisfying

\displaystyle\mathbb{E}[r]=O(n\log n),\quad\operatorname{\mathrm{Var}}[r]=O(n^{2}\log^{2}n).

Proof.

Apply Lemma 18 to Theorem 10, Lemma 13, and Lemma 17 respectively. ∎

We introduce our last subroutine below, which is used to sort a subset of $\Theta(\frac{n}{\log n})$ elements in the main algorithm.

Lemma 20.

There exists a randomized algorithm SafeSimpleSort for $\mathsf{NoisySorting}(n)$ with error probability $o(1)$ which always uses $O(n\log n)$ queries and $O(n\log n)$ random bits always.

Proof.

Consider the following algorithm $\mathcal{A}$ :

•

Keep an array of the first $i$ elements for $i$ from $1$ to $n$ . Each time a new element needs to be inserted, the algorithm uses SafeBinarySearch to find the correct position, with error probability $\frac{1}{n\log n}$ .
•

Output the resulting array.

We have $(n-1)$ calls to SafeBinarySearch. Let $E_{i}$ be the event that the $i$ -th call to SafeBinarySearch returns the correct value, and let $E$ be the event that all $E_{i}$ happens. By union bound, probability of error $\mathbb{P}[\neg E]$ is at most $(n-1)\cdot\frac{1}{n\log n}=o(1)$ .

Let $\tau_{i}$ be the number of queries used in the $i$ -th call to SafeBinarySearch. By Corollary 19 Part 1, we have

\displaystyle\mathbb{E}[\tau_{i}\mid E]=\mathbb{E}\left[\tau_{i}\mid\bigwedge_{1\leq j\leq i}E_{j}\right]\leq\frac{\mathbb{E}\left[\tau_{i}\mid\bigwedge_{1\leq j<i}E_{j}\right]}{\mathbb{P}\left[E_{i}\mid\bigwedge_{1\leq j<i}E_{j}\right]}

\displaystyle=O(\log n),

and

	$\displaystyle\operatorname{\mathrm{Var}}[\tau_{i}\mid E]$	$\displaystyle=\operatorname{\mathrm{Var}}\left[\tau_{i}\mid\bigwedge_{1\leq j\leq i}E_{j}\right]\leq\mathbb{E}\left[\tau_{i}^{2}\mid\bigwedge_{1\leq j\leq i}E_{j}\right]\leq\frac{\mathbb{E}\left[\tau_{i}^{2}\mid\bigwedge_{1\leq j<i}E_{j}\right]}{\mathbb{P}\left[E_{i}\mid\bigwedge_{1\leq j<i}E_{j}\right]}$
		$\displaystyle\leq O\left(\operatorname{\mathrm{Var}}\left[\tau_{i}\mid\bigwedge_{1\leq j<i}E_{j}\right]+\mathbb{E}\left[\tau_{i}\mid\bigwedge_{1\leq j<i}E_{j}\right]^{2}\right)=O(\log^{3}n).$

Conditioned on $E$ , $\tau_{i}$ ’s are independent because they use disjoint source of randomness. Thus, by Chebyshev’s inequality,

\displaystyle\mathbb{P}\left[\sum_{i\in[n-1]}\tau_{i}\geq\sum_{i\in[n-1]}\mathbb{E}[\tau_{i}]+n^{2/3}\mid E\right]\leq O\left(\frac{n\log^{3}n}{\left(n^{2/3}\right)^{2}}\right)=o(1).

Let us define $m=\sum_{i\in[n-1]}\mathbb{E}[\tau_{i}\mid E]+n^{2/3}=O(n\log n)$ .

If the number of random bits used in the $i$ -th call to SafeBinarySearch is $r_{i}$ , then we can similarly show

\displaystyle\mathbb{P}\left[\sum_{i\in[n-1]}r_{i}\geq\sum_{i\in[n-1]}\mathbb{E}[r_{i}]+n^{2/3}\mid E\right]=o(1).

Let $R=\sum_{i\in[n-1]}\mathbb{E}[r_{i}\mid E]+n^{2/3}=O(n\log n)$ .

Consider the following algorithm (which is our SafeSimpleSort):

•

Run $\mathcal{A}$ until it finishes, or it is about the make the $(m+1)$ -th query, or it is about the use the $(R+1)$ -th random bit.
•

If $\mathcal{A}$ finishes, then return the output of $\mathcal{A}$ ; otherwise, output an arbitrary permutation.

Then SafeSimpleSort always makes at most $m=O(n\log n)$ queries, uses at most $R=O(n\log n)$ random bits, and has error probability $\mathbb{P}[\mathcal{A}\text{ errs}]+o(1)=o(1)$ . ∎

Finally, we give our main algorithm for noisy sorting. See Algorithm 3 for its description.

We first analyze its error probability.

Algorithm 3

1:procedure NoisySort(

A=(x_{1},\ldots,x_{n}),p

)

S\leftarrow\{-\infty,\infty\}

3: for

a\in A

4: Add

a

S

with probability

\frac{1}{\log n}

\triangleright

Fully independent random bits.

5: Sort

S

using SafeSimpleSort.

\triangleright

Fully independent random bits.

6: for

a\in A\setminus S

7: Use SafeBinarySearch with

\delta=\frac{1}{\log n}

to search the predecessor of

a

S

\triangleright

We use

n^{2/3}

pairwise independent hash families to feed the first

n^{2/3}

random bits of each SafeBinarySearch call, and errs if some call needs more than

n^{2/3}

random bits.

8: Denote the returned answer as

\hat{l}_{a}

X\leftarrow\emptyset

10:

\mathcal{A}\leftarrow S\setminus\{-\infty,\infty\}

11: for

l\in S\setminus\{\infty\}

12:

r\leftarrow

next element in

S

13:

B_{l}\leftarrow\{a\in A\setminus S:\hat{l}_{a}=l\}

14: if

|B_{l}|>6\log^{2}n

then

15:

X\leftarrow X\cup B_{l}

16: else

17: Sort

B_{l}

using SafeWeakSort with error probability

\frac{1}{n\log n}

\triangleright

We use

n^{2/3}

pairwise independent hash families to feed the first

n^{2/3}

random bits of each SafeWeakSort call, and errs if some call needs more than

n^{2/3}

random bits.

18: while

|B_{l}|>0

19:

x\leftarrow

first element in

B_{l}

20: if SafeLessThan(

x

l

\frac{1}{n\log n}

) then

21:

B\leftarrow B\setminus\{x\}

22:

X\leftarrow X\cup\{x\}

23: else

24: break

25: while

|B_{l}|>0

26:

x\leftarrow

last element in

B_{l}

27: if SafeLessThan(

r

x

\frac{1}{n\log n}

) then

28:

B_{l}\leftarrow B_{l}\setminus\{x\}

29:

X\leftarrow X\cup\{x\}

30: else

31: break

32: Add

B_{l}

\mathcal{A}

between

l

and

r

33: for

x\in X

34: Insert

x

\mathcal{A}

using SafeBinarySearch with

\delta=\frac{1}{n\log n}

\triangleright

We use

n^{2/3}

pairwise independent hash families to feed the first

n^{2/3}

random bits of each SafeBinarySearch call, and errs if some call needs more than

n^{2/3}

random bits.

35: return

\mathcal{A}

Lemma 21.

Algorithm 3 has error probability $o(1)$ .

Proof.

Let $E_{0}$ be the event that the algorithm does not err because of lack of random bits at Line 7, Line 17, or Line 34. Let $E_{1}$ be the event that SafeSimpleSort successfully sorts $S$ at Line 5; $E_{2}$ be the event that all calls of SafeWeakSort at Line 17 are correct; $E_{3}$ be the event that all calls of SafeLessThan at Lines 20 and Lines 27 are correct; $E_{4}$ be the event that all calls of SafeBinarySearch at Line 34 are correct.

We show that if $E_{0},E_{1},E_{2},E_{3},E_{4}$ all happen, then Algorithm 3 is correct. First of all, under $E_{1}$ , $S$ is correctly sorted at Line 5. Secondly, under $E_{2}$ and $E_{0}$ , for each $l$ , $B_{l}$ is correctly sorted at Line 17. Under $E_{3}$ and $E_{0}$ , at Line 32, all elements remaining in $B_{l}$ are indeed greater than $l$ and smaller than $r$ , so adding $B_{l}$ to $\mathcal{A}$ between $l$ and $r$ keeps all elements in $\mathcal{A}$ sorted. Therefore, before the for loop at Line 33, all elements in $\mathcal{A}$ are correctly sorted, and clearly these elements are exactly $A\setminus X$ . Finally, under $E_{4}$ and $E_{0}$ , the insertions made at Line 34 are all correct, so the final result is correct.

For each call of SafeBinarySearch at Line 7, the probability that it requires more than $n^{2/3}$ random bits is $O(\frac{\operatorname{polylog}n}{n^{4/3}})$ by Corollary 19 Part 1 and Chebyshev’s inequality. By union bound, with probability $1-o(1)$ , none of the calls of SafeBinarySearch at Line 7 causes the algorithm to err. We can similarly bound the probability the algorithm errs at Line 17 or Line 34. Overall, $\Pr[\neg E_{0}]\leq o(1)$ . Clearly, $\Pr[\neg E_{1}]\leq o(1)$ , $\Pr[\neg E_{2}]\leq\frac{1}{n\log n}\cdot(|S|-1)\leq\frac{1}{\log n}$ , $\Pr[\neg E_{3}]\leq\frac{1}{n\log n}\cdot(2n)=\frac{2}{\log n}$ and $\Pr[\neg E_{4}]\leq\frac{1}{n\log n}\cdot n=\frac{1}{\log n}$ . Thus, by union bound, the overall success probability of Algorithm 3 is at least $1-\frac{4}{\log n}-o(1)=1-o(1)$ . ∎

Let a bucket be the set of elements that are between two adjacent elements in the sorted order of $S$ . Then we have the following simple lemma.

Lemma 22.

With probability $\geq 1-\frac{1}{n}$ , all buckets have size at most $3\log^{2}n$ .

Proof.

For any continuous segment of length $L$ in the sorted order of $A$ , the probability that none of these element is in $S$ is $(1-1/\log n)^{L}\leq\exp(-L/\log n)$ . By union bound, the probability that there exists $L$ continuous elements not in $S$ is at most $n\exp(-L/\log n)$ . Taking $L=2\log_{e}n\cdot\log n\leq 3\log^{2}n$ , we see that with probability $\geq 1-\frac{1}{n}$ , all continuous segments of length $L$ have at least one element in $S$ , which implies all buckets have size $\leq 3\log^{2}n$ . ∎

Lemma 23.

With probability $1-o(1)$ , Algorithm 3 uses at most $(1+o(1))\left(\frac{1}{I(p)}+\frac{1}{(1-2p)\log\frac{1-p}{p}}\right)n\log n$ queries.

Proof.

Because erring only makes the algorithm exit earlier, in the analysis we ignore the effect of erring caused by insufficient random bits. We define the following events:

•

Let $E_{1}$ be the event that $|S|\leq\frac{n}{\log n}+n^{2/3}=(1+o(1))\frac{n}{\log n}$ . By Chernoff bound, $\mathbb{P}[E_{1}]=1-o(1)$ .
•

Let $E_{2}$ be the event that all buckets have size at most $3\log^{2}n$ . By Lemma 22, $\mathbb{P}[E_{2}]=1-o(1)$ .
•

Let $E_{3}$ be the event that at most $\frac{n}{\log n}+n^{2/3}=(1+o(1))\frac{n}{\log n}$ elements $a$ have the wrong predecessor $\hat{l}_{a}$ at Line 7. By Chebshev’s inequality, $\mathbb{P}[E_{3}]=1-o(1)$ .
•

Let $E_{4}$ be the event that all calls to SafeLessThan on Line 20, 27 return the correct values. Because there are at most $2n$ calls, by union bound, $\mathbb{P}[E_{4}]=1-o(1)$ .
•

Let $E_{5}$ be the event that $|X|\leq O(\frac{n}{\log n})$ . If $E_{2}$ , $E_{3}$ , and $E_{4}$ all happen, then $E_{5}$ happens, for the following reasons. First, at Line 15, $B_{l}$ has size greater than $6\log^{2}n$ , while the sizes of all buckets are at most $3\log^{2}n$ conditioned on $E_{2}$ . Thus, at least half of the elements in $B_{l}$ have the wrong predecessor $\hat{l}_{a}$ at Line 7. Thus, the amount of elements added to $X$ at Line 15 is at most twice the number of elements $a$ with the wrong predecessor $\hat{l}_{a}$ , which is bounded by $O(\frac{n}{\log n})$ conditioned on $E_{3}$ . Also, if an element $x$ is added to $X$ at Line 22 or 29 and $E_{4}$ happens, $x$ must also have the wrong predecessor. Thus, overall, $|X|\leq O(\frac{n}{\log n})$ if $E_{2}$ , $E_{3}$ , and $E_{4}$ all happen.

We make queries in the following lines: Line 5, 7, 17, 20, 27, 34. Let us consider them separately.

Line 5.

By Lemma 20, conditioned on $E_{1}$ , with probability $1-o(1)$ , Line 5 uses $O(n)$ queries.

Line 7.

We have at most $n$ calls to SafeBinarySearch whose number of queries are independent. By Corollary 19 Part 1,

\mathbb{E}[\text{number of queries}]=(1+o(1))\frac{n\log n}{I(p)}.

By Corollary 19 Part 1 and Chebyshev’s inequality,

\mathbb{P}\left[\text{number of queries}>\mathbb{E}[\text{number of queries}\right]+n^{2/3}]=o(1).

Line 17.

We have a few calls to SafeWeakSort where every input length is at most $6\log^{2}n$ , and total input length is at most $n$ . By Corollary 19 Part 3,

\mathbb{E}[\text{number of queries}]=(1+o(1))\frac{n\log n}{(1-2p)\log\frac{1-p}{p}}.

By Corollary 19 Part 3 and Chebyshev’s inequality,

\mathbb{P}[\text{number of queries}>\mathbb{E}[\text{number of queries}]+n^{2/3}]=o(1).

Line 20, 27.

Conditioned on $E_{1}$ , $E_{3}$ , $E_{4}$ , the number of calls to SafeLessThan is

\displaystyle O(|S|)+O(\text{number of elements in the wrong bucket})=O(\frac{n}{\log n}).

By Corollary 19 Part 2,

\mathbb{E}[\text{number of queries}]=O(n).

By Corollary 19 Part 2 and Chebyshev’s inequality,

\mathbb{P}[\text{number of queries}>\mathbb{E}[\text{number of queries}]+n^{2/3}]=o(1).

Line 34.

Conditioned on $E_{5}$ , the number of calls to SafeBinarySearch is $O(\frac{n}{\log n})$ . By Corollary 19 Part 1,

\mathbb{E}[\text{number of queries}]=O(n)

By Corollary 19 Part 1 and Chebyshev’s inequality,

\mathbb{P}[\text{number of queries}>\mathbb{E}[\text{number of queries}]+n^{2/3}]=o(1).

Summarizing the above, excluding events with $o(1)$ probability in total, the total number of queries made is

	$\displaystyle O(n)+(1+o(1))\frac{n\log n}{I(p)}+(1+o(1))\frac{n\log n}{(1-2p)\log\frac{1-p}{p}}+O(n)+O(n)+O(n^{2/3})$
	$\displaystyle=(1+o(1))\left(\frac{n\log n}{I(p)}+\frac{n\log n}{(1-2p)\log\frac{1-p}{p}}\right).$

∎

Finally, we analyze the expected number of random bits used by the algorithm.

Lemma 24.

Algorithm 3 uses $O(n)$ random bits in expectation.

Proof.

We consider all the places Algorithm 3 uses randomness.

Line 4.

Any biased random bits can be simulated by $O(1)$ random bits in expectation [KY76], so Line 4 uses $O(n)$ random bits in expectation.

Line 5.

By Chernoff bound, $|S|\leq 100n/\log n$ with probability at least $1-n^{-2}$ . In this case, SafeSimpleSort uses $O(|S|\log|S|)=O(n)$ random bits. If $S>100n/\log n$ , SafeSimpleSort uses $O(n\log n)$ random bits, which contribute at most $O(n\log n\cdot n^{-2})$ to the overall expectation. Thus, Line 5 uses $O(n)$ random bits in expectation.

Line 7, Line 17, Line 34.

For each of the first $n^{2/3}$ outputted random bits, we use Lemma 14 to construct a pairwise independent hash family of size $n$ using $O(\log n)$ fresh random bits. The total number of fresh random bits needed is $O(n^{2/3}\log n)$ . ∎

Now we are ready to prove Theorem 15.

Proof of Theorem 15.

Consider the following algorithm, which we call SafeNoisySort:

•

Run NoisySort. Stop immediately if we are about to use $m$ queries, for some $m$ to be chosen later.
•

If we successfully completed NoisySort, then output the sorted array; otherwise, output any permutation.

By Lemma 23, we can choose $m=(1+o(1))\left(\frac{1}{I(p)}+\frac{1}{(1-2p)\log\frac{1-p}{p}}\right)n\log n$ so that with probability $1-o(1)$ , the NoisySort call completes successfully. By Lemma 21, with probability $1-o(1)$ , the NoisySort call is correct. By union bound, SafeNoisySort has error probability $o(1)$ and it always takes at most $m$ queries. By Lemma 24, number of random bits used is $O(n)$ in expectation. ∎

3.1 Trading Queries for Random Bits

In this section, we show how to use the randomized algorithm to construct a deterministic one.

See 1

Proof.

In our model, it is possible to generate unbiased random bits using noisy comparisons. Take two arbitrary elements $x,y$ , and call $\textsc{NoisyComparison}(x,y)$ twice. If the two returned values are different, we use the result of the first returned value as our random bit; otherwise, we repeat. Clearly, this procedure generates an unbiased random bit. The probability that this procedure successfully return a random bit after each two calls of NoisyComparison is $2p(1-p)$ , so the expected number of noisy comparisons needed to generate each random bit is $O(\frac{1}{2p(1-p)})=O(1)$ . Note that this procedure is deterministic.

Algorithm 3 uses $O(n)$ random bits in expectation, so we can use the above procedure to generate random bits, and the expected number of queries needed is $O(n)$ . We can halt the algorithm if the number of queries used this way exceeds $O(n\log\log n)$ , which only incurs an additional $o(1)$ error probability. ∎

4 Noisy Sorting Lower Bound

In this section, we show the lower bound of noisy sorting. See 2

The following problem serves as an intermediate step towards the lower bound.

Definition 25 (Group Sorting Problem).

For some integers $k\mid n$ , we define problem $\mathsf{GroupSorting}(n,k)$ as follows: Given a list $L$ of $n$ elements, divided into $n/k$ groups $A_{1},\ldots,A_{n/k}$ of $k$ elements each, satisfying the property that for all $1\leq i<j\leq n/k$ , $x\in A_{i},y\in A_{j}$ , we have $x<y$ . An algorithm needs to output an ordered list of sets $A_{1},\ldots,A_{n/k}$ by asking the following type of queries $\textsc{GroupQuery}(x,y)$ :

•

If the two elements are in the same group, then the query result is $*$ .
•

Otherwise, suppose $x\in A_{i}$ , $y\in A_{j}$ with $i\neq j$ . Then the query result is $\textsc{NoisyComparison}(i,j)$ .

The following lemma relates GroupSorting and NoisySorting.

Lemma 26.

Fix any algorithm $\mathcal{A}$ for $\mathsf{NoisySorting}(n)$ with error probability $o(1)$ . Let $k\mid n$ . Let $L$ be the input list of $\mathsf{NoisySorting}(n)$ . Suppose we partition $L$ into $n/k$ groups $A_{1},\ldots,A_{n/k}$ where for all $1\leq i<j\leq n/k$ , for all $x\in A_{i}$ , $y\in A_{j}$ , we have $x<y$ . Let $U^{\neq}$ denote the number of queries $\mathcal{A}$ makes to elements in different groups.

Then there exists an algorithm for $\mathsf{GroupSorting}(n,k)$ with error probability $o(1)$ which makes at most $\mathbb{E}[U^{\neq}]+(n-n/k)$ queries in expectation.

Proof.

Given $\mathcal{A}$ , we design an algorithm $\mathcal{A}^{\prime}$ for $\mathsf{GroupSorting}(n,k)$ as follows.

Given input $L$ , Algorithm $\mathcal{A}^{\prime}$ picks an arbitrary strict total order $<_{T}$ on all the elements. $\mathcal{A}^{\prime}$ also maintains which elements are known to be in the same group, implied by queries that return $*$ . Then, it simulates Algorithm $\mathcal{A}$ on the same input $L$ as follows:

•
When $\mathcal{A}$ attempts to make a comparison between $x$ and $y$ :
- –
  
  If $x,y$ are not known to be in the same group, call $\textsc{GroupQuery}(x,y)$ . If the query returns $*$ , which means $x$ and $y$ are in the same group, return $\operatorname{\mathrm{BSC}}_{p}(\mathbbm{1}\{x<_{T}y\})$ to $\mathcal{A}$ . Otherwise, return the result of $\textsc{GroupQuery}(x,y)$ to $\mathcal{A}$ .
- –
  
  If $x,y$ are known to be in the same group, return $\operatorname{\mathrm{BSC}}_{p}(\mathbbm{1}\{x<_{T}y\})$ to $\mathcal{A}$ .
•

When $\mathcal{A}$ outputs a sequence $x_{1},\ldots,x_{n}$ : Let $A_{i}=\{x_{(i-1)k+1},\ldots,x_{ik}\}$ for $i\in[n/k]$ and output $A_{1},\ldots,A_{n/k}$ .

Let us analyze the error probability and number of queries made by algorithm $\mathcal{A}^{\prime}$ .

Error probability.

Algorithm $\mathcal{A}^{\prime}$ simulates a $\mathsf{NoisySorting}(n)$ instance on $L$ with the following total order:

•

If $x,y$ are in the same group, then $x<y$ iff $x<_{T}y$ ;
•

If $x\in A_{i}$ , $y\in A_{j}$ are in different groups, then $x<y$ iff $i<j$ .

Therefore, with error probability $o(1)$ , the sequence $x_{1},\ldots,x_{n}$ that $\mathcal{A}$ outputs is sorted with respect to the above total order, which gives the valid groups.

Number of queries.

Algorithm $\mathcal{A}^{\prime}$ makes one query when $\mathcal{A}$ makes a query for elements in different groups. Algorithm $\mathcal{A}^{\prime}$ makes queries between elements in the same group only when they are not known to be in the same group. Imagine an initially empty graph on vertex set $A$ where for each $\textsc{GroupQuery}(x,y)$ that returns $*$ , we add an edge between $x$ and $y$ . For every such query, the number of connected components in this graph decreases by $1$ , and at the end the graph has at least $n/k$ connected components. Thus, the total number of queries made by $\mathcal{A}^{\prime}$ to elements in the same group is at most $n-n/k$ .

Overall, the total number of queries made by Algorithm $\mathcal{A}^{\prime}$ is at most $\mathbb{E}[U^{\neq}]+(n-n/k)$ in expectation. ∎

Next we prove a lower bound for the Group Sorting Problem. By Lemma 26, this implies a lower bound of the number of queries made between elements in different groups of any Noisy Sorting algorithm.

Lemma 27.

Let $k\mid n$ with $k=n^{o(1)}$ . Any (deterministic or randomized) algorithm for $\mathsf{GroupSorting}(n,k)$ with error probability $o(1)$ makes at least $(1-o(1))\frac{n\log n}{I(p)}$ queries in expectation, even if the input is a uniformly random permutation.

Proof.

Fix an algorithm $\mathcal{A}$ for $\mathsf{GroupSorting}(n,k)$ . WLOG assume that $\mathcal{A}$ makes queries between elements in the same group only when they are not known to be in the same group. Therefore, the total number of queries between elements in the same group is at most $n-n/k$ .

Let $X=(A_{1},\ldots,A_{n/k})$ be the true partition. Let $m$ be the (random) number of queries made. Let $Q^{m}$ be the queries made. Let $Y^{m}$ be the returned values of the queries. Let $\hat{X}$ be the most probable $X$ given all query results. Note that $\hat{X}$ is a function of $(Q^{m},Y^{m})$ .

When $k=n^{o(1)}$ , we have $H(X)=\log n!-\frac{n}{k}\log k!=(1-o(1))n\log n$ .

By Fano’s inequality, $H(X\mid\hat{X})\leq 1+o(n\log n)$ , so

\displaystyle I(X;Q^{m},Y^{m})\geq I(X;\hat{X})=H(X)-H(X\mid\hat{X})\geq(1-o(1))n\log n.

(1)

On the other hand,

$\displaystyle I(X;Q^{m},Y^{m})$	$\displaystyle=\sum_{i\geq 1}\mathbb{P}[m\geq i]I(X;Q_{i},Y_{i}\mid Q^{i-1},Y^{i-1},m\geq i)$	(chain rule of mutual information)
	$\displaystyle=\sum_{i\geq 1}\mathbb{P}[m\geq i]I(X;Y_{i}\mid Q_{i},Q^{i-1},Y^{i-1},m\geq i)$	( $Q_{i}$ and $X$ are independent conditioned on $(Q^{i-1},Y^{i-1})$ )
	$\displaystyle\leq\sum_{i\geq 1}\mathbb{P}[m\geq i]I(X,Q_{i},Q^{i-1},Y^{i-1};Y_{i}\mid m\geq i)$	(chain rule)
	$\displaystyle=\sum_{i\geq 1}\mathbb{P}[m\geq i]I(X,Q_{i};Y_{i}\mid m\geq i)$	( $(Q^{i-1},Y^{i-1})$ and $Y_{i}$ are independent conditioned on $(X,Q_{i})$ )
	$\displaystyle=\sum_{i\geq 1}\mathbb{P}[m\geq i]\left(H(Y_{i}\mid m\geq i)-H(Y_{i}\mid X,Q_{i},m\geq i)\right)$	(2)

Let $q_{i}=\mathbb{P}[Y_{i}=*\mid m\geq i]$ . Because the algorithm makes at most $n-n/k$ queries between elements in the same group, we have

\displaystyle\sum_{i\geq 1}\mathbb{P}[m\geq i]q_{i}\leq n-n/k.

We have

\displaystyle H(Y_{i}\mid m\geq i)\leq h\left(q_{i},\frac{1-q_{i}}{2},\frac{1-q_{i}}{2}\right)

and

\displaystyle H(Y_{i}\mid X,Q_{i},m\geq i)=0\cdot q_{i}+h(p)(1-q_{i}).

Note that $H(Y_{i}\mid m\geq i)\leq\log 3$ trivially, so by Equation (2), $I(X;Q^{m},Y^{m})\leq\log 3\cdot\sum_{i\geq 1}\mathbb{P}[m\geq i]=\log 3\cdot\mathbb{E}[m]$ . Combining with Equation (1), we have $\mathbb{E}[m]=\Omega(n\log n)$ .

Note that $g(x):=h(x,\frac{1-x}{2},\frac{1-x}{2})$ is concave in $x$ , so

$\displaystyle\sum_{i\geq 1}\mathbb{P}[m\geq i]H(Y_{i}\mid m\geq i)$	$\displaystyle\leq\left(\sum_{i\geq 1}\mathbb{P}[m\geq i]\right)g\left(\frac{\sum_{i\geq 1}\mathbb{P}[m\geq i]q_{i}}{\sum_{i\geq 1}\mathbb{P}[m\geq i]}\right)$	(Jensen’s inequality)
	$\displaystyle\leq\mathbb{E}[m]\cdot g\left(\min\left\{\frac{1}{3},\frac{n-n/k}{\mathbb{E}[m]}\right\}\right)$	( $g$ is increasing on $[0,\frac{1}{3}]$ and is maximized at $\frac{1}{3}$ )
	$\displaystyle=(1+o(1))\mathbb{E}[m].$	( $n=o(\mathbb{E}[m])$ )

On the other hand,

	$\displaystyle\sum_{i\geq 1}\mathbb{P}[m\geq i]H(Y_{i}\mid X,Q_{i},m\geq i)$	$\displaystyle=\sum_{i\geq 1}\mathbb{P}[m\geq i]h(p)(1-q_{i})$
		$\displaystyle=h(p)\left(\sum_{i\geq 1}\mathbb{P}[m\geq i]-\sum_{i\geq 1}\mathbb{P}[m\geq i]q_{i}\right)$
		$\displaystyle\geq h(p)(\mathbb{E}[m]-n)$
		$\displaystyle=(1-o(1))h(p)\mathbb{E}[m].$

Plugging the above two inequalities in Equation (2), we get

	$\displaystyle I(X;Q^{m},Y^{m})$	$\displaystyle\leq(1+o(1))\mathbb{E}[m]-(1-o(1))h(p)\mathbb{E}[m]$
		$\displaystyle\leq(1+o(1))I(p)\mathbb{E}[m].$

Combining with Equation (1), we get $\mathbb{E}[m]\geq(1-o(1))\frac{n\log n}{I(p)}$ . ∎

Before we give lower bound for the number of queries made to elements in the same group, we first show the following technical lemma that will be useful later.

Lemma 28.

Let $(X_{n})_{n\geq 0}$ be a Markov process defined as

\displaystyle X_{0}=0,\quad X_{n+1}=X_{n}+\left\{\begin{array}[]{ll}+1&\text{w.p.}~{}1-p,\\ -1&\text{w.p.}~{}p.\end{array}\right.

Let $\tau$ be a random variable supported on $\mathbb{Z}_{\geq 0}$ . Suppose there exists $\delta>0$ such that

\displaystyle\mathbb{E}[\tau]\leq(1-\delta)m/(1-2p).

Then

\displaystyle\mathbb{P}[X_{\tau}\leq m]\geq\delta/2-o_{m}(1).

Proof.

Let $E_{1}$ be the event that $\tau\leq(1-\delta/2)m/(1-2p)$ . Let $E_{2}$ be the event that $X_{t}\leq m$ for all $t\leq(1-\delta/2)m/(1-2p)$ . By Markov’s inequality, $\mathbb{P}[E_{1}]\geq\delta/2$ . By concentration inequalities, $\mathbb{P}[E_{2}]\geq 1-o_{m}(1)$ . Using union bound, we have

\displaystyle\mathbb{P}[X_{\tau}\leq m]\geq\mathbb{P}[E_{1}\land E_{2}]\geq\delta/2-o_{m}(1).

∎

Lemma 29.

Fix any (deterministic or randomized) algorithm for $\mathsf{NoisySorting}(n)$ with error probability $o(1)$ . Let $k\mid n$ . Let $L$ be the input list of $\mathsf{NoisySorting}(n)$ . Suppose we partition $L$ into $n/k$ groups $A_{1},\ldots,A_{n/k}$ where for all $1\leq i<j\leq n/k$ , for all $x\in A_{i}$ , $y\in A_{j}$ , we have $x<y$ . Let $U^{=}$ denote the number of queries made to elements in the same group.

Then $\mathbb{E}[U^{=}]\geq(1-o(1))\frac{(n-n/k)\log(n-n/k)}{(1-2p)\log\frac{1-p}{p}}$ , even if the input is a uniformly random permutation.

Proof.

Suppose the true order is $\sigma=(x_{1},\ldots,x_{n})$ . Let $S=\{i\in[n]:(n/k)\nmid i\}$ . Note that $|S|=n-n/k$ . For $i\in S$ , define $W_{i}$ to be the number of queries returning $x_{i}<x_{i+1}$ , minus the number of queries returning $x_{i}>x_{i+1}$ . Note that $W_{i}$ is a random variable depending on the transcript. Define $W=\sum_{i\in S}W_{i}$ . Then $\mathbb{E}[W]=(1-2p)\mathbb{E}[U^{=}]$ .

Let us consider the posterior probabilities $q$ of the permutations given a fixed transcript. That is, define

q_{\tau}:=\mathbb{P}[\sigma=\tau\mid Q^{m},Y^{m}].

By Bayes’ rule, $q_{\tau}$ equals $\frac{\mathbb{P}[Q^{m},Y^{m}\mid\sigma=\tau]\mathbb{P}[\sigma=\tau]}{\mathbb{P}[Q^{m},Y^{m}]}$ . By symmetry, $\mathbb{P}[\sigma=\tau]$ are equal for different $\sigma$ and $\mathbb{P}[Q^{m},Y^{m}]$ is constant as we fixed the transcript, so $q_{\tau}$ is proportional to $\mathbb{P}[Q^{m},Y^{m}\mid\sigma=\tau]$ . If $\tau=(y_{1},\ldots,y_{n})$ , then $\mathbb{P}[Q^{m},Y^{m}\mid\sigma=\tau]$ equals

\prod_{1\leq i<j\leq n}(1-p)^{(\text{\#queries returning }y_{i}<y_{j})}\cdot\prod_{1\leq i<j\leq n}p^{(\text{\#queries returning }y_{i}>y_{j})}.

(3)

For $i\in S$ , consider the permutation $\tau_{i}:=(x_{1},\ldots,x_{i-1},x_{i+1},x_{i},x_{i+2},\ldots,x_{n})$ . Then by comparing $q_{\tau_{i}}$ and $q_{\sigma}$ using (3), we have

	$\displaystyle\frac{q_{\sigma}}{q_{\tau_{i}}}$	$\displaystyle=\frac{(1-p)^{(\text{\#queries returning }x_{i}<x_{i+1})}p^{(\text{\#queries returning }x_{i}>x_{i+1})}}{(1-p)^{(\text{\#queries returning }x_{i}>x_{i+1})}p^{(\text{\#queries returning }x_{i}<x_{i+1})}}$
		$\displaystyle=\left(\frac{1-p}{p}\right)^{(\text{\#queries returning }x_{i}<x_{i+1})}\left(\frac{1-p}{p}\right)^{-(\text{\#queries returning }x_{i}>x_{i+1})}$
		$\displaystyle=\left(\frac{1-p}{p}\right)^{W_{i}}.$

Thus,

\displaystyle q_{\tau_{i}}=q_{\sigma}\cdot\left(\frac{1-p}{p}\right)^{-W_{i}}.

Summing over $S$ , we have

	$\displaystyle\sum_{i\in S}q_{\tau_{i}}$	$\displaystyle=q_{\sigma}\sum_{i\in S}\left(\frac{1-p}{p}\right)^{-W_{i}}$
		$\displaystyle\geq q_{\sigma}\|S\|\left(\frac{1-p}{p}\right)^{-W/\|S\|}.$

Because the error probability is $o(1)$ , for any $\epsilon>0$ , with probability $1-o(1)$ , $q_{\sigma}\geq 1-\epsilon$ . So

\displaystyle|S|\left(\frac{1-p}{p}\right)^{-W/|S|}\leq\epsilon/(1-\epsilon)

with probability $1-o(1)$ .

On the other hand, for any $\delta>0$ , if

\displaystyle\mathbb{E}[U^{=}]\leq(1-\delta)\frac{|S|\log|S|}{(1-2p)\log\frac{1-p}{p}},

then by regarding $U^{=}$ as $\tau$ and $W$ as $X_{\tau}$ in Lemma 28, with probability $\Omega(\delta)-o_{|S|}(1)$ , we have

\displaystyle W\leq\frac{|S|\log|S|}{\log\frac{1-p}{p}}

and then

\displaystyle|S|\left(\frac{1-p}{p}\right)^{-W/|S|}\geq 1,

which is a contradiction.

Thus,

\displaystyle\mathbb{E}[U^{=}]\geq(1-o(1))\frac{|S|\log|S|}{(1-2p)\log\frac{1-p}{p}},

as desired. ∎

Now we are ready to prove Theorem 2.

Proof of Theorem 2.

Fix an algorithm for $\mathsf{NoisySorting}(n)$ with error probability $o(1)$ .

Take $k=\log n$ . Let $n^{\prime}=k\lfloor n/k\rfloor$ . Clearly, an algorithm for $\mathsf{NoisySorting}(n)$ can be used to solve $\mathsf{NoisySorting}(n^{\prime})$ with less or equal number of queries in expectation. Let $U^{\neq}$ , $U^{=}$ be as defined in Lemma 26, Lemma 29 for $n^{\prime}$ and $k$ .

By Lemma 26 and Lemma 27, we have

\mathbb{E}[U^{\neq}]\geq(1-o(1))\frac{n^{\prime}\log n^{\prime}}{I(p)}-(n^{\prime}-n^{\prime}/k)=(1-o(1))\frac{n^{\prime}\log n^{\prime}}{I(p)}.

By Lemma 29, we have

\mathbb{E}[U^{=}]\geq(1-o(1))\frac{(n^{\prime}-n^{\prime}/k)\log(n^{\prime}-n^{\prime}/k)}{(1-2p)\log\frac{1-p}{p}}=(1-o(1))\frac{n^{\prime}\log n^{\prime}}{(1-2p)\log\frac{1-p}{p}}.

Therefore expected number of queries made by the algorithm is at least

	$\displaystyle\mathbb{E}[U^{\neq}]+\mathbb{E}[U^{=}]$	$\displaystyle\geq(1-o(1))\left(\frac{1}{I(p)}+\frac{1}{(1-2p)\log\frac{1-p}{p}}\right)n^{\prime}\log n^{\prime}$
		$\displaystyle=(1-o(1))\left(\frac{1}{I(p)}+\frac{1}{(1-2p)\log\frac{1-p}{p}}\right)n\log n.$

∎

5 Noisy Binary Search

In this section, we prove our lower and upper bounds for Noisy Binary Search.

5.1 Lower Bound

In this section, we prove our lower bound for Noisy Binary Search using techniques similar to our lower bound for Noisy Sorting. See 3

We define the following intermediate problem:

Definition 30 ( $\mathsf{OracleNoisyBinarySearch}(n)$ ).

Given a sorted list $L$ of $n$ elements and an element $x$ that is unequal to any element in the list, a solver needs to output the predecessor of $x$ in $L$ by asking the following type of query $\textsc{OracleQuery}(x,y)$ for $y\in L$ :

•

If $y$ is the predecessor of $x$ , output symbol $P$ ;
•

If $y$ is the successor of $x$ , output symbol $S$ ;
•

Otherwise, output $\textsc{NoisyComparison}(x,y)$ .

Lemma 31.

Fix an algorithm $\mathcal{A}$ for $\mathsf{NoisyBinarySearch}(n)$ with error probability $\delta$ . Let $L$ be the input list and $x$ be the input element of $\mathsf{NoisyBinarySearch}(n)$ . Let $U^{\not\approx}$ denote the number of noisy comparisons $\mathcal{A}$ makes between $x$ and elements that are not the predecessor or successor of $x$ .

Then there exists an algorithm for $\mathsf{OracleNoisyBinarySearch}(n)$ with error probability at most $\delta$ which makes at most $\mathbb{E}[U^{\not\approx}]+1$ queries in expectation.

Proof.

The algorithm for $\mathsf{OracleNoisyBinarySearch}(n)$ simulates $\mathcal{A}$ as follows:

•

Every time $\mathcal{A}$ attempts to make a $\textsc{NoisyComparison}(x,y)$ between some $x$ and some element $y$ in $L$ , the algorithm makes $\textsc{OracleQuery}(x,y)$ instead. If $\textsc{OracleQuery}(x,y)$ returns $P$ or $S$ , we have found the predecessor of $x$ , so we can return the predecessor and finish the execution; otherwise, we pass the result of $\textsc{OracleQuery}(x,y)$ (which will be $\textsc{NoisyComparison}(x,y)$ ) to $\mathcal{A}$ .
•

If $\mathcal{A}$ returns a predecessor or $x$ , then we also return the same predecessor, finishing the execution.

The error probability bound and number of queries easily follow. ∎

Lemma 32.

Any (deterministic or randomized) algorithm for $\mathsf{OracleNoisyBinarySearch}(n)$ with error probability $\leq\delta$ makes at least $(1-\delta-o(1))\frac{\log n}{I(p)}$ queries in expectation, even if the position of the element to search for is uniformly random.

Proof.

Fix an algorithm $\mathcal{A}$ for $\mathsf{OracleNoisyBinarySearch}(n)$ . WLOG, we can assume that $\mathcal{A}$ exits as soon as it sees a query that returns $P$ or $S$ , i.e., it sees at most one such query result. Also, we assume $\delta\leq 1-\Omega(1)$ , as otherwise the lower bound is trivially $0$ .

Let $X\in\{0,\ldots,n\}$ be the index of the true predecessor of $x$ . Let $m$ be the (random) number of queries made. Let $Q^{m}$ be the queries made. Let $Y^{m}$ be the returned values of the queries. Let $\hat{X}$ be the most probable $X$ given all query results. Note that $\hat{X}$ is a function of $(Q^{m},Y^{m})$ .

Clearly, $H(X)=\log(n+1)$ . By Fano’s inequality, $H(X\mid\hat{X})\leq 1+\delta\log(n)$ . Thus,

\displaystyle I(X;Q^{m},Y^{m})\geq I(X;\hat{X})=H(X)-H(X\mid\hat{X})\geq(1-\delta-o(1))\log n.

On the other hand, we also have

I(X;Q^{m},Y^{m})\leq\sum_{i\geq 1}\mathbb{P}[m\geq i]\left(H(Y_{i}\mid m\geq i)-H(Y_{i}\mid X,Q_{i},m\geq i)\right)

using the same inequalities in the proof of Lemma 27.

Let $q_{i}=\mathbb{P}[Y_{i}=P\mid m\geq i]$ and $r_{i}=\mathbb{P}[Y_{i}=S\mid m\geq i]$ . Because $\mathcal{A}$ exits as soon as it sees $P$ or $S$ we have

\displaystyle\sum_{i\geq 1}\mathbb{P}[m\geq i](q_{i}+r_{i})\leq 1.

Then

\displaystyle H(Y_{i}\mid m\geq i)\leq h\left(q_{i},r_{i},\frac{1-q_{i}-r_{i}}{2},\frac{1-q_{i}-r_{i}}{2}\right)\leq h\left(\frac{q_{i}+r_{i}}{2},\frac{q_{i}+r_{i}}{2},\frac{1-q_{i}-r_{i}}{2},\frac{1-q_{i}-r_{i}}{2}\right)

and

\displaystyle H(Y_{i}\mid X,Q_{i},m\geq i)=0\cdot(q_{i}+r_{i})+h(p)(1-q_{i}-r_{i}).

Note that $H(Y_{i}\mid m\geq i)\leq 2$ trivially, so $I(X;Q^{m},Y^{m})\leq 2\sum_{i\geq 1}\mathbb{P}[m\geq i]=2\mathbb{E}[m]$ . Combining with $I(X;Q^{m},Y^{m})\geq(1-\delta-o(1))\log n$ , we have $\mathbb{E}[m]=\Omega(\log n)$ (as we can assume $\delta\leq 1-\Omega(1)$ ).

Note that $g(x):=h(\frac{x}{2},\frac{x}{2},\frac{1-x}{2},\frac{1-x}{2})$ is concave in $x$ , so

$\displaystyle\sum_{i\geq 1}\mathbb{P}[m\geq i]H(Y_{i}\mid m\geq i)$	$\displaystyle\leq\left(\sum_{i\geq 1}\mathbb{P}[m\geq i]\right)g\left(\frac{\sum_{i\geq 1}\mathbb{P}[m\geq i](q_{i}+r_{i})/2}{\sum_{i\geq 1}\mathbb{P}[m\geq i]}\right)$	(Jensen’s inequality)
	$\displaystyle\leq\mathbb{E}[m]\cdot g\left(\min\left\{\frac{1}{2},\frac{1/2}{\mathbb{E}[m]}\right\}\right)$	( $g$ is increasing on $[0,\frac{1}{2}]$ and is maximum at $\frac{1}{2}$ )
	$\displaystyle=(1+o(1))\cdot\mathbb{E}[m].$	( $\frac{1/2}{\mathbb{E}[m]}=o(1)$ )

On the other hand,

	$\displaystyle\sum_{i\geq 1}\mathbb{P}[m\geq i]H(Y_{i}\mid X,Q_{i},m\geq i)$	$\displaystyle=\sum_{i\geq 1}\mathbb{P}[m\geq i]h(p)(1-q_{i}-r_{i})$
		$\displaystyle=h(p)\left(\sum_{i\geq 1}\mathbb{P}[m\geq i]-\sum_{i\geq 1}\mathbb{P}[m\geq i](q_{i}+r_{i})\right)$
		$\displaystyle\geq h(p)(\mathbb{E}[m]-1)$
		$\displaystyle=(1-o(1))h(p)\mathbb{E}[m].$

Therefore,

	$\displaystyle I(X;Q^{m},Y^{m})$	$\displaystyle\leq(1+o(1))\mathbb{E}[m]-(1-o(1))h(p)\mathbb{E}[m]$
		$\displaystyle\leq(1+o(1))I(p)\mathbb{E}[m].$

Combining with $I(X;Q^{m},Y^{m})\geq(1-\delta-o(1))\log n$ , we get $\mathbb{E}[m]\geq(1-\delta-o(1))\frac{\log n}{I(p)}$ . ∎

Lemma 33.

Fix any (deterministic or randomized) algorithm for $\mathsf{NoisyBinarySearch}(n)$ with error probability $\delta\leq 1/\log n$ . Let $L$ be the input list and $x$ be the input element of $\mathsf{NoisyBinarySearch}(n)$ . Let $U^{\approx}$ denote the number of noisy comparisons made between $x$ and it the predecessor and successor. Then $\mathbb{E}[U^{\approx}]\geq(2-o(1))\frac{\log\frac{1}{\delta}}{(1-2p)\log\frac{1-p}{p}}$ , even if the position of the element to search for is uniformly random.

Proof.

Let the input list contain elements $y_{1},\ldots,y_{n}$ in this order. Let $k$ be the index of the predecessor of $x$ . Define $W_{i}$ to be the number of queries returning $x<y_{i}$ , minus the number of queries returning $x>y_{i}$ . Note that $W_{i}$ is a random variable depending on the transcript. We will first show that $\mathbb{E}[U^{\approx}\mid k\not\in\{0,n\}]\geq(2-o(1))\frac{\log\frac{1}{\delta}}{(1-2p)\log\frac{1-p}{p}}$ . Notice that the error probability of the algorithm conditioned on $k\not\in\{0,n\}$ is at most $\frac{\delta}{\Pr[k\not\in\{0,n\}]}\leq 1.1\delta$ for sufficiently large $n$ .

Define $W=W_{k}+W_{k+1}$ . Then $\mathbb{E}[W]=(1-2p)\mathbb{E}[U^{\approx}]$ . Let us consider the posterior probabilities $q$ of the distributions of $k$ given a fixed transcript. That is,

q_{i}:=\mathbb{P}[k=i\mid Q^{m},Y^{m}].

Then if $k\not\in\{0,n\}$ , similarly to the proof of Lemma 29, we have

q_{k-1}=q_{k}\cdot\left(\frac{1-p}{p}\right)^{-W_{k}}\text{\quad and\quad}q_{k+1}=q_{k}\cdot\left(\frac{1-p}{p}\right)^{-W_{k+1}}

Thus, by Jensen’s inequality,

\displaystyle q_{k-1}+q_{k+1}\geq q_{k}\left(\left(\frac{1-p}{p}\right)^{-W_{k}}+\left(\frac{1-p}{p}\right)^{-W_{k+1}}\right)\geq 2q_{k}\left(\frac{1-p}{p}\right)^{-W/2}.

Suppose for the sake of contradiction that

\displaystyle\mathbb{E}[U^{\approx}\mid k\not\in\{0,1\}]\leq(2-\sigma)\frac{\log\frac{1}{\delta}}{(1-2p)\log\frac{1-p}{p}}

for some constant $\sigma>0$ . Then by Lemma 28 (taking $U^{\approx}$ as $\tau$ , $W$ as $X_{\tau}$ , $m$ as $\frac{(2-\sigma/4)\log\frac{1}{\delta}}{\log\frac{1-p}{p}}$ , $\delta$ as $1-\frac{2-\sigma}{2-\sigma/4}\geq\sigma/4$ ), with probability $\geq\sigma/8-o_{1/\delta}(1)$ , we have

\displaystyle W\leq\frac{(2-\sigma/4)\log\frac{1}{\delta}}{\log\frac{1-p}{p}},

and then

\displaystyle 2\left(\frac{1-p}{p}\right)^{-W/2}\geq 2\delta^{1-\sigma/8}.

Because the error probability is $\leq 1.1\delta$ , with probability at least $1-\frac{1.1\delta}{10\delta/\sigma}\geq 1-\frac{\sigma}{9}$ , we have $1-q_{k}\leq 10\delta/\sigma$ , which leads to $q_{k-1}+q_{k+1}\leq 10\delta/\sigma$ . Therefore,

2\left(\frac{1-p}{p}\right)^{-W/2}\leq\frac{q_{k-1}+q_{k+1}}{q_{k}}\leq\frac{10\delta/\sigma}{1-10\delta/\sigma}<20\delta/\sigma.

(In the last step we use $\delta\leq 1/\log n$ and thus $1-10\delta/\sigma>\frac{1}{2}$ for sufficiently large $n$ .) This is a contradiction because $2\delta^{1-\sigma/8}\geq 20\delta/\sigma$ and $\frac{\sigma}{8}+\left(1-\frac{\sigma}{9}\right)-o_{1/\delta}(1)>1$ when $\delta\leq 1/\log n$ and $n$ is large enough.

Thus,

\displaystyle\mathbb{E}[U^{\approx}\mid k\not\in\{0,n\}]\geq(2-o(1))\frac{\log\frac{1}{\delta}}{(1-2p)\log\frac{1-p}{p}}.

Consequently,

\displaystyle\mathbb{E}[U^{\approx}]\geq\mathbb{E}[U^{\approx}\mid k\not\in\{0,n\}]\cdot\Pr[k\not\in\{0,n\}]\geq(2-o(1))\frac{\log\frac{1}{\delta}}{(1-2p)\log\frac{1-p}{p}}.

∎

Now we are ready to prove Theorem 3.

Proof of Theorem 3.

If $\delta\geq 1/\log n$ , then by Lemma 31, 32,

	$\displaystyle\mathbb{E}[U]$	$\displaystyle\geq\mathbb{E}[U^{\approx}]$
		$\displaystyle\geq(1-\delta-o(1))\frac{\log n}{I(p)}$
		$\displaystyle\geq(1-o(1))\left((1-\delta)\frac{\log n}{I(p)}+\frac{2\log\frac{1}{\delta}}{(1-2p)\log\frac{1-p}{p}}\right).$

If $\delta\leq 1/\log n$ , then by combining Lemma 31, 32, 33, for any algorithm, the number of queries $U$ satisfies

	$\displaystyle\mathbb{E}[U]$	$\displaystyle\geq\mathbb{E}[U^{\approx}]+\mathbb{E}[U^{\not\approx}]$
		$\displaystyle\geq(1-\delta-o(1))\frac{\log n}{I(p)}+(2-o(1))\frac{\log\frac{1}{\delta}}{(1-2p)\log\frac{1-p}{p}}$
		$\displaystyle\geq(1-o(1))\left((1-\delta)\frac{\log n}{I(p)}+\frac{2\log\frac{1}{\delta}}{(1-2p)\log\frac{1-p}{p}}\right).$

∎

5.2 Upper Bound

In this section we prove our upper bound for Noisy Binary Search. See 4

Proof.

First, if $\delta>\frac{1}{\log n}$ , we output an arbitrary answer and halt with probability $\delta-\frac{1}{\log n}$ . Otherwise, we call Theorem 10 with error probability $\frac{1}{\log n}$ . By union bound, the overall error probability is at most $\delta$ . The expected number of noisy comparisons is

\left(1-\delta+\frac{1}{\log n}\right)\cdot(1+o(1))\left(\frac{\log n+O(\log\log n)}{I(p)}\right)=(1+o(1))(1-\delta)\frac{\log n}{I(p)}

as required. In the following, we assume $\delta\leq\frac{1}{\log n}$ . We also assume $\log n\geq 4$ for simplicity, as we could pad dummy elements.

Let $x$ be the element for which we need to find the predecessor. First, we use Theorem 10 with error probability $\frac{1}{\log n}$ to find a candidate predecessor, which takes $(1+o(1))\left(\frac{\log n+O(\log\log n)}{I(p)}\right)=(1+o(1))\cdot\frac{\log n}{I(p)}$ time. Then we use Lemma 13 to compare $x$ with this candidate predecessor $l$ and with the next element of this candidate predecessor $r$ , with error probability $\delta/4$ . This takes $(2+o_{4/\delta}(1))\cdot\frac{\log(4/\delta)}{(1-2p)\log((1-p)/p)}=(2+o(1))\cdot\frac{\log(1/\delta)}{(1-2p)\log((1-p)/p)}$ time. If the comparison results are $l<x$ and $x<r$ , we return $l$ as the predecessor of $x$ , otherwise, we restart. See Algorithm 4.

Algorithm 4

1:procedure NoisyBinarySearch(

n,A,x,\delta

)

2: while true do

3: Use Theorem 10 with error probability

\frac{1}{\log n}

to search the predecessor

l

x

A\cup\{-\infty\}

r\leftarrow

next element of

l

A\cup\{\infty\}

5: if LessThan(

l,x,\delta/4

) and LessThan(

x,r,\delta/4

) then return

l

Let $\mathbb{P}[\text{restart}]$ be the probability that we restart a new iteration of the while loop. Clearly, if the call to Theorem 10 and both calls to LessThan are correct, then we will not restart. Thus, $\mathbb{P}[\text{restart}]\leq\frac{1}{\log n}+\frac{\delta}{2}\leq\frac{2}{\log n}$ . Note that as $\log n\geq 4$ , $\mathbb{P}[\text{restart}]\leq\frac{1}{2}$ .

Let $P_{e}$ be the probability that this algorithm returns the wrong predecessor. In each iteration of the while loop, the algorithm returns the wrong answer only if at least one of the two calls to LessThan are incorrect. Therefore, $P_{e}\leq\frac{\delta}{2}+\mathbb{P}[\text{restart}]P_{e}$ , so $P_{e}\leq\frac{\delta/2}{1-\mathbb{P}[\text{restart}]}\leq\delta$ .

Let $\mathbb{E}[Q]$ be the expected number of queries that this algorithm makes. Since the call to Theorem 10 uses $(1+o(1))\left(\frac{\log n+O(\log\log n)}{I(p)}\right)\leq(1+o(1))\frac{\log n}{I(p)}$ noisy comparisons in expectation, and each call to LessThan uses $(1+o_{1/\delta}(1))\frac{\log(4/\delta)}{(1-2p)\log\frac{1-p}{p}}\leq(1+o(1))\frac{\log(1/\delta)}{(1-2p)\log\frac{1-p}{p}}$ noisy comparisons in expectation (as $\delta\leq\frac{1}{\log n}$ ), we get that

\mathbb{E}[Q]\leq(1+o(1))\left(\frac{\log n}{I(p)}+\frac{2\log(1/\delta)}{(1-2p)\log\frac{1-p}{p}}\right)+\mathbb{E}[Q]\cdot\mathbb{P}[\text{restart}].

Therefore,

	$\displaystyle\mathbb{E}[Q]$	$\displaystyle\leq(1+o(1))\cdot\frac{1}{1-\frac{2}{\log n}}\cdot\left(\frac{\log n}{I(p)}+\frac{2\log(1/\delta)}{(1-2p)\log\frac{1-p}{p}}\right)$
		$\displaystyle\leq(1+o(1))\left(\frac{\log n}{I(p)}+\frac{2\log(1/\delta)}{(1-2p)\log\frac{1-p}{p}}\right),$

as desired. ∎

6 Discussions

In this section we discuss a few possible extensions of our results.

Varying $p$ .

In our main results for Noisy Sorting, we have assumed that $p$ remains constant as $n\to\infty$ . This assumption is necessary in several places of our proofs as discussed below and we leave it as an open problem to drop this assumption.

For $p\to\frac{1}{2}$ as $n\to\infty$ , Theorem 1 still holds. For the proof, one needs to suitably modify Corollary 12, Lemma 17, Lemma 18, Corollary 19, Lemma 20 to handle $p\to\frac{1}{2}$ correctly. For example, in our current statement of Corollary 19 we ignore constants related to $p$ in the expressions for $\operatorname{\mathrm{Var}}[\tau]$ . By choosing $m$ in Lemma 18 carefully, we can let $\operatorname{\mathrm{Var}}[\tau]=(\mathbb{E}[\tau])^{2}\operatorname{polylog}(n)$ even for growing $p$ , which suffices for the concentration results. For the lower bound (Theorem 2), Lemma 27 may become problematic. It seems possible to modify the proof of Lemma 27 to handle the case $I(p)=n^{-o(1)}$ . However, for $I(p)=n^{-\Omega(1)}$ a stronger method is needed.

For $p\to 0$ as $n\to\infty$ , the lower bounds hold without any change. For the algorithms, our derandomization method would result in too many queries when $p=O(1/\log n)$ .

We similarly leave the task of dropping the assumption that $p$ remains a constant in our Noisy Binary Search results as an open problem.

Non-uniform $p$ and semi-random adversary.

In our main results we have assumed that the probability of error in the comparisons remains constant. One possible extension is that in each noisy comparison, the result is flipped with some probability at most $p$ , where this probability can be adversarially chosen for each query. (The algorithm knows $p$ but does not know the flip probability for individual comparisons.) This model is sometimes called semi-random adversary in literature (e.g., [MWR18]). It is an interesting open problem to generalize our results to this model.

Dependency on $\delta$ .

In our bounds for Noisy Sorting, we simply used $o(1)$ as our error probability. We leave it as an open problem to generalize our bounds to the case where the error probability $\delta$ can also be specified.

BMS observation.

One natural extension is to replace $\operatorname{\mathrm{BSC}}$ observation noise with a Binary Memoryless Symmetric channel (BMS). That is, for every noisy comparison, the probability of error is chosen (independently) from a distribution $W$ on $[0,\frac{1}{2}]$ , and this probability is returned together with the comparison result. In this case our guess is that

\displaystyle(1\pm o(1))\left(\frac{1}{\mathbb{E}_{p\sim W}[I(p)]}+\frac{1}{\mathbb{E}_{p\sim W}[(1-2p)\log\frac{1-p}{p}]}\right)n\log n

noisy comparisons should be necessary and sufficient. When $W$ is supported on $[\epsilon,1/2-\epsilon]$ for some constant $\epsilon>0$ independent of $n$ , we expect it to be easier to adapt our proofs to BMS, whereas in the general case, issues described in the Varying $p$ discussion above can occur.

References

[AD91] Javed A. Aslam and Aditi Dhagat. Searching in the presence of linearly bounded errors. In Proceedings of the 23rd Annual ACM Symposium on Theory of Computing (STOC), pages 486–493, 1991.
[BH08] Michael Ben-Or and Avinatan Hassidim. The bayesian learner is optimal for noisy binary search (and pretty good for quantum as well). In Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 221–230, 2008.
[BK93] Ryan S. Borgstrom and S. Rao Kosaraju. Comparison-based search in the presence of errors. In Proceedings of the 25th Annual ACM Symposium on Theory of Computing (STOC), pages 130–136, 1993.
[BM86] Donald A. Berry and Roy F. Mensch. Discrete search with directional information. Oper. Res., 34(3):470–477, 1986.
[BM08] Mark Braverman and Elchanan Mossel. Noisy sorting without resampling. In Proceedings of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 268–276, 2008.
[BMW16] Mark Braverman, Jieming Mao, and S. Matthew Weinberg. Parallel algorithms for select and partition with noisy comparisons. In Proceedings of the 48th Annual ACM Symposium on Theory of Computing (STOC), pages 851–862, 2016.
[BZ74] M. V. Burnashev and K. Sh. Zigangirov. An interval estimation problem for controlled observations. Probl. Peredachi Inf., 10(3):51–61, 1974.
[CLRS22] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to algorithms. MIT press, 2022.
[CW77] J. Lawrence Carter and Mark N. Wegman. Universal classes of hash functions. In Proceedings of the 9th Annual ACM Symposium on Theory of Computing (STOC), pages 106–112, 1977.
[DGW92] Aditi Dhagat, Peter Gács, and Peter Winkler. On playing “twenty questions” with a liar. In Proceedings of the 3rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 16–22, 1992.
[DŁU21] Dariusz Dereniowski, Aleksander Łukasiewicz, and Przemysław Uznański. Noisy searching: simple, fast and correct. arXiv preprint arXiv:2107.05753, 2021.
[FJ59] Lester R. Ford Jr. and Selmer M. Johnson. A tournament problem. Am. Math. Mon., 66(5):387–389, 1959.
[FRPU94] Uriel Feige, Prabhakar Raghavan, David Peleg, and Eli Upfal. Computing with noisy information. SIAM J. Comput., 23(5):1001–1018, 1994.
[Gal78] Shmuel Gal. A stochastic search game. SIAM. J. Appl. Math., 34(1):205–210, 1978.
[GLLP17] Barbara Geissmann, Stefano Leucci, Chih-Hung Liu, and Paolo Penna. Sorting with recurrent comparison errors. In Proceedings of the 28th International Symposium on Algorithms and Computation (ISAAC), pages 38:1–38:12, 2017.
[GLLP19] Barbara Geissmann, Stefano Leucci, Chih-Hung Liu, and Paolo Penna. Optimal sorting with persistent comparison errors. In Proceedings of the 27th Annual European Symposium on Algorithms (ESA), pages 1–14, 2019.
[GLLP20] Barbara Geissmann, Stefano Leucci, Chih-Hung Liu, and Paolo Penna. Optimal dislocation with persistent errors in subquadratic time. Theory Comput. Syst., 64(3):508–521, 2020.
[Hor63] Michael Horstein. Sequential transmission using noiseless feedback. IEEE Trans. Inf. Theory, 9(3):136–143, 1963.
[Knu98] Donald E. Knuth. The art of computer programming, volume 3. Addison-Wesley Professional, 1998.
[KPSW11] Rolf Klein, Rainer Penninger, Christian Sohler, and David P. Woodruff. Tolerant algorithms. In Proceedings of the 19th Annual European Symposium on Algorithms (ESA), pages 736–747, 2011.
[KY76] D. Knuth and A. Yao. The complexity of nonuniform random number generation. In Algorithms and Complexity: New Directions and Recent Results, 1976.
[Mut94] S. Muthukrishnan. On optimal strategies for searching in presence of errors. In Proceedings of the 5th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 680–689, 1994.
[MWR18] Cheng Mao, Jonathan Weed, and Philippe Rigollet. Minimax rates and efficient algorithms for noisy sorting. In Algorithmic Learning Theory, pages 821–847. PMLR, 2018.
[Pel89] Andrzej Pelc. Searching with known error probability. Theor. Comput. Sci., 63(2):185–202, 1989.
[PW22] Yury Polyanskiy and Yihong Wu. Information Theory: From Coding to Learning. Cambridge University Press, 2022. Draft version available at https://people.lids.mit.edu/yp/homepage/data/itbook-export.pdf.
[RMK⁺80] Ronald L. Rivest, Albert R. Meyer, Daniel J. Kleitman, Karl Winklmann, and Joel Spencer. Coping with errors in binary search procedures. J. Comput. Syst. Sci., 20(3):396–404, 1980.
[WGW22] Ziao Wang, Nadim Ghaddar, and Lele Wang. Noisy sorting capacity. In Proceedings of the 2022 IEEE International Symposium on Information Theory (ISIT), pages 2541–2546, 2022.
[WGZW23] Ziao Wang, Nadim Ghaddar, Banghua Zhu, and Lele Wang. Noisy sorting capacity. arXiv preprint arXiv:2202.01446v2, 2023.

Optimal Bounds for Noisy Sorting

Abstract

1 Introduction

Theorem 1 (Noisy Sorting Upper Bound).

Theorem 2 (Noisy Sorting Lower Bound).

Noisy Binary Search.

Theorem 3 (Noisy Binary Search Lower Bound).

Theorem 4 (Noisy Binary Search Upper Bound).

Concurrent Work.

Other Related Works.

Acknowledgements.

1.1 Technical Overview

2 Preliminaries

2.1 Basic Notions

Notation 5.

Definition 6 (Noisy Comparison Query).

Definition 7 (Noisy Sorting Problem).

Definition 8 (Noisy Binary Search Problem).

Remark 9.

2.2 Known and Folklore Algorithms

Theorem 10 ([DŁU21, Corollary 1.6]).

Remark 11.

Corollary 12.

Proof.

Lemma 13.

Proof.

2.3 Pairwise Independent Hashing

Lemma 14 ([CW77]).

3 Noisy Sorting Algorithm

Theorem 15.

Lemma 16.

Proof.

Lemma 17.

Proof.

Lemma 18.

Proof.

Corollary 19 (Safe Algorithms).

Proof.

Lemma 20.

Proof.

Lemma 21.

Proof.

Lemma 22.

Proof.

Lemma 23.

Proof.

Line 5.

Line 7.

Line 17.

Line 20, 27.

Line 34.

Lemma 24.

Proof.

Line 4.

Line 5.

Line 7, Line 17, Line 34.

Proof of Theorem 15.

3.1 Trading Queries for Random Bits

Proof.

4 Noisy Sorting Lower Bound

Definition 25 (Group Sorting Problem).

Lemma 26.

Proof.

Error probability.

Number of queries.

Lemma 27.

Proof.

Lemma 28.

Proof.

Lemma 29.

Proof.

Proof of Theorem 2.

5 Noisy Binary Search

5.1 Lower Bound

Definition 30 (𝖮𝗋𝖺𝖼𝗅𝖾𝖭𝗈𝗂𝗌𝗒𝖡𝗂𝗇𝖺𝗋𝗒𝖲𝖾𝖺𝗋𝖼𝗁​(n)\mathsf{OracleNoisyBinarySearch}(n)).

Lemma 31.

Proof.

Lemma 32.

Proof.

Lemma 33.

Definition 30 ( $\mathsf{OracleNoisyBinarySearch}(n)$ ).

Varying $p$ .

Non-uniform $p$ and semi-random adversary.

Dependency on $\delta$ .