Optimal Bounds for Noisy Sorting
Abstract
Sorting is a fundamental problem in computer science. In the classical setting, it is well-known that comparisons are both necessary and sufficient to sort a list of elements. In this paper, we study the Noisy Sorting problem, where each comparison result is flipped independently with probability for some fixed . As our main result, we show that
noisy comparisons are both necessary and sufficient to sort elements with error probability using noisy comparisons, where is capacity of BSC channel with crossover probability . This simultaneously improves the previous best lower and upper bounds (Wang, Ghaddar and Wang, ISIT 2022) for this problem.
For the related Noisy Binary Search problem, we show that
noisy comparisons are both necessary and sufficient to find the predecessor of an element among sorted elements with error probability . This extends the previous bounds of (Burnashev and Zigangirov, 1974), which are only tight for .
1 Introduction
Sorting is one of the most fundamental problems in computer science. Among the wide variety of sorting algorithms, an important class is comparison-based sorting. It is well-known that sorting a list of numbers requires comparisons (see e.g., [CLRS22]), even if the input list is a random permutation.111All logarithms in the paper are of base . A good number of algorithms require only comparisons, such as binary insertion sort, merge sort (see e.g., [Knu98]), and merge insertion sort [FJ59].
Sorting is run on many real-world applications with large data sets. Thus, it is important to consider faults that arise unavoidably in large systems. For comparison-based sorting, this means that the result of a comparison could be incorrect. We consider one well-studied model for simulating such faults, studied in, e.g., [Hor63, Gal78, Pel89, BM86, FRPU94, BMW16]. In this model, the result of each query (or noisy comparison in the context of sorting) is flipped independently with probability for some fixed and repeated queries are allowed. Let us call it the noisy model. In this model, we define as the task to sort elements using noisy comparisons. As this model inherently has errors and no algorithm can always produce correct outputs, we consider algorithms with error probability.
If we repeat each noisy comparison times, the majority of the returned results is the correct comparison result with high probability. Therefore, we can modify any classical comparison-based sorting algorithm with comparisons, by replacing each comparison with noisy comparisons, to get an -comparison algorithm for Noisy Sorting that succeeds with probability.
In fact, better algorithms were known. Feige, Raghavan, Peleg, and Upfal [FRPU94] gave an algorithm for with only noisy comparisons. Recently, Wang, Ghaddar, Wang [WGW22] analyzed the constant factor hidden in [FRPU94]’s algorithm,222The actual constant is quite complicated, see [WGW22] for details. and gave an improved upper bound of noisy comparisons. They also showed that comparisons are necessary for any algorithm with error probability, where is the capacity of BSC channel with crossover probability (which is also the amount of information each noisy comparison gives) and is the binary entropy function.
Despite these progresses, there is still a big gap between the upper and lower bounds. For instance, when , the constant in front of for the lower bound is roughly , whereas the constant for the upper bound is roughly , more than three times as large. It is a fascinating question to narrow down this gap, or even close it like the case of classical sorting.
As the main result of our paper, we show that it is indeed possible to close this gap by giving both an improved upper bound and an improved lower bound:
Theorem 1 (Noisy Sorting Upper Bound).
There exists a deterministic algorithm for with error probability which always uses at most
noisy comparisons.
Theorem 2 (Noisy Sorting Lower Bound).
Any (deterministic or randomized) algorithm for with error probability must make at least
noisy comparisons in expectation, even if the input is a uniformly random permutation.
To compare with previous works more quantitatively, our constant is roughly for . When approaches , i.e., when the noisy comparisons are almost errorless, our bounds approach , matching the bounds of classical sorting.
Noisy Binary Search.
We also study the closely related Noisy Binary Search problem. In , we are given a sorted list of elements and an element , and the goal is to find the predecessor of in the sorted list. Noisy Binary Search is well-studied [Hor63, BZ74, FRPU94, BH08, Pel89, DŁU21] in the noisy model we consider, as well as in some other models [AD91, BK93, DGW92, Mut94, RMK+80].
Both previous algorithms [FRPU94, WGW22] for use binary insertion sort at the high level. More specifically, the algorithms keep a sorted array of the first elements for from to . When the -th element needs to be inserted for , the algorithms use some implementations of to find the correct position to insert it, with error probability (it also suffices to have average error probability ). A simple union bound then implies that the error probability of the sorting algorithm is . The difference between the two algorithms is that [FRPU94] uses an algorithm for Noisy Binary Search based on random walk, while [WGW22] builds on Burnashev and Zigangirov’s algorithm [BZ74].
In fact, a variable-length version of Burnashev and Zigangirov’s algorithm [BZ74] achieves optimal query complexity for the Noisy Binary Search problem.333The original paper is in Russian. See [WGZW23, Theorem 5] for an English version. They proved that with error probability can be solved using at most noisy comparisons in expectation (we remark that their algorithm is randomized and assumes uniform prior). Information theoretical lower bound (e.g., [BH08, Theorem B.1]) shows that noisy comparisons are necessary. These bounds essentially match when . However, for the application of Noisy Sorting, we must have on average. In this case, the lower and upper bounds do not match.
Dereniowski, Łukasiewicz, and Uznański [DŁU21] designed alternative algorithms Noisy Binary Search in various settings. They also considered a more general problem called Graph Search.
We remark that Ben-Or and Hassidim [BH08], likely unaware of [BZ74], claims an algorithm for solving with error probability using at most noisy comparisons in expectation. However, as pointed out by [DŁU21], there are potential issues in [BH08]’s proof.
By using a similar technique as our lower bound for Noisy Sorting, we are also able to improve the lower bound for Noisy Binary Search for .
Theorem 3 (Noisy Binary Search Lower Bound).
Any (deterministic or randomized) algorithm for with error probability must make at least
noisy comparisons in expectation, even if the position of the element to search for is uniformly random.
This lower bound indicates that, any algorithm for purely based on binary insertion sort, which requires executions of Noisy Binary Search with average error probability , needs at least noisy comparisons in expectation to achieve error probability . In light of Theorem 1, we see that any algorithm for Noisy Sorting purely based on binary insertion sort cannot be optimal.
We also show an algorithm for that matches with the lower bound in Theorem 3:
Theorem 4 (Noisy Binary Search Upper Bound).
There exists a randomized algorithm for with error probability that makes at most
noisy comparisons in expectation.
Concurrent Work.
Concurrent work [WGZW23] also improves over [WGW22] in many aspects. Our results are stronger than their main results. Our algorithm in Theorem 1 uses fewer noisy comparisons than their algorithms in expectation [WGZW23, Theorem 1 and 2]. Our Theorem 2 is strictly stronger than [WGZW23, Theorem 3]. Our lower bound for insertion-based sorting algorithms (see discussion after Theorem 3) is strictly stronger than [WGZW23, Theorem 4].
Other Related Works.
People have also considered Noisy Sorting in a slightly different noisy model, where every pair of elements is allowed to be compared at most once [BM08, KPSW11, GLLP17, GLLP20, GLLP19]. Sorting is harder in this model than the model we consider, as it is information-theoretically impossible to correctly sort all the elements with probability for [GLLP17]. For , Geissmann, Leucci, Liu, Penna [GLLP19] achieved an time algorithm that guarantees maximum dislocation and total dislocation with high probability, matching the lower bound given in [GLLP17].
Acknowledgements.
We thank Przemysław Uznański for telling us about the issue of [BH08].
1.1 Technical Overview
Before we give the overview of our techniques, let us first give some intuitions about the constant, , in our bound. is essentially the amount of information in bits each noisy query can give, and the ordering of elements requires bits to represent. Therefore, queries are necessary intuitively. On the other hand, is roughly the number of noisy comparisons required to compare two elements with error probability . Therefore, even for the simpler task of checking whether a list of elements are sorted, roughly queries seem necessary to compare all the adjacent elements with overall error probability . Therefore, the constants and are natural.
However, the above discussion only intuitively suggests a lower bound on the constant, and a priori it is unclear how to strengthen it to their sum. To resolve this issue, we design an easier version of Noisy Sorting as an intermediate problem. In this version, suppose elements are split into continuous groups of sizes in the sorted order (and the groups are unknown to the algorithm). Any time the algorithm tries to compare two elements from the same group, the query result tells the algorithm that these two elements are in the same group, without noise; otherwise, the query result flips the result of the comparison with probability , as usual. The algorithm is also not required to sort elements inside the same group correctly. We can show that in this version, each query still only gives roughly bits of information, and the total amount of bits the output space can represent is . Therefore, this simpler problem still requires approximately queries. By designing a reduction from this problem to Noisy Sorting, we can show that any algorithm for Noisy Sorting requires roughly queries comparing elements not from the same group. At the same time, inside each group, the algorithm needs to perform comparisons with error probability, even for checking if all the groups are sorted. This part requires roughly queries.
Our algorithm that matches this lower bound can be viewed as a remote variant of quicksort, with a large number of pivots and one level of recursion. First, we sample a random subset of elements of size , which can be viewed as pivots. Sorting only contributes to lower order terms in the query complexity. These pivots separate the list to multiple sublists. Then, we use Noisy Binary Search to find the correct sublist for the remaining elements, with error probability . This step contributes to the part of the query complexity. Then we design an algorithm that correctly sorts elements with probability using only queries. This algorithm is then used to sort all the sublists. Summing over all the sublists, the first term becomes , as we expect each sublist to have size . The second term sums up to . The total error probability is Finally, we have to correct the elements that are sent to the wrong sublists due to the error probability in the Noisy Binary Search algorithm, but we only expect to see at most of them, so the number of queries for handling them ends up being a lower order term.
2 Preliminaries
In this section we introduce several basic notations, definitions, and lemmas used in the proof.
2.1 Basic Notions
Notation 5.
We use the following notations throughout this paper.
-
•
For , let .
-
•
For two random variables , let denote the mutual information between and .
-
•
For a distribution , define its entropy function as . Specially, for , define . Define .
-
•
For , let denote the Bernoulli random variable, such that for , we have , .
-
•
For , let be the binary symmetric channel with crossover probability , i.e., on input , the output follows distribution .
-
•
For two random variables , define .
-
•
Let be a variable and and be two functions which are defined for all large enough values of . We say if . Specially, if , we omit the subscript and write .
We also assume some basic knowledge about information theory, such as chain rule of entropy and mutual information and Fano’s inequality. See e.g. [PW22] for an overview.
In Noisy Sorting and Noisy Binary Search, the algorithm is allowed to make noisy comparison queries.
Definition 6 (Noisy Comparison Query).
Fix . Let be two comparable elements (i.e., either or ). We define the noisy comparison query as returning , where the randomness is independent for every query.
We will omit the crossover probability when it is clear from the context throughout the paper.
With these definitions, we are ready to define the problems of study.
Definition 7 (Noisy Sorting Problem).
Let be the following problem: Given comparable elements , an algorithm is allowed to make query for . The goal is to output a permutation of such that for all .
Definition 8 (Noisy Binary Search Problem).
Let be the following problem: Given elements satisfying for all and an element . An algorithm is allowed to make query , where for some . The goal is to output (i.e., the index of the largest element in less than ).
Remark 9.
By our definition of the Noisy Sorting Problem, we can WLOG assume that all elements are distinct. In fact, suppose we have an algorithm which works for the distinct elements case. Then we can modify it such that if it is about to call , it instead calls (but flip the returning bit if ). Call the modified algorithm . This way, to the view of the algorithm, the elements are ordered by . It is easy to see that has the same distribution of number of queries, and error probability is no larger than error probability of . Thus, we get an algorithm which works for the case where elements are not necessarily distinct.
Similarly, by our definition of the Noisy Binary Search Problem, we can WLOG assume that all elements and are distinct.
2.2 Known and Folklore Algorithms
Theorem 10 ([DŁU21, Corollary 1.6]).
There exists a randomized algorithm for with error probability at most using
noisy comparisons in expectation, and using random bits always.
Remark 11.
[DŁU21] stated their problem as finding an element in a sorted array, but their algorithm can also find the predecessor. The only place their algorithm uses randomness is to shift the initial array randomly, which requires random bits (as we can WLOG assume is a power of two in , it always takes random bits to sample a random shift, instead of random bits in expectation).
We could also use the noisy binary search algorithm in [BZ74] (see also [WGZW23, Theorem 5]), which achieves a slightly better bound on the number of noisy comparisons. However, the number of random bits used in Burnashev-Zigangirov’s algorithm is in expectation, not always. This would make our control of number of random bits used slightly more complicated.
Corollary 12.
There exists a randomized algorithm for with failure probability using noisy comparisons in expectation, and using random bits always.
Proof.
The algorithm keeps a sorted array of the first elements for from to . Each time a new element needs to be inserted, the algorithm uses Theorem 10 to find its correct position, with error probability . By union bound, the overall error probability is bounded by , the expected number of queries is , and the number of random bits used is . ∎
We show how to use repeated queries to boost the correctness probability of pairwise comparisons. This is a folklore result, but we present a full proof here for completeness.
Lemma 13.
There exists a deterministic algorithm which compares two (unequal) elements using noisy comparisons with error probability using
noisy queries in expectation.
Proof.
Consider Algorithm 1 which maintains the posterior probability that .
Because of the return condition, if prior distribution of is , then probability of error is . However, by symmetry, probability of error of the algorithm does not depend on the ground truth, so the probability of error is regardless of the ground truth.
In the following, let us analyze the number of queries used.
Let be the value of after the -th query. If the ground truth is , then
Thus,
where the expectation is over the randomness of the -th query.
Define two sequences , as and . Let be the stopping time .
By the above discussion, is a martingale. Using Optional Stopping Theorem, we have
for every , so
By the bounded convergence theorem, goes to as . By the monotone convergence theorem, goes to as . Therefore,
which leads to
where we used the fact that is always an integer multiple of and hence .
Thus, the algorithm stops within queries in expectation.
Similarly, if the ground truth is , the algorithm stops within the same expected number of queries. This finishes the proof. ∎
2.3 Pairwise Independent Hashing
During our algorithm, we sometimes run into the following situation: we need to run a certain algorithm times, where each instance uses random bits for some . We can only afford fresh random bits in total (for derandomization purpose), so the instances need to share randomness in some way. On the other hand, we want randomness between any two instances to be independent of each other, in order for concentration bounds to hold. Therefore we need the following standard result on pairwise independent hashing.
Lemma 14 ([CW77]).
There exists a pairwise independent hash family of size using fresh random bits.
By using fully independent copies of pairwise independent hash families, we can achieve pairwise independence between instances using fresh random bits.
3 Noisy Sorting Algorithm
In this section, we will present our algorithm for noisy sorting and prove Theorem 1. We will first present the following randomized version of our algorithm, and then convert it to a deterministic one in Section 3.1.
Theorem 15.
There exists a randomized algorithm for with error probability which always uses at most
noisy comparisons, and uses random bits in expectation.
We start with a subroutine that sorts a list with a small number of inversion pairs.
Lemma 16.
Fix any parameter which can depend on . Given a list of elements with inversion pairs, there exists a deterministic algorithm using noisy comparisons that sorts these elements with error probability using noisy queries in expectation. The algorithm does not have to know in advance.
Proof.
Consider Algorithm 2, which is essentially insertion sort.
By union bound, with probability , the first calls to LessThan all return correctly. Conditioned on this happening, the algorithm acts exactly like a normal insertion sort, which halts after the first calls to LessThan, which takes noisy comparisons in expectation.
With probability , the algorithm does not run correctly, but the algorithm still halts in at most queries in expectation. ∎
Next, we use Lemma 16 to construct the following subroutine. It is ultimately used to sort some small sets of elements in the main algorithm.
Lemma 17.
Fix any parameter which can depend on . There is a randomized algorithm for with error probability at most using noisy comparisons in expectation, and using random bits always.
Proof.
After the first step, the expected number of inversion pairs is
so the second step uses at most queries in expectation.
The overall error probability is at most . ∎
The following lemma allows us to construct algorithms with guaranteed tail behavior for the number of queries. This will be helpful for concentration bounds.
Lemma 18.
Suppose we have an algorithm (solving a certain task) that has error probability and number of queries with , and uses random bits always. Then we can construct an algorithm solving the same task satisfying the following properties:
-
•
has error probability ;
-
•
Let be the number of queries algorithm uses. Then
-
•
Let be the number of random bits algorithm uses. Then
Proof.
Let be a parameter to be chosen later. Consider the following algorithm :
-
1.
Run algorithm until it halts, or it is about to use the -th query.
-
2.
If halts, then we return the return value of ; otherwise, we restart the whole algorithm.
Let us compute the probability of a restart. By Markov’s inequality,
Algorithm ’s error probability is
The expected number of queries used by Algorithm satisfies
Solving this, we get
The second moment satisfies
Solving this, we get
Similar to , we have
Solving this we get
Also,
Solving this we get
Choosing , we have , so we get
∎
Using Lemma 18, we are able to construct “safe” (i.e., with guaranteed tail behavior) versions of algorithms introduced before.
Corollary 19 (Safe Algorithms).
-
1.
Given , there exists a randomized algorithm SafeBinarySearch for , with error probability , with the number of queries satisfying
and with the number of used random bits satisfying
-
2.
Given , there exists a deterministic algorithm SafeLessThan which compares two elements with error probability , with the number of queries satisfying
-
3.
Given , there exists a randomized algorithm SafeWeakSort for , with error probability , with the number of queries satisfying
and with the number of used random bits satisfying
We introduce our last subroutine below, which is used to sort a subset of elements in the main algorithm.
Lemma 20.
There exists a randomized algorithm SafeSimpleSort for with error probability which always uses queries and random bits always.
Proof.
Consider the following algorithm :
-
•
Keep an array of the first elements for from to . Each time a new element needs to be inserted, the algorithm uses SafeBinarySearch to find the correct position, with error probability .
-
•
Output the resulting array.
We have calls to SafeBinarySearch. Let be the event that the -th call to SafeBinarySearch returns the correct value, and let be the event that all happens. By union bound, probability of error is at most .
Let be the number of queries used in the -th call to SafeBinarySearch. By Corollary 19 Part 1, we have
and
Conditioned on , ’s are independent because they use disjoint source of randomness. Thus, by Chebyshev’s inequality,
Let us define .
If the number of random bits used in the -th call to SafeBinarySearch is , then we can similarly show
Let .
Consider the following algorithm (which is our SafeSimpleSort):
-
•
Run until it finishes, or it is about the make the -th query, or it is about the use the -th random bit.
-
•
If finishes, then return the output of ; otherwise, output an arbitrary permutation.
Then SafeSimpleSort always makes at most queries, uses at most random bits, and has error probability . ∎
Finally, we give our main algorithm for noisy sorting. See Algorithm 3 for its description.
We first analyze its error probability.
Lemma 21.
Algorithm 3 has error probability .
Proof.
Let be the event that the algorithm does not err because of lack of random bits at Line 7, Line 17, or Line 34. Let be the event that SafeSimpleSort successfully sorts at Line 5; be the event that all calls of SafeWeakSort at Line 17 are correct; be the event that all calls of SafeLessThan at Lines 20 and Lines 27 are correct; be the event that all calls of SafeBinarySearch at Line 34 are correct.
We show that if all happen, then Algorithm 3 is correct. First of all, under , is correctly sorted at Line 5. Secondly, under and , for each , is correctly sorted at Line 17. Under and , at Line 32, all elements remaining in are indeed greater than and smaller than , so adding to between and keeps all elements in sorted. Therefore, before the for loop at Line 33, all elements in are correctly sorted, and clearly these elements are exactly . Finally, under and , the insertions made at Line 34 are all correct, so the final result is correct.
For each call of SafeBinarySearch at Line 7, the probability that it requires more than random bits is by Corollary 19 Part 1 and Chebyshev’s inequality. By union bound, with probability , none of the calls of SafeBinarySearch at Line 7 causes the algorithm to err. We can similarly bound the probability the algorithm errs at Line 17 or Line 34. Overall, . Clearly, , , and . Thus, by union bound, the overall success probability of Algorithm 3 is at least . ∎
Let a bucket be the set of elements that are between two adjacent elements in the sorted order of . Then we have the following simple lemma.
Lemma 22.
With probability , all buckets have size at most .
Proof.
For any continuous segment of length in the sorted order of , the probability that none of these element is in is . By union bound, the probability that there exists continuous elements not in is at most . Taking , we see that with probability , all continuous segments of length have at least one element in , which implies all buckets have size . ∎
Lemma 23.
With probability , Algorithm 3 uses at most queries.
Proof.
Because erring only makes the algorithm exit earlier, in the analysis we ignore the effect of erring caused by insufficient random bits. We define the following events:
-
•
Let be the event that . By Chernoff bound, .
-
•
Let be the event that all buckets have size at most . By Lemma 22, .
-
•
Let be the event that at most elements have the wrong predecessor at Line 7. By Chebshev’s inequality, .
- •
-
•
Let be the event that . If , , and all happen, then happens, for the following reasons. First, at Line 15, has size greater than , while the sizes of all buckets are at most conditioned on . Thus, at least half of the elements in have the wrong predecessor at Line 7. Thus, the amount of elements added to at Line 15 is at most twice the number of elements with the wrong predecessor , which is bounded by conditioned on . Also, if an element is added to at Line 22 or 29 and happens, must also have the wrong predecessor. Thus, overall, if , , and all happen.
Line 5.
Line 7.
Line 17.
Line 20, 27.
Line 34.
Conditioned on , the number of calls to SafeBinarySearch is . By Corollary 19 Part 1,
By Corollary 19 Part 1 and Chebyshev’s inequality,
Summarizing the above, excluding events with probability in total, the total number of queries made is
∎
Finally, we analyze the expected number of random bits used by the algorithm.
Lemma 24.
Algorithm 3 uses random bits in expectation.
Proof.
We consider all the places Algorithm 3 uses randomness.
Line 4.
Line 5.
By Chernoff bound, with probability at least . In this case, SafeSimpleSort uses random bits. If , SafeSimpleSort uses random bits, which contribute at most to the overall expectation. Thus, Line 5 uses random bits in expectation.
Line 7, Line 17, Line 34.
For each of the first outputted random bits, we use Lemma 14 to construct a pairwise independent hash family of size using fresh random bits. The total number of fresh random bits needed is . ∎
Now we are ready to prove Theorem 15.
Proof of Theorem 15.
Consider the following algorithm, which we call SafeNoisySort:
-
•
Run NoisySort. Stop immediately if we are about to use queries, for some to be chosen later.
-
•
If we successfully completed NoisySort, then output the sorted array; otherwise, output any permutation.
By Lemma 23, we can choose so that with probability , the NoisySort call completes successfully. By Lemma 21, with probability , the NoisySort call is correct. By union bound, SafeNoisySort has error probability and it always takes at most queries. By Lemma 24, number of random bits used is in expectation. ∎
3.1 Trading Queries for Random Bits
In this section, we show how to use the randomized algorithm to construct a deterministic one.
See 1
Proof.
In our model, it is possible to generate unbiased random bits using noisy comparisons. Take two arbitrary elements , and call twice. If the two returned values are different, we use the result of the first returned value as our random bit; otherwise, we repeat. Clearly, this procedure generates an unbiased random bit. The probability that this procedure successfully return a random bit after each two calls of NoisyComparison is , so the expected number of noisy comparisons needed to generate each random bit is . Note that this procedure is deterministic.
Algorithm 3 uses random bits in expectation, so we can use the above procedure to generate random bits, and the expected number of queries needed is . We can halt the algorithm if the number of queries used this way exceeds , which only incurs an additional error probability. ∎
4 Noisy Sorting Lower Bound
In this section, we show the lower bound of noisy sorting. See 2
The following problem serves as an intermediate step towards the lower bound.
Definition 25 (Group Sorting Problem).
For some integers , we define problem as follows: Given a list of elements, divided into groups of elements each, satisfying the property that for all , , we have . An algorithm needs to output an ordered list of sets by asking the following type of queries :
-
•
If the two elements are in the same group, then the query result is .
-
•
Otherwise, suppose , with . Then the query result is .
The following lemma relates GroupSorting and NoisySorting.
Lemma 26.
Fix any algorithm for with error probability . Let . Let be the input list of . Suppose we partition into groups where for all , for all , , we have . Let denote the number of queries makes to elements in different groups.
Then there exists an algorithm for with error probability which makes at most queries in expectation.
Proof.
Given , we design an algorithm for as follows.
Given input , Algorithm picks an arbitrary strict total order on all the elements. also maintains which elements are known to be in the same group, implied by queries that return . Then, it simulates Algorithm on the same input as follows:
-
•
When attempts to make a comparison between and :
-
–
If are not known to be in the same group, call . If the query returns , which means and are in the same group, return to . Otherwise, return the result of to .
-
–
If are known to be in the same group, return to .
-
–
-
•
When outputs a sequence : Let for and output .
Let us analyze the error probability and number of queries made by algorithm .
Error probability.
Algorithm simulates a instance on with the following total order:
-
•
If are in the same group, then iff ;
-
•
If , are in different groups, then iff .
Therefore, with error probability , the sequence that outputs is sorted with respect to the above total order, which gives the valid groups.
Number of queries.
Algorithm makes one query when makes a query for elements in different groups. Algorithm makes queries between elements in the same group only when they are not known to be in the same group. Imagine an initially empty graph on vertex set where for each that returns , we add an edge between and . For every such query, the number of connected components in this graph decreases by , and at the end the graph has at least connected components. Thus, the total number of queries made by to elements in the same group is at most .
Overall, the total number of queries made by Algorithm is at most in expectation. ∎
Next we prove a lower bound for the Group Sorting Problem. By Lemma 26, this implies a lower bound of the number of queries made between elements in different groups of any Noisy Sorting algorithm.
Lemma 27.
Let with . Any (deterministic or randomized) algorithm for with error probability makes at least queries in expectation, even if the input is a uniformly random permutation.
Proof.
Fix an algorithm for . WLOG assume that makes queries between elements in the same group only when they are not known to be in the same group. Therefore, the total number of queries between elements in the same group is at most .
Let be the true partition. Let be the (random) number of queries made. Let be the queries made. Let be the returned values of the queries. Let be the most probable given all query results. Note that is a function of .
When , we have .
By Fano’s inequality, , so
(1) |
On the other hand,
(chain rule of mutual information) | ||||
( and are independent conditioned on ) | ||||
(chain rule) | ||||
( and are independent conditioned on ) | ||||
(2) |
Let . Because the algorithm makes at most queries between elements in the same group, we have
We have
and
Note that is concave in , so
(Jensen’s inequality) | ||||
( is increasing on and is maximized at ) | ||||
() |
On the other hand,
Plugging the above two inequalities in Equation (2), we get
Combining with Equation (1), we get . ∎
Before we give lower bound for the number of queries made to elements in the same group, we first show the following technical lemma that will be useful later.
Lemma 28.
Let be a Markov process defined as
Let be a random variable supported on . Suppose there exists such that
Then
Proof.
Let be the event that . Let be the event that for all . By Markov’s inequality, . By concentration inequalities, . Using union bound, we have
∎
Lemma 29.
Fix any (deterministic or randomized) algorithm for with error probability . Let . Let be the input list of . Suppose we partition into groups where for all , for all , , we have . Let denote the number of queries made to elements in the same group.
Then , even if the input is a uniformly random permutation.
Proof.
Suppose the true order is . Let . Note that . For , define to be the number of queries returning , minus the number of queries returning . Note that is a random variable depending on the transcript. Define . Then .
Let us consider the posterior probabilities of the permutations given a fixed transcript. That is, define
By Bayes’ rule, equals . By symmetry, are equal for different and is constant as we fixed the transcript, so is proportional to . If , then equals
(3) |
For , consider the permutation . Then by comparing and using (3), we have
Thus,
Summing over , we have
Because the error probability is , for any , with probability , . So
with probability .
On the other hand, for any , if
then by regarding as and as in Lemma 28, with probability , we have
and then
which is a contradiction.
Thus,
as desired. ∎
Now we are ready to prove Theorem 2.
Proof of Theorem 2.
Fix an algorithm for with error probability .
Take . Let . Clearly, an algorithm for can be used to solve with less or equal number of queries in expectation. Let , be as defined in Lemma 26, Lemma 29 for and .
Therefore expected number of queries made by the algorithm is at least
∎
5 Noisy Binary Search
In this section, we prove our lower and upper bounds for Noisy Binary Search.
5.1 Lower Bound
In this section, we prove our lower bound for Noisy Binary Search using techniques similar to our lower bound for Noisy Sorting. See 3
We define the following intermediate problem:
Definition 30 ().
Given a sorted list of elements and an element that is unequal to any element in the list, a solver needs to output the predecessor of in by asking the following type of query for :
-
•
If is the predecessor of , output symbol ;
-
•
If is the successor of , output symbol ;
-
•
Otherwise, output .
Lemma 31.
Fix an algorithm for with error probability . Let be the input list and be the input element of . Let denote the number of noisy comparisons makes between and elements that are not the predecessor or successor of .
Then there exists an algorithm for with error probability at most which makes at most queries in expectation.
Proof.
The algorithm for simulates as follows:
-
•
Every time attempts to make a between some and some element in , the algorithm makes instead. If returns or , we have found the predecessor of , so we can return the predecessor and finish the execution; otherwise, we pass the result of (which will be ) to .
-
•
If returns a predecessor or , then we also return the same predecessor, finishing the execution.
The error probability bound and number of queries easily follow. ∎
Lemma 32.
Any (deterministic or randomized) algorithm for with error probability makes at least queries in expectation, even if the position of the element to search for is uniformly random.
Proof.
Fix an algorithm for . WLOG, we can assume that exits as soon as it sees a query that returns or , i.e., it sees at most one such query result. Also, we assume , as otherwise the lower bound is trivially .
Let be the index of the true predecessor of . Let be the (random) number of queries made. Let be the queries made. Let be the returned values of the queries. Let be the most probable given all query results. Note that is a function of .
Clearly, . By Fano’s inequality, . Thus,
On the other hand, we also have
using the same inequalities in the proof of Lemma 27.
Let and . Because exits as soon as it sees or we have
Then
and
Note that trivially, so . Combining with , we have (as we can assume ).
Note that is concave in , so
(Jensen’s inequality) | ||||
( is increasing on and is maximum at ) | ||||
() |
On the other hand,
Therefore,
Combining with , we get . ∎
Lemma 33.
Fix any (deterministic or randomized) algorithm for with error probability . Let be the input list and be the input element of . Let denote the number of noisy comparisons made between and it the predecessor and successor. Then , even if the position of the element to search for is uniformly random.
Proof.
Let the input list contain elements in this order. Let be the index of the predecessor of . Define to be the number of queries returning , minus the number of queries returning . Note that is a random variable depending on the transcript. We will first show that . Notice that the error probability of the algorithm conditioned on is at most for sufficiently large .
Define . Then . Let us consider the posterior probabilities of the distributions of given a fixed transcript. That is,
Then if , similarly to the proof of Lemma 29, we have
Thus, by Jensen’s inequality,
Suppose for the sake of contradiction that
for some constant . Then by Lemma 28 (taking as , as , as , as ), with probability , we have
and then
Because the error probability is , with probability at least , we have , which leads to . Therefore,
(In the last step we use and thus for sufficiently large .) This is a contradiction because and when and is large enough.
Thus,
Consequently,
∎
Now we are ready to prove Theorem 3.
5.2 Upper Bound
In this section we prove our upper bound for Noisy Binary Search. See 4
Proof.
First, if , we output an arbitrary answer and halt with probability . Otherwise, we call Theorem 10 with error probability . By union bound, the overall error probability is at most . The expected number of noisy comparisons is
as required. In the following, we assume . We also assume for simplicity, as we could pad dummy elements.
Let be the element for which we need to find the predecessor. First, we use Theorem 10 with error probability to find a candidate predecessor, which takes time. Then we use Lemma 13 to compare with this candidate predecessor and with the next element of this candidate predecessor , with error probability . This takes time. If the comparison results are and , we return as the predecessor of , otherwise, we restart. See Algorithm 4.
Let be the probability that we restart a new iteration of the while loop. Clearly, if the call to Theorem 10 and both calls to LessThan are correct, then we will not restart. Thus, . Note that as , .
Let be the probability that this algorithm returns the wrong predecessor. In each iteration of the while loop, the algorithm returns the wrong answer only if at least one of the two calls to LessThan are incorrect. Therefore, , so .
Let be the expected number of queries that this algorithm makes. Since the call to Theorem 10 uses noisy comparisons in expectation, and each call to LessThan uses noisy comparisons in expectation (as ), we get that
Therefore,
as desired. ∎
6 Discussions
In this section we discuss a few possible extensions of our results.
Varying .
In our main results for Noisy Sorting, we have assumed that remains constant as . This assumption is necessary in several places of our proofs as discussed below and we leave it as an open problem to drop this assumption.
For as , Theorem 1 still holds. For the proof, one needs to suitably modify Corollary 12, Lemma 17, Lemma 18, Corollary 19, Lemma 20 to handle correctly. For example, in our current statement of Corollary 19 we ignore constants related to in the expressions for . By choosing in Lemma 18 carefully, we can let even for growing , which suffices for the concentration results. For the lower bound (Theorem 2), Lemma 27 may become problematic. It seems possible to modify the proof of Lemma 27 to handle the case . However, for a stronger method is needed.
For as , the lower bounds hold without any change. For the algorithms, our derandomization method would result in too many queries when .
We similarly leave the task of dropping the assumption that remains a constant in our Noisy Binary Search results as an open problem.
Non-uniform and semi-random adversary.
In our main results we have assumed that the probability of error in the comparisons remains constant. One possible extension is that in each noisy comparison, the result is flipped with some probability at most , where this probability can be adversarially chosen for each query. (The algorithm knows but does not know the flip probability for individual comparisons.) This model is sometimes called semi-random adversary in literature (e.g., [MWR18]). It is an interesting open problem to generalize our results to this model.
Dependency on .
In our bounds for Noisy Sorting, we simply used as our error probability. We leave it as an open problem to generalize our bounds to the case where the error probability can also be specified.
BMS observation.
One natural extension is to replace observation noise with a Binary Memoryless Symmetric channel (BMS). That is, for every noisy comparison, the probability of error is chosen (independently) from a distribution on , and this probability is returned together with the comparison result. In this case our guess is that
noisy comparisons should be necessary and sufficient. When is supported on for some constant independent of , we expect it to be easier to adapt our proofs to BMS, whereas in the general case, issues described in the Varying discussion above can occur.
References
- [AD91] Javed A. Aslam and Aditi Dhagat. Searching in the presence of linearly bounded errors. In Proceedings of the 23rd Annual ACM Symposium on Theory of Computing (STOC), pages 486–493, 1991.
- [BH08] Michael Ben-Or and Avinatan Hassidim. The bayesian learner is optimal for noisy binary search (and pretty good for quantum as well). In Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 221–230, 2008.
- [BK93] Ryan S. Borgstrom and S. Rao Kosaraju. Comparison-based search in the presence of errors. In Proceedings of the 25th Annual ACM Symposium on Theory of Computing (STOC), pages 130–136, 1993.
- [BM86] Donald A. Berry and Roy F. Mensch. Discrete search with directional information. Oper. Res., 34(3):470–477, 1986.
- [BM08] Mark Braverman and Elchanan Mossel. Noisy sorting without resampling. In Proceedings of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 268–276, 2008.
- [BMW16] Mark Braverman, Jieming Mao, and S. Matthew Weinberg. Parallel algorithms for select and partition with noisy comparisons. In Proceedings of the 48th Annual ACM Symposium on Theory of Computing (STOC), pages 851–862, 2016.
- [BZ74] M. V. Burnashev and K. Sh. Zigangirov. An interval estimation problem for controlled observations. Probl. Peredachi Inf., 10(3):51–61, 1974.
- [CLRS22] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to algorithms. MIT press, 2022.
- [CW77] J. Lawrence Carter and Mark N. Wegman. Universal classes of hash functions. In Proceedings of the 9th Annual ACM Symposium on Theory of Computing (STOC), pages 106–112, 1977.
- [DGW92] Aditi Dhagat, Peter Gács, and Peter Winkler. On playing “twenty questions” with a liar. In Proceedings of the 3rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 16–22, 1992.
- [DŁU21] Dariusz Dereniowski, Aleksander Łukasiewicz, and Przemysław Uznański. Noisy searching: simple, fast and correct. arXiv preprint arXiv:2107.05753, 2021.
- [FJ59] Lester R. Ford Jr. and Selmer M. Johnson. A tournament problem. Am. Math. Mon., 66(5):387–389, 1959.
- [FRPU94] Uriel Feige, Prabhakar Raghavan, David Peleg, and Eli Upfal. Computing with noisy information. SIAM J. Comput., 23(5):1001–1018, 1994.
- [Gal78] Shmuel Gal. A stochastic search game. SIAM. J. Appl. Math., 34(1):205–210, 1978.
- [GLLP17] Barbara Geissmann, Stefano Leucci, Chih-Hung Liu, and Paolo Penna. Sorting with recurrent comparison errors. In Proceedings of the 28th International Symposium on Algorithms and Computation (ISAAC), pages 38:1–38:12, 2017.
- [GLLP19] Barbara Geissmann, Stefano Leucci, Chih-Hung Liu, and Paolo Penna. Optimal sorting with persistent comparison errors. In Proceedings of the 27th Annual European Symposium on Algorithms (ESA), pages 1–14, 2019.
- [GLLP20] Barbara Geissmann, Stefano Leucci, Chih-Hung Liu, and Paolo Penna. Optimal dislocation with persistent errors in subquadratic time. Theory Comput. Syst., 64(3):508–521, 2020.
- [Hor63] Michael Horstein. Sequential transmission using noiseless feedback. IEEE Trans. Inf. Theory, 9(3):136–143, 1963.
- [Knu98] Donald E. Knuth. The art of computer programming, volume 3. Addison-Wesley Professional, 1998.
- [KPSW11] Rolf Klein, Rainer Penninger, Christian Sohler, and David P. Woodruff. Tolerant algorithms. In Proceedings of the 19th Annual European Symposium on Algorithms (ESA), pages 736–747, 2011.
- [KY76] D. Knuth and A. Yao. The complexity of nonuniform random number generation. In Algorithms and Complexity: New Directions and Recent Results, 1976.
- [Mut94] S. Muthukrishnan. On optimal strategies for searching in presence of errors. In Proceedings of the 5th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 680–689, 1994.
- [MWR18] Cheng Mao, Jonathan Weed, and Philippe Rigollet. Minimax rates and efficient algorithms for noisy sorting. In Algorithmic Learning Theory, pages 821–847. PMLR, 2018.
- [Pel89] Andrzej Pelc. Searching with known error probability. Theor. Comput. Sci., 63(2):185–202, 1989.
- [PW22] Yury Polyanskiy and Yihong Wu. Information Theory: From Coding to Learning. Cambridge University Press, 2022. Draft version available at https://people.lids.mit.edu/yp/homepage/data/itbook-export.pdf.
- [RMK+80] Ronald L. Rivest, Albert R. Meyer, Daniel J. Kleitman, Karl Winklmann, and Joel Spencer. Coping with errors in binary search procedures. J. Comput. Syst. Sci., 20(3):396–404, 1980.
- [WGW22] Ziao Wang, Nadim Ghaddar, and Lele Wang. Noisy sorting capacity. In Proceedings of the 2022 IEEE International Symposium on Information Theory (ISIT), pages 2541–2546, 2022.
- [WGZW23] Ziao Wang, Nadim Ghaddar, Banghua Zhu, and Lele Wang. Noisy sorting capacity. arXiv preprint arXiv:2202.01446v2, 2023.