Nearly-Linear Time Seeded Extractors with Short Seeds
Abstract
Seeded extractors are fundamental objects in pseudorandomness and cryptography, and a deep line of work has designed polynomial-time seeded extractors with nearly-optimal parameters. However, existing constructions of seeded extractors with short seed length and large output length run in time and often slower, where is the input source length and is the error of the extractor. Since cryptographic applications of extractors require to be small, the resulting runtime makes these extractors unusable in practice.
Motivated by this, we explore constructions of strong seeded extractors with short seeds computable in nearly-linear time , for any error . We show that an appropriate combination of modern condensers and classical approaches for constructing seeded extractors for high min-entropy sources yields strong extractors for -bit sources with any min-entropy and any target error with seed length and output length for an arbitrarily small constant , running in nearly-linear time, after a reasonable one-time preprocessing step (finding a primitive element of with a power of ) that is only required when , for a constant and the iterated logarithm, and which can be implemented in time under mild conditions on . As a second contribution, we give an instantiation of Trevisan’s extractor that can be evaluated in truly linear time in the RAM model, as long as the number of output bits is at most . Previous fast implementations of Trevisan’s extractor ran in time in this setting. In particular, these extractors directly yield privacy amplification protocols with the same time complexity and output length, and communication complexity equal to their seed length.
1 Introduction
Seeded randomness extractors are central objects in the theory of pseudorandomness. A strong -seeded extractor is a deterministic function that receives as input an -bit source of randomness with bits of min-entropy111A random variable has bits of min-entropy if for all . Min-entropy has been the most common measure for the quality of a weak source of randomness since the work of Chor and Goldreich [CG88]. and a -bit independent and uniformly random seed , and outputs an -bit string that is -close in statistical distance to the uniform distribution over , where is an error term, even when the seed is revealed. Besides their most direct application to the generation of nearly-perfect randomness from imperfect physical sources of randomness (and their inaugural applications to derandomizing space-bounded computation [NZ96] and privacy amplification [BBCM95]), seeded extractors have also found many other surprising applications throughout computer science, particularly in cryptography.
For most applications, it is important to minimize the seed length of the extractor. A standard application of the probabilistic method shows the existence of strong -seeded extractors with seed length and output length , and we also know that these parameters are optimal up to the terms [RT00]. This motivated a deep line of research devising explicit constructions of seeded extractors with seed length as small as possible spanning more than a decade (e.g., [NZ96, SZ99, NT99, Tre01, TZS06, SU05]) and culminating in extractors with essentially optimal seed length [LRVW03, GUV09]. In particular, the beautiful work of Guruswami, Umans, and Vadhan [GUV09] gives explicit strong extractors with order-optimal seed length and output length for any constant , and follow-up work [DKSS13, TU12] further improved the entropy loss . The extractors constructed in these works are explicit, in the sense that there is an algorithm that given and computes the corresponding output in time polynomial in the input length.
A closer look shows that the short-seed constructions presented in the literature all run in time , and often significantly slower. In cryptographic applications of extractors we want the error guarantee to be small, which means that implementations running in time are often impractical. If we insist on nearly-linear runtime for arbitrary error , we can use strong seeded extractors based on universal hash functions that can be implemented in time (e.g., see [HT16]), have essentially optimal output length, but have the severe drawback of requiring a very large seed length .
These limitations have been noted in a series of works studying concrete implementations of seeded extractors, with practical applications in quantum cryptography in mind [MPS12, FWE+23, FYEC24]. For example, Foreman, Yeung, Edgington, and Curchod [FYEC24] implement a version of Trevisan’s extractor [Tre01, RRV02] with its standard instantiation of Reed–Solomon codes concatenated with the Hadmadard code, and emphasize its excessive running time as a major reason towards non-adoption.222The reason why these works focus on Trevisan’s extractor is that this is the best seeded extractor (in terms of asymptotic seed length) that is known to be secure against quantum adversaries [DPVR12]. Instead, they have to rely on extractors based on universal hash functions, which, as mentioned above, are fast but require very large seeds.
This state of affairs motivates the following question, which is the main focus of this work:
Can we construct strong -seeded extractors with seed length and output length computable in nearly-linear time, for arbitrary error ?
Progress on this problem would immediately lead to faster implementations of many cryptographic protocols that use seeded extractors.
1.1 Our Contributions
We make progress on the construction of nearly-linear time extractors.
Seeded extractors with order-optimal seed length and large output length.
We construct nearly-linear time strong seeded extractors with order-optimal seed length and large output length for any and , with the caveat that they require a one-time preprocessing step whenever . This preprocessing step corresponds to finding primitive elements of finite fields with , which, as we discuss below, is reasonable in practical applications. More precisely, we have the following result.
Theorem 1.
For any constant there exists a constant such that the following holds. For any positive integers and and any satisfying there exists a strong -seeded extractor
with seed length and output length . Furthermore,
-
•
if , then is computable in time , where hides polylogarithmic factors in its argument and denotes the iterated logarithm;
-
•
if , then is computable in time after a preprocessing step, corresponding to finding a primitive element of with a power of .333In full rigor, the preprocessing step corresponds to finding primitive elements of fields with orders , each a power of . This term has negligible influence on the complexity of this preprocessing step. Note that we can find such a primitive element in time if is a power of and we know the factorization of , but we do not know how to do that in time . More precisely, given the factorization of we can test whether a given is primitive in time by checking whether for all prime factors of . We can exploit this in various ways. If we are fine with using randomness in the one-time preprocessing stage, then we can sample an element of uniformly at random, test whether it is primitive, and repeat if not. If we insist on a deterministic algorithm, then we can combine the testing procedure with algorithms of Shoup [Sho90] or Shparlinski [Shp92] which identify in time a subset of size in that is guaranteed to contain a primitive element. For an alternative faster randomized algorithm, see [DD06].
1 follows from combining modern condensers with short seeds (namely, the lossless condenser of Kalev and Ta-Shma [KT22] and the lossy Reed-Solomon-based condenser of Guruswami, Umans, and Vadhan [GUV09]) with a careful combination and instantiation of classical recursive approaches developed by Srinivasan and Zuckerman [SZ99] and in [GUV09]. It readily implies, among other things, an -time privacy amplification protocol where only bits need to be communicated over the one-way authenticated public channel and almost all the min-entropy can be extracted (after a reasonable one-time preprocessing step if the min-entropy bound is very small).
A new non-recursive construction.
As a conceptual contribution which may be of independent interest, we present a new “non-recursive” construction of extractors with seed length and output length that is computable in nearly-linear time when and avoids the complicated recursive procedures from [SZ99, GUV09]. We believe this to be a conceptually better approach towards constructing seeded extractors, and we discuss it in more detail in the technical overview.
Faster instantiations of Trevisan’s extractor.
One of the most widely-used explicit seeded extractors is Trevisan’s extractor [Tre01, RRV02]. While by now we have extractors with better parameters, one of its main advantages is that it is one of the few examples of extractors, and in a sense the best one, which are known to be quantum proof.444An extractor is quantum proof if its output is close to uniform even in the presence of a quantum adversary that has some (bounded) correlation with . A bit more formally, is quantum-proof if for all classical-quantum state (where is a quantum state correlated with ) with , and a uniform seed , it holds that . See [DPVR12] for more details.
Trevisan’s extractor uses two basic primitives: combinatorial designs (when more than one output bit is desired), and binary list-decodable codes. A standard instantiation of such suitable codes goes by concatenating a Reed-Solomon code with a Hadamard code, and this is also what is considered in [FWE+23, FYEC24]. As they also observe, this gives a nearly-linear time construction when the output length . In fact, by leveraging fast multipoint evaluation, one can also get a nearly-linear time construction for any output length , although this was not noted in previous works.555For a rigorous statement on fast multipoint evaluation, see Lemma 2.1.
Our main contribution in this direction is an alternative instantiation of Trevisan’s extractor that can be computed in truly linear time on a RAM in the logarithmic cost model, for any output length .
Theorem 2.
There exists an instantiation of Trevisan’s extractor, set to extract bits with any error , that is computable in:
-
1.
Time after a preprocessing step running in time , on a RAM in the logarithmic cost model. In particular, there exists a universal constant , such that whenever , the instantiation runs in time , without the need for a preprocessing step.
-
2.
Time in the Turing model.
We note that one interesting instantiation of the above theorem is when Trevisan’s extractor is set to output bits for . In this setting, Trevisan’s extractor requires a seed of length , and, as long as is not too tiny, we get truly-linear runtime.
1.2 Other Related Work
Besides the long line of work focusing on improved constructions of explicit seeded extractors and mentioned in the introduction above, other works have studied randomness extraction in a variety of restricted computational models. These include extractors computable by streaming algorithms [BRST02], local algorithms [Lu02, Vad04, BG13, CL18], AC0 circuits [GVW15, CL18, CW24], AC0 circuits with a layer of parity gates [HIV22], NC1 circuits [CW24], and low-degree polynomials [ACG+22, AGMR24, GGH+24]. Moreover, implementations in various restricted computational models of other fundamental pseudorandomness primitives such as -wise and -biased generators, that often play a key role in constructions of various types of extractors, have also been independently studied (see [HV06, Hea08, CRSW13, MRRR14] for a very partial list).
As mentioned briefly above, some works have also focused on constructing seeded extractors computable in time motivated by applications in privacy amplification for quantum key distribution. Such constructions are based on hash functions, and are thus far restricted to seed length. The work of Hayashi and Tsurumaru [HT16] presents an extensive discussion of such efforts. We also mention that nearly-linear time extractors with very short seed, in the regime and , were given in [DMOZ22], with applications in derandomization.
1.3 Technical Overview
In a nutshell, we obtain 1 by following two standard high-level steps:
-
1.
We apply a randomness condenser with small seed length to the original -bit weak source to obtain an output that is -close to a high min-entropy source.
-
2.
We apply a seeded extractor tailored to high min-entropy sources with small seed length to to obtain a long output that is -close to uniform.
To realize this approach, we need to implement each of these steps in nearly-linear time (possibly after a reasonable one-time preprocessing step). We briefly discuss how we achieve this, and some pitfalls we encounter along the way.
Observations about nearly-linear time condensers.
In order to implement Item 1, we need to use fast condensers with short seeds. Luckily for us, some existing state-of-the-art constructions of condensers already satisfy this property, although, to the best of our knowledge, this has not been observed before. We argue this carefully in Section 3.3.
For example, the “lossy Reed-Solomon condenser” from [GUV09] interprets the source as a polynomial of degree and the seed as an element of , and outputs , for an appropriate and field size , with a primitive element of . Evaluating corresponds to evaluating the same polynomial on multiple points in . This is an instance of the classical problem of multipoint evaluation in computational algebra, for which we know fast and practical algorithms (e.g., see [vzGG13, Chapter 10] or Lemma 2.1) running in time , since and if .
A downside of this condenser is that it requires knowing a primitive element of with . As discussed above, if we know the factorization of and is a power of , then we can find such a primitive element in time . Beyond that, having access to such primitive elements, which only need to be computed once independently of the source and seed, is reasonable in practice. Therefore, we may leave this as a one-time preprocessing step.
The lossless “KT condenser” from [KT22] has a similar flavor. It interprets the source as a polynomial and the seed as an evaluation point, and outputs , for some appropriate . The problem of evaluating several derivatives of the same polynomial on the same point (sometimes referred to as Hermite evaluation) is closely related to the multipoint evaluation problem above, and can also be solved in time .666Interestingly, recent works used other useful computational properties of the KT condenser. Cheng and Wu [CW24] crucially use the fact that the KT condenser can be computed in NC1. Doron and Tell [DT23] use the fact that the KT condenser is logspace computable for applications in space-bounded derandomization. Evaluating the KT condenser does not require preprocessing. On the other hand, it only works when the min-entropy for a large constant , where is the source length and the target error of the condenser.
The “ideal” approach to seeded extraction from high min-entropy sources.
We have seen that there are fast condensers with short seeds. It remains to realize Item 2. Because of the initial condensing step, we may essentially assume that our -bit weak source has min-entropy , for an arbitrarily small constant . In this case, we would like to realize in time and with overall seed length what we see as the most natural approach to seeded extraction from high min-entropy sources:
-
1.
Use a fresh short seed to transform into a block source with geometrically decreasing blocks, where denotes string concatenation. A block source has the property that each block has good min-entropy even conditioned on the values of blocks .
-
2.
Perform block source extraction on using another fresh short seed. Due to its special structure, we can extract a long random string from using only the (small) seed length associated with extracting randomness from the smallest block , which has length .
The classical approach to Item 2 where we iteratively apply extractors based on universal hash functions with increasing output lengths to the blocks of from right to left is easily seen to run in time and requires a seed of length if, e.g., we use the practical extractors of [TSSR11, HT16]. Therefore, we only need to worry about realizing Item 1.
A standard approach to Item 1 would be to use an averaging sampler to iteratively sample subsequences of as the successive blocks of the block source , following a classical strategy of Nisan and Zuckerman [NZ96] (improved by [RSW06, Vad04]). We do know averaging samplers running in time (such as those based on random walks on a carefully chosen expander graph). However, this approach requires a fresh seed of length per block of . Since will have roughly blocks, this leads to an overall seed of length , which is too much for us.
Instead, we provide a new analysis of a sampler based on bounded independence, that runs in time and only requires a seed of length to create the entire desired block source. We give the construction, which may be of independent interest, in Section 3.2. The caveat of this “block source creator” is that it only works as desired when the target error for some small constant . Combining these realizations of Items 1 and 2 yields the desired -time extractor with order-optimal seed length and output length for arbitrary constant , provided that . See Theorem 5.1 for the formal statement.
Getting around the limitation of the ideal approach.
We saw above that combining the ideal approach to seeded extraction from high min-entropy sources with the new analysis of the bounded independence sampler yields a conceptually simple construction with the desired properties when the error is not too small (or alternatively, whenever the entropy guarantee is large enough). However, we would like to have -time seeded extraction with seed length and large output length for all ranges of parameters.
To get around this limitation of our first construction, it is natural to turn to other classical approaches for constructing nearly-optimal extractors for high min-entropy sources, such as those of Srinivasan and Zuckerman [SZ99] or Guruswami, Umans, and Vadhan [GUV09]. These approaches consist of intricate recursive procedures combining a variety of combinatorial objects, and require a careful analysis.777In our view, these approaches are much less conceptually appealing than the “ideal” approach above. We believe that obtaining conceptually simpler constructions of fast nearly-optimal extractors that work for all errors is a worthwhile research direction, even if one does not improve on the best existing parameters. However, we could not find such an approach that works as is, even when instantiated with -time condensers and -time hash-based extractors. In particular:
-
•
The GUV approach [GUV09] gives explicit seeded extractors with large output length and order-optimal seed length for any min-entropy requirement and error . However, its overall runtime is significantly larger than whenever is not extremely small (for example, for some is not small enough).
-
•
The SZ approach [SZ99] can be made to run in time and have large output length when instantiated with fast condensers, samplers, and hash-based extractors, but it is constrained to error , where is the iterated logarithm.
Fortunately, the pros and cons of the GUV and SZ approaches complement each other. Therefore, we can obtain our desired result by applying appropriately instantiated versions of the GUV and SZ approaches depending on the regime of we are targeting.
1.4 Future Work
We list here some directions for future work:
-
•
Remove the preprocessing step that our constructions behind 1 require when .
-
•
On the practical side, develop concrete implementations of seeded extractors with near-optimal seed length and large output length. In particular, we think that our non-recursive construction in Section 5.1 holds promise in this direction.
1.5 Acknowledgements
Part of this research was done while the authors were visiting the Simons Institute for the Theory of Computing, supported by DOE grant # DE-SC0024124. D. Doron’s research was also supported by Instituto de Telecomunicações (ref. UIDB/50008/2020) with the financial support of FCT - Fundação para a Ciência e a Tecnologia and by NSF-BSF grant #2022644. J. Ribeiro’s research was also supported by Instituto de Telecomunicações (ref. UIDB/50008/2020) and NOVA LINCS (ref. UIDB/04516/2020) with the financial support of FCT - Fundação para a Ciência e a Tecnologia.
2 Preliminaries
2.1 Notation
We often use uppercase roman letters to denote sets and random variables – the distinction will be clear from context. We denote the support of a random variable by , and, for a random variable and set , we also write to mean that is supported on . For a random variable , we write to mean that is sampled according to the distribution of . We use to denote a random variable that is uniformly distributed over . For two strings and , we denote their concatenation by . Given two random variables and , we denote their product distribution by (i.e., . Given a positive integer , we write . For a prime power , we denote the finite field of order by . We denote the base- logarithm by .
2.2 Model of Computation
We work in the standard, multi-tape, Turing machine model with some fixed number of work tapes. In particular, there exists a constant such that all our claimed time bounds hold whenever we work with at most work tapes. This also implies that our results hold in the RAM model, wherein each machine word can store integers up to some fixed length, and standard word operations take constant time. In Section 4 we will give, in addition to the standard Turing machine model bounds, an improved runtime bound that is dedicated to the logarithmic-cost RAM model.
2.3 Fast Finite Fields Operations
For a prime power , we let be the number of field operations required to multiply two univariate polynomials over of degree less than , and be the bit complexity of such a multiplication, so , where we denote by an upper bound on the bit complexity of arithmetic operations in . When , Harvey and van der Hoeven [HvdH19, HvdH21] showed that
and in general, [Für09].888If contains a -th root of unity, one can get from the classic FFT algorithm [CT65]. For a simpler algorithm attaining the bound , see [vzGG13, Sections 8,10]. See also [HvdH22] for a widely-believed conjecture under which always holds. When , we can use Schönhage’s algorithm [Sch77] to get where we relied on the fact that addition and multiplication in can be done in time . Overall, when , and is either a prime or a power of two, . We will use fast multi-point evaluation and fast computation of derivatives (together with the preceding bounds on ).
Lemma 2.1 ([BM74], see also [vzGG13, Chapter 10]).
Let , and let be a prime, or a power of . Then, given a polynomial of degree at most , the following holds.
-
1.
Given a set , where , one can compute in time .
-
2.
For and , one can compute the derivatives in time .
Note that when , we can bound by .999The looser bound of , when , is also a bound for an arbitrary , and can be achieved with simpler algorithms than the ones cited.
2.4 Statistical Distance, Entropy
We present some relevant definitions and lemmas about the statistical distance and min-entropy.
Definition 2.2 (statistical distance).
The statistical distance between two random variables and supported on , denoted by , is defined as
We say that and are -close, and write , if .
Definition 2.3 (min-entropy).
The min-entropy of a random variable supported on , denoted by , is defined as
Above, and throughout the paper, we use base- logarithms.
Definition 2.4 (average conditional min-entropy).
Let and be two random variables supported on and , respectively. The average conditional min-entropy of given , denoted by , is defined as
The following standard lemma gives a chain rule for min-entropy.
Lemma 2.5 (see, e.g., [DORS08]).
Let , , and be arbitrary random variables such that . Then,
We can turn the chain rule above into a high probability statement.
Lemma 2.6 (see, e.g., [MW97]).
Let , , and be random variables such that . Then,
for any .
Definition 2.7 (smooth min-entropy).
We say that a random variable has -smooth min-entropy at least , denoted by , if there exists a random variable such that and .
2.5 Extractors and Condensers
Definition 2.8 (-source).
We say that a random variable is an -source if and .
Definition 2.9 (block source).
A random variable is an -block source if we can write , each , where for all . In the special case where for all , we say that is an -block source.
We say that is an exact block source if for any prefix . Lemma 2.6 tells us that any -block-source is -close to an exact -block-source, where .
Definition 2.10 (seeded extractor).
A function is a seeded extractor if the following holds. For every -source ,
where is uniformly distributed over and is independent of and is uniformly distributed over . We say that is strong if .
Furthermore, is said to be an average-case (strong seeded) extractor if for all correlated random variables and such that is supported on and we have
where is uniformly distributed over and is independent of and is uniformly distributed over and independent of .
Remark 2.11.
By Lemma 2.6, every strong -seeded extractor is also an average-case strong -seeded extractor.
Definition 2.12 (condenser).
A function is a (seeded) condenser if the following holds. For every -source , , where is uniformly distributed over and is independent of .
We say that is strong if is -close to some distribution with min-entropy (and note that here, necessarily, bits of entropy come from the seed). Finally, we say that is lossless if .
We also define extractors tailored to block sources.
Definition 2.13 (block source extractor).
A function is a strong block-source extractor if for any -block-source ,
where is uniformly distributed over and is independent of and is uniformly distributed over .
We will also require the following extractors based on the leftover hash lemma and fast hash functions. We state a result from [TSSR11] which requires seed length , where is the output length. It is possible to improve the seed length to , but this requires the input length to be structured [HT16].
Lemma 2.14 (fast hash-based extractors [TSSR11, Theorem 10], adapted. See also [HT16, Table I]).
For any positive integers , , and and any such that and there exists a -strong seeded extractor with seed length . Moreover, can be computed in time .
Note that by appending the seed to the output of the extractor, we can get the following: There exists a constant such that for any constant , and , there exists a strong -seeded extractor .
2.6 Averaging Samplers
In this section we define averaging samplers and state some useful related results and constructions.
Definition 2.15 (averaging sampler).
We say that is a -averaging sampler if
for every function , where . We say that has distinct samples if outputs distinct elements of for every input . The parameter is often referred to as the accuracy of the sampler, and is its confidence parameter. Moreover, we sometimes refer to as a sampling distribution.
The following lemma gives guarantees on sub-sampling from a weak source using an averaging sampler.
Lemma 2.16 ([Vad04, Lemma 6.2]).
Let be such that and let be a -averaging sampler with distinct samples. Then, for any -source and uniformly distributed over we have that
where is an -source for every .
The “expander random walk” sampler.
We will need the following averaging sampler based on random walks on expanders. Let be a -regular graph with vertex set . We assume that the neighborhood of each vertex is ordered in some predetermined way. Then, the associated averaging sampler parses its input as , where and , and outputs , where is the -th neighbor of when . To ensure distinct samples, we skip repeated vertices.
The performance of as an averaging sampler is determined by the spectral expansion of . In fact, if has spectral expansion then a direct application of the expander Chernoff bound [Gil98] gives that is a -averaging sampler with and [Vad04, Section 8.2]. We instantiate with the regular expander graphs from the following result of Alon [Alo21].
Lemma 2.17 ([Alo21, Theorem 1.2], adapted).
Fix any prime such that . Then, there is a constant such that for every integer there exists a -regular graph on vertices with spectral expansion , where the tends to as . Furthermore, the family is strongly explicit.
In particular, for any there exist constants and and a strongly explicit family of -regular graphs with spectral expansion for any .
Taking the -th power of a -spectral expander improves its expansion to . This readily gives us the following corollary.
Corollary 2.18.
For every large enough , and any , there exists a -regular graph with spectral expansion , where , and given and , the -th neighbor of can be computed in time .
Combining the discussion above with Lemma 2.17 (or Corollary 2.18) immediately yields the following.
Lemma 2.19 ([Vad04, Lemma 8.2], appropriately instantiated).
For every large enough integer and every , there exists a -averaging sampler with distinct samples with and . Furthermore, is computable in time .
We can extend Lemma 2.19 to output more distinct samples while not increasing via the following simple lemma.
Lemma 2.20 ([Vad04, Lemma 8.3]).
Suppose that is a -averaging sampler with distinct samples. Then, for every integer there exists a -averaging sampler with distinct samples. Furthermore, if is computable in time , then is computable in time .
Lemma 2.20 follows easily by parsing and considering the sampler for and . If we can compute in time , then we can compute in time , as desired. The following is an easy consequence of Lemmas 2.19 and 2.20.
Lemma 2.21 ([Vad04, Lemma 8.4], with additional complexity claim).
There exists a constant such that the following holds. For every large enough and , there exists a -averaging sampler with distinct samples for any with and . Furthermore, is computable in time .
In particular, if is constant then , , and is computable in time .
2.7 Standard Composition Techniques for Extractors
We collect some useful classical techniques for composing seeded extractors.
Lemma 2.22 (boosting the output length [WZ99, RRV02]).
Suppose that for there exist strong -seeded extractors running in time , with . Then, given by is a strong -seeded extractor running in time .
Lemma 2.23 (block source extraction).
Let be an -block-source, and let be average-case strong -seeded extractors running in time with output length for . Then, there exists a strong -block-source extractor with output length that runs in time . If is an exact block source, then the ’s do not need to be average-case.
We discuss how the fast hash-based extractor from Lemma 2.14 can be used to construct a fast extractor with seed length any constant factor smaller than its output length for high min-entropy sources. We need the following lemma, which is an easy consequence of the chain rule for min-entropy.
Lemma 2.24 ([GUV09, Corollary 4.16]).
If is an -source and we write with for all , then is -close to an -block-source.
The following appears in [GUV09] without the time complexity bound. We appropriately instantiate their approach and analyze the time complexity below.
Lemma 2.25 (fast extractors with seed shorter than output [GUV09, Lemma 4.11]).
For every integer there exists a constant such that for any positive integer and there exists a strong -seeded extractor with and computable in time .
Proof.
Let be an -source and . Write as with for all . Then, Lemma 2.24 guarantees that is -close to an -block-source . Note that
since we have assumed that . Now, let be the strong -seeded extractor from Lemma 2.14 with output length and corresponding seed length for a large enough constant depending only on . Then, we apply block source extraction (Lemma 2.23) to with to get the desired strong -extractor with output length and seed length . Since is computable in time , then is computable in time . ∎
In addition to Lemma 2.22, one can potentially boost the output length of a high min-entropy extractor by first treating the source as a block sources and then performing a simple block source extraction. The following corollary follows easily from Lemmas 2.23 and 2.24 (and can also be found in [Vad04, Section 6]).
Lemma 2.26.
Let and be -extractors. Then, for any -source , each , and an independent uniform , we have that
is -close to uniform, assuming that and . In other words, is a -extractor. Moreover, if is strong then is also strong, and if and run in time and , respectively, then runs in time .
3 Additional Building Blocks
3.1 Fast Generation of Small-Bias Sets
A set is -biased if the uniform distribution over its elements is indistinguishable from uniform by every linear test. Namely, if for every nonempty it holds that . We say that a linear code is -balanced if the Hamming weight of each nonzero codeword lies in . It is known that these two objects are essentially the same: is -biased if and only if the matrix whose rows are the elements of is a generator matrix of an -balanced code.
One prominent way of constructing -balanced codes is via distance amplification, namely, starting with a code of some bias and, using a parity sampler, amplify its bias. We will use a specific, simple, instantiation of a parity sampler – the random walk sampler.
Lemma 3.1 (RWs amplify bias [Ta-17]101010The argument for was suggested already by Rozenman and Wigderson (see [Bog12]).).
Let be an -balanced code, and let be a -regular -spectral expander, and for an even , let be the set of walks of length on , noting that . Define such that
where at location is given by .
Then, is -balanced, for
For , we will use the Justesen code.
Lemma 3.2 ([Jus72]).
There exist constants and such that there exists an explicit family of codes , each of which has block length , rate , and is -balanced. Moreover, given , can be computed in .
Proof.
The parameters of the codes follow from the original construction (and specifically, the lemma holds, say, with and ), so we will just show that the code is efficiently computable. Given a message , we first encode it with a full Reed–Solomon code of constant relative distance over a field of characteristic , where . By Lemma 2.1, this can be done in time . Then, we encode with polynomial evaluation , for , with the binary representation of . This takes time as well. ∎
Corollary 3.3.
There exist a constant , and an explicit family of balanced codes, such that for every and any , is -balanced of rate , and given , any bits of can be computed in time .
Moreover, for every and any there exists an explicit -biased set over generated by a function computable in time .
Proof.
Let be the -balanced code guaranteed to us by Lemma 3.2, and let be the -regular -spectral expander of Corollary 2.18, instantiated with (so ). Letting be the amplified code of Lemma 3.1 set with
the lemma tells us that it is balanced. Given and , computing amounts to XORing coordinates of determined by , which indexes a random walk over . Computing takes time, and computing a length- walk over takes time. The corollary then follows, observing that , and that
For the “Moreover” part, recall that we can take the rows of the generator matrix of as our -biased set . Thus, for any , we can compute as follows: Compute the corresponding random walk on , and then, for any , is obtained by XORing the bits of indexed by the -th random walk. Observing that each bit of can be computed in time ,111111Indeed, each coordinate of is a bit in the encoding of for some , where . the runtime of is
Remark 3.4.
Instead of using Justesen’s code from Lemma 3.2 as our inner code , we can instead use the linear-time encodable code of Spielman [Spi96]. While not stated as balanced codes, but rather as constant relative distance codes, one can verify that the distance can also be bounded by above. The construction is more involved than Justesen’s one. However, in the logarithmic-cost RAM model, in which basic register operations over bit registers count as a single time step, Spielman’s code can be implemented in time.
3.2 A Sampler from Bounded Independence
Recall that is a -wise independent distribution, if for every distinct it holds that is -close to the uniform distribution over . Given our efficiently generated small biased spaces, we can efficiently generate almost -wise independent sample spaces as well.
Lemma 3.5.
For any positive integers , , and , and any , there exists an explicit -wise independent generator with . That is, the distribution formed by picking and outputting is -wise independent over . Moreover,
-
1.
Given , is computable in time .
-
2.
Assume that is such that . Then, with probability at least over , has at least distinct elements.
Proof.
Let , and set and . Let denote the -biased set that is guaranteed to us by Corollary 3.3 and let be the set that corresponds to treating each consecutive bits as an element of .121212Since we won’t care about optimizing the dependence on , we do not pre-encode using a bounded-independence generator (as in, say, [NN90, AGHP92]). By the XOR lemma, is -biased over . Moreover, , and each element of can be generated in time
where the last equality follows since we can always assume that . Notice that ignoring the last symbols of each element of still preserves the above properties, which indeed gives rise to an efficiently samplable -wise independent sample space over .
Next, we argue that most samples contain mostly distinct elements. Towards this end, let be our -wise independent distribution , and let denote the indicator random variable that is if and only if is a duplicate element (namely, there exists such that ). We are looking to bound with high probability.
Claim 3.6.
Assume that and for some . Then, for any distinct , it holds that .
Proof.
Fix indices , where each . The probability that for all is at most , since this event depends on at most random variables. Union-bounding over all choices of -s incurs a multiplicative factor of so overall, . ∎
Now, 3.6 is sufficient to give us good tail bounds (see, e.g., [HH15, Section 3]). In particular, denoting there exists a universal constant such that
which implies Item 2 when . Finally, we need to argue that we can also handle the case where is not a power of (and so ). In this case, we can take our -biased set to be over bits, and each consecutive bits are mapped to by simply taking the corresponding integer modulo . The correctness can be found, e.g., in [Rao07]. ∎
Towards introducing our sampler, we will need the following tail bound for -wise independent random variables.
Lemma 3.7 ([XZ24]).
Let be a -wise independent distribution, and fix some . Then, is also a sampling distribution, where
While the error in Item 2 above is small, it is not small enough for us to simply combine Lemmas 3.7 and 3.5, and we will need to do a mild error reduction. We do this via random walks on expanders and discarding repeating symbols, as was also done in [Vad04]. This gives us the following bounded-independence based sampler.
Lemma 3.8.
For any positive integers , any , and any constant such that , there exists an explicit sampler with , that satisfies the following additional properties.
-
1.
Every output of contains distinct symbols of , and,
-
2.
Given , is computable in time .
Proof.
Set to be the smallest integer such that , set , , and . Notice that and . Let
be the -wise independent generator guaranteed to us by Lemma 3.5, with . By Lemma 3.7, is a sampling distribution, where
Also, we know from Lemma 3.5 that with probability at least , each sample from has at least distinct symbols, using the fact that . Conditioned on seeing at least distinct symbols, as a sampling distribution, when we remove the non-distinct elements, has confidence and accuracy (where the second comes from the fact that symbols were removed).
Next, in order to improve the probability of sampling a good sequence, let be the -regular -spectral expander of Corollary 2.18, instantiated with , so for some universal constant . Write for , where for some constant soon to be determined. Given , let denote the corresponding random walk over . Our sampler , on input , computes and outputs the first sequence with at least distinct symbols. If no such sequence was found, simply outputs (in which case we say it failed). By the expander hitting property (see, e.g., [Vad12, Section 4]), fails with probability at most
over , upon choosing the appropriate constant . We then have that is indeed a sampling distribution, that can be generated using a seed of length . In terms of runtime, computing can be done in time
and computing the sequences themselves takes time. Observing that , the proof is concluded. ∎
We will need to somewhat extend Lemma 3.8 and use the simple, yet crucial, property of our bounded independence sampling: A subset of the coordinates of a -wise independent distribution with distinct samples is itself a -wise independent distribution with distinct samples.131313We note that the distinct-samples sampler given in [Vad04] does not seem to enjoy a similar property. Thus, if we wish to sample multiple times, say using samples, we can use one sample from a sampler that outputs coordinates, and truncate accordingly to create the other samples. We only need to note that: (1) the sampling parameters are determined by the different -s (and in particular, should be large enough), and (2) needs to be small enough compared to , so that we can get enough distinct symbols. We summarize this observation in the next lemma.
Lemma 3.9.
For any positive integers and , any and any constant such that , there exists an explicit function with that satisfies the following.
-
1.
For any , the function , that on input outputs , is a sampler, and each sample contains distinct symbols.
-
2.
On input , can be computed in time .
3.3 Nearly-Linear Time Condensers
We first give the condenser based on multiplicity codes, due to Kalev and Ta-Shma [KT22].
Theorem 3.10 (the lossless KT condenser, [KT22]).
For any constant the following holds, for every , and any and . There exists an explicit strong -condenser
where and . Note that the output entropy rate satisfies . Moreover, given and , the output can be computed in time.
In particular, if , then for all -sources and a -fraction of seeds it holds that , where is an -source. Note that the seed length is .
Proof.
For the first part of the theorem statement we only need to establish the construction’s runtime. Given and , set a prime ,141414More precisely, they set , and take to be a prime between and . and interpret as a polynomial of degree at most , and as an element of . Thus, , and we can safely ignore rounding issues, which can easily be addressed. The output is the sequence of derivatives
where . By Lemma 2.1, computing the derivatives takes time. The rest of the auxiliary operations are negligible compared to computing the derivatives.
To see the “In particular” part of the theorem statement, fix an -source and note that for some such that . Let . Then, an averaging argument gives that for a -fraction of seeds we have . Since is uniformly random over , we get that , as desired. ∎
The downside of Theorem 3.10 is that it requires the entropy in the source to be , instead of the optimal . Instead, we can use a lossy condenser151515Our extractor will lose a small constant fraction of the entropy, so losing a small constant fraction of the entropy in the condensing step will not make much difference. based on Reed–Solomon codes. Unfortunately, this comes at the expense of computing a generator of a field of size , which we do not know how to do in nearly-linear time for arbitrary -s. We consider it a one-time preprocessing step, as it does not depend on the inputs to the condenser.
Theorem 3.11 (the lossy RS condenser, [GUV09]).
For any constant the following holds, for every , and any and . There exists an explicit strong -condenser
where , , and . Note that the output entropy rate satisfies . Moreover, given , , and a primitive element for , the output can be computed in time .
In particular, if and , then for all -sources and a -fraction of seeds it holds that , where is an -source. Note that the seed length is .
Proof.
We set , and let be the generator of the multiplicative group given to us as input.161616Working with fields of characteristic is not necessary, but may help in efficiently computing . For example, Shoup [Sho90] showed that given an irreducible polynomial of degree at most , there exists a primitive element of such that is a monic polynomial of degree . Given and , similarly to Theorem 3.10, interpret as a univariate polynomial of degree at most , and as an element of . The output is the sequence of evaluations
where .
The correctness proof, as well as the exact choice of parameters, are given in [GUV09, Section 6], so we proceed to bounding the runtime. Towards that end, since we rely on a specific primitive element , we assume that the irreducible polynomial used to construct is known, either (and there are several ). Computing the evaluation points can then be done naively in time . Then, using Lemma 2.1, the evaluation can be done in time as well.
The “In particular” part of the theorem statement follows analogously to that of Theorem 3.10, using also the fact that if , then . ∎
4 A Faster Instantiation of Trevisan’s Extractor
We first recall Trevisan’s extractor [Tre01, RRV02], , set to some designated error . We will need the notion of weak designs, due to Raz, Reingold, and Vadhan [RRV02].
Definition 4.1 (weak design).
A collection of sets is an -weak design if for all we have and
We will also need a -balanced code . The parameters of the weak design affect the extractor’s parameters and can be set in a couple of different ways. The parameter is set to be , typically is chosen according to , , and the desired entropy , and then is chosen as a function of , , and according to the weak design (see [RRV02]). Given and , Trevisan’s extractor outputs
(1) |
where we denote and interpret each length- bit-string as a location in . For the runtime analysis, it will be important to recall that is set to be for some universal constant .
Theorem 4.2.
Trevisan’s extractor of Equation 1, set to extract bits with any error , is computable in time .
On a RAM in the logarithmic cost model, Trevisan’s extractor is computable in time with a preprocessing time of . In particular, there exists a universal constant , such that whenever , it runs in time , without the need for a separate preprocessing step.
Proof.
Looking at Equation 1, note that we only need to compute coordinates of . To compute those coordinates, , we first need to compute the weak design itself. Note that this can be seen as a preprocessing step, since it only depends on the parameters of the extractor, and not on or . We will use the following result.
Claim 4.3 ([FYEC24], Section A.5).
For every and , there exists an -weak design with , computable in time .
Once we have our preprocessing step, we are left with computing the code. By Corollary 3.3, we can choose so that for some universal constant , and so and . Generating the design can then be done in time . Now, Corollary 3.3 tells us that any bits of can be computed in time
On a RAM in the logarithmic cost model, we can use the variant of that uses Spielman’s code as a base code (see Remark 3.4) and get a runtime of . This gives a truly linear time construction whenever is at most . ∎
We conclude by noting that there is a natural setting of parameters under which Trevisan’s extractor gives logarithmic seed and linear (or near-linear) time. When , the parameters can be set so that . We thus have the following corollary.
Corollary 4.4.
For every , any constant , and any constants , Trevisan’s extractor can be instantiated as a extractor with , , and given and , is computable in time (or time, depending on the model).
5 Nearly-Linear Time Extractors with Order-Optimal Seed Length
5.1 A Non-Recursive Construction
In this section, we use the various previously introduced building blocks to construct a seeded extractor with order-optimal seed length computable in time . In a nutshell, our extractor proceeds as follows on input an -source :
-
1.
Using a fresh seed, apply the lossless KT condenser from Theorem 3.10 to . This yields an -source of length and constant entropy rate which can be arbitrarily close to .
-
2.
Using the fact that has high min-entropy rate, use the bounded-independence sampler from Lemma 3.9 to sample subsources from using a fresh seed. Specific properties of the bounded-independence sampler allow us to obtain a block source with a seed of length only . The number of blocks is and the blocks have geometrically increasing lengths, up to an length threshold.
-
3.
Now, to prepare for the hash-based iterative extraction, we need to make our blocks decreasing. Again, using a short seed, of length , we transform into , where the blocks are now geometrically decreasing. The blocks lengths will vary from to some , for some constants .
-
4.
Using a fresh seed, apply the fast hash-based extractor from Lemma 2.14 to perform block source extraction from . Noting that the first block has length , the block source extraction only outputs bits. We are able to use only random bits here, since we do not output bits already at the beginning of the iterative extraction process, but instead output logarithmically many bits, and gradually increasing the output length.
These steps will culminate in the following theorem.
Theorem 5.1 (non-recursive construction).
There exists a constant such that for every positive integers and , any , and any constant , there exists a strong extractor
where , and . Moreover, given inputs and , we can compute in time .
5.1.1 Item 2: Generating the block source
Because of the initial condensing step, we will assume from here onwards that our input source is an -source with constant . In order to generate the desired block source, we first use a fresh seed as input to an appropriate instantiation of the bounded-independence sampler from Lemma 3.9. This yields a tuple of coordinates from , such that is an appropriate averaging sampler for every . Then, we use these coordinates to sample subsources from , and get a block source with increasing blocks.
Lemma 5.2 (sampling a block source).
There exists a deterministic procedure that given an -source with , being constant, and:
-
•
A constant loss parameter ,
-
•
A closeness parameter that satisfies where is constant,
-
•
Number of desired blocks ,
-
•
A final, maximal, block length where is constant, and,
takes an independent and uniform random seed and outputs a random variable that is -close to a
block-source, where each for . Moreover, the seed length , and the procedure runs in time .
Note that for any constants , and any , we can have and for some , with seed length and runtime .
Proof.
Given our , we let for . Note that for , each , so in particular
by choosing the constant appropriately. Let be the sampler of Lemma 3.9, set with and . Note that then,
and indeed can be met by, again, setting the constant appropriately. Moreover, we have that for any ,
is a sampler, where has distinct symbols. Set .
Now, Lemma 2.16, instantiated with (notice that indeed ), tells us that for every , denoting , there exists a set of bad -s of density at most , such that for any ,
has entropy rate for every . Letting , union-bounding over the bad -s tells us that is close to some such that for any , has entropy rate .
Next, we apply the chain rule for min-entropy to argue that (and hence ) is close to a block source. To do that, we apply the chain rule for min-entropy times. For simplicity, abbreviate (so note that is the longer block, , whereas is its length- suffix), so
We will argue that is a block source. Applying Lemma 2.5, we know that for any ,
Now, , so , and notice that
where we used the fact that .
The bound on the runtime follows easily, recalling that runs in time . ∎
5.1.2 Item 3: Subsampling from the block source
To apply iterative extraction, we will our block source to have decreasing blocks. Here, we will use a sampler to sample from each block, using the same seed accross the blocks.
Lemma 5.3 (subsampling from a block source).
There exists a deterministic procedure that given a
block-source , for every and a constant , and:
-
•
A constant shrinkage parameter ,
-
•
A constant loss parameter ,
-
•
A closeness parameter ,
-
•
An initial, maximal, block length , and,
takes an independent and uniform random seed and outputs a random variable that is -close to a block-source, where each , and assuming that where is a constant. Moreover, the seed length , and the procedure runs in time .
Note that when and for some constants , , the procedure runs in time , and we can take any for some constant that depends on and .
Proof.
For , let , recalling that . For each , let be the be the distinct-samplers sampler of Lemma 2.21, where and . We need to make sure that each for some universal constant , and indeed that is the case, by our constraint on . Also, and we set to be the maximal over the -s, so
We denote the corresponding samples by , and let . Setting and observing that , we get that is close to some , an exact -source. From here onwards, assume that is the exact block source, and aggregate the error.
Next, we invoke Lemma 2.16 with (notice that indeed ), and get that for every , and ,
is -close to having min-entropy . Thus, in particular, it holds if we condition on any sample from , and so we have that for every ,
where has entropy rate. This means171717In what follows, we use the fact that we can couple any two with , for any joint distribution . See, e.g., [Li15, Lemma 3.20]. that has distance
from an block source, where we used the fact that the and terms are at most , which follows from the fact that for a suitable choice of .
To establish the runtime, note that we simply apply for each , which takes
time. This concludes our lemma. ∎
5.1.3 Item 4: Applying a block source extractor
We now wish to extract from our decreasing-blocks block source, and for that we combine Lemmas 5.3 and 5.2 with the block source extraction of Lemma 2.23, which will give us a nearly linear-time logarithmic-seed extractor that outputs bits. For the -s in Lemma 2.23, we will use the fast hash-based extractors from Lemma 2.14.
Lemma 5.4.
There exists a small constant such that the following holds. For every large enough , any constant , any , and any , there exists a extractor
where , and . Moreover, given inputs and , we can compute in time .
Proof.
Let be an -source. Set , , , and . We first apply Lemma 5.2 with , , and error , where is as guaranteed from the lemma’s statement. This requires a seed of length , and in time we output a random variable which is -close to a block source. Assume that is exactly a block source, and aggregate the error.
Set , and . Set to be the constant such that . We then apply Lemma 5.3 on with that , the same , closeness and an initial block length .This gives us a random variable that is -close to a
block source, requires a seed of length , and runs in time . Again, assume that is exactly a block source, and aggregate the error.
For our next and final step, set where is the constant guaranteed by Lemma 2.14. Also, let , and will be a constant whose value will be later determined. We will use the following extractors:
-
•
Let be the extractor guaranteed to us by Lemma 2.14. Notice that we need to satisfy . Looking forward, we will also need that . Those constraints can be satisfied making sure that is at most , where the hidden constant depends on .
-
•
For each , let
be the extractor guaranteed to us by Lemma 2.14, where . We need to make sure that and that . To see that the latter holds, note that and that if we choose to be a small enough constant (with respect to the constant ) and to be, again, at most . Also, here too, record that , which follows easily, since the -s increase.
Everything is in place to apply our block source extraction, Lemma 2.23, on and an independent and uniform seed of length .181818Note that here, we use Lemma 2.23 with and . The slack in entropy is needed since we work with the notion of block sources that also allows average conditional min-entropy. We can thus use the fact that under such setting of parameters, every extractor is an average case extractor with only a slight loss in parameters (see, e.g., [DORS08]). We omit the easy proof. We get that outputs of length , which is close to uniform, and runs in time
To conclude, note that the overall error of our extractor is at most , and the seed length is . ∎
5.1.4 Improving the output length
The extractor from Lemma 5.4 only outputs bits. Here, we will use an extractor that outputs a linear fraction of the entropy but requires a (relatively) long seed, and use Lemma 2.26 to boost the output length. For , we will again use a sample-then-extract extractor, however this time, we can use independent samples to create a block source with exponentially decreasing blocks. This setting is easier, and we can simply use the original [NZ96] construction. Since a similar construction will be analyzed later in the paper (including a time complexity analysis), we choose to employ it instead of revisiting [NZ96].
Corollary 5.5.
There exist constants and , such that for every positive integer , and any , there exists a strong extractor
where , and . Moreover, given inputs and , we can compute in time .
The correctness follows from Lemma 5.9 (without the need for a preliminary condensing step), employed with the hash functions of Lemma 2.14.
Plugging-in and into Lemma 2.26 readily gives the following result.
Lemma 5.6.
There exist constants such that for every positive integer , and any , there exists a extractor where , and . Moreover, given inputs and , we can compute in time .
To boost the output length from to for any constant , we apply Lemma 2.22 a constant number of times depending only on (that is, we simply apply with independent seeds and concatenate the outputs). To go from any min-entropy to entropy rate , we first apply a condenser, either the one from Theorem 3.10 or the one from Theorem 3.11. Specifically, when , we can use Theorem 3.10 which takes time. When is smaller, we can use Theorem 3.11, but this requires an extra preprocessing time which takes times. Note that the bound on from Lemma 5.6 translates to , so we can (if needed) modify so that . This finally gives us our main theorem for this section, Theorem 5.1, apart from the strongness property, which we now discuss.
The non-recursive construction is strong.
In what follows, we refer to the itemized list in the beginning of the section. The condensing step, Item 1, is strong, since we use strong condensers. Next, inspecting the proofs of Lemmas 5.2 and 5.3, we see that both samplings procedures yield a good sample with high probability over the fixing of the seed, so Items 2 and 3 hold in a strong manner as well. Item 4 follows by applying a block source extractor, which is strong since the extraction steps output the seed. Thus, the extractor from Lemma 5.4 is in fact strong. For the output-extending phase, Lemma 2.26 readily tells us that the extractor from Lemma 5.6 is strong. Finally, we apply that extractor several times with independent seeds, and the strongness of that procedure is guaranteed from Lemma 2.22.
5.2 A Recursive Construction
In this section, we prove the following.
Theorem 5.7 (recursive construction).
For any constant there exists a constant such that the following holds. For any positive integers , , and any such that there exists a strong -seeded extractor
with seed length and output length . Furthermore,
-
•
if , then is computable in time ;
-
•
if , then is computable in time after a preprocessing step that corresponds to finding primitive elements of fields with orders powers of .
In a nutshell, our construction behind Theorem 5.7 works by considering two cases. If , then we instantiate the recursive approach of Srinivasan and Zuckerman [SZ99] appropriately. Otherwise, we apply the recursive approach of Guruswami, Umans, and Vadhan [GUV09].
5.2.1 The (extremely) low-error case
In this section, we consider the lower error case of Theorem 5.7 where . We instantiate the recursive approach from [GUV09, Section 4.3.3] appropriately, and analyze its time complexity. Crucially, because of our upper bound on , we will only need to run levels of their recursive approach.
In order to obtain the statement of Theorem 5.7 for output length with an arbitrarily small constant, it suffices to achieve output length and then apply Lemma 2.22 a constant number of times depending only on . Therefore, we focus on achieving output length .
Theorem 5.8.
There exist constants such that the following holds. For any positive integers and and any further satisfying , there exists a strong -seeded extractor with seed length and output length .
Furthermore, is computable in time after a preprocessing step that corresponds to finding primitive elements of fields with orders , each a power of .
Proof.
We discuss our instantiation of the recursive approach from [GUV09] in detail, as it will be relevant to the time complexity analysis. Let and for a large enough constant to be determined later. For an integer , let , which determines the number of levels in our recursion. It will be important for bounding the time complexity of this construction to observe that
(2) |
because . For each , we define a family of strong -seeded extractors with when by induction on . Solving this recursion yields , provided that for a sufficiently large constant .
Base case.
For the base case , which holds when , we choose to be the -seeded extractor defined as follows. On input an -source ,
-
1.
Apply the lossy RS strong condenser (Theorem 3.11) on , instantiated with and error . When is large enough we have , and require a seed of length , for some constant . The corresponding output satisfies , for some -source with .
-
2.
Let be the average-case strong -seeded extractor from Lemma 2.25 instantiated with , which requires a seed of length for some constant and has output length . The conditions for the invocation of Lemma 2.25 with are satisfied since and
where the second inequality uses the theorem’s hypothesis that with a sufficiently large constant.
We set and define . From the discussion above, we have
Therefore, the triangle inequality implies that is an average-case strong -seeded extractor. It remains to argue about the seed length, output length, and time complexity of . The seed length of is
provided that with a sufficiently large constant. The output length of is , since . Finally, both steps above take time , and so can be computed in time after a preprocessing step.
Induction step.
When , we assume the existence of the desired average-case strong extractors for all as the induction hypothesis. More precisely, we assume that for all such that there exists a family of average-case strong -seeded extractors parameterized by computable in time after a one-time preprocessing step. We proceed as follows on input an -source :
-
1.
Apply the lossy RS strong -condenser (Theorem 3.11) on with and a seed of length . Since if is a large enough constant, by the second part of Theorem 3.11 we know that with probability at least over the choice of it holds that the corresponding condenser output is -close to some -source with . For the sake of exposition, from here onwards we work under such a good choice of the seed , and we will add the slack term to the final error.
-
2.
Split with . By Lemma 2.24 instantiated with and and the fact that is -close to an -source, we get that is -close to an -block-source with
(3) since for a sufficiently large constant .
-
3.
Apply the lossy RS strong -condenser (Theorem 3.11) to with and a seed of length at most . From Item 2 and the data-processing inequality, we know that
(4) Since is a -source for any in the support of , we conclude from Theorem 3.11 and Equation 4 that
where and for all in the support of , with . This is a valid invocation since by Equation 3. Therefore, by the second part of Theorem 3.11, with probability at least over the choice of of we get that
(5) where satisfies . Fix such a good fixing of from now onwards. As before, we will account for the probability of fixing a bad seed in the final extractor error. Then, by combining Equations 4 and 5 we get that is -close to an -block source.
-
4.
We will now apply block source extraction to , which we recall is -close to an -block source. We instantiate Lemma 2.23 with being the strong extractor from Lemma 2.25 with source input length , min-entropy requirement , error , output length , and . This requires a seed of length . This instantiation of Lemma 2.25 is valid since and
where we used the fact that , and so . For we choose the average-case strong extractor (recall that and note that ) with input length , entropy requirement , error , output length at least , and seed length guaranteed by the induction hypothesis above.
Items 1, 2, 3 and 4 above yield a strong seeded extractor with min-entropy requirement , error (where the term comes from the two fixings of the seeds in the two condensing steps in Items 1 and 3), seed length
for some constant , and output length .
To conclude the definition of , we need to increase the output length of from to . To that end, we use Lemma 2.22. Applying Lemma 2.22 once with with and with and yields a strong -seeded extractor with output length and seed length . Applying Lemma 2.22 again with for and for and yields a strong -seeded extractor with output length and seed length , which we set as . This second invocation of Lemma 2.22 is also valid, since . Note that the error , as desired.
Time complexity and final error.
It remains to analyze the time complexity and the overall error of the recursive procedure above. Evaluating requires at most eight evaluations of the condenser from Theorem 3.11, four evaluations of the fast hash-based extractor from Lemma 2.25, four evaluations of for some , and simple operations that can be done in time . This means that the overall time complexity is after a one-time preprocessing step independent of the source and seed, since by Equation 2. This preprocessing step corresponds to finding primitive elements for fields with orders powers of . Furthermore, for all , and so provided that for a large enough constant . ∎
5.2.2 The (relatively) high-error case
In this section, we consider the higher error case where . We instantiate the recursive approach of Srinivasan and Zuckerman [SZ99, Section 5.5] appropriately with our building blocks and analyze its complexity.
Lemma 5.9 (analogous to [SZ99, Corollary 5.10], with different instantiation and additional complexity claim).
There exist constants such that the following holds. Suppose that for any positive integers , , and some and there exists a strong -seeded extractor with seed length . Then, for any positive integers and there exists a family of strong -seeded extractors with error , seed length , and output length . Furthermore,
-
1.
If is computable in time and , then is computable in time ;
-
2.
If is computable in time after a preprocessing step corresponding to finding primitive elements of fields of orders , then is computable in time after a preprocessing step corresponding to finding primitive elements of fields of orders .
Proof.
We begin by setting up relevant notation:
-
•
Let be a constant to be determined. Set and . For and , we define . Then, we define for all . The ’s will be block lengths for a block source . In particular, when performing block source extraction from we will instantiate with input length .
-
•
Define and for all . The ’s will be output lengths for block source extraction from .
-
•
Set . This will be the number of blocks of . We have . Furthermore, since , we also have that for all .
Let be an arbitrary -source. The extractor proceeds as follows on input :
-
1.
Using a fresh seed of length , apply a strong -condenser to . If for an apropriately large constant , then we instantiate with the lossless KT -condenser (Theorem 3.10). Otherwise, we instantiate with the lossy RS -condenser (Theorem 3.11) instantiated with . By the second part of either Theorem 3.10 or Theorem 3.11, we get that with probability at least over the choice of it holds that is -close to an -source with . From here onwards we work under such a good fixing , and will account for the error term in the final extractor error later on.
-
2.
We use and Lemma 2.16 to generate a block source with geometrically decreasing block lengths defined above.
For each , let be the -averaging sampler from Lemma 2.21 with input length for some constant . We choose the constant above to be large enough so that for all , where is the constant from Lemma 2.21. To see that for (and so indeed Lemma 2.21 can be applied to obtain samples), note that
(6) The second-to-last inequality uses the fact that
where the first inequality holds since is an hypothesis in the lemma statement. We also assume that for a constant small enough so that
where we recall that , meaning that the conditions of Lemma 2.21 are satisfied for all .
For each , let be a fresh seed of length . We set the -th block as . By Lemma 2.16 instantiated with and , we conclude that
with an absolute constant guaranteed by Lemma 2.16, where is an -source for every . We now argue how this guarantee extends to more blocks. Consider an arbitrary and fixings . Then, Lemma 2.6 with and (from the upper bound in Equation 6) implies that
except with probability at most over the choice of , which we can absorb into the statistical distance, since . Consequently, from Lemma 2.16 we get that
(7) where is an -source for any choice of . Combining Equation 7 with the triangle inequality over all , we conclude that is -close to an exact -block-source , where .
-
3.
We apply block source extraction (Lemma 2.23) to . More precisely, let be the strong -block-source extractor with obtained via Lemma 2.23 as follows. We instantiate with the strong extractor promised by the lemma statement with seed length . For , we instantiate as the strong -seeded extractor from Lemma 2.14 with seed length . We choose the constant to be large enough so that
as required by Lemma 2.14. This is possible since by choosing large enough we have
for all , and so for all . Furthermore, for any the output length of satisfies
where we recall that for . Finally, the output length of satisfies , where we recall that is the seed length of .
Let be a fresh seed of length . With the desired upper bound on the seed length from the lemma’s statement in mind, we note that
(8) since . By Lemma 2.23, we get that
Applying the triangle inequality, we conclude that
We now define our final strong extractor (recall that we abbreviate ). Choose our overall seed to be and set . By the discussion above, is a strong -extractor with error (recall that we abbreviate )
for a sufficiently large constant since (where one of the terms comes from fixing the seed in the condensing step of Item 1), and seed length
provided that is large enough (again since ), as desired. We used Equation 8 to bound and obtain the last inequality.
Time complexity.
It remains to analyze the time complexity of . If with a sufficiently large constant, then Item 1 takes time . Item 2 takes time , since and each averaging sampler is computable in time . Item 3 takes time , since and each from Lemma 2.14 are computable in time . Therefore, is computable in overall time in this case.
Otherwise, if , then Item 1 takes time after a preprocessing step corresponding to finding a primitive element of with . As discussed above, Item 2 takes time . Item 3 takes time after a preprocessing step, and so is computable in overall time after a preprocessing step. Moreover, if the preprocessing step for consists in finding primitive elements of fields with orders , then by the above the preprocessing step for consists in finding primitive elements of fields with orders . ∎
Denote by the function that iteratively applies a total of times (so , , and so on). Denote by the iterated logarithm. Then, we have the following corollary.
Corollary 5.10.
There exists a constant such that the following holds. Let be any positive integer and any positive integer such that . Then, for any and any there exists a strong -seeded extractor with seed length and output length . Furthermore,
-
1.
if , then is computable in time ;
-
2.
if , then is computable in time after a preprocessing step which corresponds to finding primitive elements of fields of orders powers of .
Consequently, if we choose to be the largest integer such that (which satisfies ) we get a strong -seeded extractor with seed length and output length for any error . If , then is computable in time . Otherwise, is computable in time after a preprocessing step which corresponds to finding primitive elements of fields of orders .
Proof.
We iteratively apply Lemma 5.9 times. Let be the constants guaranteed by Lemma 5.9. For the first application of the lemma, we take to be the strong extractor from Lemma 2.14 with and to be defined later. The corresponding seed length is , which satisfies , and so the initial value of is . Denote by the resulting strong seeded extractor. In the second application of Lemma 5.9, we instantiate with instead to obtain a new strong seeded extractor , and so on. For each , we obtain a family of strong -seeded extractors parameterized by with output length , error
and seed length
where
The last inequality uses the fact that for all .
Recall that from the corollary statement that is such that . We show by induction that for all . This is immediate for the base case , since . For the induction step, note that
as desired. This implies that
and
for all . We may assume that is large enough that for all , in which case since for all by hypothesis. Therefore, we obtain final output length
final error satisfying
where the last inequality uses that , and final seed length
We now instantiate . Note that as required for the choice of above so long as , which holds by the corollary’s hypothesis if is a large enough constant. With this choice of we get final error . In fact, we can make larger so that , in which case the final seed length satisfies
as desired.
Time complexity.
Finally, we discuss the time complexity of . Note that the initial choice for is computable in time . Therefore, if for a sufficiently large constant , then the conditions of Item 1 of Lemma 5.9 are satisfied for all applications of this lemma, and so will be computable in time . Otherwise, the condition in Item 1 holds and so is computable in time after a preprocessing step, since we always have in each application of the lemma. By Lemma 5.9, the preprocessing amounts to finding primitive elements of fields with orders . ∎
To obtain our final theorem, we use block source extraction to increase the output length of the extractor from Corollary 5.10, following a strategy of Zuckerman [Zuc97].
Theorem 5.11.
There exist constants such that the following holds. For any integers and and any there exists a strong -seeded extractor with seed length and output length . Furthermore,
-
1.
if , then is computable in time ;
-
2.
if , then is computable in time after a preprocessing step which corresponds to finding primitive elements of fields of orders powers of .
Proof.
Define and let be an arbitrary -source. The extractor behaves as follows on input :
-
1.
Apply a strong -condenser to , with output min-entropy rate and seed length . If , we instantiate with the lossless KT strong -condenser (Theorem 3.10). Otherwise, we instantiate with the lossy RS strong -condenser (Theorem 3.11). By the second part of either Theorem 3.10 or Theorem 3.11, we get that with probability at least over the choice of the seed we obtain an output that is -close to an -source with . As in previous arguments, we work under such a good fixing of from here onwards and account for the probability of selecting a bad seed in the final extractor error later on.
-
2.
Write with . Choose the constant in the theorem statement small enough so that , which means that . Then, combining Item 1 with Lemma 2.24, (instantiated with , , and ) via the triangle inequality, is -close to an -block source.
-
3.
Apply block source extraction to . More precisely, let be the strong -seeded extractor from Corollary 5.10 instantiated with and , which yields , , and , for constants guaranteed by Corollary 5.10. Furthermore, let be the strong -seeded extractor from the “Consequently” part of Corollary 5.10 and , which yields , , and , for a constant guaranteed by Corollary 5.10. This choice of parameters ensures that . Indeed, since , to see that it suffices to check that
Since and , it is enough that
for a sufficiently large constant , which holds whenever is larger than some appropriate absolute constant. Instantiating Lemma 2.23 with and above yields a strong -block-source extractor .
Since is -close to an -block source, we conclude that
(9)
We define the output of our final strong extractor to be . Since , Equation 9 implies that
This means that is a strong -seeded extractor with seed length and output length for an absolute constant , where one of the terms in the error comes from fixing the seed in the condensing step of Item 1.
Time complexity.
Finally, we analyze the time complexity of . If , then Item 1 runs in time . In Item 3, and are both computable in time under this lower bound on , and thus so is . We conclude that runs in time . Otherwise, if , then Item 1 runs in time after a preprocessing step, and and in Item 3 run in time after a preprocessing step. Therefore, overall, runs in time after a preprocessing step. ∎
References
- [ACG+22] Omar Alrabiah, Eshan Chattopadhyay, Jesse Goodman, Xin Li, and João Ribeiro. Low-degree polynomials extract from local sources. In International Colloquium on Automata, Languages, and Programming (ICALP), pages 10:1–10:20, 2022.
- [AGHP92] Noga Alon, Oded Goldreich, Johan Håstad, and René Peralta. Simple constructions of almost -wise independent random variables. Random Structures & Algorithms, 3(3):289–304, 1992.
- [AGMR24] Omar Alrabiah, Jesse Goodman, Jonathan Mosheiff, and João Ribeiro. Low-degree polynomials are good extractors. ECCC, 2024. https://eccc.weizmann.ac.il/report/2024/093/ (manuscript).
- [Alo21] Noga Alon. Explicit expanders of every degree and size. Combinatorica, pages 1–17, 2021.
- [BBCM95] Charles H. Bennett, Gilles Brassard, Claude Crépeau, and Ueli M. Maurer. Generalized privacy amplification. IEEE Transactions on Information Theory, 41(6):1915–1923, 1995.
- [BG13] Andrej Bogdanov and Siyao Guo. Sparse extractor families for all the entropy. In Innovations in Theoretical Computer Science (ITCS), pages 553–560. ACM, 2013.
- [BM74] Allan Borodin and Robert Moenck. Fast modular transforms. Journal of Computer and System Sciences, 8(3):366–386, 1974.
- [Bog12] Andrej Bogdanov. Topics in (and out) the theory of computing: Lecture notes. https://andrejb.net/csc5060/notes/12L12.pdf, 2012. [Online; accessed October 2024].
- [BRST02] Ziv Bar-Yossef, Omer Reingold, Ronen Shaltiel, and Luca Trevisan. Streaming computation of combinatorial objects. In Annual Conference on Computational Complexity (CCC), pages 165–174. IEEE, 2002.
- [CG88] Benny Chor and Oded Goldreich. Unbiased bits from sources of weak randomness and probabilistic communication complexity. SIAM Journal on Computing, 17(2):230–261, 1988.
- [CL18] Kuan Cheng and Xin Li. Randomness extraction in AC0 and with small locality. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM), pages 37:1–37:20. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018.
- [CRSW13] L. Elisa Celis, Omer Reingold, Gil Segev, and Udi Wieder. Balls and bins: Smaller hash families and faster evaluation. SIAM Journal on Computing, 42(3):1030–1050, 2013.
- [CT65] James W. Cooley and John W. Tukey. An algorithm for the machine calculation of complex Fourier series. Mathematics of Computation, 19(90):297–301, 1965.
- [CW24] Kuan Cheng and Ruiyang Wu. Randomness extractors in AC0 and NC1: Optimal up to constant factors. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM), pages 69:1–69:22. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2024.
- [DD06] Jacques Dubrois and Jean-Guillaume Dumas. Efficient polynomial time algorithms computing industrial-strength primitive roots. Information Processing Letters, 97(2):41–45, 2006.
- [DKSS13] Zeev Dvir, Swastik Kopparty, Shubhangi Saraf, and Madhu Sudan. Extensions to the method of multiplicities, with applications to Kakeya sets and mergers. SIAM Journal on Computing, 42(6):2305–2328, 2013.
- [DMOZ22] Dean Doron, Dana Moshkovitz, Justin Oh, and David Zuckerman. Nearly optimal pseudorandomness from hardness. J. ACM, 69(6), November 2022.
- [DORS08] Yevgeniy Dodis, Rafail Ostrovsky, Leonid Reyzin, and Adam Smith. Fuzzy extractors: How to generate strong keys from biometrics and other noisy data. SIAM Journal on Computing, 38(1):97–139, 2008.
- [DPVR12] Anindya De, Christopher Portmann, Thomas Vidick, and Renato Renner. Trevisan’s extractor in the presence of quantum side information. SIAM Journal on Computing, 41(4):915–940, 2012.
- [DT23] Dean Doron and Roei Tell. Derandomization with minimal memory footprint. In Computational Complexity Conference (CCC), pages 11:1–11:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2023.
- [Für09] Martin Fürer. Faster integer multiplication. SIAM Journal on Computing, 39(3):979–1005, 2009.
- [FWE+23] Cameron Foreman, Sherilyn Wright, Alec Edgington, Mario Berta, and Florian J. Curchod. Practical randomness amplification and privatisation with implementations on quantum computers. Quantum, 7:969, March 2023.
- [FYEC24] Cameron Foreman, Richie Yeung, Alec Edgington, and Florian J. Curchod. Cryptomite: A versatile and user-friendly library of randomness extractors. arXiv e-prints, February 2024. https://arxiv.org/abs/2402.09481.
- [GGH+24] Alexander Golovnev, Zeyu Guo, Pooya Hatami, Satyajeet Nagargoje, and Chao Yan. Hilbert functions and low-degree randomness extractors. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM), pages 41:1–41:24. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2024.
- [Gil98] David Gillman. A Chernoff bound for random walks on expander graphs. SIAM Journal on Computing, 27(4):1203–1220, 1998.
- [GUV09] Venkatesan Guruswami, Christopher Umans, and Salil Vadhan. Unbalanced expanders and randomness extractors from Parvaresh–Vardy codes. J. ACM, 56(4), jul 2009.
- [GVW15] Oded Goldreich, Emanuele Viola, and Avi Wigderson. On randomness extraction in AC0. In Conference on Computational Complexity (CCC), page 601–668. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2015.
- [Hea08] Alexander D. Healy. Randomness-efficient sampling within . Computational Complexity, 17:3–37, 2008.
- [HH15] Jan Hązła and Thomas Holenstein. Upper tail estimates with combinatorial proofs. In Symposium on Theoretical Aspects of Computer Science (STACS). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2015.
- [HIV22] Xuangui Huang, Peter Ivanov, and Emanuele Viola. Affine extractors and AC0-parity. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM), pages 9:1–9:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022.
- [HT16] Masahito Hayashi and Toyohiro Tsurumaru. More efficient privacy amplification with less random seeds via dual universal hash function. IEEE Transactions on Information Theory, 62(4):2213–2232, 2016.
- [HV06] Alexander Healy and Emanuele Viola. Constant-depth circuits for arithmetic in finite fields of characteristic two. In Annual Symposium on Theoretical Aspects of Computer Science (STACS), pages 672–683. Springer, 2006.
- [HvdH19] David Harvey and Joris van der Hoeven. Faster polynomial multiplication over finite fields using cyclotomic coefficient rings. Journal of Complexity, 54:101404, 2019.
- [HvdH21] David Harvey and Joris van der Hoeven. Integer multiplication in time . Annals of Mathematics, 193(2):563 – 617, 2021.
- [HvdH22] David Harvey and Joris van der Hoeven. Polynomial multiplication over finite fields in time . J. ACM, 69(2):1–40, 2022.
- [Jus72] Jørn Justesen. Class of constructive asymptotically good algebraic codes. IEEE Transactions on Information Theory, 18(5):652–656, 1972.
- [KT22] Itay Kalev and Amnon Ta-Shma. Unbalanced expanders from multiplicity codes. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM), pages 12:1–12:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022.
- [Li15] Xin Li. Non-malleable condensers for arbitrary min-entropy, and almost optimal protocols for privacy amplification. In Theory of Cryptography Conference (TCC), pages 502–531. Springer, 2015.
- [LRVW03] Chi-Jen Lu, Omer Reingold, Salil Vadhan, and Avi Wigderson. Extractors: Optimal up to constant factors. In Symposium on Theory of Computing (STOC), pages 602–611. ACM, 2003.
- [Lu02] Chi-Jen Lu. Hyper-encryption against space-bounded adversaries from on-line strong extractors. In Advances in Cryptology — CRYPTO, pages 257–271. Springer, 2002.
- [MPS12] Wolfgang Mauerer, Christopher Portmann, and Volkher B. Scholz. A modular framework for randomness extraction based on Trevisan’s construction. arXiv e-prints, December 2012. https://arxiv.org/abs/1212.0520.
- [MRRR14] Raghu Meka, Omer Reingold, Guy N. Rothblum, and Ron D. Rothblum. Fast pseudorandomness for independence and load balancing. In International Colloquium on Automata, Languages, and Programming (ICALP), pages 859–870. Springer, 2014.
- [MW97] Ueli Maurer and Stefan Wolf. Privacy amplification secure against active adversaries. In Advances in Cryptology — CRYPTO, pages 307–321. Springer, 1997.
- [NN90] Joseph Naor and Moni Naor. Small-bias probability spaces: Efficient constructions and applications. In Symposium on Theory of Computing (STOC), pages 213–223. ACM, 1990.
- [NT99] Noam Nisan and Amnon Ta-Shma. Extracting randomness: A survey and new constructions. Journal of Computer and System Sciences, 58(1):148–173, 1999.
- [NZ96] Noam Nisan and David Zuckerman. Randomness is linear in space. Journal of Computer and System Sciences, 52(1):43–52, 1996.
- [Rao07] Anup Rao. An exposition of Bourgain’s 2-source extractor. ECCC, 2007. https://eccc.weizmann.ac.il//report/2007/034/ (manuscript).
- [RRV02] Ran Raz, Omer Reingold, and Salil Vadhan. Extracting all the randomness and reducing the error in Trevisan’s extractors. Journal of Computer and System Sciences, 65(1):97–128, 2002.
- [RSW06] Omer Reingold, Ronen Shaltiel, and Avi Wigderson. Extracting randomness via repeated condensing. SIAM Journal on Computing, 35(5):1185–1209, 2006.
- [RT00] Jaikumar Radhakrishnan and Amnon Ta-Shma. Bounds for dispersers, extractors, and depth-two superconcentrators. SIAM Journal on Discrete Mathematics, 13(1):2–24, 2000.
- [Sch77] Arnold Schönhage. Schnelle multiplikation von polynomen über körpern der charakteristik 2. Acta Informatica, 7(4):395–398, 1977.
- [Sho90] Victor Shoup. Searching for primitive roots in finite fields. In Symposium on Theory of Computing (STOC), pages 546–554. ACM, 1990.
- [Shp92] Igor E. Shparlinski. On primitive elements in finite fields and on elliptic curves. Mathematics of the USSR-Sbornik, 71(1):41, feb 1992.
- [Spi96] Daniel A. Spielman. Linear-time encodable and decodable error-correcting codes. IEEE Transactions on Information Theory, 42(6):1723–1731, 1996.
- [SU05] Ronen Shaltiel and Christopher Umans. Simple extractors for all min-entropies and a new pseudorandom generator. J. ACM, 52(2):172–216, 2005.
- [SZ99] Aravind Srinivasan and David Zuckerman. Computing with very weak random sources. SIAM Journal on Computing, 28(4):1433–1459, 1999.
- [Ta-17] Amnon Ta-Shma. Explicit, almost optimal, epsilon-balanced codes. In Symposium on Theory of Computing (STOC), page 238–251. ACM, 2017.
- [Tre01] Luca Trevisan. Extractors and pseudorandom generators. J. ACM, 48(4):860–879, jul 2001.
- [TSSR11] Marco Tomamichel, Christian Schaffner, Adam Smith, and Renato Renner. Leftover hashing against quantum side information. IEEE Transactions on Information Theory, 57(8):5524–5535, 2011.
- [TU12] Amnon Ta-Shma and Christopher Umans. Better condensers and new extractors from Parvaresh-Vardy codes. In Conference on Computational Complexity (CCC), pages 309–315. IEEE, 2012.
- [TZS06] Amnon Ta-Shma, David Zuckerman, and Shmuel Safra. Extractors from Reed–Muller codes. Journal of Computer and System Sciences, 72(5):786–812, 2006.
- [Vad04] Salil Vadhan. Constructing locally computable extractors and cryptosystems in the bounded-storage model. Journal of Cryptology, 17:43–77, 2004.
- [Vad12] Salil Vadhan. Pseudorandomness. Foundations and Trends in Theoretical Computer Science, 7(1–3):1–336, 2012.
- [vzGG13] Joachim von zur Gathen and Jürgen Gerhard. Modern Computer Algebra. Cambridge University Press, 2013.
- [WZ99] Avi Wigderson and David Zuckerman. Expanders that beat the eigenvalue bound: Explicit construction and applications. Combinatorica, 19(1):125–138, 1999.
- [XZ24] Zhiyang Xun and David Zuckerman. Near-optimal averaging samplers. ECCC, 2024. https://eccc.weizmann.ac.il/report/2024/097/ (manuscript).
- [Zuc97] David Zuckerman. Randomness-optimal oblivious sampling. Random Structures & Algorithms, 11(4):345–367, 1997.