This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

On 𝒩𝒫\mathcal{NP} versus co𝒩𝒫{\rm co}\mathcal{NP} and Frege Systems

By Tianrong Lin

Abstract

We prove in this paper that there is a language LdL_{d} accepted by some nondeterministic Turing machines but not by any co𝒩𝒫{\rm co}\mathcal{NP}-machines (defined later). Then we further show that LdL_{d} is in 𝒩𝒫\mathcal{NP}, thus proving that 𝒩𝒫co𝒩𝒫\mathcal{NP}\neq{\rm co}\mathcal{NP}. The main techniques used in this paper are lazy-diagonalization and the novel new technique developed in the author’s recent work [Lin21]. Further, since there exists some oracle AA such that 𝒩𝒫A=co𝒩𝒫A\mathcal{NP}^{A}={\rm co}\mathcal{NP}^{A}, we then explore what mystery lies behind it and show that if 𝒩𝒫A=co𝒩𝒫A\mathcal{NP}^{A}={\rm co}\mathcal{NP}^{A} and under some rational assumptions, then the set of all polynomial-time co-nondeterministic oracle Turing machines with oracle AA is not enumerable, thus showing that the technique of lazy-diagonalization is not applicable for the first half of the whole step to separate 𝒩𝒫A\mathcal{NP}^{A} from co𝒩𝒫A{\rm co}\mathcal{NP}^{A}. As a by-product, we reach the important result that 𝒫𝒩𝒫\mathcal{P}\neq\mathcal{NP} [Lin21] once again, which is clear from the above outcome and the well-known fact that 𝒫=co𝒫\mathcal{P}={\rm co}\mathcal{P}. Next, we show that the complexity class co𝒩𝒫{\rm co}\mathcal{NP} has intermediate languages, i.e., there exists a language Linterco𝒩𝒫L_{inter}\in{\rm co}\mathcal{NP}, which is not in 𝒫\mathcal{P} and not co𝒩𝒫{\rm co}\mathcal{NP}-complete. We also summarize other direct consequences implied by our main outcome, such as 𝒩𝒳𝒫co𝒩𝒳𝒫\mathcal{NEXP}\neq{\rm co}\mathcal{NEXP}, 𝒫𝒫𝒩𝒳𝒫\mathcal{BPP}\neq\mathcal{NEXP} and there exists no super proof system. Lastly, we show a lower bounds result for Frege proof systems, i.e., no Frege proof systems can be polynomially bounded.

Table of Contents

1. Introduction

2. Preliminaries

3. All co𝒩𝒫{\rm co}\mathcal{NP}-machines are enumerable

4. Lazy-diagonalization against All co𝒩𝒫{\rm co}\mathcal{NP}-machines

5. Proving That Ld𝒩𝒫L_{d}\in\mathcal{NP}

6. Breaking the Relativization Barrier

7. Rich Structure of co𝒩𝒫{\rm co}\mathcal{NP}

8. Frege Systems

9. Conclusions

References

1 Introduction

It is well-known that computational complexity theory is a central subfield of theoretical computer science and mathematics, which mainly concerns the efficiency of Turing machines (i.e., algorithms), or the intrinsic complexity of computational tasks, i.e., focuses on classifying computational problems according to their resource usage, and relating these classes to each other (see e.g. [A1]). In other words, computational complexity theory specifically deals with fundamental questions such as what is feasible computation and what can and cannot be computed with a reasonable amount of computational resources in terms of time or space (memory). What’s exciting is that there are many remarkable open problems in the area of computational complexity theory, such as the famous 𝒫\mathcal{P} versus 𝒩𝒫\mathcal{NP} problem and the important unknown relation between the complexity class 𝒩𝒫\mathcal{NP} and the complexity class co𝒩𝒫{\rm co}\mathcal{NP}, which deeply attract many researchers to conquer them like crazy. However, despite decades of effort, very little progress has been made on these important problems. Furthermore, as pointed out by Wigderson [Wig07], to understand the power and limits of efficient computation has led to the development of computational complexity theory, and this discipline in general, and the aforementioned famous open problems in particular, have gained prominence within the mathematics community in the past decades.

As introduced in the Wikipedia [A1], in the field of computational complexity theory, a problem is regarded as inherently difficult if its solution requires significant resources such as time and space, whatever the algorithm used. The theory formalizes this intuition, by introducing mathematical models of computing to study these problems and quantifying their computational complexity, i.e., the amount of resources needed to solve them. However, it is worth noting that other measures of complexity are also used, such as the amount of communication, which appeared in communication complexity, the number of gates in a circuit, which appeared in circuit complexity, and so forth (see e.g. [A1]). Going further, one of the roles of computational complexity theory is to determine the practical limits on what computers (the computing models) can and cannot do.

Recently, in the author’s work [Lin21], we left an important open conjecture in computational complexity untouched on, which is one of the aforementioned open problems, i.e., the 𝒩𝒫\mathcal{NP} versus co𝒩𝒫{\rm co}\mathcal{NP} problem (the unknown relation between complexity classes 𝒩𝒫\mathcal{NP} and co𝒩𝒫{\rm co}\mathcal{NP}). Naturally, one will ask, are these two complexity classes the same? Further, also note that there is a subfield of computational complexity theory, namely, the proportional proof complexity, which was initiated by Cook and Reckhow [CR79] and is devoted to the goal of proving the conjecture

𝒩𝒫co𝒩𝒫.\mathcal{NP}\neq{\rm co}\mathcal{NP}.

In history, the fundamental measure of time opened the door to the study of the extremely expressive time complexity class 𝒩𝒫\mathcal{NP}, one of the most important classical complexity classes, i.e., nondeterministic polynomial-time. Specifically, 𝒩𝒫\mathcal{NP} is the set of decision problems for which the problem instances, where the answer is “yes”, have proofs verifiable in polynomial time by some deterministic Turing machine, or alternatively the set of problems that can be solved in polynomial time by a nondeterministic Turing machine (see e.g. [A2]). The famous Cook-Levin theorem [Coo71, Lev73] shows that this class has complete problems, which states that the Satisfiability is 𝒩𝒫\mathcal{NP}-complete, i.e., Satisfiability is in 𝒩𝒫\mathcal{NP} and any other language in 𝒩𝒫\mathcal{NP} can be reduced to it in polynomial-time. At the same time, it is also worth noting that this famous and seminal result opened the door to research into the renowned rich theory of 𝒩𝒫\mathcal{NP}-completeness [Kar72].

On the other hand, the complexity class co𝒩𝒫{\rm co}\mathcal{NP} is formally defined as follows: For a complexity class 𝒞\mathcal{C}, its complement is denoted by co𝒞{\rm co}\mathcal{C} (see e.g. [Pap94]), i.e.,

co𝒞={L¯:L𝒞},{\rm co}\mathcal{C}=\{\overline{L}:L\in\mathcal{C}\},

where LL is a decision problem, and L¯\overline{L} is the complement of LL (i.e., L¯=ΣL\overline{L}=\Sigma^{*}-L with assumption that the language LL is over the alphabet Σ\Sigma). That is to say, co𝒩𝒫{\rm co}\mathcal{NP} is the complement of 𝒩𝒫\mathcal{NP}. Note that, the complement of a decision problem LL is defined as the decision problem whose answer is “yes” whenever the input is a “no” input of LL, and vice versa. In other words, for instance, according to the Wikipedia’s language [A3], co𝒩𝒫{\rm co}\mathcal{NP} is the set of decision problems where there exists a polynomial p(n)p(n) and a polynomial-time bounded Turing machine MM such that for every instance xx, xx is a no-instance if and only if: for some possible certificate cc of length bounded by p(n)p(n), the Turing machine MM accepts the pair (x,c)(x,c). To the best of our knowledge, the complexity class co𝒩𝒫{\rm co}\mathcal{NP} was introduced for the first time by Meyer and Stockmeyer [MS72] with the name “1P\prod_{1}^{P}”, and Stockmeyer wrote a full paper on the polynomial hierarchy which also uses the notation co𝒩𝒫{\rm co}\mathcal{NP}; see e.g. [Sto77].

The 𝒩𝒫\mathcal{NP} versus co𝒩𝒫{\rm co}\mathcal{NP} problem is deeply connected to the field of proof complexity. Here are some historical notes: In 1979, Cook and Reckhow [CR79] introduced a general definition of propositional proof system and related it to mainstream complexity theory by pointing out that such a system exists in which all tautologies have polynomial length proofs if and only if the two complexity classes 𝒩𝒫\mathcal{NP} and co𝒩𝒫{\rm co}\mathcal{NP} coincide; see e.g. [CN10]. We refer the reader to the reference [Coo00] for the importance of this research field and to the reference [Kra95] for the motivation of the development of this rich theory. Apart from those, Chapter 1010 of [Pap94] also contains the introductions of importance of the problem

𝒩𝒫=?co𝒩𝒫.\mathcal{NP}\overset{?}{=}{\rm co}\mathcal{NP}.

In this paper, our main goal is to settle the above important open conjecture.

In the next few paragraphs, let us review some of the interesting background, the current status, the history, and the main goals of the field of proof complexity. On the one hand, as we all know well, proof theory is a major branch of mathematical logic and theoretical computer science with which proofs are treated as formal mathematical objects, facilitating their analysis by mathematical techniques (see e.g. [A4]). Some of the major areas of proof theory include structural proof theory, ordinal analysis, automated theorem proving, and proof complexity, and so forth. It should be pointed out that much research also focuses on applications in computer science, linguistics, and philosophy, from which we see that the impact of proof theory is very profound. Moreover, as pointed out in [Raz04] by Razborov, proof and computations are among the most fundamental concepts relevant to virtually any human intellectual activity. Both have been central to the development of mathematics for thousands of years. The effort to study these concepts themselves in a rigorous, metamathematical way initiated in the 2020th century led to flourishing of mathematical logic and derived disciplines.

On the other hand, according to the points of view given in [Rec76], logicians have proposed a great number of systems for proving theorems, which give certain rules for constructing proofs and for associating a theorem (formula) with each proof. More importantly, these rules are much simpler to understand than the theorems. Thus, a proof gives a constructive way of understanding that a theorem is true. Specifically, a proof system is sound if every theorem is true, and it is complete if every true statement (from a certain class) is a theorem (i.e., has a proof); see e.g. [Rec76].

Indeed, the relationship between proof theory and proof complexity is inclusion. Within the area of proof theory, proof complexity is the subfield aiming to understand and analyse the computational resources that are required to prove or refute statements (see e.g. [A5]). Research in proof complexity is predominantly concerned with proving proof-length lower and upper bounds in various propositional proof systems. For example, among the major challenges of proof complexity is showing that the Frege system, the usual propositional calculus, does not admit polynomial-size proofs of all tautologies; see e.g. [A5]. Here the size of the proof is simply the number of symbols in it, and a proof is said to be of polynomial size if it is polynomial in the size of the tautology it proves. Moreover, contemporary proof complexity research draws ideas and methods from many areas in computational complexity, algorithms and mathematics. Since many important algorithms and algorithmic techniques can be cast as proof search algorithms for certain proof systems, proving lower bounds on proof sizes in these systems implies run-time lower bounds on the corresponding algorithms; see e.g. [A5].

At the same time, propositional proof complexity is an area of study that has seen a rapid development over the last two decades, which plays as important a role in the theory of feasible proofs as the role played by the complexity of Boolean circuits in the theory of efficient computations, see e.g. the celebrated work [Raz15] by Razborov. In most cases, according to [Raz15], the basic question of propositional proof complexity can be reduced to that given a mathematical statement encoded as a propositional tautology ψ\psi and a class of admissible mathematical proofs formalized as a propositional proof system PP, what is the minimal possible complexity of a PP-proof of ψ\psi? In other words, propositional proof complexity aims to understand and analyze the computational resources required to prove propositional tautologies, in the same way that circuit complexity studies the resources required to compute Boolean functions. In the light of the point of view given in [Raz15], the task of proving lower bounds for strong proof systems like Frege or Extended Frege modulo any hardness assumption in the purely computational world may be almost as interesting; see e.g. [Raz15]. It is worth noting that, it has been observed by Razborov [Raz15] that 𝒩𝒫co𝒩𝒫\mathcal{NP}\neq{\rm co}\mathcal{NP} implies lower bounds for any propositional proof system. In addition, see e.g. [Raz03] for interesting directions in propositional proof complexity.

Historically, systematic study of proof complexity began with the seminal work of Cook and Reckhow [CR79] who provided the basic definition of a propositional proof system from the perspective of computational complexity. In particular, as pointed out by the authors of [FSTW21], the work [CR79] relates the goal of propositional proof complexity to fundamental hardness questions in computational complexity such as the 𝒩𝒫\mathcal{NP} versus co𝒩𝒫{\rm co}\mathcal{NP} problem: establishing super-polynomial lower bounds for every propositional proof system would separate 𝒩𝒫\mathcal{NP} from co𝒩𝒫{\rm co}\mathcal{NP}.

Let us return back to the 𝒩𝒫\mathcal{NP} versus co𝒩𝒫{\rm co}\mathcal{NP} problem for this moment. As a matter of fact, the so-called 𝒩𝒫\mathcal{NP} versus co𝒩𝒫{\rm co}\mathcal{NP} problem is the central problem in proof complexity; see e.g. the excellent survey [Kra19]. It formalizes the question of whether or not there are efficient ways to prove the negative cases in our, and in many other similar examples. Moreover, the ultimate goal of proof complexity is to show that there is no universal propositional proof system allowing for efficient proofs of all tautologies, which is equivalent to showing that the computational complexity class 𝒩𝒫\mathcal{NP} is not closed under the complement; see e.g. [Kra19].

Particularly, in 1974, Cook and Reckhow showed in [CR74] that there exists a super proof system if and only if 𝒩𝒫\mathcal{NP} is closed under complement; that is, if and only if 𝒩𝒫=co𝒩𝒫\mathcal{NP}={\rm co}\mathcal{NP}. It is, of course, open whether any super proof system exists. The above result has led to what is sometimes called “Cook’s program” for proving 𝒩𝒫co𝒩𝒫\mathcal{NP}\neq{\rm co}\mathcal{NP}: prove superpolynomial lower bounds for proof lengths in stronger and stronger proposition proof systems, until they are established for all abstract proof systems. Cook’s program, which also can be seen as a program to prove 𝒫𝒩𝒫\mathcal{P}\neq\mathcal{NP}, is an attractive and plausible approach; unfortunately, it has turned out to be quite hard to establish superpolynomial lower bounds on common proof systems, see e.g. the excellent survey on proof complexity [Bus12].

In this paper, the proof systems we consider are those familiar from textbook presentations of logic, such as the Frege systems which were mentioned previously. It can be said that the Frege proof system is a “textbook-style” propositional proof system with Modus Ponens as its only rule of inference. In fact, before our showing that 𝒩𝒫co𝒩𝒫\mathcal{NP}\neq{\rm co}\mathcal{NP}, although 𝒩𝒫co𝒩𝒫\mathcal{NP}\neq{\rm co}\mathcal{NP} is considered to be very likely true, researchers are not able to prove that some very basic proof systems are not polynomially bounded; see e.g. [Pud08]. In particular, the following open problem is listed as the first open problem in [Pud08]:

Prove a superpolynomial lower bound on the length of proofs for a Frege system (or prove that it is polynomially bounded).

However, despite decades of effort, very little progress has been made on the above mentioned open problem.

1.1 Main Results

In this paper, we explore and settle the aforementioned open problems. Our first main goal in this paper is to prove the following most important theorem:

Theorem 1

There is a language LdL_{d} accepted by a nondeterministic Turing machine but by no co𝒩𝒫{\rm co}\mathcal{NP}-machines, i.e., Ldco𝒩𝒫L_{d}\notin{\rm co}\mathcal{NP}. Further, it can be proved that Ld𝒩𝒫L_{d}\in\mathcal{NP}. That is,

𝒩𝒫co𝒩𝒫.\mathcal{NP}\neq{\rm co}\mathcal{NP}.

Further, since there exists some oracle AA for which 𝒩𝒫A=co𝒩𝒫A\mathcal{NP}^{A}={\rm co}\mathcal{NP}^{A} by the result of [BGS75], and just as the result that 𝒫A=𝒩𝒫A\mathcal{P}^{A}=\mathcal{NP}^{A} [BGS75] led to a strong belief that problems with contradictory relativization are very hard to solve and are not amenable to current proof techniques, i.e., the solutions of such problems are beyond the current state of mathematics (see e.g. [HCCRR93, Hop84]), the conclusion 𝒩𝒫A=co𝒩𝒫A\mathcal{NP}^{A}={\rm co}\mathcal{NP}^{A} also suggests that the problem of separating 𝒩𝒫\mathcal{NP} from co𝒩𝒫{\rm co}\mathcal{NP} is beyond the current state of mathematics. So, the next result, i.e., the following theorem, is to break the so-called Relativization Barrier:

Theorem 2

Under some rational assumptions (see Section 6 below), and if 𝒩𝒫A=co𝒩𝒫A\mathcal{NP}^{A}={\rm co}\mathcal{NP}^{A}, then the set coNPAcoNP^{A} of all polynomial-time co-nondeterministic oracle Turing machine with oracle AA is not enumerable. Thereby, the ordinary diagonalization techniques (lazy-diagonalization) will generally not apply to the relativized versions of the 𝒩𝒫\mathcal{NP} versus co𝒩𝒫{\rm co}\mathcal{NP} problem.

By the well-known fact that 𝒫=co𝒫\mathcal{P}={\rm co}\mathcal{P} and the Theorem 1, it immediately follows that

Corollary 3

𝒫𝒩𝒫\mathcal{P}\neq\mathcal{NP}.

Thus, we reach the important result that 𝒫𝒩𝒫\mathcal{P}\neq\mathcal{NP} [Lin21] once again, which can be seen as a by-product of Theorem 1 and the well-known fact that 𝒫=co𝒫\mathcal{P}={\rm co}\mathcal{P}.

Let 𝒩𝒳𝒫\mathcal{NEXP} and co𝒩𝒳𝒫{\rm co}\mathcal{NEXP} be the complexity classes defined by

𝒩𝒳𝒫=defk1NTIME[2nk]\mathcal{NEXP}\overset{\rm def}{=}\bigcup_{k\in\mathbb{N}_{1}}{\rm NTIME}[2^{n^{k}}]

and

co𝒩𝒳𝒫=defk1coNTIME[2nk],{\rm co}\mathcal{NEXP}\overset{\rm def}{=}\bigcup_{k\in\mathbb{N}_{1}}{\rm coNTIME}[2^{n^{k}}],

respectively (the complexity classes NTIME{\rm NTIME} and coNTIME{\rm coNTIME} are defined below, see Section 2). Then, besides the above obvious corollary, we also have the following consequence of Theorem 1, whose proof can be similar to that of Theorem 1 (e.g., a point is setting the counter of tape 33 to count up to 2nk+12^{n^{k+1}} in the proof of Theorem 4.1):

Corollary 4

𝒩𝒳𝒫co𝒩𝒳𝒫\mathcal{NEXP}\neq{\rm co}\mathcal{NEXP}.

Since 𝒳𝒫=co𝒳𝒫\mathcal{EXP}={\rm co}\mathcal{EXP} (where 𝒳𝒫\mathcal{EXP} is the class of languages accepted by deterministic Turing machines in time 2nO(1)2^{n^{O(1)}}; see Section 2 for big OO notation), this, together with Corollary 4, follows that 𝒳𝒫𝒩𝒳𝒫\mathcal{EXP}\neq\mathcal{NEXP}. Furthermore, the complexity class 𝒫𝒫\mathcal{BPP} (bounded-error probabilistic polynomial-time) aims to capture the set of decision problems efficiently solvable by polynomial-time probabilistic Turing machines (see e.g. [AB09]), and a central open problem in probabilistic complexity theory is the relation between 𝒫𝒫\mathcal{BPP} and 𝒩𝒳𝒫\mathcal{NEXP}. However, currently researchers only know that 𝒫𝒫\mathcal{BPP} is sandwiched between 𝒫\mathcal{P} and 𝒳𝒫\mathcal{EXP} (i.e., 𝒫𝒫𝒫𝒳𝒫\mathcal{P}\subseteq\mathcal{BPP}\subseteq\mathcal{EXP}) but are even unable to show that 𝒫𝒫\mathcal{BPP} is a proper subset of 𝒩𝒳𝒫\mathcal{NEXP}; see e.g. page 126 of [AB09]. With the consequence of Corollary 4 (i.e., 𝒳𝒫𝒩𝒳𝒫\mathcal{EXP}\neq\mathcal{NEXP}) at hand, it immediately follows that

Corollary 5

𝒫𝒫𝒩𝒳𝒫\mathcal{BPP}\neq\mathcal{NEXP}.

It is interesting that the complexity class 𝒩𝒫\mathcal{NP} has a rich structure if 𝒫\mathcal{P} and 𝒩𝒫\mathcal{NP} differ. Specifically, in [Lad75], Lander constructed a language that is 𝒩𝒫\mathcal{NP}-intermediate under the assumption that 𝒫𝒩𝒫\mathcal{P}\neq\mathcal{NP}. In fact, since by Theorem 1, we know that 𝒫co𝒩𝒫\mathcal{P}\neq{\rm co}\mathcal{NP} and symmetrically also have the following interesting outcome saying that the complexity class co𝒩𝒫{\rm co}\mathcal{NP} has the intermediate languages. To see so, let L𝒩𝒫L\in\mathcal{NP} be the 𝒩𝒫\mathcal{NP}-intermediate language constructed in [Lad75]; then L¯={0,1}L\overline{L}=\{0,1\}^{*}-L is a co𝒩𝒫{\rm co}\mathcal{NP}-intermediate language in co𝒩𝒫{\rm co}\mathcal{NP}. In other words, we will prove in detail the following important result:

Theorem 6

There are co𝒩𝒫{\rm co}\mathcal{NP}-intermediate languages, i.e., there exists language Lco𝒩𝒫L\in{\rm co}\mathcal{NP} which is not in 𝒫\mathcal{P} and not co𝒩𝒫{\rm co}\mathcal{NP}-complete.

By the result of [CR74], there exists a super proof system if and only if 𝒩𝒫\mathcal{NP} is closed under complement. Then, by Theorem 1, we clearly have the following:

Corollary 7

There exists no super proof system.

Moreover, we will settle the open problem listed as the first in [Pud08]; See the introduction in Section 1. Namely, we show the following theorem with respect to the aforementioned problem:

Theorem 8

There exists no polynomial p(n)p(n) such that for all ψTAUT\psi\in TAUT, there is a Frege proof of ψ\psi of length at most p(|ψ|)p(|\psi|). In other words, no Frege proof systems of ψTAUT\psi\in TAUT can be polynomially bounded.

1.2 Overview

The rest of the paper is organized as follows: For the convenience of the reader, we will review some notions closely associated with our discussions and fix some notation we will use in the following context in the next Section, with some useful technical lemmas also presented in the next Section. In Section 3, we prove that all co𝒩𝒫{\rm co}\mathcal{NP}-machines are enumerable, thus showing that there exists an enumeration of all co𝒩𝒫{\rm co}\mathcal{NP}-machines. Section 4 contains the definition of our nondeterministic Turing machine, which accepts a language LdL_{d} not in co𝒩𝒫{\rm co}\mathcal{NP}, showing the desired lower bounds. And the Section 5 is devoted to showing that the language LdL_{d} is in fact in 𝒩𝒫\mathcal{NP}, proving the upper bounds required, in which we finish the proof of Theorem 1 at our own convenience. In Section 6 we will show Theorem 2 to break the so-called Relativization Barrier. In Section 7, we show that there are co𝒩𝒫{\rm co}\mathcal{NP}-intermediate languages in the complexity class co𝒩𝒫{\rm co}\mathcal{NP}. In Section 8, we prove the Theorem 8 which says that no Frege proof systems are polynomially bounded, thus answering an important open question in the area of proof complexity. Finally, we draw some conclusions in the last Section.

2 Preliminaries

In this section, we introduce some notions and notation that will be used in what follows.

Let 1={1,2,3,}\mathbb{N}_{1}=\{1,2,3,\cdots\} where +1+\infty\notin\mathbb{N}_{1}, i.e., the set of all positive integers. We also denote by \mathbb{N} the set of natural numbers, i.e.,

=def1{0}.\mathbb{N}\overset{\rm def}{=}\mathbb{N}_{1}\cup\{0\}.

Let Σ\Sigma be an alphabet, for finite words w,vΣw,v\in\Sigma^{*}, the concatenation of ww and vv, denoted by wvw\circ v, is wvwv. For example, suppose that Σ={0,1}\Sigma=\{0,1\}, and w=100001w=100001, v=0111110v=0111110, then

wv=def\displaystyle w\circ v\overset{\rm def}{=} wv\displaystyle wv
=\displaystyle= 1000010111110\displaystyle 1000010111110

The length of a finite word ww, denoted as |w||w|, is defined to be the number of symbols in it. It is clear that for finite words ww and vv,

|wv|=|w|+|v|.|wv|=|w|+|v|.

The big OO notation indicates the order of growth of some quantity as a function of nn or the limiting behavior of a function. For example, that S(n)S(n) is big OO of f(n)f(n), i.e.,

S(n)=O(f(n))S(n)=O(f(n))

means that there exist a positive integer N0N_{0} and a positive constant MM such that

S(n)M×f(n)S(n)\leq M\times f(n)

for all n>N0n>N_{0}.

The little oo notation also indicates the order of growth of some quantity as a function of nn or the limiting behavior of a function but with different meaning. Specifically, that T(n)T(n) is little oo of t(n)t(n), i.e.,

T(n)=o(t(n))T(n)=o(t(n))

means that for any constant c>0c>0, there exists a positive integer N0>0N_{0}>0 such that

T(n)<c×t(n)T(n)<c\times t(n)

for all n>N0n>N_{0}.

Throughout this paper, the computational modes used are nondeterministic Turing machines (or their variants such as nondeterministic Turing machines with oracle). We follow the standard definition of a nondeterministic Turing machine given in the standard textbook [AHU74]. Let us first introduce the precise definition of a nondeterministic Turing machine as follows:

Definition 2.1 (kk-tape nondeterministic Turing machine, [AHU74])

A kk-tape nondeterministic Turing machine (shortly, NTM) MM is a seven-tuple (Q,T,I,δ,𝕓,q0,qf)(Q,T,I,\delta,\mathbbm{b},q_{0},q_{f}) where:

  1. 1.

    QQ is the set of states.

  2. 2.

    TT is the set of tape symbols.

  3. 3.

    II is the set of input symbols; ITI\subseteq T.

  4. 4.

    𝕓TI\mathbbm{b}\in T-I, is the blank.

  5. 5.

    q0q_{0} is the initial state.

  6. 6.

    qfq_{f} is the final (or accepting) state.

  7. 7.

    δ\delta is the next-move function, or a mapping from Q×TkQ\times T^{k} to subsets of

    Q×(T×{L,R,S})k.Q\times(T\times\{L,R,S\})^{k}.

    Suppose

    δ(q,a1,a2,,ak)={(q1,(a11,d11),(a21,d21),,(ak1,dk1)),,(qn,(a1n,d1n),(a2n,d2n),,(akn,dkn))}\delta(q,a_{1},a_{2},\cdots,a_{k})=\{(q_{1},(a^{1}_{1},d^{1}_{1}),(a^{1}_{2},d^{1}_{2}),\cdots,(a^{1}_{k},d^{1}_{k})),\cdots,(q_{n},(a^{n}_{1},d^{n}_{1}),(a^{n}_{2},d^{n}_{2}),\cdots,(a^{n}_{k},d^{n}_{k}))\}

    and the nondeterministic Turing machine is in state qq with the iith tape head scanning tape symbol aia_{i} for 1ik1\leq i\leq k. Then in one move the nondeterministic Turing machine enters state qjq_{j}, changes symbol aia_{i} to aija^{j}_{i}, and moves the iith tape head in the direction dijd^{j}_{i} for 1ik1\leq i\leq k and 1jn1\leq j\leq n.

Let MM be a nondeterministic Turing machine, and ww be an input. Then M(w)M(w) represents that MM is on input ww.

A nondeterministic Turing machine MM works in time t(n)t(n) (or of time complexity t(n)t(n)), if for any input wIw\in I^{*} where II is the input alphabet of MM, M(w)M(w) will halt within t(|w|)t(|w|) steps for any input ww.

Now, it is time for us to introduce the concept of a polynomial-time nondeterministic Turing machine as follows:

Definition 2.2 (cf. polynomial-time deterministic Turing machines in [Coo00])

Formally, a polynomial-time nondeterministic Turing machine is a nondeterministic Turing machine such that there exists k1k\in\mathbb{N}_{1}, for all input ww of length |w||w| where |w||w|\in\mathbb{N}, M(w)M(w) will halt within t(|w|)=|w|k+kt(|w|)=|w|^{k}+k steps.

By default, a word ww is accepted by a nondeterministic t(n)t(n) time-bounded Turing machine MM if there exists a computation path γ\gamma such that M(w)=1M(w)=1 (i.e., stop in the “accepting” state) on that computation path described by γ\gamma. This is the “exists” accepting criterion for nondeterministic Turing machines in general, based on which the complexity class NTIME[t(n)]{\rm NTIME}[t(n)] is defined:

Definition 2.3

The set of languages decided by nondeterministic Turing machines within time t(n)t(n) is denoted by NTIME[t(n)]{\rm NTIME}[t(n)]. Thus,

𝒩𝒫=k1NTIME[nk].\mathcal{NP}=\bigcup_{k\in\mathbb{N}_{1}}{\rm NTIME}[n^{k}].

Apart from the “exists” accepting criterion for defining the complexity class NTIME[t(n)]{\rm NTIME}[t(n)], there is another “for all” accepting criterion111Originally, for a polynomial-time nondeterministic Turing machine, it should be the “for all” rejecting criterion (i.e., all computation paths of M(w)M(w) reject), but we can exchange the rejecting state and the accepting state, i.e., treat the rejecting state as the accepting state. Thus, in this sense, we can also call that the “for all” accepting criterion. for nondeterministic Turing machines, defined as follows:

Definition 2.4 (“for all” accepting criterion)

Let MM be a nondeterministic Turing machine. MM accepts the input ww if and only if for all computation paths of M(w)M(w) leading to the accepting state of MM, i.e.,

wL(M) All computation paths of M(w) accept.w\in L(M)\Leftrightarrow\text{ All computation paths of $M(w)$ accept}.
Remark 2.1

Obviously, for a polynomial-time nondeterministic Turing machine MM, MM accepts a language L𝒩𝒫L\in\mathcal{NP} when using the “exists” accepting criterion. However, if using the “for all” accepting criterion (whose synonym is the “for all” rejecting criterion), then MM accepts the language L¯co𝒩𝒫\overline{L}\in{\rm co}\mathcal{NP} which is the complement of LL.

With the Definition 2.4 above, we can similarly define the complexity class coNTIME[t(n)]{\rm coNTIME}[t(n)] of languages decided by nondeterministic Turing machines within time t(n)t(n) in terms of the “for all” accepting criterion:

Definition 2.5

The set of languages decided by nondeterministic Turing machines within time t(n)t(n) in terms of the “for all” accepting criterion is denoted coNTIME[t(n)]{\rm coNTIME}[t(n)]. Namely,

LcoNTIME[t(n)]L\in{\rm coNTIME}[t(n)]

if and only if there is a nondeterministic Turing machine MM of time complexity t(n)t(n) such that

wL All computation paths of M(w) lead to the accepting.w\in L\Leftrightarrow\text{ All computation paths of $M(w)$ lead to the accepting.}

Thus,

co𝒩𝒫=k1coNTIME[nk].{\rm co}\mathcal{NP}=\bigcup_{k\in\mathbb{N}_{1}}{\rm coNTIME}[n^{k}].

We can define the complexity class co𝒩𝒫{\rm co}\mathcal{NP} in a equivalent way using polynomial-time deterministic Turing machines as verifiers and “for all” witnesses, given as follows:

Definition 2.6 ([AB09], Definition 2.20)

For every L{0,1}L\subseteq\{0,1\}^{*}, we say that Lco𝒩𝒫L\in{\rm co}\mathcal{NP} if there exists a polynomial p(n)p(n) and a deterministic polynomial-time Turing machine MM such that for every x{0,1}x\in\{0,1\}^{*},

xLu{0,1}p(|x|),M(x,u)=1.x\in L\Leftrightarrow\forall u\in\{0,1\}^{p(|x|)},\quad M(x,u)=1.

With the complexity class co𝒩𝒫{\rm co}\mathcal{NP} above, we next define the set of all co𝒩𝒫{\rm co}\mathcal{NP}-machines as follows.

Definition 2.7

For any language Lco𝒩𝒫L\in{\rm co}\mathcal{NP}, the nondeterministic Turing machine MM is called a p(n)p(n) time-bounded co𝒩𝒫{\rm co}\mathcal{NP}-machine for the language LL if there exists a polynomial p(n)=nk+kp(n)=n^{k}+k (k1k\in\mathbb{N}_{1}), such that for every x{0,1}x\in\{0,1\}^{*},

xL All computation paths of M(x) accept.x\in L\Leftrightarrow\text{ All computation paths of $M(x)$ accept}.

For convenience, we denote by the notation coNP{\rm co}NP the set of all co𝒩𝒫{\rm co}\mathcal{NP}-machines, i.e., the set coNP{\rm co}NP consists of all co𝒩𝒫{\rm co}\mathcal{NP}-machines:

coNP=def{M:M is a co𝒩𝒫-machine}.{\rm co}NP\overset{\rm def}{=}\{M\,:\,M\text{ is a ${\rm co}\mathcal{NP}$-machine}\}.
Remark 2.2

By Definition 2.2, Definition 2.5 and Definition 2.7 we know that, both 𝒩𝒫\mathcal{NP}-machines and co𝒩𝒫{\rm co}\mathcal{NP}-machines are polynomial-time nondeterministic Turing machines. But they only differ by the definition of accepting criterion for the input.

In regard to the relation between the time complexity of kk-tape nondeterministic Turing machines and that of single-tape nondeterministic Turing machines, we quote the following useful lemma, extracted from [AHU74] (see Lemma 10.1 in [AHU74]), which plays important roles in the following context:

Lemma 2.1 (Lemma 10.1 in [AHU74])

If LL is accepted by a kk-tape nondeterministic T(n)T(n) time-bounded Turing machine, then LL is accepted by a single-tape nondeterministic O(T2(n))O(T^{2}(n)) time-bounded Turing machine.  

Since co𝒩𝒫{\rm co}\mathcal{NP}-machines are also polynomial-time nondeterministic Turing machines but with the “for all” accepting criterion for the input, we can similarly prove the following technical lemma:

Lemma 2.2

If LL is accepted by a kk-tape co-nondeterministic T(n)T(n) time-bounded Turing machine, then LL is accepted by a single-tape co-nondeterministic O(T2(n))O(T^{2}(n)) time-bounded Turing machine, i.e., by a single-tape O(T2(n))O(T^{2}(n)) time-bounded co𝒩𝒫{\rm co}\mathcal{NP}-machine.

Proof. The proof is similar to that of Lemma 2.1 and see [AHU74], so the detail is omitted.   

The following theorem about efficient simulation for universal nondeterministic Turing machines is useful in proving the main result in Section 4.

Lemma 2.3 ([AB09])

There exists a Turing machine UU such that for every x,α{0,1}x,\alpha\in\{0,1\}^{*}, U(x,α)=Mα(x)U(x,\alpha)=M_{\alpha}(x), where MαM_{\alpha} denotes the Turing machine represented by α\alpha. Moreover, if MαM_{\alpha} halts on input xx within T(|x|)T(|x|) steps, then U(x,α)U(x,\alpha) halts within cT(|x|)logT(|x|)cT(|x|)\log T(|x|) 222In this paper, logn\log n stands for log2n\log_{2}n. steps, where cc is a constant independent of |x||x| and depending only on MαM_{\alpha}’s alphabet size, number of tapes, and number of states.  

Other background information and notions will be given along the way in proving our main results stated in Section 1.

3 All co𝒩𝒫{\rm co}\mathcal{NP}-machines Are Enumerable

In this section, our main goal is to establish an important theorem that all co𝒩𝒫{\rm co}\mathcal{NP}-machines are in fact enumerable, which is the prerequisite for the next steps.

Following Definition 2.2, a polynomial-time nondeterministic Turing machine can be represented by a tuple of (M,k)(M,k), where MM is the nondeterministic Turing machine itself and kk is the unique minimal degree of some polynomial |x|k+k|x|^{k}+k such that M(x)M(x) will halt within |x|k+k|x|^{k}+k steps for any input xx of length |x||x|. We call such a positive integer kk the order of (M,k)(M,k).

By Lemma 2.1 stated above, we only need to discuss the single-tape nondeterministic Turing machines. Thus, in the following context, by a nondeterministic Turing machine we mean a single-tape nondeterministic Turing machine. Similarly, by Lemma 2.2, we only need to discuss the single-tape co𝒩𝒫{\rm co}\mathcal{NP}-machines. Thus, in the following context, by a co𝒩𝒫{\rm co}\mathcal{NP}-machine we mean a single-tape co𝒩𝒫{\rm co}\mathcal{NP}-machine.

To obtain our main result, we must enumerate all co𝒩𝒫{\rm co}\mathcal{NP}-machines and suppose that the set of all co𝒩𝒫{\rm co}\mathcal{NP}-machines is enumerable (which needs to be proved) so that we can refer to the jj-th co𝒩𝒫{\rm co}\mathcal{NP}-machine in the enumeration.

To show that the set of all co𝒩𝒫{\rm co}\mathcal{NP}-machines is enumerable, we need to study whether the whole set of polynomial-time nondeterministic Turing machines is enumerable, because by Definition 2.4 or by Remark 2.1, co𝒩𝒫{\rm co}\mathcal{NP}-machines are nondeterministic Turing machines running within polynomial time but the accepting criterion for the input is the “for all” criterion, so the set coNP{\rm co}NP can be seen as a subset of all polynomial-time nondeterministic Turing machines but with the “for all” accepting criterion.

In what follows, we first use the method presented in [AHU74], p. 407, to encode a single-tape nondeterministic Turing machine into an integer.

Without loss of generality, we can make the following assumptions about the representation of a single-tape nondeterministic Turing machine with input alphabet {0,1}\{0,1\} because that will be all we need:

  1. 1.

    The states are named

    q1,q2,,qsq_{1},q_{2},\cdots,q_{s}

    for some ss, with q1q_{1} the initial state and qsq_{s} the accepting state.

  2. 2.

    The input alphabet is {0,1}\{0,1\}.

  3. 3.

    The tape alphabet is

    {X1,X2,,Xt}\{X_{1},X_{2},\cdots,X_{t}\}

    for some tt, where X1=𝕓X_{1}=\mathbbm{b}, X2=0X_{2}=0, and X3=1X_{3}=1.

  4. 4.

    The next-move function δ\delta is a list of quintuples of the form,

    {(qi,Xj,qk,Xl,Dm),,(qi,Xj,qf,Xp,Dz)}\{(q_{i},X_{j},q_{k},X_{l},D_{m}),\cdots,(q_{i},X_{j},q_{f},X_{p},D_{z})\}

    meaning that

    δ(qi,Xj)={(qk,Xl,Dm),,(qf,Xp,Dz)},\delta(q_{i},X_{j})=\{(q_{k},X_{l},D_{m}),\cdots,(q_{f},X_{p},D_{z})\},

    and DmD_{m} is the direction, LL, RR, or SS, if m=1,2m=1,2, or 33, respectively. We assume this quintuple is encoded by the string

    10i10j10k10l10m110i10j10f10p10z1.10^{i}10^{j}10^{k}10^{l}10^{m}1\cdots 10^{i}10^{j}10^{f}10^{p}10^{z}1.
  5. 5.

    The Turing machine itself is encoded by concatenating in any order the codes for each of the quintuples in its next-move function. Additional 11’s may be prefixed to the string if desired. The result will be some string of 0’s and 11’s, beginning with 11, which we can interpret as an integer.

Next, we encode the order of (M,k)(M,k) to be

10k110^{k}1

so that the tuple (M,k)(M,k) should be the concatenation of the binary string representing MM itself followed by the order 10k110^{k}1. Now the tuple (M,k)(M,k) is encoded as a binary string, which can be explained as an integer. In the following context, we often use the notation M\langle M\rangle to denote the shortest binary string that represents the same nk+kn^{k}+k time-bounded nondeterministic Turing machine (M,k)(M,k). We should further remark that the shortest binary string M\langle M\rangle that represents the same nk+kn^{k}+k time-bounded nondeterministic Turing machine (M,k)(M,k) is by concatenating in any order the codes for each of the quintuples in its next-move function followed by the order 10k110^{k}1 without additional 11 prefixed to it, from which it is clear that M\langle M\rangle may not be unique (since the order of codes of the quintuples may be different), but the lengths of all different M\langle M\rangle are the same.

By this encoding, any integer that cannot be decoded is assumed to represent the trivial Turing machine with an empty next-move function. Every single-tape polynomial-time nondeterministic Turing machine will appear infinitely often in the enumeration since, given a polynomial-time nondeterministic Turing machine, we may prefix 11’s at will to find larger and larger integers representing the same set of (M,k)(M,k). We denote such a polynomial-time nondeterministic Turing machine by M^j\widehat{M}_{j}, where jj is the integer representing the tuple (M,k)(M,k), i.e., jj is the integer value of the binary string representing the tuple (M,k)(M,k).

For convenience, let us introduce an additional notation NPNP to denote a set of all polynomial-time nondeterministic Turing machines, i.e., we denote the set of all polynomial-time nondeterministic Turing machines by the notation NPNP.333The reader should not confuse NPNP and 𝒩𝒫\mathcal{NP}. Note that NPNP is the set of all polynomial-time nondeterministic Turing machines; however, 𝒩𝒫\mathcal{NP} is the set of all languages where, for any language L𝒩𝒫L\in\mathcal{NP}, there is some polynomial-time nondeterministic Turing machine (M,k)NP(M,k)\in NP such that (M,k)(M,k) accepts LL. In short, NP={(M,k)}NP=\{(M,k)\}.

From the above arguments, we have reached the following important result:

Lemma 3.1

There exists a one-one correspondence between 1\mathbb{N}_{1} and the set NPNP of all polynomial-time nondeterministic Turing machines. In other words, all polynomial-time nondeterministic Turing machines are enumerable, and all languages in 𝒩𝒫\mathcal{NP} are enumerable (with languages appearing multiple times).  

As mentioned earlier, by Definition 2.4 or further by Remark 2.1, the set coNP{\rm co}NP can be seen as a subset of all polynomial-time nondeterministic Turing machines with the “for all” accepting criterion. Thus, Lemma 3.1 implies that we also establish the following important theorem:

Theorem 3.2

The set coNP{\rm co}NP is enumerable (All co𝒩𝒫{\rm co}\mathcal{NP}-machines are enumerable). In other words, all languages in co𝒩𝒫{\rm co}\mathcal{NP} are enumerable (with languages appearing multiple times).  

Remark 3.1

In fact, we can establish a one-one correspondence between the set NPNP of all polynomial-time nondeterministic Turing machines and the set coNP{\rm co}NP of all polynomial-time co-nondeterministic Turing machines. Since the set NPNP of all polynomial-time nondeterministic Turing machines is enumerable, we then deduce that the set coNP{\rm co}NP of all polynomial-time co-nondeterministic Turing machines is enumerable as well. To see so, by the “exists” accepting criterion we know that the elements in NPNP are polynomial-time nondeterministic Turing machines with the “exists” accepting criterion, and by Definition 2.7 or by Remark 2.1 we know that the elements in coNP{\rm co}NP are polynomial-time nondeterministic Turing machines with the “for all” accepting criterion. We then let the function

h:NPcoNPh:NP\rightarrow{\rm co}NP

be defined as follows: For (M,k)NP(M,k)\in NP, where (M,k)NP(M,k)\in NP with the “exists” accepting criterion:

h((M,k))=(M,k) with the “for all” accepting criterion;h((M,k))=(M,k)\text{ with the ``for all" accepting criterion};

Obviously,

h1:coNPNPh^{-1}:{\rm co}NP\rightarrow NP

is the inverse of hh, i.e., for any (M,k)coNP(M,k)\in{\rm co}NP, where (M,k)(M,k) with the “for all” accepting criterion,

h1((M,k))=(M,k) with the “exists” accepting criterion.h^{-1}((M,k))=(M,k)\text{ with the ``exists" accepting criterion.}

It is not hard to see that such a function hh is a one-one correspondence between NPNP and coNP{\rm co}NP. Thus, by Lemma 3.1, Theorem 3.2 follows. In fact, these arguments also follow from Remark 2.1.

There are also one-one correspondence between 𝒩𝒫\mathcal{NP} and co𝒩𝒫{\rm co}\mathcal{NP}. For any language L𝒩𝒫L\in\mathcal{NP}, let the function

g:𝒩𝒫co𝒩𝒫g:\mathcal{NP}\rightarrow{\rm co}\mathcal{NP}

be defined as:

g(L)=L¯g(L)=\overline{L}

where L¯={0,1}L\overline{L}=\{0,1\}^{*}-L is the complement of LL, and the inverse of gg is defined as: For any L¯co𝒩𝒫\overline{L}\in{\rm co}\mathcal{NP},

g1(L¯)=L.g^{-1}(\overline{L})=L.

It is not difficult to check that the function gg is indeed a one-one correspondence between 𝒩𝒫\mathcal{NP} and co𝒩𝒫{\rm co}\mathcal{NP}. From the fact that the set NPNP of all polynomial-time nondeterministic Turing machines is enumerable, we know that the set 𝒩𝒫\mathcal{NP} is enumerable. This, together with the one-one correspondence gg, follows the fact that the set co𝒩𝒫{\rm co}\mathcal{NP} is enumerable too.

4 Lazy Diagonalization against All co𝒩𝒫{\rm co}\mathcal{NP}-machines

In computational complexity, the most basic approach to show hierarchy theorems uses simulation and diagonalization, because the simulation and diagonalization technique [Tur37, HS65] is a standard method to prove lower bounds on uniform computing models (i.e., the Turing machine model); see for example [AHU74]. These techniques work well in deterministic time and space measures; see e.g. [For00, FS07]. However, they do not work for nondeterministic time, which is not known to be closed under complement at present — one of the main issues discussed in this paper; hence it is unclear how to define a nondeterministic machine that “does the opposite”; see e.g. [FS07].

For nondeterministic time, we can still do a simulation but can no longer negate the answer directly. In this case, we apply the lazy diagonalization to show the nondeterministic time hierarchy theorem; see e.g. [AB09]. It is worth noting that Fortnow [For11] developed a much more elegant and simple style of diagonalization to show nondeterministic time hierarchy. Generally, lazy diagonalization [AB09, FS07] is a clever application of the standard diagonalization technique [Tur37, HS65, AHU74]. The basic strategy of lazy diagonalization is that for any nj<m=nt(n)n\leq j<m=n^{t(n)}, MM on input 1j1^{j} simulates MiM_{i} on input 1j+11^{j+1}, accepting if MiM_{i} accepts and rejecting if MiM_{i} rejects. When j=mj=m, MM on input 1j1^{j} simulates MiM_{i} on input xx deterministically, accepting if MiM_{i} rejects and rejecting if MiM_{i} accepts. Since m=nt(n)m=n^{t(n)}, MM has enough time to perform the trivial exponential-time deterministic simulation of MiM_{i} on input 1n1^{n}, what we are doing is deferring the diagonalization step by linking MM and MiM_{i} together on larger and larger inputs until MM has an input large enough that it can actually “do the opposite” deterministically; see e.g. [FS07] for details.

For our goal in this section to establish the lower bounds result, we first make the following remarks, which will be referred to in what follows:

Remark 4.1

Given nn, for any k1k\in\mathbb{N}_{1}, it is not hard for a nondeterministic Turing machine to find the number jj such that nn is sandwiched between f(j)f(j) and f(j+1)f(j+1) within O(nk+δ)O(n^{k+\delta}) time for some 0<δ<10<\delta<1, which is less than in time nk+1n^{k+1}, (see e.g. standard textbook [AB09], page 7070; or see Remark 4.2 below), where the function

f:f:\mathbb{N}\rightarrow\mathbb{N}

is defined as follows:

f(j+1)={2,j=0;2f(j)k,j1.f(j+1)=\left\{\begin{array}[]{ll}2,&\hbox{$j=0$;}\\ 2^{f(j)^{k}},&\hbox{$j\geq 1$.}\end{array}\right.

Our diagonalizing machine DD (defined below) will try to flip the answer of a co𝒩𝒫{\rm co}\mathcal{NP}-machine MiM_{i} on some input in the set

{1nmMi:f(j)<nf(j+1)},\{1^{n-m}\langle M_{i}\rangle\,:\,f(j)<n\leq f(j+1)\},

with nm0n-m\geq 0 and n=|1nmMi|n=|1^{n-m}\langle M_{i}\rangle|, where m=|Mi|m=|\langle M_{i}\rangle| is the shortest length of the binary string representing MiM_{i}, and 1nmMi1^{n-m}\langle M_{i}\rangle is the concatenation of finite words 1nm1^{n-m} and Mi\langle M_{i}\rangle.

Remark 4.2

Let |n||n| denote the length of a number nn, i.e.,

logn+1(for n>0).\lfloor\log n\rfloor+1\quad\text{\em(for $n>0$).}

Computing 2n2^{n} requires time O(n)O(n) in the RAM model, and so computing f(i+1)f(i+1) from f(i)f(i) takes time

O(f(i)k)=O(|f(i+1)|).O(f(i)^{k})=O(|f(i+1)|).

We need to compute f(1)f(1), f(2)f(2), \cdots, f(i+1)f(i+1) until f(i+1)nf(i+1)\geq n, which means that f(i)<nf(i)<n. Further note that f(i+1)f(i+1) is growing faster than geometrically. Thus, the total running time is

O(|f(i+1)|)=\displaystyle O(|f(i+1)|)= O(f(i)k)\displaystyle O(f(i)^{k})
=\displaystyle= O(nk).\displaystyle O(n^{k}).

However, in the Turing machine model with a single tape, calculating 2n2^{n} takes time O(nlogn)O(n\log n), and so computing f(i+1)f(i+1) from f(i)f(i) takes time

O(f(i)klogf(i)k).O(f(i)^{k}\log f(i)^{k}).

So the total running time is

O(f(i)klogf(i)k)=\displaystyle O(f(i)^{k}\log f(i)^{k})= O(kf(i)klogf(i))\displaystyle O(kf(i)^{k}\log f(i))
<\displaystyle< O(nk×klogn)(by n>f(i))\displaystyle O(n^{k}\times k\log n)\qquad\text{\em(by $n>f(i)$)}
<\displaystyle< nk+1(by klogn<n).\displaystyle n^{k+1}\qquad\text{\em(by $k\log n<n$).}

Our main goal of this section is to establish the following lower bounds theorem with the technique of lazy diagonalization from [AB09]:

Theorem 4.1

There exists a language LdL_{d} accepted by a universal nondeterministic Turing machine DD, but not by any co𝒩𝒫{\rm co}\mathcal{NP}-machines. Namely, there is a language LdL_{d} such that

Ldco𝒩𝒫.L_{d}\notin{\rm co}\mathcal{NP}.

Proof. By Theorem 3.2, it is convenient that let

Mi,Mj,M_{i},M_{j},\cdots

be an effective enumeration (denoted by ee) of all co𝒩𝒫{\rm co}\mathcal{NP}-machines.

The key idea will be lazy diagonalization, so named (in accordance with the point of view presented in [AB09]) because the new machine DD is in no hurry to diagonalize and only ensures that it flips the answer of each co𝒩𝒫{\rm co}\mathcal{NP}-machine MiM_{i} in only one string out of a sufficiently large (exponentially large) set of strings.

Let DD be a five-tape nondeterministic Turing machine which operates as follows on an input string xx of length of |x||x|.

  1. 1.

    On input xx, if x1Mix\notin 1^{*}\langle M_{i}\rangle, reject xx. If x=1lMix=1^{l}\langle M_{i}\rangle for some ll\in\mathbb{N}, then DD decodes MiM_{i} encoded by xx, including to determine tt, the number of tape symbols used by MiM_{i}; ss, its number of states; kk, its order of polynomial

    T(n)=nk+k;T(n)=n^{k}+k;

    and mm, the shortest length of MiM_{i}, i.e.,

    m=min{|Mi|:different binary strings Mi represent the same Mi}.m=\min\{|\langle M_{i}\rangle|\,:\,\text{different binary strings $\langle M_{i}\rangle$ represent the same $M_{i}$}\}.

    The fifth tape of DD can be used as “scratch” memory to calculate tt, ss, kk and mm.

  2. 2.

    Then DD lays off on its second tape |x||x| blocks of

    logt\lceil\log t\rceil

    cells each, the blocks being separated by single cell holding a marker #\#, i.e., there are

    (1+logt)|x|(1+\lceil\log t\rceil)|x|

    cells in all. Each tape symbol occurring in a cell of MiM_{i}’s tape will be encoded as a binary number in the corresponding block of the second tape of DD. Initially, DD places its input, in binary coded form, in the blocks of tape 22, filling the unused blocks with the code for the blank.

  3. 3.

    On tape 33, DD sets up a block of

    (k+1)logn\lceil(k+1)\log n\rceil

    cells, initialized to all 0’s. Tape 33 is used as a counter to count up to

    nk+1,n^{k+1},

    where n=|x|n=|x|.

  4. 4.

    DD simulates MiM_{i}, using tape 11, its input tape, to determine the moves of MiM_{i} and using tape 22 to simulate the tape of MiM_{i} (i.e., DD will place MiM_{i}’s input to the tape 22). The moves of MiM_{i} are counted in binary in the block of tape 33, and tape 44 is used to hold the states of MiM_{i}. If the counter on tape 33 overflows, DD halts with rejecting. The specified simulation is the following:

    “On input 1nmMi1^{n-m}\langle M_{i}\rangle (note that m=|Mi|m=|\langle M_{i}\rangle|), then DD uses the fifth tape as “scratch” memory to compute jj such that

    f(j)<nf(j+1)f(j)<n\leq f(j+1)

    where

    f(j+1)={2,j=0;2f(j)k,j1.f(j+1)=\left\{\begin{array}[]{ll}2,&\hbox{$j=0$;}\\ 2^{f(j)^{k}},&\hbox{$j\geq 1$.}\end{array}\right.

    (by Remark 4.1, to compute jj can be done within time

    O(nk+δ)O(n^{k+\delta})

    with 0<δ<10<\delta<1 which is less than time

    nk+1,n^{k+1},

    so DD has sufficient time) and

    • (i).

      If f(j)<n<f(j+1)f(j)<n<f(j+1) then DD simulates MiM_{i} on input 1n+1mMi1^{n+1-m}\langle M_{i}\rangle 444DD first erases the old content of the second tape, then DD lays off on its second tape |1n+1mMi||1^{n+1-m}\langle M_{i}\rangle| (i.e., n+1n+1) blocks of logt\lceil\log t\rceil cells each, the blocks being separated by single cell holding a marker #\#, i.e., there are (1+logt)|1n+1mMi|(1+\lceil\log t\rceil)|1^{n+1-m}\langle M_{i}\rangle| cells in all. Each tape symbol occurring in a cell of MiM_{i}’s tape will be encoded as a binary number in the corresponding block of the second tape of DD. (where m=|Mi|m=|\langle M_{i}\rangle|) using nondeterminism in (n+1)k+k(n+1)^{k}+k time and output its answer, i.e., DD rejects the input 1nmMi1^{n-m}\langle M_{i}\rangle if and only if MiM_{i} rejects the input 1n+1mMi1^{n+1-m}\langle M_{i}\rangle. 555 Because as a co𝒩𝒫{\rm co}\mathcal{NP}-machine with the “for all” accepting criterion, MiM_{i} rejects an input ww if and only if u{0,1}p(|w|)\exists u\in\{0,1\}^{p(|w|)} as a sequence of choices made by MiM_{i} such that Mi(w)=0M_{i}(w)=0 on the computation path described by uu, where p(|w|)p(|w|) is the time bounded of MiM_{i}.

    • (ii).

      If n=f(j+1)n=f(j+1), then DD simulates MiM_{i} on input 1f(j)+1mMi1^{f(j)+1-m}\langle M_{i}\rangle.666Similarly, DD first erases the second tape of DD, then DD lays off on its second tape |1f(j)+1mMi||1^{f(j)+1-m}\langle M_{i}\rangle| (i.e., f(j)+1f(j)+1) blocks of logt\lceil\log t\rceil cells each, the blocks being separated by single cell holding a marker #\#, i.e., there are (1+logt)|1f(j)+1mMi|(1+\lceil\log t\rceil)|1^{f(j)+1-m}\langle M_{i}\rangle| cells in all. Each tape symbol occurring in a cell of MiM_{i}’s tape will be encoded as a binary number in the corresponding block of the second tape of DD. Then, DD rejects 1nmMi1^{n-m}\langle M_{i}\rangle if MiM_{i} accepts 1f(j)+1mMi1^{f(j)+1-m}\langle M_{i}\rangle in (f(j)+1)k(f(j)+1)^{k} time; DD accepts 1nmMi1^{n-m}\langle M_{i}\rangle if MiM_{i} rejects 1f(j)+1mMi1^{f(j)+1-m}\langle M_{i}\rangle in (f(j)+1)k(f(j)+1)^{k} time.777 Note again that as a co𝒩𝒫{\rm co}\mathcal{NP}-machine with the “for all” accepting criterion, MiM_{i} accepts an input ww if and only if u{0,1}p(|w|)\forall u\in\{0,1\}^{p(|w|)} as the sequences of choices made by MiM_{i} such that Mi(w)=1M_{i}(w)=1 on the computation path described by uu, where p(|w|)p(|w|) is the time bounded of MiM_{i}.

Part (ii) requires going through all possible 2(f(j)+1)k2^{(f(j)+1)^{k}} branches of MiM_{i} on input 1f(j)+1mMi1^{f(j)+1-m}\langle M_{i}\rangle, but that is fine since the input size f(j+1)f(j+1) is 2f(j)k2^{f(j)^{k}} (see below).

The nondeterministic Turing machine DD constructed above is of time complexity, say SS, which is currently unknown. By Lemma 2.1, DD is equivalent to a single-tape nondeterministic O(S2)O(S^{2}) time-bounded Turing machine, and it of course accepts some language LdL_{d}.

We claim that Ldco𝒩𝒫L_{d}\notin{\rm co}\mathcal{NP}. Indeed, suppose for the sake of contradiction that LdL_{d} is decided by some co𝒩𝒫{\rm co}\mathcal{NP}-machine MiM_{i} running in T(n)=nk+kT(n)=n^{k}+k steps. Then by Lemma 2.2 and Remark 2.2, we may assume that MiM_{i} is a single-tape T(n)=nk+kT(n)=n^{k}+k time-bounded co𝒩𝒫{\rm co}\mathcal{NP}-machine. Let MiM_{i} have ss states and tt tape symbols and the shortest length of MiM_{i} is mm, i.e,

m=min{|Mi|:different binary strings Mi represent the same Mi}.m=\min\{|\langle M_{i}\rangle|\,:\,\text{different binary strings $\langle M_{i}\rangle$ represent the same $M_{i}$}\}.

Since MiM_{i} is represented by infinitely many strings and appears infinitely often in the enumeration ee, DD was to set the counter of tape 33 to count up to nk+1n^{k+1}, by Lemma 2.3 to simulate

T(n+1)T(n+1)

steps of MiM_{i} (Note that at this moment the input of MiM_{i} is 1n+1mMi1^{n+1-m}\langle M_{i}\rangle where m=|Mi|m=|\langle M_{i}\rangle|, so the length of 1n+1mMi1^{n+1-m}\langle M_{i}\rangle is n+1n+1), DD requires to perform

T(n+1)logT(n+1)T(n+1)\log T(n+1)

steps, and since

limn\displaystyle\lim_{n\rightarrow\infty} T(n+1)logT(n+1)nk+1\displaystyle\frac{T(n+1)\log T(n+1)}{n^{k+1}}
=\displaystyle= limn((n+1)k+k)log((n+1)k+k)nk+1\displaystyle\lim_{n\rightarrow\infty}\frac{((n+1)^{k}+k)\log((n+1)^{k}+k)}{n^{k+1}}
=\displaystyle= limn((n+1)klog((n+1)k+k)nk+1+klog((n+1)k+k)nk+1)\displaystyle\lim_{n\rightarrow\infty}\left(\frac{(n+1)^{k}\log((n+1)^{k}+k)}{n^{k+1}}+\frac{k\log((n+1)^{k}+k)}{n^{k+1}}\right)
=\displaystyle= 0\displaystyle 0
<\displaystyle< 1.\displaystyle 1.

So, there exists a N0>0N_{0}>0 such that for any N>N0N>N_{0},

T(N+1)logT(N+1)<Nk+1T(N+1)\log T(N+1)<N^{k+1}

which implies that for a sufficiently long ww, say |w|N0|w|\geq N_{0}, and MwM_{w} denoted by such ww is MiM_{i}, we have

T(|w|+1)logT(|w|+1)<|w|k+1.T(|w|+1)\log T(|w|+1)<|w|^{k+1}.

Thus, on input ww, DD has sufficient time to simulate MwM_{w} according to the requirement (i) within item 44 above.

Furthermore, to simulate 2(f(j)+1)k2^{(f(j)+1)^{k}} steps of MiM_{i}, by Lemma 2.3, DD requires to perform

2(f(j)+1)klog2(f(j)+1)k2^{(f(j)+1)^{k}}\log 2^{(f(j)+1)^{k}}

steps, and because f(j)f(j)\rightarrow\infty as jj\rightarrow\infty, and for 1lk11\leq l\leq k-1, we have

limf(j)2(kl)f(j)2f(j)k<1\lim_{f(j)\rightarrow\infty}\frac{2^{\binom{k}{l}f(j)}}{2^{f(j)^{k}}}<1

and

limf(j)2(f(j)+1)k2f(j)k<1,\lim_{f(j)\rightarrow\infty}\frac{2(f(j)+1)^{k}}{2^{f(j)^{k}}}<1,

where (kl){k\choose l} is the binomial coefficient, these yield

limf(j)\displaystyle\lim_{f(j)\rightarrow\infty} 2(f(j)+1)klog2(f(j)+1)k(2f(j)k)k+1\displaystyle\frac{2^{(f(j)+1)^{k}}\log 2^{(f(j)+1)^{k}}}{(2^{f(j)^{k}})^{k+1}}
=\displaystyle= limf(j)(f(j)+1)k×2(k0)+(k1)f(j)++(kk)f(j)k2f(j)k×2f(j)k××2f(j)kk+1\displaystyle\lim_{f(j)\rightarrow\infty}\frac{(f(j)+1)^{k}\times 2^{\binom{k}{0}+\binom{k}{1}f(j)+\cdots+\binom{k}{k}f(j)^{k}}}{\underbrace{2^{f(j)^{k}}\times 2^{f(j)^{k}}\times\cdots\times 2^{f(j)^{k}}}_{k+1}}
=\displaystyle= limf(j)(f(j)+1)k×2(k0)×2(k1)f(j)××2(kk)f(j)k2f(j)k×2f(j)k××2f(j)kk+1\displaystyle\lim_{f(j)\rightarrow\infty}\frac{(f(j)+1)^{k}\times 2^{\binom{k}{0}}\times 2^{\binom{k}{1}f(j)}\times\cdots\times 2^{\binom{k}{k}f(j)^{k}}}{\underbrace{2^{f(j)^{k}}\times 2^{f(j)^{k}}\times\cdots\times 2^{f(j)^{k}}}_{k+1}}
=\displaystyle= limf(j)2(f(j)+1)k2f(j)k×2(k1)f(j)2f(j)k×2(k2)f(j)22f(j)k××(kk1)f(j)k12f(j)k×2(kk)f(j)k2f(j)kk+1\displaystyle\lim_{f(j)\rightarrow\infty}\underbrace{\frac{2(f(j)+1)^{k}}{2^{f(j)^{k}}}\times\frac{2^{\binom{k}{1}f(j)}}{2^{f(j)^{k}}}\times\frac{2^{\binom{k}{2}f(j)^{2}}}{2^{f(j)^{k}}}\times\cdots\times\frac{\binom{k}{k-1}f(j)^{k-1}}{2^{f(j)^{k}}}\times\frac{2^{\binom{k}{k}f(j)^{k}}}{2^{f(j)^{k}}}}_{k+1}
=\displaystyle= limf(j)2(f(j)+1)k2f(j)k×2(k1)f(j)2f(j)k×2(k2)f(j)22f(j)k××2(kk1)f(j)k12f(j)kk×1\displaystyle\lim_{f(j)\rightarrow\infty}\underbrace{\frac{2(f(j)+1)^{k}}{2^{f(j)^{k}}}\times\frac{2^{\binom{k}{1}f(j)}}{2^{f(j)^{k}}}\times\frac{2^{\binom{k}{2}f(j)^{2}}}{2^{f(j)^{k}}}\times\cdots\times\frac{2^{\binom{k}{k-1}f(j)^{k-1}}}{2^{f(j)^{k}}}}_{k}\times 1
<\displaystyle< 1,\displaystyle 1,

which implies that there exits N0>0N^{\prime}_{0}>0 such that for all f(j)>N0f(j)>N^{\prime}_{0},

2(f(j)+1)klog2(f(j)+1)k<\displaystyle 2^{(f(j)+1)^{k}}\log 2^{(f(j)+1)^{k}}< (2f(j)k)k+1\displaystyle\,(2^{f(j)^{k}})^{k+1}
=\displaystyle= (f(j+1))k+1.\displaystyle\,(f(j+1))^{k+1}.

Thus, for a sufficiently long ww such that |w|=f(j+1)>f(j)>N0|w|=f(j+1)>f(j)>N^{\prime}_{0} , and MwM_{w} denoted by such ww is MiM_{i}, we have

2(f(j)+1)klog2(f(j)+1)k<|w|k+1.2^{(f(j)+1)^{k}}\log 2^{(f(j)+1)^{k}}<|w|^{k+1}.

Thereby, on input ww, DD also has sufficient time to simulate MiM_{i} according to the requirement (ii) within the item 44 above.

Overall, we can find ww large enough such that MwM_{w} represents the same MiM_{i} and on inputs of length

f(j+1)|w|>f(j)>max{N0,N0},f(j+1)\geq|w|>f(j)>\max\{N_{0},N^{\prime}_{0}\},

MiM_{i} can be simulated in less than |w|k+1|w|^{k+1} steps to ensure the conditions (i) and (ii) within item 4 above. This means that the two steps in the description of DD within the item 44 ensure, respectively, that

(4.1) If f(j)<n<f(j+1),then D(1nmMi)=Mi(1n+1mMi)\text{If }f(j)<n<f(j+1),\quad\text{then }D(1^{n-m}\langle M_{i}\rangle)=M_{i}(1^{n+1-m}\langle M_{i}\rangle)
(4.2) whereasD(1f(j+1)mMi)Mi(1f(j)+1mMi)\qquad\qquad\qquad\quad\text{whereas}\quad D(1^{f(j+1)-m}\langle M_{i}\rangle)\neq M_{i}(1^{f(j)+1-m}\langle M_{i}\rangle)

By our assumption MiM_{i} and DD agree on all inputs 1nmMi1^{n-m}\langle M_{i}\rangle for nn in the semi-open interval

(f(j),f(j+1)].(f(j),f(j+1)].

Together with (4.1), this implies that

D(1f(j)+1mMi)=\displaystyle D(1^{f(j)+1-m}\langle M_{i}\rangle)= Mi(1f(j)+2mMi)\displaystyle M_{i}(1^{f(j)+2-m}\langle M_{i}\rangle)
D(1f(j)+2mMi)=\displaystyle D(1^{f(j)+2-m}\langle M_{i}\rangle)= Mi(1f(j)+3mMi)\displaystyle M_{i}(1^{f(j)+3-m}\langle M_{i}\rangle)
\displaystyle\vdots
D(1nmMi)=\displaystyle D(1^{n-m}\langle M_{i}\rangle)= Mi(1n+1mMi)\displaystyle M_{i}(1^{n+1-m}\langle M_{i}\rangle)
\displaystyle\vdots
D(1f(j+1)1mMi)=\displaystyle D(1^{f(j+1)-1-m}\langle M_{i}\rangle)= Mi(1f(j+1)mMi)\displaystyle M_{i}(1^{f(j+1)-m}\langle M_{i}\rangle)
D(1f(j+1)mMi)=\displaystyle D(1^{f(j+1)-m}\langle M_{i}\rangle)= Mi(1f(j)+1mMi).\displaystyle M_{i}(1^{f(j)+1-m}\langle M_{i}\rangle).

But the last of the above, i.e.,

D(1f(i+1)mMi)=Mi(1f(i)+1mMi),D(1^{f(i+1)-m}\langle M_{i}\rangle)=M_{i}(1^{f(i)+1-m}\langle M_{i}\rangle),

contradicting (4.2), which yields that there exists no co𝒩𝒫{\rm co}\mathcal{NP}-machine MiM_{i} in the enumeration ee accepting the language LdL_{d}. Thereby,

Ldco𝒩𝒫.L_{d}\notin{\rm co}\mathcal{NP}.

This finishes the proof.   

Remark 4.3

To make it more clearer that how does the last part of the above proof of Theorem 4.1 work, we further make the following remarks. Suppose that

Mi(1f(j)+1mMi)=0,M_{i}(1^{f(j)+1-m}\langle M_{i}\rangle)=0,

then by the assumption that MiM_{i} and DD agree on all inputs 1nmMi1^{n-m}\langle M_{i}\rangle for nn in the semi-open interval

(f(j),f(j+1)],(f(j),f(j+1)],

it must have that D(1f(j)+1mMi)=0D(1^{f(j)+1-m}\langle M_{i}\rangle)=0. By the simulation requirement (i) within item 44,

D(1f(j)+1mMi)=Mi(1f(j)+2mMi),D(1^{f(j)+1-m}\langle M_{i}\rangle)=M_{i}(1^{f(j)+2-m}\langle M_{i}\rangle),

it must have that Mi(1f(j)+2mMi)=0M_{i}(1^{f(j)+2-m}\langle M_{i}\rangle)=0. Similarity, by the assumption again that MiM_{i} and DD agree on all inputs 1nmMi1^{n-m}\langle M_{i}\rangle for nn in the semi-open interval

(f(j),f(j+1)],(f(j),f(j+1)],

it has that

D(1f(j)+2mMi)=0.D(1^{f(j)+2-m}\langle M_{i}\rangle)=0.

Repeating this reasoning, we finally arrive at that

D(1f(j+1)mMi)=0.D(1^{f(j+1)-m}\langle M_{i}\rangle)=0.

But by our diagonalization requirement (ii) within item 44,

D(1f(j+1)mMi)Mi(1f(j)+1mMi)D(1^{f(j+1)-m}\langle M_{i}\rangle)\neq M_{i}(1^{f(j)+1-m}\langle M_{i}\rangle)

and our initial condition

Mi(1f(j)+1mMi)=0,M_{i}(1^{f(j)+1-m}\langle M_{i}\rangle)=0,

we must have D(1f(j+1)mMi)=1D(1^{f(j+1)-m}\langle M_{i}\rangle)=1. Thus, we get a contradiction. See Fig. 1 below.

Refer to caption
Figure 1: if Mi(1f(j)+1mMi)=0M_{i}(1^{f(j)+1-m}\langle M_{i}\rangle)=0

Similarity, we can show the second possibility. Assume that

Mi(1f(j)+1mMi)=1,M_{i}(1^{f(j)+1-m}\langle M_{i}\rangle)=1,

then by the assumption that MiM_{i} and DD agree on all inputs 1nmMi1^{n-m}\langle M_{i}\rangle for nn in the semi-open interval

(f(j),f(j+1)],(f(j),f(j+1)],

it must have that

D(1f(j)+1mMi)=1.D(1^{f(j)+1-m}\langle M_{i}\rangle)=1.

By the simulation requirement (i) within item 44,

D(1f(j)+1mMi)=Mi(1f(j)+2mMi),D(1^{f(j)+1-m}\langle M_{i}\rangle)=M_{i}(1^{f(j)+2-m}\langle M_{i}\rangle),

this means that there exists no γ{0,1}lk+k\gamma\in\{0,1\}^{l^{k}+k} such that Mi(1f(j)+2mMi)=0M_{i}(1^{f(j)+2-m}\langle M_{i}\rangle)=0 on the computation path described by γ\gamma, where l=|1f(j)+2mMi|l=|1^{f(j)+2-m}\langle M_{i}\rangle|. So, it must have that

Mi(1f(j)+2mMi)=1.M_{i}(1^{f(j)+2-m}\langle M_{i}\rangle)=1.

Similarly, by the assumption that MiM_{i} and DD agree on all inputs 1nmMi1^{n-m}\langle M_{i}\rangle for nn in the semi-open interval

(f(j),f(j+1)](f(j),f(j+1)]

again, it has that D(1f(j)+2mMi)=1D(1^{f(j)+2-m}\langle M_{i}\rangle)=1. Repeating this deduction, we finally arrive at that D(1f(j+1)mMi)=1D(1^{f(j+1)-m}\langle M_{i}\rangle)=1. But by our diagonalization requirement (ii) within item 44,

D(1f(j+1)mMi)Mi(1f(j)+1mMi)D(1^{f(j+1)-m}\langle M_{i}\rangle)\neq M_{i}(1^{f(j)+1-m}\langle M_{i}\rangle)

and our initial condition

Mi(1f(j)+1mMi)=1,M_{i}(1^{f(j)+1-m}\langle M_{i}\rangle)=1,

we must have D(1f(j+1)mMi)=0D(1^{f(j+1)-m}\langle M_{i}\rangle)=0. Thus, we also get a contradiction. See Fig. 2 below.

Refer to caption
Figure 2: if Mi(1f(j)+1mMi)=1M_{i}(1^{f(j)+1-m}\langle M_{i}\rangle)=1

To summarize, in both cases, we have shown the language LdL_{d} can not be accepted by MiM_{i}. Thus, the conclusion follows immediately.

We close this section by presenting the following corollary:

Corollary 4.2

coNTIME[T(n)]NTIME[t(n)]{\rm coNTIME}[T(n)]\neq{\rm NTIME}[t(n)], if If T(n)T(n) and t(n)t(n) are time-constructible functions and T(n)logT(n)=o(t(n))T(n)\log T(n)=o(t(n)).

Proof. Similar to the proof of Theorem 4.1.   

5 Proving That Ld𝒩𝒫L_{d}\in\mathcal{NP}

The previous section aims to establish a lower bounds result, i.e., to show that there is a language LdL_{d} not in co𝒩𝒫{\rm co}\mathcal{NP}. We now turn our attention to the upper bounds, i.e., to show that the language LdL_{d} accepted by the universal nondeterministic Turing machine DD is in fact in 𝒩𝒫\mathcal{NP}.

As a matter of fact, the technique to establish the desired upper bounds theorem is essentially the same as the one developed in the author’s recent work [Lin21], because the discussed issue to establish desired upper bounds is essentially the same as [Lin21]. But the work to establish such upper bounds for the first time is very difficult, and to see why; see e.g. [Lin21].

Now, for our goals, we are going to show firstly that the universal nondeterministic Turing machine DD runs within time O(nk)O(n^{k}) for any k1k\in\mathbb{N}_{1}:

Theorem 5.1

The universal nondeterministic Turing machine DD constructed in proof of Theorem 4.1 runs within time O(nk)O(n^{k}) for any k1k\in\mathbb{N}_{1}.

Proof. The simplest way to show the theorem is to prove that for any input ww to DD, there is a corresponding positive integer iw1i_{w}\in\mathbb{N}_{1} such that DD runs at most

|w|iw+1|w|^{i_{w}+1}

steps, which can be done as follows.

On the one hand, if the input xx encodes a nk+kn^{k}+k time-bounded co𝒩𝒫{\rm co}\mathcal{NP}-machine, then DD turns itself off mandatorily within

|x|k+1|x|^{k+1}

steps by the construction, so the corresponding integer ixi_{x} is kk in this case (i.e., ix=ki_{x}=k). This holds true for all polynomial-time co𝒩𝒫{\rm co}\mathcal{NP}-machines as input with kk to be the order of that corresponding co𝒩𝒫{\rm co}\mathcal{NP}-machine.

But on the other hand, if the input xx does not encode some polynomial-time co𝒩𝒫{\rm co}\mathcal{NP}-machine, then the input is rejected, which can be done within time O(|x|)O(|x|). In both cases we have show that for any input ww to DD, there is a corresponding positive integer iw1i_{w}\in\mathbb{N}_{1} such that DD runs at most |w|iw+1|w|^{i_{w}+1} steps. So DD is a nondeterministic

S(n)=max{nk,n}S(n)=\max\{n^{k},n\}

time-bounded Turing machine for any k1k\in\mathbb{N}_{1}. Thus, DD is a nondeterministic

O(nk)O(n^{k})

time-bounded Turing machine for any k1k\in\mathbb{N}_{1}. By Lemma 2.1, there is a single-tape nondeterministic Turing machine DD^{\prime} equivalent to DD and DD^{\prime} runs within time

O(S(n)2)=O(n2k)O(S(n)^{2})=O(n^{2k})

for any k1k\in\mathbb{N}_{1}.   

The following theorem is our main result of this section describing the upper bounds result:

Theorem 5.2

The language LdL_{d} is in 𝒩𝒫\mathcal{NP} where LdL_{d} is accepted by the universal nondeterministic Turing machine DD which runs within time O(nk)O(n^{k}) for any k1k\in\mathbb{N}_{1}.

Proof. Let us first define the family of languages

{Ldi}i1\{L_{d}^{i}\}_{i\in\mathbb{N}_{1}}

as follows:

Ldi=def\displaystyle L_{d}^{i}\overset{{\rm def}}{=} language accepted by DD running within time O(ni)O(n^{i}) for fixed i1i\in\mathbb{N}_{1}.
That is, DD turns itself off mandatorily when its moves made by itself
during the computation exceed ni+1n^{i+1} steps,

which technically can be done by adding a new tape to DD as a counter to count up to

ni+1n^{i+1}

for a fixed i1i\in\mathbb{N}_{1}, meaning that DD turns itself off when the counter of tape 33 exceeds

nk+1n^{k+1}

or the counter of the newly added tape exceeds

ni+1.n^{i+1}.

Obviously, for each i1i\in\mathbb{N}_{1}, LdiL_{d}^{i} is a truncation of LdL_{d}.

Then by the construction of DD, namely, for an arbitrary input ww to DD, there is a corresponding integer iw1i_{w}\in\mathbb{N}_{1} such that DD runs at most

|w|iw+1|w|^{i_{w}+1}

steps (in other words, DD runs at most ni+1n^{i+1} steps for any i1i\in\mathbb{N}_{1} where nn is the length of input, see Theorem 5.1 above), we have

(5.1) Ld=i1Ldi.L_{d}=\bigcup_{i\in\mathbb{N}_{1}}L_{d}^{i}.

Furthermore,

LdiLdi+1,for each fixed i1L_{d}^{i}\subseteq L_{d}^{i+1},\quad\text{for each fixed $i\in\mathbb{N}_{1}$}

since for any word wLdiw\in L_{d}^{i} accepted by DD within O(ni)O(n^{i}) steps, it surely can be accepted by DD within O(ni+1)O(n^{i+1}) steps, i.e.,

wLdi+1.w\in L_{d}^{i+1}.

This gives that for any fixed i1i\in\mathbb{N}_{1},

(5.2) Ld1Ld2LdiLdi+1L_{d}^{1}\subseteq L_{d}^{2}\subseteq\cdots\subseteq L_{d}^{i}\subseteq L_{d}^{i+1}\subseteq\cdots

Note further that for any fixed i1i\in\mathbb{N}_{1}, LdiL_{d}^{i} is the language accepted by the nondeterministic Turing machine DD within time O(ni)O(n^{i}), i.e., at most ni+1n^{i+1} steps, we thus obtain

(5.3) LdiNTIME[ni]𝒩𝒫,for any fixed i1.L_{d}^{i}\in{\rm NTIME}[n^{i}]\subseteq\mathcal{NP},\quad\text{for any fixed $i\in\mathbb{N}_{1}$}.

Now, (5.1), together with (5.2) and (5.3) easily yields

Ld𝒩𝒫,L_{d}\in\mathcal{NP},

as desired.   

Proof of Theorem 1. It is clear that Theorem 1 follows immediately from Theorem 4.1 and Theorem 5.2. This completes the proof.  

By the proof of Theorem 5.2, we also have the following corollary:

Corollary 5.3

For each fixed k1k\in\mathbb{N}_{1}, it holds that

Ldk=ikLdi𝒩𝒫L_{d}^{k}=\bigcup_{i\leq k}L_{d}^{i}\in\mathcal{NP}

where i1i\in\mathbb{N}_{1}.

Proof. It follows clearly from the relations (5.2) and (5.3).   

We close this section by making the following remark:

Remark 5.1

In fact, if the notion of the universal co-nondeterministic Turing machine exists, then we can apply lazy diagonalization against polynomial-time nondeterministic Turing machines via a universal co-nondeterministic Turing machine to obtain a language Ld𝒩𝒫L_{d}^{\prime}\notin\mathcal{NP} but Ldco𝒩𝒫L^{\prime}_{d}\in{\rm co}\mathcal{NP} in a similar way to the above proofs (i.e., proof of Theorem 4.1 and proof of Theorem 5.2). But we do not know whether such a concept of the universal co-nondeterministic Turing machines exists or not. Simultaneously, we also suspect that a universal co-nondeterministic Turing machine is just a universal nondeterministic Turing machine with the “for all” accepting criterion.

6 Breaking the Relativization Barrier

In 19751975, Baker, Gill, and Solovay [BGS75] presented a proof of that:

There is an oracle A for which 𝒫A=𝒩𝒫A.\text{There is an oracle $A$ for which }\mathcal{P}^{A}=\mathcal{NP}^{A}.

Baker, Gill, and Solovay [BGS75] suggested that their results imply that ordinary diagonalization techniques are not capable of proving 𝒫𝒩𝒫\mathcal{P}\neq\mathcal{NP}. Note that the above result also implies that for the same oracle AA,

𝒩𝒫A=co𝒩𝒫A,\mathcal{NP}^{A}={\rm co}\mathcal{NP}^{A},

because if 𝒫A=𝒩𝒫A\mathcal{P}^{A}=\mathcal{NP}^{A} then co𝒫A=co𝒩𝒫A{\rm co}\mathcal{P}^{A}={\rm co}\mathcal{NP}^{A}. Further by

𝒫A=co𝒫A,\mathcal{P}^{A}={\rm co}\mathcal{P}^{A},

we have the conclusion that 𝒩𝒫A=co𝒩𝒫A\mathcal{NP}^{A}={\rm co}\mathcal{NP}^{A}. Thereby, Baker, Gill, and Solovay’s above result also indirectly suggests that ordinary diagonalization techniques (lazy-diagonalization) are not capable of proving 𝒩𝒫co𝒩𝒫\mathcal{NP}\neq{\rm co}\mathcal{NP}. Now, let us explore what is behind this kind of mystery.

The computation model we use in this Section is the query machines, or the oracle Turing machines, which is an extension of the multi-tape Turing machine, i.e., Turing machines that are given access to a black box or “oracle” that can magically solve the decision problem for some language

X{0,1}.X\subseteq\{0,1\}^{*}.

The machine has a special oracle tape on which it can write a string

w{0,1}w\in\{0,1\}^{*}

and in one step gets an answer to a query of the form

“Is w in X?”,\text{``Is $w$ in $X$?"},

which can be repeated arbitrarily often with different queries. If XX is a difficult language (say, one that cannot be decided in polynomial time, or is even undecidable), then this oracle gives the Turing machine additional power. We first quote its formal definition as follows:

Definition 6.1 (cf. the notion of deterministic oracle Turing machines in [AB09])

A nondeterministic oracle Turing machine is a nondeterministic Turing machine MM that has a special read-write tape we call MM’s oracle tape and three special states qqueryq_{query}, qyesq_{yes}, qnoq_{no}. To execute MM, we specify in addition to the input a language X{0,1}X\subseteq\{0,1\}^{*} that is used as the oracle for MM. Whenever during the execution MM enters the state qqueryq_{query}, the machine moves into the state qyesq_{yes} if wXw\in X and qnoq_{no} if wXw\not\in X, where ww denotes the contents of the special oracle tape. Note that, regardless of the choice of XX, a membership query to XX counts only as single computation step. If MM is an oracle machine, X{0,1}X\subseteq\{0,1\}^{*} a language, and x{0,1}x\in\{0,1\}^{*}, then we denote the output of MM on input xx and with oracle XX by MX(x)M^{X}(x). An input xx is said to be accepted by a nondeterministic oracle Turing machine MXM^{X} if there is a computation path of MXM^{X} on input xx leading to the accepting state of MXM^{X}.

In a similar way to define the notion of co𝒩𝒫{\rm co}\mathcal{NP}-machines (Definition 2.7) in Section 2, we can define the notion of co𝒩𝒫A{\rm co}\mathcal{NP}^{A}-machines as follows.

Definition 6.2

A co-nondeterministic oracle Turing machine is a nondeterministic Turing machine MM that has a special read-write tape we call MM’s oracle tape and three special states qqueryq_{query}, qyesq_{yes}, qnoq_{no}. To execute MM, we specify in addition to the input a language X{0,1}X\subseteq\{0,1\}^{*} that is used as the oracle for MM. Whenever during the execution MM enters the state qqueryq_{query}, the machine moves into the state qyesq_{yes} if wXw\in X and qnoq_{no} if wXw\not\in X, where ww denotes the contents of the special oracle tape. Note that, regardless of the choice of XX, a membership query to XX counts only as single computation step. If MM is an oracle machine, X{0,1}X\subseteq\{0,1\}^{*} a language, and x{0,1}x\in\{0,1\}^{*}, then we denote the output of MM on input xx and with oracle XX by MX(x)M^{X}(x). An input xx is said to be accepted by a co-nondeterministic oracle Turing machine MXM^{X} if all computation paths of MXM^{X} on input xx lead to the accepting state of MXM^{X}.

If for every input xx of length |x||x|, all computations of MXM^{X} on input xx end in less than or equal to t(|x|)t(|x|) steps, then MXM^{X} is said to be a t(n)t(n) time-bounded nondeterministic (co-nondeterministic) oracle Turing machine with oracle XX, or said to be of time complexity t(n)t(n). The family of languages of nondeterministic time complexity t(n)t(n) with oracle XX is denoted by

NTIMEX[t(n)];{\rm NTIME}^{X}[t(n)];

the family of languages of co-nondeterministic time complexity t(n)t(n) with oracle XX is denoted by

coNTIMEX[t(n)].{\rm coNTIME}^{X}[t(n)].

The notation 𝒩𝒫X\mathcal{NP}^{X} and co𝒩𝒫X{\rm co}\mathcal{NP}^{X} are defined, respectively, to be the class of languages:

𝒩𝒫X=k1NTIMEX[nk]\mathcal{NP}^{X}=\bigcup_{k\in\mathbb{N}_{1}}{\rm NTIME}^{X}[n^{k}]

and

co𝒩𝒫X=k1coNTIMEX[nk].{\rm co}\mathcal{NP}^{X}=\bigcup_{k\in\mathbb{N}_{1}}{\rm coNTIME}^{X}[n^{k}].

Let us denote the set of all polynomial-time co-nondeterministic oracle Turing machines with oracle XX by the notation coNPXcoNP^{X}.

To break the Relativization Barrier or to explore the mystery behind the implications of Baker, Gill, and Solovay’s result [BGS75], let us further make the following rational assumptions:

  1. 1.

    the polynomial-time co-nondeterministic oracle Turing machine can be encoded to a string over {0,1}\{0,1\};

  2. 2.

    there are universal nondeterministic oracle Turing machines that can simulate any other co-nondeterministic oracle Turing machine, which is similar to the behavior of universal nondeterministic Turing machine DD in Section 4;

  3. 3.

    the simulation can be done within in time

    O(T(n)logT(n)),O(T(n)\log T(n)),

    where T(n)T(n) is the time complexity of the simulated co-nondeterministic oracle Turing machine.

Then, we will prove the following interesting theorems:

Theorem 6.1

Assume that the set coNPAcoNP^{A} of polynomial-time co-nondeterministic oracle Turing machines with oracle AA is enumerable, and further we have the above rational assumptions. Then, there exists a language LdAL_{d}^{A} accepted by a universal nondeterministic Turing machine DAD^{A} with oracle AA, but not by any co𝒩𝒫{\rm co}\mathcal{NP}-machines with oracle AA. Namely, there is a language LdAL_{d}^{A} such that

LdAco𝒩𝒫A.L_{d}^{A}\notin{\rm co}\mathcal{NP}^{A}.

Proof. Since the set coNPAcoNP^{A} of polynomial-time co-nondeterministic oracle Turing machines with oracle AA is enumerable, and apart from it, we further have the above rational assumptions, i.e., the following: (1) the polynomial-time co-nondeterministic oracle Turing machine can be encoded to a string over {0,1}\{0,1\}; (2) there are universal nondeterministic oracle Turing machines that can simulate any other co-nondeterministic oracle Turing machine, which is similar to the behavior of universal nondeterministic Turing machine DD in Section 4; (3) the simulation can be done within in time

O(T(n)logT(n)),O(T(n)\log T(n)),

where T(n)T(n) is the time complexity of the simulated co-nondeterministic oracle Turing machine. Then, the remainder is to show that there exists a universal nondeterministic Turing machine DAD^{A} with oracle AA accepting the language

LdAco𝒩𝒫AL_{d}^{A}\notin{\rm co}\mathcal{NP}^{A}

in a similar way to that of Theorem 4.1.   

Theorem 6.2

The universal nondeterministic Turing machine DAD^{A} with oracle AA constructed in the proof of Theorem 6.1 runs within time O(nk)O(n^{k}) for any k1k\in\mathbb{N}_{1}. Further, the language LdAL_{d}^{A} accepted by the nondeterministic Turing machine DAD^{A} with oracle AA is in fact in 𝒩𝒫A\mathcal{NP}^{A}.

Proof. The proofs of this theorem are similar to those of Theorem 5.1 and Theorem 5.2.   

Combining Theorem 6.1 and Theorem 6.2 actually yields the following result:

Theorem 6.3

Assume that the set coNPAcoNP^{A} of polynomial-time co-nondeterministic oracle Turing machines with oracle AA is enumerable. Further under the above rational assumptions, then

𝒩𝒫Aco𝒩𝒫A.\mathcal{NP}^{A}\neq{\rm co}\mathcal{NP}^{A}.

But now we have the fact that 𝒩𝒫A=co𝒩𝒫A\mathcal{NP}^{A}={\rm co}\mathcal{NP}^{A}, so our assumption in Theorem 6.3 is not true, i.e., the set coNPAcoNP^{A} of all polynomial-time co-nondeterministic oracle Turing machines with oracle AA is not enumerable. So the truths behind this kind of mystery are the following theorem:

Theorem 6.4

Under some rational assumptions (i.e., the above rational assumptions), and if 𝒩𝒫A=co𝒩𝒫A\mathcal{NP}^{A}={\rm co}\mathcal{NP}^{A}, then the set coNPAcoNP^{A} of all polynomial-time co-nondeterministic oracle Turing machine with oracle AA is not enumerable. Thereby, the ordinary diagonalization techniques (lazy-diagonalization) will generally not apply to the relativized versions of the 𝒩𝒫\mathcal{NP} versus co𝒩𝒫{\rm co}\mathcal{NP} problem.  

Now, we are naturally at a right point to present the proof of Theorem 2 as follows: Proof of Theorem 2. Note that the statement of Theorem 2 is precisely the same as Theorem 6.4, thereby we have done the proof.  

7 Rich Structure of co𝒩𝒫{\rm co}\mathcal{NP}

In complexity theory, or computational complexity, problems that are in the complexity class 𝒩𝒫\mathcal{NP} but are neither in the class 𝒫\mathcal{P} nor 𝒩𝒫\mathcal{NP}-complete are called 𝒩𝒫\mathcal{NP}-intermediate, and the class of such problems is called 𝒩𝒫\mathcal{NPI}. The well-known Ladner’s theorem, proved in 1975 by Ladner [Lad75], is a result asserting that if the complexity classes 𝒫\mathcal{P} and 𝒩𝒫\mathcal{NP} are different, then the class 𝒩𝒫\mathcal{NPI} is not empty. Namely, the complexity class 𝒩𝒫\mathcal{NP} contains problems that are neither in 𝒫\mathcal{P} nor 𝒩𝒫\mathcal{NP}-complete. We call such a nice theorem the rich structure of the complexity class 𝒩𝒫\mathcal{NP}.

Our main goal in this section is to establish a result saying that similar to the complexity class 𝒩𝒫\mathcal{NP} having a rich structure, the complexity class co𝒩𝒫{\rm co}\mathcal{NP} also has intermediate languages that are not in 𝒫\mathcal{P} nor co𝒩𝒫{\rm co}\mathcal{NP}-complete. To do so in a simple way, we need to quote the following useful result whose proof can be found in [Lad75]:

Lemma 7.1 ([Lad75])

(Suppose that 𝒫𝒩𝒫\mathcal{P}\neq\mathcal{NP}). There is a language Linter𝒩𝒫L_{\rm inter}\in\mathcal{NP} such that LinterL_{\rm inter} is not in 𝒫\mathcal{P} and LinterL_{\rm inter} is not 𝒩𝒫\mathcal{NP}-complete.  

Next, we need to show that the language of all tautologies is co𝒩𝒫{\rm co}\mathcal{NP}-complete:

Lemma 7.2

Let TAUT={φ:φ is a tautology}{\rm TAUT}=\{\varphi\,:\,\varphi\text{ is a tautology}\}. Then the language TAUT{\rm TAUT} is co𝒩𝒫{\rm co}\mathcal{NP}-complete.

Proof. It is sufficient to show that for every language Lco𝒩𝒫L\in{\rm co}\mathcal{NP}, LpTAUTL\leq_{p}{\rm TAUT} where p\leq_{p} is polynomial-time many-one reduction [Kar72]. We can modify the Cook-Levin reduction [Coo71, Lev73] from L¯={0,1}L\overline{L}=\{0,1\}^{*}-L to SAT{\rm SAT} (recall that all nondeterministic Turing machines are with input alphabet {0,1}\{0,1\} because that will be all we need, see Section 3). So for any input w{0,1}w\in\{0,1\}^{*} the Cook-Levin reduction produces a formula φw\varphi_{w} that is satisfiable if and only if wL¯w\in\overline{L}. In other words, the formula ¬φx\neg\varphi_{x} is in TAUT{\rm TAUT} if and only if xLx\in L, which completes the proof.   

Now, we are naturally at a right point to give the proof of Theorem 6 as follows:

Proof of Theorem 6. Let L={0,1}LinterL=\{0,1\}^{*}-L_{\rm inter}, where the language LinterL_{\rm inter} is the same as in Lemma 7.1. Such a language indeed exists by Corollary 3, which states that 𝒫𝒩𝒫\mathcal{P}\neq\mathcal{NP}.

Then Lco𝒩𝒫L\in{\rm co}\mathcal{NP}, and we will show that LL is not in 𝒫\mathcal{P} nor co𝒩𝒫{\rm co}\mathcal{NP}-complete. To see so, suppose that L𝒫L\in\mathcal{P}, then since 𝒫=co𝒫\mathcal{P}={\rm co}\mathcal{P}, we have L¯=Linter𝒫\overline{L}=L_{\rm inter}\in\mathcal{P}, a contradiction to the fact that Linter𝒫L_{\rm inter}\notin\mathcal{P} by Lemma 7.1. Further assume that LL is co𝒩𝒫{\rm co}\mathcal{NP}-complete, then TAUTpL{\rm TAUT}\leq_{p}L. By Lemma 7.2, L¯=Linter\overline{L}=L_{\rm inter} is 𝒩𝒫\mathcal{NP}-complete, it is also a contradiction to the fact that LinterL_{\rm inter} is not 𝒩𝒫\mathcal{NP}-complete. Thereby, to summarize from the above two cases, we have shown that LL is co𝒩𝒫{\rm co}\mathcal{NP}-intermediate. This finishes the proof.  

8 Frege Systems

The focal point of this section is Frege systems, which are very strong proof systems in propositional setting [CR79], based on axiom schemes and rules such as modus ponens. While Frege systems operate with Boolean formulas as lines, the extended Frege system EF works with Boolean circuits; see e.g. [Jer05]. Furthermore, showing lower bounds on Frege or even extended Frege systems constitutes a major open problem in proof complexity; see e.g. [BP01].

Informally, a proof system for a language \mathcal{L} is a definition of what is considered to be a proof that Φ\Phi\in\mathcal{L}; see e.g. [CR79]. The key features of a proof system are that it is sound, i.e., only formulas in \mathcal{L} have proofs, and complete, i.e., all formulas in \mathcal{L} have proofs, and that there is an algorithm with running time polynomial in |π||\pi| to check whether π\pi is a proof that Φ\Phi\in\mathcal{L} [BH19].

Let 𝔉\mathfrak{F} be the set of functions f:Σ1Σ2f:\Sigma_{1}^{*}\rightarrow\Sigma_{2}^{*}, where Σ1\Sigma_{1} and Σ2\Sigma_{2} are any finite alphabets, such that ff can be computed by a deterministic Turing machine in time bounded by a polynomial in the length of the input. Then, we have the following definition:

Definition 8.1 ([CR79])

If LΣL\subseteq\Sigma^{*}, a proof system for LL is a function f:Σ1Lf:\Sigma_{1}^{*}\rightarrow L for some alphabet Σ1\Sigma_{1} and f𝔉f\in\mathfrak{F} such that ff is onto. We say that the proof system is polynomially bounded iff there is a polynomial p(n)p(n) such that for all yLy\in L there is a xΣ1x\in\Sigma_{1}^{*} such that y=f(x)y=f(x) and |x|p(|y|)|x|\leq p(|y|), where |z||z| denotes the length of a string zz.

If y=f(x)y=f(x), then xx is a proof of yy, and xx is a short proof of yy, if in addition, |x|p(|y|)|x|\leq p(|y|). Thus a proof system ff is polynomially bounded iff there is a bounding polynomial p(n)p(n) with respect to which every yLy\in L has a short proof.

In particular, Frege proof systems are proof systems for propositional logic. As a matter of fact, Frege systems are the usual “textbook” proof systems for propositional logic based on axioms and rules; see e.g. [CR79]. A Frege system composed of a finite set of axiom schemes and a finite number of rules is a possible axiom scheme. Furthermore, a Frege proof is a sequence of formulas where each formula is either a substitution instance of an axiom or can be inferred from previous formulas by a valid inference rule. At the same time, Frege systems are required to be sound and complete. However, the exact choice of the axiom schemes and rules does not matter, as any to Frege system are polynomial equivalent; see e.g.[CR79] or see Theorem 8.1 below. Thus, we can assume without loss of generality that modus ponens (see e.g. Remark 8.1 below) is the only rule of inference.

Definition 8.2 ([BG98])

A Frege proof system FF is an inference system for propositional logic based on

  1. (1)

    a language of well-formed formulas obtained from a numerable set of propositional variables and any finite propositionally complete set of connectives;

  2. (2)

    a finite set of axiom schemes;

  3. (3)

    and the rule of Modus Ponens

    AABB.\frac{A\quad A\rightarrow B}{B}.

A proof PP of the formula AA in a Frege system is a sequence A1,,AnA_{1},\cdots,A_{n} of formulas such that AnA_{n} is AA and every AiA_{i} is either an instance of an axiom scheme or it is obtained by the application of the Modus Ponens from premises AjA_{j} and AkA_{k} with j,k<ij,k<i.

Remark 8.1

The well-known inference rule of Modus Ponens is its only rule of inference:

AABB\frac{A\quad A\rightarrow B}{B}

We give a more intuitive explanation about the inference rule of Modus Ponens: For example, in the proof of Theorem 5.2 in Section 5 where we prove that Ld𝒩𝒫L_{d}\in\mathcal{NP}, AA is the following:

A=def(5.1)(5.2)(5.3),A\overset{\rm def}{=}(5.1)\bigwedge(5.2)\bigwedge(5.3),

BB represents the proposition Ld𝒩𝒫L_{d}\in\mathcal{NP}. Namely,

B=defLd𝒩𝒫,B\overset{\rm def}{=}L_{d}\in\mathcal{NP},

and one of the main deductions of Theorem 5.2, besides showing that AA is true, is to show that AA implies BB.

The axiom schemes of a Frege proof system is the following (see e.g. [Bus99]):

(PQ)P\displaystyle(P\wedge Q)\rightarrow P
(PQ)Q\displaystyle(P\wedge Q)\rightarrow Q
P(PQ)\displaystyle P\rightarrow(P\vee Q)
Q(PQ)\displaystyle Q\rightarrow(P\vee Q)
(PQ)((P¬Q)¬P)\displaystyle(P\rightarrow Q)\rightarrow((P\rightarrow\neg Q)\rightarrow\neg P)
(¬¬P)P\displaystyle(\neg\neg P)\rightarrow P
P(QPQ)\displaystyle P\rightarrow(Q\rightarrow P\wedge Q)
(PR)((QR)(PQR))\displaystyle(P\rightarrow R)\rightarrow((Q\rightarrow R)\rightarrow(P\vee Q\rightarrow R))
P(QP)\displaystyle P\rightarrow(Q\rightarrow P)
(PQ)(P(QR))(PR).\displaystyle(P\rightarrow Q)\rightarrow(P\rightarrow(Q\rightarrow R))\rightarrow(P\rightarrow R).

More generally, a Frege system is specified by any finite complete set of Boolean connectives and finite set of axiom schemes and rule schemes, provided it is implicationally sound and implicationally complete.

Definition 8.3

The length of a Frege proof is the number of symbols in the proof. The length |φ||\varphi| of a formula φ\varphi is the number of symbols in φ\varphi.

Now, we introduce the notion of pp-simulation between two proof systems.

Definition 8.4 ([CR79])

If f1:Σ1Lf_{1}:\Sigma_{1}^{*}\rightarrow L and f2:Σ2Lf_{2}:\Sigma_{2}^{*}\rightarrow L are proof system for LL, then f2f_{2} pp-simulates f1f_{1} provided there is a function g:Σ1Σ2g:\Sigma_{1}^{*}\rightarrow\Sigma_{2}^{*} such that g𝔉g\in\mathfrak{F}, and f2(g(x))=f1(x)f_{2}(g(x))=f_{1}(x) for all xx.

The notion of an abstract propositional proof system is given as follows:

Definition 8.5 ([Bus99])

An abstract propositional proof system is a polynomial-time computable function gg such that Range(g)=TAUTRange(g)=TAUT; i.e., the range of gg is the set of all Boolean tautologies. An gg-proof of a formula ϕ\phi is a string ww such that g(w)=ϕg(w)=\phi.

8.1 Lower Bounds for Frege Systems

Let 1\mathcal{F}_{1} and 2\mathcal{F}_{2} be two arbitrary Frege systems, the following theorem indicates that these standard proof systems for the propositional calculus are about equally powerful:

Theorem 8.1 ([Rec76, CR79])

Any two Frege systems pp-simulate each other. Hence one Frege system is polynomially bounded if and only if all Frege systems are.  

The following theorem gives a necessary and sufficient condition for the complexity class 𝒩𝒫\mathcal{NP} being closed under complementation. For completeness, the proof is also quoted from [CR79]:

Theorem 8.2 ([CR79])

𝒩𝒫\mathcal{NP} is closed under complementation if and only if TAUTTAUT is in 𝒩𝒫\mathcal{NP}.

Proof. The “if” part. Suppose that the set of tautologies is in 𝒩𝒫\mathcal{NP}. Then every set LL in 𝒩𝒫\mathcal{NP} is reducible to the complement of the tautologies [Coo71], i.e., there is a function ff that can be computed by a deterministic Turing machine in time bounded by a polynomial in the length of the input such that for all strings xx, xLx\in L iff f(x)f(x) is not a tautology. Hence a nondeterministic procedure for accepting the complement of LL is: on input xx, compute f(x)f(x), and accept xx if f(x)f(x) is a tautology, using the nondeterministic algorithm for tautologies assumed above. Hence the complement of LL is in 𝒩𝒫\mathcal{NP}.

The “only if” part. Suppose that 𝒩𝒫\mathcal{NP} is closed under complementation. The the complement of the set of tautologies is in 𝒩𝒫\mathcal{NP}, since to verify that a formula is not a tautology one can guess at a truth assignment and verify that it falsifies the formula.   

Since nondeterministic Turing machines can simulate Frege proof (see e.g. [Bus99]) and by Theorem 8.1 we know that any two Frege systems pp-simulate each others, we then can present the proof of Theorem 8 as follows:

Proof of Theorem 8. We show the theorem by contradiction. Suppose that there is a polynomial p(n)p(n) such that for all ψTAUT\psi\in TAUT, there exists a proof \oint of ψ\psi of length at most p(|ψ|)p(|\psi|) (see Definition 8.3), then a nondeterministic Turing machine MM on input ψ\psi can guess \oint of length at most p(|ψ|)p(|\psi|) and checking whether \oint is correct in deterministic polynomial time, and we know that the set of tautologies is co𝒩𝒫{\rm co}\mathcal{NP}-complete, which, together with Theorem 8.2, implies

𝒩𝒫=co𝒩𝒫,\mathcal{NP}={\rm co}\mathcal{NP},

a contradiction to Theorem 1. This finishes the proof.  

Remark 8.2

Initially, propositional proof complexity has been primarily concerned with proving lower bounds (even conditional) for the length of proofs in propositional proof systems, which is extremely interesting and well justified in its own right, with the ultimate goal of settling whether 𝒩𝒫=co𝒩𝒫\mathcal{NP}={\rm co}\mathcal{NP} (see e.g. [CR79]). In fact, as mentioned in Section 1, such research directions are called Cook’s Program for separating 𝒩𝒫\mathcal{NP} and co𝒩𝒫{\rm co}\mathcal{NP} (see e.g. [Bus12]): Prove superpolynomial lower bounds for proof lengths in stronger and stronger propositional proof systems until they are established for all abstract proof systems. However, our approach is “to do the opposite”, that is, we first prove that 𝒩𝒫co𝒩𝒫\mathcal{NP}\neq{\rm co}\mathcal{NP}, and then apply it to prove the lower bounds on the length of proofs in Frege proof systems.

9 Conclusions

In conclusion, we have shown that

𝒩𝒫co𝒩𝒫,\mathcal{NP}\neq{\rm co}\mathcal{NP},

thus resolving a very important and long-standing conjecture in computational complexity theory. Specifically, we first showed that the set of all co𝒩𝒫{\rm co}\mathcal{NP}-machines is enumerable. Then, we constructed a universal nondeterministic Turing machine DD to simulate all co𝒩𝒫{\rm co}\mathcal{NP}-machines and to do the lazy-diagonalization operation to create a language LdL_{d} which is not in co𝒩𝒫{\rm co}\mathcal{NP}. Next, we applied the technique developed in the author’s recent work [Lin21] to show that the language LdL_{d} accepted by the universal nondeterministic Turing machine DD is in fact in 𝒩𝒫\mathcal{NP}, thus establishing the desired main conclusion of this paper.

We stress again that one of the important techniques used to prove lower bounds (i.e., Theorem 4.1) is the so-called lazy-diagonalization [FS07, AB09], and the essential technique used to show the upper bounds result, i.e., the result that the language Ld𝒩𝒫L_{d}\in\mathcal{NP}, is essentially the same as the one developed in the author’s recent work [Lin21].

We have broken the so-called Relativization Barrier. Since by the result of [BGS75], it is not hard to show that there exists some oracle AA such that

𝒩𝒫A=co𝒩𝒫A,\mathcal{NP}^{A}={\rm co}\mathcal{NP}^{A},

which suggests that (in the light of the point of view given in [BGS75]) ordinary diagonalization techniques (lazy-diagonalization) are not capable of proving 𝒩𝒫co𝒩𝒫\mathcal{NP}\neq{\rm co}\mathcal{NP}; see [BGS75]. We explored the mystery behind this phenomenon by proving that if 𝒩𝒫A=co𝒩𝒫A\mathcal{NP}^{A}={\rm co}\mathcal{NP}^{A} and under some rational assumptions, then the set of all co𝒩𝒫{\rm co}\mathcal{NP}-machines with oracle AA is in fact not enumerable, thus showing that the ordinary diagonalization techniques (i.e., lazy-diagonalization) will generally not apply to the relativized versions of the 𝒩𝒫\mathcal{NP} versus co𝒩𝒫{\rm co}\mathcal{NP} problem.

As a by-product, the important result that 𝒫𝒩𝒫\mathcal{P}\neq\mathcal{NP} which is shown in [Lin21], also follows. The above outcome can be seen as, in a sense, the direct or indirect consequence of the proof procedures of

𝒩𝒫co𝒩𝒫.\mathcal{NP}\neq{\rm co}\mathcal{NP}.

Note again that by Corollary 3 we in fact have shown 𝒫co𝒩𝒫\mathcal{P}\neq{\rm co}\mathcal{NP} because if 𝒫=co𝒩𝒫\mathcal{P}={\rm co}\mathcal{NP}, then 𝒫=co𝒫=𝒩𝒫\mathcal{P}={\rm co}\mathcal{P}=\mathcal{NP}, a contradiction to Corollary 3. As a matter of fact and interestingly, we have also shown by Theorem 6 that there exists a co𝒩𝒫{\rm co}\mathcal{NP}-intermediate language in co𝒩𝒫{\rm co}\mathcal{NP}, which is not in 𝒫\mathcal{P} nor co𝒩𝒫{\rm co}\mathcal{NP}-complete, i.e., the complexity class co𝒩𝒫{\rm co}\mathcal{NP} also has a rich structure. We have also obtained other interesting consequences, such as Corollary 4 (i.e., 𝒩𝒳𝒫co𝒩𝒳𝒫\mathcal{NEXP}\neq{\rm co}\mathcal{NEXP}), Corollary 5 (i.e., 𝒫𝒫𝒩𝒳𝒫\mathcal{BPP}\neq\mathcal{NEXP}) and Corollary 7, which states that there is no super proof system.

As mentioned in the introduction section, the problem of 𝒩𝒫\mathcal{NP} versus co𝒩𝒫{\rm co}\mathcal{NP} closely connects with the field of proof complexity. In particular, we have shown that no Frege proof systems are polynomially bounded (Theorem 8), thus answering an open problem in [Pud08]. As pointed out in Remark 8.2, we do the opposite direction of Cook’s program. That is, we first prove that 𝒩𝒫co𝒩𝒫\mathcal{NP}\neq{\rm co}\mathcal{NP}, and then apply it to prove the lower bounds on the length of proofs in Frege proof systems. Note that one of the key points to show the Theorem 8 is that nondeterministic Turing machines can simulate Frege proofs (see [Bus99]). But we do not know whether the nondeterministic Turing machines can simulate Extended Frege proofs. For more details about Extended Frege Systems, please consult the references [CR79, CN10, Kra95].

Lastly, there are many other interesting open problems in the field of proof complexity that we did not touch on in this paper, and the interested reader is referred to these references [Bus12, CR74, CR79, CN10, Kra95, Kra19, Pud08, Raz03, Raz15] for those excellent open problems.

References

  • [A1] Anonymous authors. Computational complexity theory. Wikipedia, the free encyclopedia (Auguest, 2024). Available at /wiki/Computational complexity theory.
  • [A2] Anonymous authors. NP (complexity). Wikipedia, the free encyclopedia (Auguest, 2024). Available at /wiki/NP (complexity).
  • [A3] Anonymous authors. Co-NP. Wikipedia, the free encyclopedia (Auguest, 2024). Available at /wiki/Co-NP.
  • [A4] Anonymous authors. Proof theory. Wikipedia, the free encyclopedia (Auguest, 2024). Available at /wiki/Proof theory.
  • [A5] Anonymous authors. Proof complexity. Wikipedia, the free encyclopedia (Auguest, 2024). Available at /wiki/Proof complexity.
  • [AB09] Sanjeev Arora and Boaz Barak. Computational Complexity: A Modern Approach. Cambridge University Press, 2009.
  • [AHU74] Alfred V. Aho, John E. Hopcroft and Jeffrey D. Ullman. The Design and Analysis of Computer Algorithms. Addison–Wesley Publishing Company, Reading, California, 1974.
  • [BGS75] Theodore Baker, John Gill, and Robert Solovay. Relativizations of The P=?NPP=?NP Question. SIAM Journal on Computing, Vol. 4, No. 4, December 1975, pp. 431–442. https://doi.org/10.1137/0204037.
  • [BG98] M. L. Bonet and N. Galesi. Linear Lower Bounds and Simulations in Frege Systems with Substitutions. In: M. Nielsen, W. Thomas (eds) Computer Science Logic. CSL 1997. Lecture Notes in Computer Science, vol. 1414, Springer, Berlin, Heidelberg. pp. 115–128, 1998.
  • [BH19] O. Beyersdorff and L. Hinde. Characterising tree-like Frege proofs for QBF. Information and Computation 268 (2019) 104429. https://doi.org/10.1016/j.ic.2019.05.002.
  • [BP01] P. Beame and T. Pitassi. Propositional proof complexity: past, present, and future. In: G. Paun, G. Rozenberg, A. Salomaa (Eds.), Current Trends in Theoretical Computer Science: Entering the 2121st Century, World Scientific Publishing, 2001, pp. 42–70.
  • [Bus99] Samuel R. Buss. Propositional Proof Complexity: An Introduction. In: Computational Logic, edited by U. Berger and H. Schwichtenberg, Springer–Verlag, Berlin, 1999, pp. 127–178.
  • [Bus12] Samuel R. Buss. Towards NPPNP-P via proof complexity and search. Annals of Pure and Applied Logic 163 (2012) 906–917. https://doi.org/10.1016/j.apal.2011.09.009.
  • [Coo71] Stephen A. Cook. The complexity of theorem-proving procedures. In: Proceedings of the Third Annual ACM Symposium on Theory of Computing, May 1971, Pages 151–158. https://doi.org/10.1145/800157.805047.
  • [CR74] Stephen A. Cook and Robert A. Reckhow. On the lengths of proof in propositional calculus, preliminary version. In: Proceedings of the 66th Annual ACM Symposium on the Theory of Computing, 1974, pp. 135–148. https://doi.org/10.1145/800119.803893.
  • [CR79] Stephen A. Cook and Robert A. Reckhow. The Relative Efficiency of Propositional Proof Systems. The Journal of Symbolic Logic, Volume 44, Number 1, March 1979, pp. 36–50. https://doi.org/10.2307/2273702.
  • [Coo00] Stephen A. Cook. The P versus NP problem. April, 2000. Available at PvsNP.ps.
  • [CN10] Stephen A. Cook and Phuong Nguyen. Logical Foundations of Proof Complexity. Association for Symbolic Logic, Cambridge University Press, 2010.
  • [For00] Lance Fortnow. Diagonalization. Bulletin of the European Association for Theoretical Computer Science 71: 102–113 (2000).
  • [For11] Lance Fortnow. A New Proof of the Nondeterministic Time Hierarchy. Computational Complexity Blog, 2011. Available at this https URL.
  • [FS07] Lance Fortnow and Rahul Santhanam. Time Hierarchies: A Survey. Electronic Colloquium on Computational Complexity, Report No. 4 (2007).
  • [FSTW21] Michael A. Forbes, Amir Shpilka, Iddo Tzameret and Avi Wigderson. Proof Complexity Lower Bounds from Algebraic Circuit Complexity. Theory of Computing, Volume 17 (10), 2021, pp. 1–88. https://doi.org/10.4086/toc.2021.v017a010.
  • [Jer05] E. Jerabek. Weak Pigeonhole Principle, and Randomized Computation. Ph.D. thesis, Faculty of Mathematics and Physics, Charles University, Prague, 2005.
  • [HS65] J. Hartmanis and R. Stearns. On the computational complexity of algorithms. Transactions of the American Mathematical Society, 117: 285–306, 1965.
  • [HCCRR93] Juris Hartmanis, Richard Chang, Suresh Chari, Desh Ranjan, and Pankaj Rohatgi. Relativization: a Revisionistic Retrospective. Current Trends in Theoretical Computer Science, 1993, pp. 537–547. https://doi.org/10.1142/9789812794499_0040.
  • [Hop84] John. E. Hopcroft. Turing machines. Scientific American, May 1984, pp. 86–98.
  • [Kar72] Richard M. Karp. Reducibility among combinatorial problems. In: Miller R. E., Thatcher J. W., Bohlinger J. D. (eds) Complexity of Computer Computations., Plenum Press, New York, 1972, 85–103. https://doi.org/10.1007/978-1-4684-2001-2_9.
  • [Kra95] J. Krajicek. Bounded Arithmetic, Propositional Logic, and Complexity Theory. Cambridge, 1995.
  • [Kra19] J. Krajicek. Proof complexity. Encyclopedia of Mathematics and Its Applications, Vol. 170, Cambridge University Press, 2019.
  • [Lad75] Richard E. Ladner. On the Structure of Polynomial Time Reducibility. Journal of the ACM, Vol. 22, No. 1, January 1975, pp. 155-171. https://doi.org/10.1145/321864.321877.
  • [Lev73] Leonid A. Levin. Universal search problems (in Russian). Problemy Peredachi Informatsii 9 (1973), 265–266. English translation in B. A. Trakhtenbrot, A survey of Russian approaches to Perebor (brute-force search) algorithms, Annals of the History of Computing 6 (1984), 384–400.
  • [Lin21] Tianrong Lin. Diagonalization of Polynomial–Time Deterministic Turing Machines via Nondeterministic Turing Machines. CoRR abs/2110.06211 (2021). Available at /abs/2110.06211
  • [MS72] Albert R. Meyer and Larry J. Stockmeyer. The equivalence problem for regular expressions with squaring requires exponential space. In: IEEE 13th Annual Symposium on Switching and Automata Theory (swat 1972), pp. 125–129. https://doi.org/10.1109/SWAT.1972.29.
  • [Pap94] Christos H. Papadimitriou. Computational Complexity. Addison–Wesley, 1994.
  • [Pud08] Pavel Pudlák. Twelve Problems in Proof Complexity. In: E.A. Hirsch et al. (Eds.): CSR 2008. Lecture Notes in Computer Science, vol. 5010, Springer, Berlin, Heidelberg, pp. 13–27, 2008.
  • [Raz03] Alexander A. Razborov. Propositional Proof Complexity. Journal of the ACM, Vol. 50, No. 1, January 2003, pp. 80–82. https://doi.org/10.1145/602382.602406.
  • [Raz04] Alexander A. Razborov. Feasible proofs and computations: partnership and fusion. In: J. Diaz, J. Karhumaki, A. Lepisto, D. Sannella (eds) Automata, Langugages and Programming. ICALP 2004. Lecture Notes in Computer Science, vol. 3142, Springer, Berlin, Heidelberg, 2004, pp. 8–14.
  • [Raz15] Alexander A. Razborov. Pseudorandom generators hard for kk-DNF resolution and polynomial calculus resolution. Annals of Mathematics 181 (2015), 415–472. https://doi.org/10.4007/annals.2015.181.2.1
  • [Rec76] Robert A. Reckhow. On the lengths of proofs in the propositional calculus. Ph.D. Thesis, Department of Computer Science, University of Toronto, 1976.
  • [Sto77] Larry J. Stockmeyer. The polynomial-time hierarchy. Theoretical Computer Science 3 (1977) 1–22. https://doi.org/10.1016/0304-3975(76)90061-X.
  • [Tur37] Alan M. Turing. On computable numbers with an application to the entscheidnungsproblem. Proceedings of the London Mathematical Society, Volume s2-42, Issue 1, 1937, Pages 230–265. https://doi.org/10.1016/0066-4138(60)90045-8
  • [Wig07] Avi Wigderson. PP, NPNP and Mathematics–a computational complexity perspective. Proceedings of the ICM 06, vol. 1, EMS Publishing House, Zurich, pp. 665–712, 2007.
Tianrong Lin
National Hakka University, China