Automata Equipped with Auxiliary Data Structures and Regular Realizability Problems
Abstract
We consider general computational models: one-way and two-way finite automata, and logarithmic space Turing machines, all equipped with an auxiliary data structure (ADS). The definition of an ADS is based on the language of protocols of work with the ADS. We describe the connection of automata-based models with ‘‘Balloon automata’’ that are another general formalization of automata equipped with an ADS presented by Hopcroft and Ullman in 1967. This definition establishes the connection between the non-emptiness problem for one-way automata with ADS, languages recognizable by nondeterministic log-space Turing machines equipped with the same ADS, and a regular realizability problem (NRR) for the language of ADS’ protocols. The NRR problem is to verify whether the regular language on the input has a non-empty intersection with the language of protocols. The computational complexity of these problems (and languages) is the same up to log-space reductions.
Keywords: Finite automata; Balloon automata; Auxiliary data structures
1 Introduction
Many computational models are derived from (one-way) finite automata (FAs) via equipping them with an auxiliary data structure (ADS). The best-known model of this kind is pushdown automata (PDAs), the deterministic version of which is widely used in compilers. Other examples are -counter automata, -reversal-bounded counter automata (equipped with counters each of which can switch between increasing and decreasing modes at most times), stack automata, nested stack automata, bag automata [5], set automata (SAs) [8] and their another variant [9]; more examples can be found in [7].
During the investigation of balloon automata (BAs) [7], Hopcroft and Ullman connected the decidability of the membership and the emptiness problems for one-way and two-way models; we denote them as and respectively, where denotes one-way and denotes two-way models, and stands for determinism or nondeterminism respectively. Eq. (1) summarizes results on decidability questions from [7], where is a Turing-reduction and means that and .
(1) |
We remark that the relation was proved for the case of at least a two-letter input alphabet.
While a lot of models can be described as BA, it is hard to invent such a model with good computational properties. One of the reasons is that the equipment of finite automata with a complex data structure (or with several simple data structures) often leads to a universal computational model. For example, FAs equipped with two pushdown stores are equivalent to Turing machines (TMs), as well as FAs equipped with two non-restricted counters.
In this paper, we investigate the computational power of FAs equipped with an ADS. We describe the model using the language of correct protocols of work with the ADS. We provide a general approach to analyze the complexity of the emptiness problem and prove the following non-trivial result. If FAs are equipped with an ADS and nondeterministic logarithmic space TMs (s, see the definition in [16]) are equipped with the same ADS, then the FAs’ non-emptines problem and the TMs-recognizable languages are of the same complexity (up to log-space reductions). Our key tool is the regular realizability problem (see Definition 1 below).
1.1 Our Contribution
BAs were initially defined as automata with access to additional storage of unspecified structure—the balloon. A rather general axioms were imposed for the balloon and the interaction of the balloon and the automaton (see Definition 4 below). In this paper, we propose another definition based on a language of the ADS’ protocols that we denote as , so we refer to the ADS as . We prove that languages recognizable by form not just a rational cone as in the case of [7], but a principal rational cone generated by (we provide the definition in Section 2.2).
This reformulation guarantees good structural properties, some of them follow from the connection with BA (Section 4), and provides the relation between and the nondeterministic regular realizability problem.
Definition 1.
Fix a formal language called a filter, the parameter of regular realizability problems and that are the problems of verifying non-emptiness of the intersection of the filter with a regular language described via the DFA or NFA respectively. Formally,
RR problems have independently been studied under the name regular intersection emptiness problems [21, 22]. A restricted version of RR problem (for context-free filters only) is a well-known CFL-reachability problem, which is related to problems in interprocedural program analysis [4, 6, 23, 10, 11, 3].
In this paper we focus on the computational complexity, so we use the weakest reduction suitable for our needs, the deterministic log-space reduction that we denote as . If and we write and say that and are log-space equivalent. Note that in our constructions, emptiness and membership problems are the sets of instances’ descriptions with positive answers, i.e., , , where is a and is the description of . So, . We prove that . Based on this result, we establish computational universality of (see Theorem 34 below). Note that in the universality result we need Turing reductions in polynomial time instead of log-space reductions.
We equip with ADS not only FAs but also s. We denote deterministic and nondeterministic s equipped with an ADS as and respectively. We prove that
(2) |
hereinafter is the class of languages recognizable by the model. If is a problem (formal language) and is a set of problems (class of formal languages) the reductions mean as follows. means that and means that ; means .
It is easy to verify that in the original proofs in [7], Turing reductions in (1) can be replaced by the log-space reductions provided we replace the emptiness problems with non-emptiness ones. So, we obtain
(3) |
We also prove the reduction
(4) |
Results (3) combined with known facts imply assertions (5-8), where S is the set data structure as in SA, S1 is the set data structure that supports the insertion of at most one word, that cannot be removed further but can be tested if a query-word in the set. In S1,|Γ|=1 the word in the set is over an unary alphabet, and are subclasses of complete languages.
(5) | ||||
(6) | ||||
(7) | ||||
(8) |
Assertion (5) is a well-known fact. Our technique here just shows a new connection: (5) directly follows from the fact that the emptiness problem for PDA is -complete. Assertions (6-8) are new results to the best of our knowledge, we prove them in Section 6. Assertions (7-8) lead to (3) for the corresponding classes of automata. For (6), we have already obtained the result in [14] in the same way and present in this paper the generalized technique.
2 Definitions
2.1 Notation on binary relations
We associate with a binary relation the corresponding mappings and that are denoted by the same letter , so and . A relation is the composition of the relations and if ; we denote the composition as . In the case of a set we treat as a binary relation in the composition that returns the set . We denote the reflexive and transitive closure of by ; the symbol can also be placed above the relation, e.g., . We denote by the inverse relation, i.e., .
2.2 Rational Transductions
Our technique is based on the connection of NRR problems with rational cones. We recall the definitions borrowing them from the book [2]. A finite state transducer (FST) is a nondeterministic finite automaton with an output tape, and DFST is the deterministic version of FST. For the deterministic version, it is important that a transducer can write a word (but not only a single symbol) on the output tape on processing a letter from the input tape. Let be an FST; we also denote by the corresponding relation, i.e., if there exists a run of on the input from the initial state to a final state such that at the end of the run the word is written on the output tape. The rational dominance relation holds if there exists an FST such that , here and are languages. The relations computable by FSTs are known as rational relations. The following lemmata are algorithmic versions of well-known facts (see [2], Chapter III), the first one is the algorithmic version of the Elgot-Mezei theorem. The log-space algorithms follow from straight-forward constructions.
Lemma 2.
For FSTs and such that , , and FA such that , there exists an FST such that , and NFA recognizing the language . So, the relation is transitive. Moreover, and are constructible in logarithmic space. We denote FST and NFA as and respectively.
Lemma 3.
For each FST there exists an FST that computes the inverse relation of the relation . FST is log-space constructible by FST .
A rational cone is a family of languages that is closed under the rational dominance relation: and imply . If there exists a language such that for any , then is a principal rational cone generated by ; we denote it as .
Rational transductions for context-free languages were thoroughly investigated in the 1970s, particularly by the French school. The main results of this research were published in Berstel’s book [2]. As described in [2], it follows from the Chomsky-Schützenberger theorem that is a principal rational cone: , where is the Dyck language on two types of brackets.
2.3 Computational Models
Firstly, we define BA. We provide the definition that is equivalent to the original definition from [7] but has technical differences, for the sake of convenience. Then we provide the definitions of other models: the refined definition of Balloon automata in terms of protocols and computational models based on that are connected with -problem as well as with .
As it said, the balloon is a storage medium of unspecified structure. Thus its states are represented by (a subset of) positive integers. A BA can get limited information about the state of the balloon (the balloon information function in the definition below) and can modify the states of the balloon (the balloon control function). Here we need 1BAs only. So we give the definition for them. The definitions for 2BAs are similar, they are provided in [7].
Definition 4.
A -way balloon automaton (1BA) is defined by a tuple
-
•
is the finite set of automaton states.
-
•
, where is the finite input alphabet and are the endmarkers. The input has the form .
-
•
is the set of the balloon states.
-
•
is the finite set of the balloon information states.
-
•
is a total computable function (balloon information function).
-
•
is a partially computable function from to (balloon control function).
-
•
is the set of the final states.
-
•
is the initial state.
-
•
is the transition relation (a partial function for deterministic automata) defined as ; hereinafter for any alphabet .
Definition 5.
A configuration of a is a triple , where is the unprocessed part of the input so is either or a suffix of . The initial configuration of is , a move of is defined by the relation on configurations as follows: , where if . A accepts the input if there exists a sequence of moves (computational path) such that after processing of the final state is reached, i.e., , where .
It is not easy to define classes of balloon automata (like PDAs or SAs) since one needs to define valid families of functions and . One can see an example of PDAs definition in terms of BA in [7]. We suggest another approach for the definition of BA classes in Section 4. The approach simplifies the definitions since it is only needed to define a language of correct protocols to define an ADS.
We define a protocol as a sequence of triples of the query-word , the query and the response on the query. Numerous extra conditions are listed in the following formal definition.
Definition 6.
Let be finite disjoint alphabets such that . Let be a relation that provides the correspondence between queries and possible responses. A protocol is a word such that , where , , , , , and . We call a word a query block. We say that a language is a language of correct protocols if the axioms (i-v) hold:
-
(i)
;
-
(ii)
is a protocol;
-
(iii)
if and is a protocol, then ;
-
(iv)
;
-
(v)
if and , then ;
-
(vi)
.
Axiom (vi) does not hold in the general case, e.g., for SAs and counter automata without zero tests. It is needed to describe the connection of automata with an ADS with BAs in Section 4.
A language of correct protocols generates the corresponding class of languages, the principal rational cone . All examples of BAs languages classes in [7] can be presented as . We provide here only two examples.
Example 7.
It is well-known [2] that , where is the Dyck language with two types of parentheses. It is also well-known that a Dyck word is a protocol of the stack. We transform the language into a language of protocols as follows.
We define the alphabets , , , . To define correct protocols we use an FST that erases all symbols from of the input. So,
By the definition , so we have that . It is also easy to show that , so .
Note that we set here for the sake of simplicity. One can use another variant: , , . ∎
The following example is a starting point for the generalization presented in this paper.
Example 8.
The data structure Set consists of the set which is initially empty. Set supports the following operations: , , . We define the protocol language SA-PROT consistently with [13, 14], so the elements of alphabets below are individual symbols while they are words in [13, 14]. , .
It was proved in [13] that . ∎
Definition 9.
Fix a language of correct protocols . An automaton equipped with auxiliary data structure (defined by ) is defined by a tuple
-
•
, , , are the same as in Definition 4, so as .
-
•
, .
-
•
.
-
•
is the transition relation defined as
The automaton has a one-way write-only query tape. During the processing of the input, it writes query-words on the query tape, performs queries , and receives responses such that . After each query, the query tape is erased.
A configuration of an ADS-automaton is a tuple
where is the unprocessed part of the input , i.e., is the suffix of , is the content of the work tape, and is the protocol of the automaton operating with the data structure. A move of an automaton is defined via the relation on configurations which is defined as follows:
(9) | |||||
(10) |
A configuration is initial if it has the form , a configuration is accepting if it has the form , where . A word is accepted by an automaton with ADS if . An automaton is deterministic if for all configurations from and follows .
For the next two models, we provide the definitions on the implementation level only.
Definition 10.
A () is a deterministic (nondeterministic) equipped with an ADS defined by the language of correct protocols . I.e., is equipped with an additional write-only one-way query tape that is used to write down a query word and perform a query. After a query is performed, the tape is erased and the finite state control of receives the result of the query . The query results are consistent with , i.e., , .
A configuration of is a triple where is the configuration of -part, is the word written on the query tape, and is the protocol that is the result of all the performed queries. A accepts a word if , where is the initial configuration of the -part of , is the accepting configuration of -part of , and , the relation corresponds to the ’s moves.
Definition 11.
Let be an arbitrary formal language (filter). A () is a deterministic (non-deterministic) log-space TM equipped with a read-only one-way infinite tape called advice tape. At the beginning of the computation, the advice tape contains a word , where and is a symbol that indicates empty cells.
A configuration of an is a pair where is the configuration of the -part of , is the unprocessed part of . accepts a word if there exists such that , where is the initial configuration of the -part of , is the accepting configuration of the -part of .
An equivalent model to s appeared in [17] and its journal version [19] under the name ‘‘models of generalized nondeterminism (GNA)’’ and lead to the appearance of the problem. In this paper we repeat the steps of [17, 19] to establish the connection between and problem in Section 5 to prove one of the main results of the paper Eq. (2). We also show the equivalence between s and GNA. The difference is that in GNA it is allowed not to process the advice till the end of the word, so it was demanded for to be a prefix-closed language in [17, 19].
3 Principal Rational Cones and
the NRR-Problem
In this section, we provide the core of our technique. We prove that is a principal rational cone generated by the language of correct protocols , i.e., ; it is the first main result of the section. This fact yields structural results about the family , as well as the results on the complexity of the emptiness problem. We focus in this section on the connection between the non-emptiness problem and the problem. We prove that these problems are equivalent under log-space reductions, it is the second main result of the section. It leads us to the main results of the paper in Section 5. We provide in this section structural results that naturally arise in the proofs. Other structural results are discussed in Section 4 since their relation to [7].
Most of the results of this section directly generalize the results of [13, Section 3] (see the full journal version [15]). In most cases, to get a generalized result, one can substitute SA protocols (see Example 8) with general protocols as defined in Definition 6. So, our general approach comes from the generalization of the technique that was developed for SAs. One can also find in [15] more technically detailed proofs.
Lemma 12.
There exists a recognizing .
Proof.
Let us assume that has on the input the word of the form , where (since it is a regular condition). writes a word on the query tape, performs the query and tests that the responce is . If all tests are correct than is accepted; otherwise, it is rejected. ∎
Lemma 13.
For each language of correct protocols there exists a language of correct protocols , provided , such that the following properties hold
-
•
,
-
•
,
-
•
There exists a DFST such that and is a DFST,
-
•
For each there exists an equivalent such that is log-space constructible by and vice versa ().
Proof.
Enumerate all letters from and encode the -th letter as . Such encoding is computable by a DFST and the inverse encoding is computed by that is a DFST as well. So, (we assume that preserves letters from ). Now we show that for each there exists an equivalent .
By our construction, simulates , i. e., has states of the form where is a state of and is an auxiliary information needed for simulation; so for each configuration of there is a corresponding configuration of , where and is a prefix of . computes by simulation of via finite control and information of this simulation is stored in ; the other part of finite control simulates ’s transitions.
Each can be simulated by a in the same way, one shall use instead of . Note that described simulations preserve determinism and the transformations between and are log-space computable. ∎
Lemma 14.
Let be an FST with the input alphabet and the output alphabet and be a over the alphabet . There exists a recognizing the language . If is a DFST and is a then is a as well.
Proof.
The simulation is performed in a straight-forward way. guesses an image of the input word such that if , computes by simulation of and simulates on the input . has configurations of the form that correspond to configurations of . As in the proof of Lemma 13, the information is used to simulate via finite state control. The construction preserves determinism of if is a DFST. ∎
Lemma 15.
Let be a . There exists an FST such that iff . Moreover, iff has a run on such that .
We denote by the move of from the state to the state on which it reads from the input tape and writes on the output tape.
Proof.
Definition 16.
An FST from Lemma 15 called extractor (of protocols).
Theorem 17.
.
Proof.
Theorem 18.
.
Proof.
Theorem 19.
.
Proof.
We construct a DFA on the input of by on the input of via a log-space transducer. The idea is that simulates ’s run on the input and checks the correctness of the protocol by reading the input word, that is a protocol . The protocol is accepted iff is the protocol of on processing of and accepts .
A state of is a tuple where , is the index of the letter over the ’s head and is the auxiliary information needed for the simulation. To simulate a transition of , the following actions are performed by . If writes a word (a subword of the future query word) to the query tape, stores in the finite memory (a part of component of its states) and checks whether the unprocessed part of its input begins with (if not, the input word is rejected). If performs a query , verifies that the unprocessed part of its input begins with and performs the transition that does after receiving as a response. accepts the input if it was not rejected during the simulation, (i.e., ’s head is over ) and is in accepting state.
It follows from the construction that accepts iff is the protocol of processing the input . Note that this protocol is unique since is a . Also since is , is log-space constructible. Finally, , so . ∎
4 Connection with Balloon Automata
We provide a high-level description of classes of BAs. The definition in a more formal style could be found in [7].
Definition 20.
A subset of BAs is a class of BAs if the following conditions hold.
-
(I)
contains all automata with such that, for each state , is either for all or for all and some constant .
-
(II)
If , , are the corresponding functions of and , then includes each automaton such that and are the functions that are obtained from the functions of and via finite control, i.e., for each state equals to either or for all , for each if then either or .
Property (I) implies that contains automata that can reset any state of the balloon to the initial state (or to some fixed state as well). Together with Property (II) it implies that the balloon has a reset operation that sets the balloon’s state to the initial state. This property does not hold for SAs, so there is no direct correspondence between classes of languages of BAs and automata with an ADS in the general case.
Theorem 21.
Proof.
We begin with the construction of the balloon and the bijection from to such that form the set satisfying Property (II). A state of the balloon is an integer that is the encoding of pairs of words , where is the current protocol, i.e., the protocol of all previous operations before the upcoming move, and is the word on the query-tape. We enumerate all and all and use the standard enumeration of pairs of integers.
Firstly, we define functions and for the BA recognizing . Recall that for any language of correct protocols there exists recognizing by Lemma 12. The function simulates write operations and queries: it just updates the ballon’s state according to the encoding. The function returns responses or if there were no query. For an arbitrary the is constructed as follows. The function is the same as for for any . The function is a modification of the function for according to the finite state control of . We define the class more formally below to show that Property (II) holds.
As the result, the states of the ballon just encode the part of that describes the data structures, and and simulate the work with the data structure defined by . So we provided the bijection between and .
We move to a formal definition of the class . At first, we define satisfying Property (II). Assume that writes at most one letter to the query tape per move and it also has a state in which it neither writes nor performs query. So, for all (hereinafter in ). We mark a state as if writes on the output tape and mark a state as if performs query . From definition follows that for all states and marked by the same mark (, or ). So we define a function so that . So for any the function defined as follows. The states of are marked by symbols from and . It is easy to see that from our definition of follows the bijection between and and Property (II) holds as well.
If has the reset operation (vi), then we shall modify the interpretation of since does not satisfy our definition anymore. Firstly we describe the interpretation of functions for from Property (I). If , and encode pairs and respectively and is not obtained from by a single move of , then we interpretate the state change as follows. The corresponding to performs the reset operation and then performs sequence of queries that move the configuration from to during moves. Note that by the definition of Property (I) has finitely many ’s in the range of so is well-defined. Denote functions for automata from Property (I) as . So is a closure of and in terms of Property (II). From Property (II) and our construction of ’s for Property (I) follows the construction of for any of from the closure in terms of Property (II). So, we have proved the second part of the theorem. ∎
So all the results from [7] that do not rely on (I) hold for -automata. We are most interested in (1) and its complexity analogue (3). Many structural results from [7] follow from the fact that is a principal cone (Theorem 17), namely, closure of over union and rational transductions111Intersection and quotient with regular languages, gsm forward mapping are the partial cases of rational transductions.. We shall also mention the closure over gsm inverse mappings proved in [7] for all that implies the same closure for all .
Lemma 22.
If contains the reset operation then is closed over concatenation and iteration.
Proof.
We construct ’s and by recognizing and respectively in a straight-forward way. simulates and nondeterministically guess the split of the input such that at the end of the . If after processing of , is in an accepting state, performs the reset operation and simulates on . guesses the split of the input into and acts in the similar way. ∎
The standard technique from [2] implies the following lemma.
Lemma 23.
If , , then is closed over concatenation. If , , then is closed over iteration.
Proof idea.
The construction is similar to the one from the proof of Lemma 22. FSTs and uses marks to split the input and simulate an FST corresponding to the language , i.e., . ∎
Remark 24.
We leave open the question of the reduction in the opposite direction. I.e., does for each class of BAs exist a language of correct protocols such that BAs recognize the same class of languages as automata? The essence of the problem is as follows. If axioms (I-II) for the class of BAs are satisfied, does it imply that there exists a ‘‘universal’’ BA such that and for each there exists an FST such that ?
5 RR Problems and Models
5.1 models
In [17, 19] it was shown that is a complete problem in the class of languages recognizable by GNA with advices from (this model corresponds to , is the set of all prefixes of ). We prove that and are complete problems in the class . We begin with the proof of similar result for and . We introduce the following auxiliary lemma for the sake of the proof.
Lemma 25 ([12]).
.
Lemma 26.
.
Proof.
An takes on the input a word and also takes on the advice tape. , where if halts in an accepting state, if halts in a rejecting state.
A surface-configuration of a is a tuple of the state , log-space memory configuration and the positions of the head on the input tape and of the head on the advice (one-way) tape. The tuples are the states of finite automata on the input of problem constructed by and . The initial state is , where is the initial state of the , accepting states are states of the form where is an accepting state of the . The transitions between the states are determined by letters of , i.e. , , if . The list of the ’s transitions is log-space computable so as the set of the ’s states as well.
Without loss of generality, we assume that always processes till the end, i.e. till mits on the advice tape. So, by the construction of , we obtain
where is constructed by due to the reduction in Lemma 25. Since is fixed, we get that
and we obtain that by the transitivity of the log-space reductions. ∎
Remark 27.
Note that so the problems for these filters are equivalent (up to reductions). The equivalence holds since nondeterministic FST’s can have several images for the same word, particularly write many ’s at the end of the word. It does not hold for deterministic FSTs, so . So to obtain the corresponding lemma for we need to modify the proof of Lemma 26.
Lemma 28.
.
Proof.
We repeat the steps of the proof of Lemma 26. Note that is a DFA, since is a deterministic machine. To construct we construct an auxiliary DFA by as follows. has the states and for each state of . The auxiliary bit of a state indicates whether met . So for each transition of there are two corresponding transitions , . For states has only transitions by . For the transitions , , has transitions . A state is an accepting if is an accepting state of .
Now we construct . It is obtained from by removing all -transitions. Each ’s accepting state is an accepting for and also has accepting states determined as follows. If has -path to an accepting state in , then is an accepting state in . It is easy to see that and is log-space computable from and is log-space computable from and . ∎
Lemma 29.
and . Moreover, there exist an that recognizes the problem and that recognizes as well.
Proof.
The proof is straightforward. gets on the input an NFA and verifies whether accepts the word written on the advice tape. If , nondeterministically guesses the ’s run on . So, by Definition 11, iff .
The construction for is the same. ∎
Theorem 30.
Proof.
By the definition of , iff there exists a transducer that maps the input of the problem to the input of the problem . We construct an recognizing via the composition of and from Lemma 29.
5.2 models
In Section 3 we exploited the following idea. In the case of a nondeterministic model, performing queries one by one and proceeding the computation depending on the queries’ results computationally equivalent to guessing all the queries results and verifying whether all the results were correct in the end (by testing whether obtained protocol was correct). In fact, this idea works even in the case of a deterministic model in Theorem 19. Now we exploit it again for -based models.
Lemma 31.
We provide only the proof idea since the proof follows our general technique that we have applied above a lot.
Proof idea.
Let be a , be a , and be an input word. Since both kinds of s are nondeterministic, can guess and verify ’s successful run on provided that ’s protocol is written on the advice tape; can guess and a successful run of on , and verify it: the transitions on configurations are simulated on log space and the fact is verified by performing subsequently the queries from the sequence .
The case of deterministic models is similar to Theorem 19. just simulates and tests whether the query words on the advice tape, queries an the results are the same as has during processing of the input. ∎
Combining all together, we obtain the main theorem of the section.
Theorem 32.
(11) | |||
(12) | |||
(13) |
6 Applications
Proof.
SA-PROT was defined in Example 8. It was proved in [14] that the problems are -complete. So we obtain (6) by applying Theorem 32. We prove (5) in the same way by combining the facts (Example 7) and is P-complete [12], and apply Lemma 25 and Theorem 32.
To prove (7-8) we use facts about the filters , where is a -letter alphabet. The problem is -complete and , , is -complete [1, 18]. We construct set-protocols based on these languages as follows. Let , , . The response to the -query is positive only for the first query, -queries are the same as in Example 8. We denote the language of correct protocols with as . It is easy to see that : an FST maps words of the form to by replacing queries and responses by ; the sequence of queries with responses is verifiable via a finite state control (the inputs with invalid sequence are rejected by the FST), so by Lemma 25.
Now we prove that . The FST takes on the input a word and acts as follows. While translating a block to the output, it has the following options: (i) change at least one letter, (ii) erase at least one letter and maybe change others, (iii) add at least one letter and maybe change others, (iv) do not change . Until has not write , it replaces by in the cases (i-iii), and either by or by in the case (iv). After wrote , it replaces by in the case (iv) and either by or by in the cases (i-iii). It is easy to see that consists of all correct protocols with either first -query or without -queries at all, and exactly queries. So and assertions (7-8) follows from Lemma 25 and Theorem 32. ∎
7 On computational complexity
of correct protocol languages
Theorem 18 essentially says that the computational complexity of the non-emptiness problem for ADS-automata is the same as the computational complexity of the NRR problem for the corresponding correct protocols languages. It can be used to answer the question about the range of complexities of the non-emptiness problems for ADS-automata. It extends the known results about the complexity of RR problems [20]. It appears that these complexities are almost universal. It means that for any nonempty language there exists a language of correct protocols such that is reducible to . The reductions in the two directions differ. In one direction it is a log-space -reduction. In the other, we present the proof only for Turing reductions in polynomial time.
Theorem 34.
For any nonempty language there exists a language of correct protocols such that
In the proof of Theorem 34 we use the language of protocols defined as follows. Set , , . The relation is defined as follows: , . The language of correct protocols consists of protocols such that, for every query block , either and , or , and , or , and . Here is a language depending on in the statement of the theorem.
The exact choice of is complicated. So we start with the presentation of basic ideas behind the proof of Theorem 34. We encode binary words using a log-space computable injective map such that and . It suffices for the first reduction in the theorem, , since the protocol is correct iff .
For the second reduction, i.e., , we need much more requirements. Let be an input automaton for and be its state set. We are going to decide in polynomial time using oracle calls of the oracle . For this purpose we reduce the question to the question for some regular language . By definition, if there exists an accepting run of that processes a correct protocol such that is obtained from by substitutions of letters and with words of and respectively (different words may be used for different occurrences of the letters). To check the correctness of the run processing the protocol, we need to compute, for all pairs , all possible transitions from to by processing words from only as well as all possible transitions from to by processing words from only.
Thus, the main part of the reduction consists of solving NRR problems and for all pairs . Here are auxiliary automata. The states and the transitions of coincide with the states and the transitions of . The initial state of is and the only accepting state is .
Note that may be infinite and it causes the first difficulty: one need to consider arbitrary long words in the protocol language. To avoid this difficulty we require that any infinite regular language intersects both and . Therefore and if is infinite.
If is finite, it means that the transition graph is DAG (after removing states that are not reachable and coreachable in ). The second difficulty: it might be exponentially many words in . Again, to overcome it, we pose specific requirements on to guarantee that that verifying and requires polynomially many oracle calls of the oracle .
Now we provide formal arguments for the above plan of proof. We encode binary words by the injective map
(22) |
where is the morphism defined on the symbols as , .
For an NFA with the state set we define a relation
that holds if can reach on processing starting from the state .
Now we list the requirements on the language .
-
1.
As it mentioned before, and .
-
2.
There exists a language such that both and are recognized in polynomial time, and, for any NFA over the alphabet and any pair of its states , , either is finite, or there exist , such that and .
-
3.
The language is sparse: . Moreover, the lists of words in and, respectively, in can be generated in polynomial time.
-
4.
If , , and then iff , where is the lexicographical order.
-
5.
If is odd and then . If and then .
The sets of -transitions and -transitions are defined as follows:
The main part of the proof of Theorem 34 is the following lemma.
Lemma 35.
Before presenting the algorithm from the lemma, we analyze the most difficult case separately.
Proposition 36.
Proof.
By solving the reachability problem, one can detect the set of reachable and coreachable states of . All other states can be deleted without affecting . From now on, we assume that all the states are reachable and coreachable.
Since is finite, the transition graph of is a DAG as well as all its subgraphs. For each pair of states , let be the set
Using topological sorting, one can construct all the sets in polynomial time by the backward induction based on the relation
where is the set of states that are reachable from in one move and .
For a positive integer and a state of , we define
We order the sets and in the lexicographical order. Let be the minimal word in , and be the maximal word in , and be the minimal word in , and be the maximal word in .
There exists an inductive procedure that computes , , , and in polynomial time. The procedure also verifies the conditions , . We describe computation of , the other words are computed similarly.
Suppose that is the prefix of of the length (if , then the procedure returns and stops). Let . This set can be computed in polynomial time. If there exists and such that and , then is a prefix of of the length . Otherwise, if there exists and such that and , then is a prefix of . If both conditions are not satisfied, then .
According to Requirement 4 on , if there exist a state and an integer such that , then . Otherwise, for all , . Since , in this case if and only if there exist a state and an integer such that and either and or due to Requirements 1, 5. The condition can be verified by an oracle call, the rest of conditions can be verified in polynomial time.
A similar check can be done for the condition . It is equivalent to the following: there exist a state and an integer such that either , or and . ∎
Proof of Lemma 35.
The algorithm maintains the sets , . Initially, . We will prove that at the end , . The algorithm analyzes states one by one and adds to the sets , according to the following rules.
In the first step, the algorithm decides whether is infinite. It can be done in polynomial time. If the answer is ‘yes’, then the algorithm adds to both sets , and continues with the next state. The correctness of this step is guaranteed by Requirement 2.
If the answer at the first step is ‘no’, the lengths of words in do not exceed (otherwise, there exists a run of from to containing a cycle, which implies that is infinite). In the second step, the algorithm checks whether and, respectively, whether . It can be done in polynomial time due to Requirement 3. If the first condition holds, then the algorithm adds to . If the second condition holds, then the algorithm adds to .
In the third step, the algorithm constructs an NFA recognizing . Let be the set of prefixes of all words in . Due to Requirement 3, and can be constructed in polynomial time. The states of are the pairs , , . The set of transitions consists of pairs such that and , and pairs such that and . The set of transitions consists of pairs such that . The initial state is . Accepting states are pairs and , where . This definition implies that can be constructed in polynomial time. To prove the correctness of the construction, note that processing a word from finishes at the state which is not accepting. For a word there exists an accepting run of . The corresponding run of finishes at a state of the form or . Thus, is accepted by .
In the fourth step, the algorithm checks whether contains a word of odd length. It can be done in polynomial time since words of odd length form a regular language and the intersection of this language with is recognized by an NFA with states, where is the state set of . If the answer is ‘yes’, then the algorithm adds to . The correctness of this step is guaranteed by Requirement 5.
In the fifth step, the algorithm constructs an NFA that accepts exactly the words of even length from , apply to it the algorithm of Proposition 36, updates , if necessary, and continues with the next state.
It is clear from the above remarks that at the end , .
Suppose that . If is infinite then is added to at the first step. If , then is added at the second step. Otherwise, should be non-empty. If there are words of odd length in , then is added at the fourth step. And, finally, if consists of words of even length only, is added at the fifth step due to Proposition 36. Therefore, at the end of the algorithm.
Suppose that . If is infinite then is added to at the first step. If , then is added at the second step. Otherwise, and is added at the fifth step due to Proposition 36. Therefore, at the end of the algorithm. ∎
Proof.
We define at first. For each triple of non-empty binary words there are two words in in the form and and for each there exists a unique triple such that either or . The definition of is inductive. Order all triples of non-empty binary words with respect to the length of and order the triples with the same length of with respect to the lexicographical order on the triples of binary words (binary words are also ordered lexicographically).
Assume that for all less than we have defined and properly. Thus the set has been already defined. The total number of less than does not exceed . Thus, there are at most words from having lengths in the range . So, there exist at least
consecutive integers in the range such that no word in has the length . At least of them are even and at least of them are odd. It guarantees that the sets
are non-empty. Set be the minimal in and be the minimal in .
To define , we require that the words from in the form are in , while the words from in the form are in . Note that it implies Requirement 2, since any infinite regular language contains all words in the form , , for some , , .
By construction, for each the length of defining triple is logarithmic in the length of . Thus . To construct the list of words in and the list of words in one need to perform only polynomial number of steps of the defining procedure and each step can be performed in polynomial time. Therefore, Requirement 3 is satisfied.
Proof of Theorem 34.
Choose as in the proof of Lemma 37. The reduction is given by a map . It is clear that the map is computed in logarithmic space. The correctness of reduction follows from Requirement 1.
Now we describe the second reduction, . Let be an input NFA for . The reducing algorithm computes all sets , using Lemma 35.
Let be an NFA with the same state set as . The alphabet of is . Transitions coincide with transitions for . For the rest of transitions, and . The initial state and the accepting states of and of coincide.
Let . Then iff . The latter condition is verified in polynomial time since is regular. ∎
Acknowledgments
This work is supported by the Russian Science Foundation grant 20–11–20203.
References
- [1] T. Anderson, J. Loftus, N. Rampersad, N. Santean, and J. Shallit. Special Issue: LATA 2008 Detecting palindromes, patterns and borders in regular languages. Information and Computation, 207(11):1096–1118, 2009.
- [2] J. Berstel. Transductions and context-free languages. Ed. Teubner, 1979.
- [3] A. Bouajjani, J. Esparza, and O. Maler. Reachability analysis of pushdown automata: Application to model-checking. In International Conference on Concurrency Theory, pages 135–150. Springer, 1997.
- [4] D. Chistikov, R. Majumdar, and P. Schepper. Subcubic certificates for cfl reachability. Proc. ACM Program. Lang., 6(POPL), jan 2022.
- [5] M. Daley, M. Eramian, and I. Mcquillan. The bag automaton: A model of nondeterministic storage. J. Autom. Lang. Comb., 13(3):185–206, June 2008.
- [6] D. Dolev, S. Even, and R. Karp. On the security of ping-pong protocols. Information and Control, 55(1):57–68, 1982.
- [7] J. E. Hopcroft and J. D. Ullman. An approach to a unified theory of automata. In SWAT 1967, pages 140–147, 1967.
- [8] M. Kutrib, A. Malcher, and M. Wendlandt. Set automata. International Journal of Foundations of Computer Science, 27(02):187–214, 2016.
- [9] K.-J. Lange and K. Reinhardt. Set automata. In Combinatorics, Complexity and Logic; Proceedings of the DMTCS’96, pages 321–329. Springer, 1996.
- [10] D. Melski and T. Reps. Interconvertibility of a class of set constraints and context-free-language reachability. Theoretical Computer Science, 248(1-2):29–98, 2000.
- [11] T. Reps, S. Horwitz, and M. Sagiv. Precise interprocedural dataflow analysis via graph reachability. In Proceedings of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’95, page 49–61, New York, NY, USA, 1995. Association for Computing Machinery.
- [12] A. Rubtsov and M. Vyalyi. Regular realizability problems and context-free languages. In DCFS 2015, volume 9118 of LNCS, pages 256–267. Springer, 2015.
- [13] A. Rubtsov and M. Vyalyi. On computational complexity of set automata. In DLT 2017, pages 332–344, Cham, 2017. Springer International Publishing.
- [14] A. Rubtsov and M. Vyalyi. On emptiness and membership problems for set automata. In CSR 2018, pages 295–307, Cham, 2018. Springer International Publishing.
- [15] A. Rubtsov and M. Vyalyi. On computational complexity of set automata. Information and Computation, to appear.
- [16] M. Sipser. Introduction to the theory of computation. Cengage Learning, 2013.
- [17] M. Vyalyi. On models of a nondeterministic computation. In CSR 2009, pages 334–345, Berlin, Heidelberg, 2009. Springer Berlin Heidelberg.
- [18] M. Vyalyi. On the models of nondeterminizm for two-way automata (in Russian). Proceedings of VIII international conference <<Discrete models in the theory of control systems>>., pages 54–60, 2009.
- [19] M. N. Vyalyi. On regular realizability problems. Probl. Inf. Transm., 47(4):342–352, 2011.
- [20] M. N. Vyalyi. On expressive power of regular realizability problems. Probl. Inf. Transm., 49(3):276–291, 2013.
- [21] P. Wolf. On the decidability of finding a positive ilp-instance in a regular set of ilp-instances. In DCFS 2019, volume 11612 of LNCS, pages 272–284. Springer, 2019.
- [22] P. Wolf and H. Fernau. Regular intersection emptiness of graph problems: Finding a needle in a haystack of graphs with the help of automata. CoRR, abs/2003.05826, 2020.
- [23] M. Yannakakis. Graph-theoretic methods in database theory. In Proceedings of the Ninth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS ’90, page 230–242, New York, NY, USA, 1990. Association for Computing Machinery.