ALGORITHMS AND LOWER BOUNDS IN FINITE AUTOMATA SIZE COMPLEXITY

ALGORITHMS AND LOWER BOUNDS IN FINITE AUTOMATA SIZE COMPLEXITY by

christos kapoutsis

Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of

doctor of philosophy at the Massachusetts Institute of Technology.

June 2006

Abstract In this thesis we investigate the relative succinctness of several types of finite automata, focusing mainly on the following four basic models: one-way deterministic (1dfas), one-way nondeterministic (1nfas), two-way deterministic (2dfas), and two-way nondeterministic (2nfas). We first establish the exact values of the trade-offs for all conversions from two-way to one-way automata. Specifically, we show that the functions j Pn−1 Pn−1 n n i 2n n nn − (n − 1)n , i=0 j=0 i j 2 −1 , n+1

return the exact values of the trade-offs from 2dfas to 1dfas, from 2nfas to 1dfas, and from 2dfas or 2nfas to 1nfas, respectively. Second, we examine the question whether the trade-offs from 1nfas or 2nfas to 2dfas are polynomial or not. We prove two theorems for liveness, the complete problem for the conversion from 1nfas to 2dfas. We first focus on moles, a restricted class of 2nfas that includes the polynomially large 1nfas which solve liveness. We prove that, in contrast, 2dfa moles cannot solve liveness, irrespective of size. We then focus on sweeping 2nfas, which can change the direction of their input head only on the end-markers. We prove that all sweeping 2nfas solving the complement of liveness are of exponential size. A simple modification of this argument also proves that the trade-off from 2dfas to sweeping 2nfas is exponential. Finally, we examine conversions between two-way automata with more than one head-like devices (e.g., heads, linearly bounded counters, pebbles). We prove that, if the automata of some type a have enough resources to (i) solve problems that no automaton of some other type b can solve, and (ii) simulate any unary 2dfa that has additional access to a linearlybounded counter, then the trade-off from automata of type a to automata of type b admits no recursive upper bound.

Contents Introduction 1. The 2D vs. 2N Problem 2. Motivation 3. Progress 4. Other Problems in This Thesis

9 9 12 19 25

Chapter 1. Exact Trade-Offs 1. History of the Conversions 2. Preliminaries 3. From 2DFAs to 1DFAs 4. From 2NFAs to 1DFAs 5. From 2NFAs to 1NFAs 6. Conclusion

29 29 33 39 46 51 61

Chapter 2. 2D versus 2N 1. History of the Problem 2. Restricted Information: Moles 3. Restricted Bidirectionality: Sweeping Automata 4. Conclusion

63 63 64 84 99

Chapter 3. Non-Recursive Trade-Offs 1. Two-Way Multi-Pointer Machines 2. Preliminaries 3. The Main Theorem 4. Programming Counters 5. Proof of the Main Lemma 6. Conclusion

101 102 104 108 109 111 115

End Note

117

Bibliography

119

7

Introduction The main subject of this thesis is the 2d vs. 2n problem, a question on the power of nondeterminism in two-way finite automata. We start by defining it, explaining the motivation for its study, and describing our progress against it. 1. The 2D vs. 2N Problem A two-way deterministic finite automaton (2dfa) is the machine that we get from the common one-way deterministic finite automaton (1dfa) when we allow its input head to move in both directions; equivalently, this is the machine that we get from the common single-tape deterministic Turing machine (dtm) when we do not allow its input head to write on the input tape. The nondeterministic version of a 2dfa is simply called twoway nondeterministic finite automaton (2nfa) and, as usual, is the machine that we get by allowing more than one options at each step and acceptance by any computation branch. The 2d vs. 2n question asks whether 2nfas can be strictly more efficient than 2dfas, in the sense that there exists a problem for which the best 2dfa algorithm is significantly worse than the best 2nfa one. Of course, to complete the description of the question, we need to explain how we measure the efficiency of an algorithm on a two-way finite automaton. It is easy to check that, with respect to the length of its input, every algorithm of this kind uses zero space and at most linear time. Therefore, the time and space measures—our typical criteria for algorithmic efficiency on the full-fledged Turing machine—are of little help in this context. Instead, we focus on the size of the program that encodes this algorithm, namely the size of the transition function of the corresponding two-way finite automaton. In turn, a good measure for this size is simply the automaton’s number of states. So, the 2d vs. 2n question asks whether there is a problem that, although it can be solved both by 2dfas and by 2nfas, the smallest possible 2dfa for it (i.e., the 2dfa that solves it with the fewest possible states) is still significantly larger than the smallest possible 2nfa for it. To fully understand the question, two additional clarifications are needed. 9

10

INTRODUCTION

First, it is well-known that the problems that are solvable by 2dfas are exactly those described by the regular languages, and that the same is true for 2nfas [53]. Hence, efficiency considerations aside, the two types of automata have the same power—in the same way that 1dfas or dtms have the same power with their nondeterministic counterparts. Therefore, the above reference to problems that “can be solved both by 2dfas and by 2nfas” is a reference to exactly the regular problems. Second, although we explained that efficiency is measured with respect to the number of states, we still have not defined what it means for the number of states in one automaton to be “significantly larger” than the number of states in another one. The best way to clarify this aspect of the question is to give a specific example. Consider the problem in which we are given a list of subsets of the set {0, 1, . . . , n − 1} and we are asked whether this list can be broken into sublists so that the first set of every sublist contains the number of sets after it [51]. More precisely, the problem is defined on the alphabet ∆n := P({0, 1, . . . , n − 1}) of all sets of numbers smaller than n. A string a0 a1 · · · al over this alphabet is a block if its first symbol is a set that contains the number of symbols after it, that is, if a0 ∋ l. The problem consists in determining whether a given string over ∆n can be written as a concatenation of blocks. For example, for n = 8 and for the string {1,2,4}∅{4}{0,4}{2,4,6}{4}{4,6}∅{3,6}∅{2,4}{5,7}{0,3}{4,7}∅{4}∅{4}{0,1}{2,5,6}{1}

the answer should be “yes”, since this is the concatenation of the substrings {1,2,4}∅{4} {0,4}{2,4,6}{4}{4,6}∅ {3,6}∅{2,4}{5,7} {0,3} {4,7}∅{4}∅{4}{0,1}{2,5,6}{1}

where the first set in each substring indeed contains the number of sets after it in the same substring, as shown by boldface. In contrast, for the string {1,2,7}{4}{5,6}∅{3,6}{2,4,6}

the answer should be “no”, as there is no way to break it into blocks. Is there a 2dfa algorithm for this problem? Is there a 2nfa algorithm? Indeed, the problem is regular. The best 2nfa algorithm for it is the following, rather obvious one: We scan the list of sets from left to right. At the beginning of each block, we read the first set. If it is empty, we just hang (in this computation branch). Otherwise, we nondeterministically select from it the correct number of remaining sets in the block, and consume as many sets counting from that number down. When the count reaches 0, we know the block is over and a new one is about to start. In the end, we accept if the list and our last count-down finish simultaneously.

It is easy to see that this algorithm can be implemented on a 2nfa (which does not actually use its bidirectionality) with exactly 1 state per possible

1. THE 2D VS. 2N PROBLEM

11

value of the counter, for a total number of n states. As for a 2dfa algorithm, here is the best known one: We scan the list of sets from left to right. At each step, we remember all possibilities about how many more sets there are in the most recent block. (E.g., right after the first set, every number in it is a possibility.) After reading each set, we decrease each possibility by 1, to account for the set just consumed; if a possibility becomes 0, we replace it by all numbers of the next set. If at any point the set of possibilities gets empty, we just hang. Otherwise, we eventually reach the end of the list. There, we check whether 0 is among the possibilities. If so, we accept.

Easily, this algorithm can be implemented on a 2dfa (which does not actually use its bidirectionality) that has exactly 1 state for each possible non-empty1 set of possibilities, for a total number of 2n − 1 states. Overall, we see that the difference in size between the automata implementing the two algorithms is exponential. Such a difference, we surely want to say that it is “significant”. At this point, after the above clarifications, we may want to restate the 2d vs. 2n question as the question whether there exists a regular problem for which the smallest possible 2dfa is still super-polynomially larger than the smallest possible 2nfa. However, a further clarification is due. Given any fixed regular problem, the sizes of the smallest possible 2dfa and the smallest possible 2nfa for it are nothing more than just two numbers. So, asking whether their difference is polynomial or not makes little sense. What the previous example really describes is a family of regular problems Π = (Πn )n≥0 , one for each natural value of n; then, a family of 2nfas N = (Nn )n≥0 which solve these problems and whose sizes grow linearly in n; and, finally, a family of 2dfas D = (Dn )n≥0 which also solve these problems and whose sizes grow exponentially in n. Therefore, our reference to “the difference in size between the automata” was really a reference to the difference in the rate of growth of the sizes of the automata in the two families. It is this difference that can be characterized as polynomial or not. Naturally, we decide to call it significant if one of the two rates can be bounded by a polynomial but the other one cannot. So, the 2d vs. 2n question asks whether there exists a family of regular problems such that the family of the smallest 2nfas that solve them have sizes that grow polynomially in n, whereas the family of the smallest 2dfas that solve them have sizes that grow super-polynomially in n. Equivalently, the question is whether there exists a family of regular problems that can be 1Note that a 2dfa is allowed to reject by just hanging anywhere along its input. Without this freedom, the number of states required to implement our algorithm would actually be 2n .

12

INTRODUCTION

solved by a polynomial-size family of 2nfas but no polynomial-size family of 2dfas. With this clarification, we are ready to explain the name “2d vs. 2n”. We define 2d as the class of families of regular problems that can be solved by polynomial-size families of 2dfas, and 2n as the corresponding class for 2nfas [48]. Under these definitions, we obviously have 2d ⊆ 2n, and the question is whether 2d and 2n are actually different. Observe how the nature of the question resembles both circuit and Turing machine complexity: like circuits, we are concerned with the rate of growth of size in families of programs; unlike circuits and like Turing machines, each program in a family can work on inputs of any length. Concluding this description of the problem, let us also remark that its formulation in terms of families is not really part of our every-day vocabulary in working on the problem. Instead, we think and speak as if n were a built-in parameter of our world, so that only the n-th members of the three families (of problems, of 2nfas, and of 2dfas) were visible. Under this pretense, the 2d vs. 2n question asks whether there is a regular problem that can be solved by a polynomially large 2nfa but no polynomially large 2dfa—and it is redundant to mention what parameter “polynomially large” refers to. In addition, we also use “small” as a more intuitive substitute for “polynomially large”, and drop the obviously redundant characterization “regular”. So, the every-day formulation of the question is whether there exists a problem that can be solved by a small 2nfa but no small 2dfa. Throughout this thesis, we will occasionally be using this kind of talk, with the understanding that it is a substitute for its formal interpretation in terms of families. 2. Motivation Our motivation for studying the 2d vs. 2n question comes from two distinct sources: the theory of computational complexity and the theory of descriptional complexity. We discuss these two different contexts separately. 2.1. Computational complexity. From the perspective of computational complexity, the 2d vs. 2n question falls within the general major goal of understanding the power of nondeterminism. For certain computational models and resources, this quest has been going on for more than four decades now. Of course, the most important (and correspondingly famous) problem of this kind is p vs. np, defined on the Turing machine and for time which is bounded by some polynomial of the length of the input. The next most important question is probably l vs. nl, also defined on the Turing machine and for space which is bounded by the logarithm of some polynomial of the length of the input. It is perhaps fair to say that our progress against the core of these problems has been slow. Although our theory is constantly being enriched

2. MOTIVATION

13

with new concepts and new connections between the already existing ones, major advances of our understanding are rather sparse. At the same time, the common conceptual origin as well as certain combinatorial similarities between these problems has led some to suspect that essentially the same elusive idea lies at the core of all problems of this kind. In particular, the suspicion goes, this idea may be independent of the specifics of the underlying computational model and resource. In this context, a possibly advantageous approach is to focus on weak models of computation. The simple setting provided by these models allows us to work closer to the set-theoretic objects that are produced by their computations. This serves us in two ways. First, it obviously helps our arguments become cleaner and more robust. Second, it helps our reasoning become more objective, by neutralizing our often misleading algorithmic intuitions about what a machine may or may not do. The only problem with this approach is that it may very well lead us to models of computation that are too weak to be relevant. In other words, some of these models are obviously not rich enough to involve ideas of the kind that we are looking for. We should therefore be careful to look for evidence that indeed such ideas are present. We believe that the 2d vs. 2n question passes this test, and we explain why. 2.1-I. Robustness. One of the main reasons why problems like p vs. np and l vs. nl are so attractive is their robustness. The essence of each question remains the same under many different variations of the mathematical definitions of the model and/or the resource. It is only on such stable grounds that the theoretical framework around each question could have been erected, with the definition of the classes of problems that can be solved in each case (p, np, l, nl) and the identification of complete problems for the nondeterministic classes, that allowed for a more tangible reformulation of the original question (is satisfiability in p? is connectivity in l?). The coherence and richness of these theories further enhance our confidence that they indeed describe important high-level questions about the nature of computation, as opposed to technical low-level inquiries about the peculiarities of particular models. The 2d vs. 2n problem is also robust. Its resolution is independent of all reasonable variations of the definition of the two-way finite automaton and/or the size measure, including changes in the conventions for accepting and rejecting, for end-marking the input, for moving the head; changes in the size of the alphabet, which may be fixed to binary; changes in the size measure, which may include the number of transitions or equal the length of some fixed binary encoding. It is on this stable ground that the classes 2d and 2n have been defined. In addition, 2n-complete (families of) problems have been identified [48], allowing more concrete forms of the

14

INTRODUCTION

question. Overall, there is no doubt that in this case, too, our investigations are beyond technical peculiarities and into the high-level properties of computation. 2.1-II. Hardness. A second important characteristic of problems like p vs. np and l vs. nl that contributes to their popularity is their hardness. Until today, a long list of people that cared about these problems have tried to attack them from several different perspectives and with several different techniques. Their limited success constitutes significant evidence that, indeed, answering these questions will require a deeper understanding of the combinatorial structure of computation. In this sense, the answer is likely to be highly rewarding. In contrast, the 2d vs. 2n problem can boast no similar attention on the part of the community, as it has never attracted the efforts of a large group of researchers. However, it does belong to the culture of this same community and several of its members have tried to attack it or problems around it, with the same limited success. In this sense, it is again fair to predict that the answer to this question will indeed involve ideas which are deep enough to be significantly rewarding. 2.1-III. Surprising conjecture. If we open a computational complexity textbook, we will probably find p defined as the class of problems that can be solved by polynomial-time Turing machines, and np as the class of problems that can be solved by the same machines when they are enhanced with the ability to make nondeterministic choices. Then, we will probably also find a discussion of the standard conjecture that p 6= np, justified by the well-known compelling list of pragmatic and philosophical reasons. In this context, the conjecture does not sound surprising at all. There is certainly nothing strange with the enhanced machines being strictly more powerful than the original, non-enhanced ones. They start already as powerful, and then the magical extra feature of nondeterminism comes along: how could this result in nothing new? But the conjecture does describe a fairly intriguing situation. First, if we interpret the conjecture in the context of the struggle of fast deterministic algorithms to subdue fast nondeterministic ones, it says that a particular nondeterministic Turing machine using just one tape and only linear time [37] can actually solve a problem that defies all polynomial-time multi-tape deterministic Turing machines, irrespective of the degree of the associated polynomial and the number of tapes. In other words, the claim is that nondeterministic algorithms can beat deterministic ones even with minimal use of the rest of their abilities. This is an aspect of the p 6= np conjecture that our definitions do not highlight. Second, if we interpret the conjecture in the context of the struggle of fast deterministic algorithms to solve a specific np-complete problem, it

2. MOTIVATION

15

says that the fastest way to check whether a propositional formula is satisfiable is essentially to try all possible assignments. Although this matches with our experience in one sense (in the simple sense that we have no idea how to do anything faster in general), it also seriously clashes with our experience in another sense, equally important: it claims that the obvious, highly inefficient solution is also the optimal one. This is in contrast with what one would first imagine, given that optimality is almost always associated with high sophistication. Similar comments are valid for l vs. nl. The standard conjecture that l 6= nl asserts that a particular nondeterministic finite automaton with a small bunch of states and just two heads that only move one-way [57] can actually solve a problem that defies all deterministic multi-head two-way finite automata, irrespective of their number of states and heads. At the same time, it claims that the most space-economical method of checking connectivity is essentially the one by Savitch, a fairly non-trivial algorithm which nevertheless still involves several natural choices (e.g., recursion) and lacks the high sophistication that we usually expect in optimality. Similarly to p vs. np and l vs. nl, the conjecture for the 2d vs. 2n problem is again that 2d 6= 2n. Moreover, two stronger variants of it have also been proposed. First, it is conjectured that 2nfas can be exponentially smaller than 2dfas even when they never move their input head to the left. In other words, it is suggested that even one-way nondeterministic finite automata (1nfas) can be exponentially smaller than 2dfas. (Note the similarity with the version of l 6= nl mentioned above, where the nondeterministic automaton need only move its heads one-way to beat all two-way multi-head deterministic ones.) So, once more we have an instance of the claim that nondeterministic algorithms can beat deterministic ones with minimal use of the rest of their abilities. Second, it is even conjectured that this alleged exponential difference in size between 1nfas and 2dfas covers the entire gap from n to 2n − 1 which is known to exist between 1nfas and 1dfas [35].2 In other words, according to this conjecture, a 2dfa trying to simulate a 1nfa may as well drop its bidirectionality, since it is going to be totally useless: its optimal strategy is going to be the well-known brute-force one-way deterministic simulation [47]. So, once more we have an instance of the claim that the obvious, highly inefficient solution is also the optimal one.3 2As with 2dfas (cf. Footnote 1 on page 11), a 1dfa is allowed to reject by just

hanging anywhere along its input. Without this freedom, the gap would actually be from n to 2n . 3 Note that, contrary to the strong versions of p 6= np and l 6= nl mentioned above [37, 57], the two conjectures mentioned in these two last paragraphs may be strictly stronger than 2d 6= 2n. It may very well be that 1nfas cannot be exponentially smaller than 2dfas, but 2nfas can (it is known that 2nfas can be exponentially smaller

16

INTRODUCTION

For a concrete example of what all this means, recall the problem that we described early in this introduction (page 10). Remember that we presented it as a problem that is solvable by small 2nfas but is conjectured to require large 2dfas. Notice that the best 2nfa algorithm presented there is actually one-way, namely a 1nfa. So, if the conjecture about this problem is true, then 2nfas can indeed beat 2dfas, and they can do so without using their bidirectionality. Also notice that the best known 2dfa for that problem is one-way, too. In fact, it is simply the brute-force 1dfa simulation of the 1nfa solver. So, if this is really the smallest 2dfa for the problem, then indeed the optimal way of simulating the hardest 2nfa is the obvious, highly inefficient one. In total, interpreting the questions on the power of nondeterminism (p vs. np, l vs. nl) as a contest between deterministic and nondeterministic algorithms, our conjectures claim that nondeterministic algorithms can win with one hand behind their back; and then, the best that deterministic algorithms can do in their defeat to minimize their losses is essentially not to think. This is a counter-intuitive claim, and our conjectures for 2d vs. 2n make this same claim, too. 2.1-IV. A mathematical connection. Of the similarities that we described above between 2d vs. 2n and the more important questions on the power of nondeterminism, none is mathematical. However, a mathematical connection is known, too. As explained in [3], if we can establish that 2d 6= 2n using only “short” strings, then we would also have a proof that l 6= nl. To describe this implication more carefully, we need to first discuss how a proof of 2d 6= 2n may actually proceed. To prove the conjecture, we need to start with a 2n-complete family of regular problems Π = (Πn )n≥0 , and prove that it is not in 2d. That is, we must prove that for any polynomial-size family of 2dfas D = (Dn )n≥0 there exists an n which is bad, in the sense that Dn does not solve Πn . Now, two observations are due: • To prove that some n is bad, we need to find a string wn that “fools” Dn , in the sense that wn ∈ Πn but Dn rejects wn , or wn 6∈ Πn but Dn accepts wn . • Every D has a bad n iff every D has infinitely many bad n. This is true because, if a polynomial-size family D has only finitely many bad n, then replacing the corresponding Dn with correct automata of any size would result in a new family which is still polynomial-size and has no bad n. Hence, proving the conjecture amounts to proving that, for any polynomialsize family D of 2dfas for Π, there is a family of strings w = (wn )n≥0 such that, for infinitely many n, the input wn fools Dn . than 1nfas). Moreover, even if 1nfas can be exponentially smaller than 2dfas, it may very well be that this exponential gap is smaller than the gap from n to 2n − 1.

2. MOTIVATION

17

Now, the connection with l vs. nl says the following: if we can indeed find such a proof and in addition manage to guarantee that the lengths of the strings in w are bounded by some polynomial of n, then l 6= nl. There is no doubt that this connection increases our confidence in the relevance of the 2d vs. 2n problem to the more important questions on the power of nondeterminism. However, its significance should not be overestimated. First, the two problems may very well be resolved independently. On one hand, if 2d = 2n then the connection is irrelevant, obviously. On the other hand, if 2d 6= 2n then our tools for short strings are so much weaker than our tools for long strings, that it is hard to imagine us arriving at a proof that uses only short strings before actually having a proof that uses long ones. Second, and perhaps most importantly, ideas do not need mathematical connections to transcend domains. In other words, an idea that works for one type of machines may very well be applicable to other types of machines, too, even if no high-level theorem encodes this transfer. Examples of this situation include the Immerman-Szelepcs´enyi idea [24, 58], Savitch’s idea [49], and Sipser’s “rewind” idea [54], each of which has been applied to machines of significantly different power [14, 54]. In conclusion, from the computational complexity perspective, the 2d vs. 2n problem is a question on the power of nondeterminism which seems both simple enough to be tractable and, at the same time, robust, hard, and intriguing enough to be relevant to our efforts against other, more important questions of its kind. 2.2. Descriptional complexity. From the perspective of descriptional complexity, the 2d vs. 2n question falls within the general major goal of understanding the relative succinctness of language descriptors. Here, by “language descriptor” we mean any formal model for recognizing or generating strings: finite automata, regular expressions, pushdown automata, grammars, Turing machines, etc. Perhaps the most famous question in this domain is the one about the relative succinctness of 1dfas and 1nfas. Since both types of automata recognize exactly the regular languages [47], every such language can be described both by the deterministic and by the nondeterministic version. Which type of description is shorter? Or, measuring the size of these descriptions by the number of states in the corresponding automata, which type of automaton needs the fewest states? Clearly, since determinism is a special case of nondeterminism, a smallest 1dfa cannot be smaller than a smallest 1nfa. So, the question really is: How much larger than a smallest 1nfa need a smallest 1dfa be? By the well-known simulation of [47], we know that every n-state 1nfa has an equivalent 1dfa with at most 2n − 1 states (cf. Footnote 2 on page 15). Moreover, this simulation is optimal [35], in the sense that certain n-state 1nfas have no equivalent 1dfa with fewer

18

INTRODUCTION

than 2n − 1 states. Hence, this question of descriptional complexity is fully resolved: if the minimal 1nfa description of a regular language is of size n, then the corresponding minimal 1dfa description is of size at most 2n − 1, and sometimes is exactly that big. There is really no end to the list of questions of this kind that can be asked. For the example of finite automata alone, we can change how the size of the descriptions is measured (e.g., use the number of transitions) and/or the resource that differentiates the machines (e.g., use any combination of nondeterminism, bidirectionality, ambiguity, alternation, randomness, pebbles, heads, etc.). Moreover, the models being compared can even be of completely different kind (e.g., 1nfas versus regular expressions) and/or have different power (e.g., 1nfas versus deterministic pushdown automata, or context-free grammars), in which case each model may have its own measure for the size of descriptions. Typically, every question of this kind is viewed in the context of the corresponding conversion. For example, the question about 1dfas and 1nfas is viewed as follows: Given an arbitrary 1nfa, we would like to convert it into a smallest equivalent 1dfa. What is the increase in the number of states in the worst case? In other words, starting with a 1nfa, we want to trade size for determinism and we would like to know in advance the worst possible loss in size. We encode this information into a function f , called the trade-off of the conversion: for every n, f (n) is the least upper bound for the new number of states when an arbitrary n-state 1nfa is converted into a smallest equivalent 1dfa. In this terminology, our previous discussion can be summarized into the following concise statement: the trade-off from 1nfas to 1dfas is f (n) = 2n − 1. Note that this encodes both the simulation of [47], by saying that f (n) ≤ 2n − 1, and the “hard” 1nfas of [35], by saying that f (n) ≥ 2n − 1. The 2d vs. 2n problem can also be concisely expressed in these terms. It concerns the conversion from 2nfas to 2dfas, where again we trade size for determinism, and precisely asks whether the associated trade-off can be upper-bounded by some polynomial: = 2n ⇔ the trade-off from 2nfas to 2dfas is polynomially bounded. Indeed, if the trade-off is polynomially bounded, then every family of regular problems that is solvable by a polynomial-size family of 2nfas N = (Nn )n≥0 is also solvable by a polynomial-size family of 2dfas: just convert Nn into a smallest equivalent 2dfa Dn , and form the resulting family D := (Dn )n≥0 . Since the size sn of Nn is bounded by a polynomial in n and the size of Dn is bounded by a polynomial in sn (the trade-off bound), the size of Dn is also bounded by a polynomial in n. Overall, 2d

3. PROGRESS

19

= 2n. Conversely, suppose the trade-off is not polynomially bounded. For every n, let Nn be any of the n-state 2nfas that cause the value of the trade-off for n, and let Dn be a smallest equivalent 2dfa. Then the sizes of the automata in the family D := (Dn )n≥0 are exactly the values of the trade-off, and therefore D is not of polynomial size. Moreover, for Πn the language recognized by Nn and Dn , the family Π := (Πn )n≥0 is clearly in 2n (because of the linear-size family N := (Nn )n≥0 ) but not in 2d (since D is not of polynomial size). Overall, 2d 6= 2n. Note the sharp difference in our understanding of the two conversions mentioned so far. On the one hand, our understanding of the conversion from 1nfas to 1dfas is perfect: we know the exact value of the associated trade-off. On the other hand, our understanding of the conversion from 2nfas to 2dfas is minimal: not only do we not know the exact value of the associated trade-off, but we cannot even tell whether it is polynomial or not. The best known upper bound for it is exponential, while the best known lower bound is quadratic. In fact, the details of this gap reveal a much more embarrassing ignorance. The exponential upper bound is the trade-off from 2nfas to 1dfas, while the quadratic lower bound is the trade-off from unary 1nfas to 2dfas. In other words, put in the shoes of a 2dfa that tries to simulate a 2nfa, we have no idea how to benefit from our bidirectionality; at the same time, put in the shoes of a 2nfa that tries to resist being simulated by a 2dfa, we have no idea how to use our bidirectionality or our ability to distinguish between different tape symbols. A bigger picture is even more peculiar. The 12 arrows in Figure 1 show all possible conversions that can be performed between the four most fundamental types of finite automata: 1dfas, 1nfas, 2dfas, and 2nfas. For 10 of these conversions, the problem of finding the exact value of the associated trade-off has been completely resolved (as shown in the figure), and therefore our understanding of them is perfect. The only two that remain unresolved are the ones from 2nfas and 1nfas to 2dfas (as shown by the dashed arrows), that is, the ones associated with 2d vs. 2n. 2d

In conclusion, from the descriptional complexity perspective, the 2d vs. 2n problem represents the last two open questions about the relative succinctness of the basic types of automata defined by nondeterminism and bidirectionality. Moreover, the contrast in our understanding between these two questions and the remaining ten is the sharp contrast between minimal and perfect understanding. 3. Progress Our progress against the 2d vs. 2n question has been in two distinct directions: we have proved lower bounds for automata of restricted information and for automata of restricted bidirectionality. In both cases, our

20

INTRODUCTION

2nfa

a = 2n − 1

e e

e

d d

2dfa

e

e c b

1nfa

b = n nn − (n − 1)n j Pn−1 Pn−1 c = i=0 j=0 ni nj 2i − 1 2n d = n+1 e=n

a 1dfa

Figure 1. The 12 conversions defined by nondeterminism and bidirectionality in finite automata, and the known exact trade-offs. theorems involve a particular computational problem called liveness. We start by describing this problem. 3.1. Liveness. As mentioned in Section 2.1-III, we currently believe that 2nfas can be exponentially smaller than 2dfas even without using their bidirectionality. That is, we believe that even 1nfas can be exponentially smaller than 2dfas. In computational complexity terms, this is the same as saying that the reason why 2d + 2n is because already 2d + 1n, where 1n is the class of families of regular problems that can be solved by polynomialsize families of 1nfas. In descriptional complexity terms, this is the same as saying that the reason why the trade-off from 2nfas to 2dfas is not polynomially bounded is because already the trade-off from 1nfas to 2dfas is not. In this thesis, we focus on this stronger conjecture. As in any attempt to show non-containment of one complexity class into another (p + np, l + nl), it is important to know specific complete problems—namely, problems which witness the non-containment iff the non-containment indeed holds. In our case, we need a family of regular problems that can be solved by a polynomial-size family of 1nfas and, in addition, they are such that no polynomial-size family of 2dfas can solve them iff 2d + 1n. Such families are known. In fact, we have already presented one: the family of problems defined on page 10 over the alphabets ∆n . So, it is safe to invest all our efforts in trying to understand that particular family, and prove or disprove the 2d + 1n conjecture by showing that the family does not or does belong to 2d. However, it is easier (and as safe) to work with another complete family, which is defined over an even larger alphabet and thus brings us closer to the combinatorial core of the conjecture. This family is called liveness, denoted by B = (Bn )n≥0 , and defined as follows [48].

3. PROGRESS

21

For each n, we consider the alphabet Σn := P({1, 2, . . . , n}2 ) of all directed 2-column graphs with n nodes per column and only rightward arrows. For example, for n = 5 this alphabet includes the symbols: 1

1

1

2

2

2

3

3

3

4

4

4

5

5

5

where, e.g., indexing the vertices from top to bottom, the rightmost symbol is {(1, 2), (2, 1), (4, 4), (5, 5)}. Given an m-long string over Σn , we naturally interpret it as the representation of a directed (m+1)-column graph, the one that we get by identifying the adjacent columns of neighboring symbols. For example, for m = 8 the string of the above symbols represents the graph: 0

1

2

3

4

5

6

7

8

1

1

1

2

2

2

3

3

3

4

4

4

5

5

5

where columns are indexed from left to right starting from 0. In this graph, a live path is any path that connects the leftmost column to the rightmost one (i.e., the 0th to the mth column), and a live vertex is any vertex that has a path from the leftmost column to it. The string is live if live paths exist; equivalently, if the rightmost column contains live vertices. Otherwise, the string is dead. For example, in the above string, the 5th node of the 2nd column is live because of the path 3 → 3 → 5, and the string is live because of two live paths, one of which is 3 → 3 → 2 → 5 → 5 → 3 → 3 → 2 → 1. Note that no information is lost if we drop the direction of the arrows, and we do. So, the above string is simply: 1

1

2

2

3

3

4

4

5

5

The problem Bn consists in determining whether a given string of Σn∗ is live or not. In formal dialect, Bn is the language {w ∈ Σn∗ | w is live} of all live strings over Σn , for all n. As already claimed, B ∈ 1n. That is, there exist small 1nfa algorithms for Bn . The smallest possible one is rather obvious: We scan the list of graphs from left to right, trying to nondeterministically follow one live path. Initially, we guess the starting vertex among those of the leftmost column. Then, on reading each graph, we find which vertices in the next column are accessible from the most recent vertex. If none is, we hang (in this branch of the nondeterminism). Otherwise, we guess one of them and move on remembering only it. If we ever arrive at the end of the input, we accept.

22

INTRODUCTION

It is easy to verify that this algorithm can be implemented on a 1nfa with exactly one state per possible vertex in a column. Hence, Bn is solvable by an n-state 1nfa. In contrast, nobody knows how to solve Bn on a 2dfa with fewer than 2n − 1 states. The best known 2dfa algorithm is the following: We scan the list of graphs from left to right, remembering only the set of live vertices in the most recent column. Initially, all vertices of the leftmost column are live. Then, on reading each graph, we use its arrows to compute the set of live vertices in the next column. If it is empty, we simply hang. Otherwise, we move on, remembering only this set. If we ever arrive at the end of the input, we accept.

Easily, this algorithm needs exactly one state per possible non-empty set of live vertices in a column, for a total of 2n − 1 states, as promised. By the completeness of B, our questions about the relation between 2d and 1n can be encoded into questions about the size of a 2dfa solving Bn . In other words, the following three statements are equivalent: • 2d ⊇ 1n, • the trade-off from 1nfas to 2dfas is polynomially bounded, • Bn can be solved by a 2dfa of size polynomial in n. Hence, to prove the conjecture that 2d + 1n, we just need to prove that the number of states in every 2dfa solving Bn is super-polynomial in n. In fact, as explained in Section 2.1-III, a stronger conjecture says that the above 2dfa algorithm is optimal! That is, in every 2dfa solving Bn —the conjecture goes—the number of states is not only super-polynomial but already 2n − 1 or bigger. To better understand what this means, observe that the above algorithm is one-way: it is, in fact, the smallest 1dfa for liveness (as we can easily prove). Therefore, the claim is that in solving liveness, a 2dfa has no way of using its bidirectionality to save even 1 of the 2n − 1 states that are necessary without it. 3.2. Restricted information: Moles. The first direction in our investigation of the efficiency of 2dfas against liveness is motivated by the particular way of operation of the 1nfa algorithm that we described above. Specifically, consider any branch of the nondeterministic computation of that 1nfa. Along that branch, the automaton moves through the input from left to right, reading one graph after the other. However, although at every step the entire next graph is read, only part of its information is used. In particular, the automaton ‘focuses’ only on one of the vertices in the left column of the graph and ‘sees’ only the arrows which depart from that vertex. The rest of the graph is ignored. In this sense, the automaton operates in a mode of ‘restricted information’. A more intuitive way to describe this mode of operation is to view the input string as a ‘network of tunnels’ and the 1nfa as an n-state one-way nondeterministic robot that explores this network. Then, at each step, the

3. PROGRESS

23

robot reads only the index of the vertex that it is currently on and the tunnels that depart from that vertex, and has the option to either follow one of these tunnels or abort, if none exists. In yet more intuitive terms, the automaton behaves like an n-state one-way nondeterministic mole. Given this observation, a natural question to ask is the following: Suppose we apply to this mole the same conversion that defines the question whether 2d ⊇ 1n. Namely, suppose that this mole loses its nondeterminism in exchange for bidirectionality. How much larger does it need to get to still be solving Bn ? That is, can liveness be solved by a small two-way deterministic mole? Equivalently, is there a 2dfa algorithm that can tell whether a string is live or not by simply exploring the graph defined by it? Note that, at first glance, there is nothing to exclude the possibility of some clever graph exploration technique that correctly detects the existence of live paths and can indeed be implemented on a small 2dfa. In Chapter 2 we answer this question in a strongly negative manner: no two-way deterministic mole can solve liveness. To understand the value of this answer, it is necessary to understand both the “good news” and the “bad news” that it contains. The good news is that we have crossed an entire, very natural class of 2dfa algorithms off the list of candidates against liveness. We have thus come to know that every correct 2dfa must be using the information of every symbol in a more complex way than moles. However, note that our answer talks of all two-way deterministic moles, as opposed to only small ones. This might sound like “even better news”, but it is actually bad. Remember that our primary interest is not moles themselves, but rather the behavior of small 2dfas against liveness. So, our hope was that we would get an answer that involves small moles, and this hope did not materialize. Put another way, we asked a complexity-theoretic question and we received a computability-theoretic answer. Overall, our understanding has indeed advanced, but not for the class of machines that we were mostly interested in. Nevertheless, some of the tools developed for the proof of this theorem may still be useful for the more general goal. Specifically, if indeed small 2dfas cannot solve liveness, then it is hard to imagine a proof that will not involve very long inputs. Such a proof will probably need tools similar to the dilemmas and generic strings for 2dfas that were used in our argument. 3.3. Restricted bidirectionality: Sweeping automata. The second direction that we explore is motivated by the known fact that 2d is closed under complement [54, 14], whereas the corresponding question for 2n is open. So, one way to prove that 2d 6= 2n is to show that 2n is not closed under complement. In terms of classes, we can write this goal as 2n 6= co2n, where co2n is the class of families of regular problems whose

24

INTRODUCTION

complements can be solved by polynomial-size families of 2nfas. Of course, it is conceivable that 2n = co2n, in which case a proof of this closure would constitute evidence that 2d = 2n. As a matter of fact, 2n = co2n is already known to hold in some special cases. First, the analogue of this question for logarithmic-space Turing machines is known to have been resolved this way: nl = conl [24, 58]. By the argument of [3], this implies that every small 2nfa can be converted into a small 2nfa that makes exactly the opposite decisions on all “short” inputs (in the sense of Section 2.1-IV). In addition, the proof idea of nl = conl has been used to prove that indeed 2n = co2n for the case of unary regular problems [14]. So, 2n and co2n are already known to coincide on short and on unary inputs. However, there is little doubt that the above special cases avoid the core of the hardness of the 2n vs. co2n question. In this sense, our confidence in the conjecture that 2n 6= co2n is not seriously harmed. As a matter of fact, in Chapter 2 we prove a theorem that constitutes evidence for it. We consider a restriction on the bidirectionality of the 2nfas and prove that, under this restriction, 2n 6= co2n. The restricted automata that we consider are the “sweeping” 2nfas. A two-way automaton is sweeping if its head can change direction only on the input endmarkers. In other words, each computation of a sweeping automaton is simply a sequence of one-way passes over the input, with alternating direction. We use the notation snfa for sweeping 2nfas, and sn for the class of families of regular problems that can be solved by polynomialsize families of snfas. With these names, our theorem says that: sn 6= cosn. Specifically, our proof uses liveness, which is obviously in sn: B ∈ sn. We show that, in contrast, every snfa for the complement of Bn needs 2Ω(n) states, so that B ∈ / cosn. Hence, B ∈ sn \ cosn and the two classes differ. Another way to interpret this theorem is to view it as a generalization of two other, previously known facts about the complement of liveness: that it is not solvable by small 1nfas [48] and that it is not solvable by small sweeping 2dfas [55, 14], either. So, proving the same for small snfas amounts to generalizing both these facts to sweeping bidirectionality and to nondeterminism, respectively. For another interesting interpretation, note that the smallest known snfa solving the complement of Bn is still the obvious 2n -state 1dfa from page 22. Hence, our theorem says that, even after allowing sweeping bidirectionality and nondeterminism together, a 1dfa can still not achieve significant savings in size against the complement of liveness—whether it can save even 1 state is still open. Finally, our proof can be modified so that all strings used in it are drawn from a special subclass of Σn∗ on which the complement of liveness can actually be determined by a small 2dfa. This immediately implies that:

4. OTHER PROBLEMS IN THIS THESIS

25

the trade-off from 2dfas to snfas is exponential, generalizing a known similar relation between 2dfas and sdfas [55, 2, 36]. 4. Other Problems in This Thesis Apart from the progress against the 2d vs. 2n question explained above, this thesis also contains several other theorems in descriptional complexity. 4.1. Exact trade-offs for regular conversions. As explained already in Section 2.2 (Figure 1 on page 20), the 2d vs. 2n question concerns only 2 of the 12 possible conversions between the four most basic types of finite automata (1dfas, 1nfas, 2dfas, and 2nfas). For each of the remaining conversions our understanding is perfect, in the sense that we know the exact value of the associated trade-off. For the conversion from 1nfas to 1dfas (Figure 1a), the upper bound is due to [47] and the lower bound due to [35]. For any of the conversions from weaker to stronger automata (Figure 1e), the upper bound is obvious by the definitions and the lower bound is due to [6]. For the remaining four conversions (from 2nfas or 2dfas to 1nfas or 1dfas), both the upper and lower bounds are due to this thesis—although the fact that the trade-offs were exponential was known before. We establish these exact values in Chapter 1. For a quick look, see Figure 1b–d. We stress, however, that the exact values alone do to reveal the depth of the understanding behind the associated proofs. In order to explain what we mean by this, let us revisit the conversion from 1nfas to 1dfas. As already mentioned, we can encode our understanding of this conversion into the concise statement that: the trade-off from 1nfas to 1dfas is exactly 2n − 1. A less succinct but more informative description is that, for all n: • every n-state 1nfa has an equivalent 1dfa with ≤ 2n − 1 states, and • some n-state 1nfa has no equivalent 1dfa with < 2n − 1 states. But even these more verbose statements fail to describe the kind of understanding that led to them. What we really know is that every 1nfa N can be simulated by a 1dfa that has 1 distinct state for each non-empty subset of states of N which (as an instantaneous description of N ) is both realizable and non-redundant. This is exactly the idea where everything else comes from: the value 2n − 1 (by a standard counting argument), the simulation for the upper bound (just construct a 1dfa with these states and with the then obvious transitions), and the hard instances for the lower bound (just find 1nfas that manage to keep all of their instantaneous descriptions realizable and non-redundant). In this sense, we know more than just the value of the trade-off; we know the precise, single reason behind it: the non-empty subsets of states of the 1nfa

26

INTRODUCTION

that is being converted. To be able to pin down the exact source of the difficulty of a conversion in terms of such a simple and well-understood class of set-theoretic objects is a rather elegant achievement. Our analyses in Chapter 1 are supported by this same kind of understanding: in each one of the four trade-offs that we discuss, we first identify the correct set-theoretic object at the core of the conversion and then move on to extract from it the exact value, the simulation, and the hard instances that we need. As a foretaste, here are the objects at the core of the conversion from 2nfas to 1nfas: the pairs of subsets of states of the 2nfa, where the second subset has exactly 1 more state than the first subset. So, every 2nfa can be simulated by a 1nfa that has 1 distinct state for every such pair, and for some 2nfas all these states are necessary. Moreover, the value of the trade-off is exactly the number of such pairs that we can construct out of an n-state 2nfa; a standard counting argument shows that 2n this number is n+1 . 4.2. Non-recursive trade-offs for non-regular conversions. In contrast to Chapters 1 and 2, the last chapter studies conversions between machines other than the automata of Figure 1, including machines that can also recognize non-regular problems. As we shall see, the trade-offs for such conversions may, in general, behave in a quite different manner. To understand the difference, note that already since [53] we knew how to effectively convert any 2nfa (the strongest type of automata in Figure 1) into a 1dfa (the weakest type). This immediately guaranteed a recursive upper bound for each one of the 12 trade-offs of Figure 1. In contrast, for other conversions, such a recursive upper bound cannot be taken for granted. As first shown in [35], there are cases where the trade-off of a conversion grows faster than any recursive function: e.g., the conversion from one-way nondeterministic pushdown automata that recognize regular languages to 1dfas. Moreover, this phenomenon cannot be attributed simply to the difference in power between the types of the machines involved. As shown in [56], if the pushdown automata in the previous conversion are deterministic, then the trade-off does admit a recursive upper bound. Such trade-offs, that cannot be recursively bounded, are simply called nonrecursive. Note that this name is slightly misleading, as it allows the possibility of a non-recursive trade-off that still admits recursive upper bounds. However, no such cases will appear in this thesis. In Chapter 3 we refine a well-known technique [16] to prove a general theorem that implies the non-recursiveness of the trade-off in a list of conversions involving two-way machines. Roughly speaking, our theorem concerns any two types of machines, a and b, such that the following two conditions are satisfied:

4. OTHER PROBLEMS IN THIS THESIS

27

• there exists a problem that can be solved by a machine of type a but cannot be solved by any machine of type b, and • any two-way deterministic finite automaton that works on a unary alphabet and has access to a linearly-bounded counter can be simulated by some machine of type a. For any such pair of types of machines, our theorem says that the trade-off from a to b is non-recursive. For example, we can have a be the multi-head finite automata with k + 1 heads and b be the multi-head finite automata with k heads. No matter what k is, the two conditions above are known to be true, and therefore replacing a multi-head finite automaton with an equivalent one that has 1 fewer head results in an non-recursive increase in the size of the automaton’s description, in general. At the core of the argument of this theorem lies a lemma of independent interest: we prove that the emptiness problem remains unrecognizable (nonsemidecidable) even for a unary two-way deterministic finite automaton that has access to a linearly-bounded counter and obeys a threshold —in the sense that it either rejects all its inputs or accepts exactly those that are longer than some fixed length.

CHAPTER 1

Exact Trade-Offs In this chapter we prove the exact values of the trade-offs for the conversions from two-way to one-way finite automata, as pictured in Figure 1 (page 20). In Section 3 we cover the conversion from 2dfas to 1dfas (Figure 1b), whereas the conversion from 2nfas to 1dfas (Figure 1c) is the subject of Section 4. The conversions from 2nfas and 2dfas to 1nfas (Figure 1d) are covered together in Section 5. We begin with a short note on the history of the subject and a summary of our conclusions. 1. History of the Conversions The conversion of 1nfas into 1dfas is the archetypal problem of descriptional complexity. As already mentioned (Figure 1a), it is fully resolved, in the sense that we know the exact value of the associated trade-off:1 the trade-off from 1nfas to 1dfas is 2n − 1. The history of the problem began in the late 50’s, when Rabin and Scott [46, 47] introduced 1nfas as a generalization of 1dfas and showed how 1dfas can simulate them. This proved the upper bound for the trade-off. The matching lower bound was established much later, via several examples of “hard” 1nfas [44, 35, 43, 51, 48, 33].2 Both bounds are based on the crucial idea that the non-empty subsets of states of the 1nfa capture everything that a simulating 1dfa needs to describe with its states. As part of the same seminal work [45, 47], Rabin and Scott also introduced two-way automata and proved “to their surprise” that 1dfas were again able to simulate their generalization. This time, though, the proof was complicated enough to be superseded by a simpler proof by Shepherdson [53] at around the same time. All authors were actually talking about 1Recall our conventions, as explained in Footnote 2 on page 15. 2The earliest ones, both over a binary alphabet, appeared in [35] (an example

that was described as a simplification of one contained in an even earlier unpublished report [44]) and in [43] (where [44] is again mentioned as containing a different example with similar properties). Other examples have also appeared, over both large [51, 48] and binary alphabets [33]. A more natural but not optimal example was also mentioned in [35] and attributed to Paterson. 29

30

1. EXACT TRADE-OFFS

what we would now call single-pass two-way deterministic finite automata (zdfas), as their definitions did not involve end-markers. However, the automata quickly grew into full-fledged two-way deterministic finite automata (2dfas) and also into nondeterministic counterparts (znfas and 2nfas), while all theorems remained valid or easily adjustable. Naturally, the descriptional complexity questions arose again. Shepherdson mentioned that, according to his proof, every n-state 2dfa had an equivalent 1dfa with at most (n + 1)(n+1) states. Had he cared for his bound to be tight, he would surely have noted that his proof had actually established an upper bound of only n(n + 1)n —e.g., see [20, Section 3.7]. Many years later, Birget [6, Theorem A3.4] claimed that this upper bound is really just nn . On the other hand, towards a lower bound, several authors showed that the trade-off is at least exponential [1, 35, 55] and indeed very close to the upper bound given by Shepherdson [35, 43].3 Here we will prove that both the upper and lower bounds meet at the value n nn − (n − 1)n . We will thus have arrived at the conclusion that the trade-off from 2dfas to 1dfas is n nn − (n − 1)n . Note that this value is larger than the upper bound nn claimed by Birget [6]. Indeed, his argument contained an oversight. But it did contain the correct idea, and it is exactly that idea which we apply here. We also note that our lower bound is valid even when the 2dfa being converted is single-pass. More importantly, both the upper and the lower bound are derived in a straightforward manner after we carefully identify the correct set-theoretic objects that ‘live’ in the relationship between the computations of 2dfas and 1dfas. These are the tables of the 2dfa as defined in Section 2.1-II. No big surprise is to be anticipated: we simply follow Birget’s idea from [6] in properly restricting the functions used by Shepherdson [53, proof of Theorem 2]. The relation between the most and least powerful of all automata mentioned so far, namely between 2nfas and 1dfas, has also been examined. Via a straightforward adjustment, Shepherdson’s argument could show very early that every n-state 2nfa can be converted into a 1dfa with at most 2 2n (2n −1) states. Much later, Birget [6, Theorem A3.4] claimed it to be no (n−1)2 n more than n n/2 2 . Towards a lower bound, we just mention the one 3That the lower bound is close to the upper bound given by Shepherdson was shown by (a slight modification of) the language of [35, Proposition 2], which requires ≥ nn states on every 1dfa, but only ≤ 5n + 5 states on a 2dfa (a zdfa, even). Similarly, [43] gave a language that requires ≥ nn + o(nn ) states on a 1dfa, but only ≤ 2n + 5 on a 2dfa. For just an exponential separation, one can look at [1] for ≥ 2n + 2 and ≤ 2n + 2 states (even on a zdfa); at the Paterson example of [35] for ≥ 2n and ≤ n + 2 states (even on a sdfa); or at [55] for ≥ 2n and ≤ O(n) (even on a sdfa).

1. HISTORY OF THE CONVERSIONS

31 2

provided by the systematic framework of [48], which was 2(n/2−2) . Here we will show that j Pn−1 Pn−1 n n i the trade-off from 2nfas to 1dfas is i=0 j=0 i j 2 −1

where the lower bound is valid even when the 2nfa being converted is singlepass. As before, we will first identify the correct set-theoretic objects that relate the computations of 2nfas to those of 1dfas. These are the tables of the 2nfa as defined in Section 2.2. Again, we arrive at them by appropriately restricting the functions in the Shepherdson argument. The most interesting of the descriptional complexity questions that we consider emerge in the conversions from 2nfas to 1nfas and 2dfas: (Q1) from 2nfas to 1nfas: is bidirectionality essential to 2nfas? Or, is there a problem that a 1nfa would be able to solve with exponentially fewer states if allowed to move its head to the left? (Q2) from 2nfas to 2dfas: is nondeterminism essential to 2nfas? Or, is there a problem that a 2dfa would be able to solve with exponentially fewer states if allowed to make nondeterministic choices? The second question is of course the 2d vs. 2n problem, as explained in the Introduction and covered in Chapter 2. For the first question, the answer is known to be positive, but here we will find the exact value of the trade-off. For the upper bound, it is straightforward to use Shepherdson’s idea 2 to show that every n-state 2nfa has an equivalent 1nfa with at most n2n states. A more economical simulation, with fewer than (n!)2 states, can be achieved by crossing sequences [21, Section 2.6]. However, it is not hard to observe that the order in which the pairs of successive states (after the first state) appear inside a crossing sequence is not important; equivalently, in applying Shepherdson’s idea we can use the nondeterminism of the simulating 1nfa not only for ‘guessing forwards’ but also for ‘guessing backwards’. Based on this observation, we can actually construct a 1nfa with at most n(n + 1)n states. But this would still be wasting exponentially many states, as Birget [6] showed 8n + 2 states are always enough. On the other hand, towards a lower bound, exponential separations between 2dfas and 1nfas have long been known [48, 6], even when the 2dfas are single-pass [51, 8], the best being 2(n−1)/2 − 1.4 Here, we will again show that the upper and lower bounds meet exactly 2n at the value n+1 . We will thus have concluded that 4In [48] a language was given that requires ≤ 2n + 1 states on a 2dfa, but ≥ 2n − 1

states on a 1nfa. Through a different method, [6] found the same 2(n−1)/2 − 1 lower bound. Seiferas [51] gave a language that needs ≤ 4n + 2 states on a zdfa, but ≥ 2n states on any 1nfa, while Damanik [8] independently arrived at the same argument. Copying that idea, one can easily see that the restriction of the language Bn of [48] to strings of length 2 has similar properties (≤ 2n and ≥ 2n states).

32

1. EXACT TRADE-OFFS

2n the trade-off from 2nfas to 1nfas is n+1 . This will again be possible after we identify the correct set-theoretic objects relating the computations of 2nfas to those of 1nfas. These are the frontiers of the 2nfa as defined in Section 5.1. Essentially, these objects are what remains of the crossing sequences of [21] after we ignore not only the order of the pairs of successive states (as we did for the n(n + 1)n bound above) but the correspondence between first and second components in this set of pairs. As a matter of fact, the lower bound is valid even when the 2nfa being converted is deterministic (and single-pass, actually). This immediately implies that 2n the trade-off from 2dfas to 1nfas is n+1 , as well. Hence, the ability of a 2nfa to move its head in both directions strictly inside the input can alone cause all the hardness that a simulating 1nfa must overcome. In other words, the answer to question (Q1) from above is positive, exactly because even the answer to the following question is positive (Figure 1d): (Q3) from 2dfas to 1nfas: can bidirectionality beat nondeterminism? Is there a problem that a 1nfa would be able to solve with exponentially fewer states if it were allowed to replace nondeterminism with bidirectionality? Note the similarity with the conjectured resolution of question (Q2). As explained in the Introduction, we believe that the answer to (Q2) is also positive exactly because the answer to the following question is positive: (Q4) from 1nfas to 2dfas: can nondeterminism beat bidirectionality? Is there a problem that a 2dfa would be able to solve with exponentially fewer states if it were allowed to replace bidirectionality with nondeterminism? So, it appears that in both cases, the hardness of the simulation may stem entirely from the feature of the simulated machine that is absent in the simulating machine. Finally, let us also briefly consider the conversions from weaker to stronger automata (Figure 1e). By the definitions, the trade-off for each one of them is trivially upper-bounded by n. Moreover, it is also lower bounded by n. To see why, notice that the n-th singleton unary language {0n−1 } can be solved by an n-state 1dfa but no 2nfa with fewer than n states [4]. This proves that the trade-off from 1dfas to 2nfas is n and thus immediately implies the same for all other conversions of this kind. In the next section, we carefully define the notions that we will be working with in the rest of this chapter.

2. PRELIMINARIES

33

2. Preliminaries We write [n] for the set {1, . . . , n}. The special objects l, r, ⊥ are used for building the disjoint union of two sets and the augmentation of a set A ⊎ B = (A × {l}) ∪ (B × {r})

and

A⊥ = A ∪ {⊥}.

When A, B are disjoint, their union A ∪ B is also written as A + B (so that + can replace ∪ in both equations above). The size of A, the set of subsets of A, and the set of non-empty subsets of A are denoted respectively by |A|, P(A), and P ′ (A). For Σ an alphabet, we use Σ ∗ for the set of all finite strings over Σ and Σe for Σ + {⊢, ⊣}, where ⊢ and ⊣ are two special end-marking symbols. If u ∈ Σe∗ is a string, |u| is its length and ui is its i-th symbol, for all i = 1, 2, . . . , |u|. By ‘the i-th boundary of u’ we mean the boundary between ui and ui+1 , if 0 < i < |u|; or the leftmost boundary of u, if i = 0; or the rightmost boundary of u, if i = |u|. (Figure 2a.) We also write ue for the end-marked extension ⊢u⊣ of u and ue,i for the i-th symbol (ue )i of this extension. The empty string is denoted by ǫ. Of the automata that we consider, the two-way deterministic ones constitute the most natural variety and are described in the next section. Section 2.2 defines the one-way and nondeterministic cases. Section 2.3 discusses some of the problems that we will be solving with all these machines. 2.1. Two-way deterministic finite automata. A two-way deterministic finite automaton (2dfa) over the states of a set Q and the symbols of an alphabet Σ consists of a finite control that can represent all states in Q, a tape that can represent all symbols in Σe , and a read-only head. An input w ∈ Σ ∗ is presented on the tape surrounded by the end-markers, as ⊢w⊣. The automaton starts at a designated start state, its head reading the left end-marker ⊢. At every step, the symbol under the head is read; based on this symbol and the current state, the automaton selects a next state and whether to move its head left or right; it then simultaneously changes its state and moves its head. The input is accepted if the machine ever moves past the right end-marker ⊣ into a designated final state—this being the only case in which violating an end-marker is allowed.5 Formally, a 2dfa over Q and Σ is defined as a triple M = (s, δ, f ), where s, f ∈ Q are the start and the final states, respectively, and δ is the transition function, partially mapping Q × Σe to Q × {l, r}. In addition, δ obeys the aforementioned restrictions about end-marker violation: on ⊢, it either moves the head to the right or hangs; on ⊣, it moves the head to the left, or hangs, or moves the head to the right and enters f . 5Note the unusual conventions about end-marker violations and the position of the head at acceptance, borrowed from [6]. They make our definitions and theorems significantly nicer.

34

1. EXACT TRADE-OFFS

0

i 2 0 3

1 u1

u2

u3

4 u4

q0

5 u5

6

i0

0

6

i0

0

i

q0

q0

qm

qm

qm (a)

6

u6

(b)

(c)

Figure 2. (a) Symbols and boundaries on a 6-long string u, and a computation that hits left. (b) A computation that hangs. (c) A computation c that hits right, and its i-th frontier: Ric in circles and Lci in boxes. 2.1-I. Computations. Although M is typically started at s and on the tape cell containing the left end-marker ⊢, many other possibilities exist: for any string u, position i, and state q, the computation of M when started at q on the i-th symbol of u is the unique sequence compM,q,i (u) = (qt , it ) 0≤t≤m

with (q0 , i0 ) = (q, i) and 0 ≤ m ≤ ∞, that meets the following restrictions: • the head is always inside u, except possibly at the very end: 0 ≤ t < m =⇒ 1 ≤ it ≤ |u| & m 6= ∞ =⇒ 0 ≤ im ≤ |u| + 1. • every two successive pairs respect the transition function: 0 ≤ t < m =⇒ δ(qt , uit ) = (qt+1 , d), where either d = l & it+1 = it − 1 or d = r & it+1 = it + 1. • a last pair inside u exists only if the transition function allows it: m 6= ∞ & 1 ≤ im ≤ |u| =⇒ δ(qm , uim ) is undefined. We say (qt , it ) is the t-th point and m is the length of this computation. If m = ∞, we say the computation loops; otherwise, it hits left into qm , if im = 0; or it hangs, if 1 ≤ im ≤ |u|; or it hits right into qm , if im = |u| + 1. (Figure 2.) When i = 1 or i = |u| we get the left computation of M from q on u or the right computation of M from q on u, respectively: lcompM,q (u) := compM,q,1 (u) or rcompM,q (u) := compM,q,|u| (u). Finally, for w ∈ Σ ∗ , the computation of M on w refers to the typical usage compM (w) := lcompM,s (we ), so that M accepts w iff the computation compM (w) hits right into f . 1. Remark. Note that, when u is the empty string, the left computation of M from q on u is just lcompM,q (ǫ) = (q, 1) and therefore hits right into q, whereas the corresponding right computation is just rcompM,q (ǫ) = (q, 0) and therefore hits left into q.

2. PRELIMINARIES

35

2. Remark. Also note that, since M can violate an end-marker only by moving past ⊣ into f , a computation of M on any end-marked u (e.g., compM (w) is such) can only loop, or hang, or hit right into f . 2.1-II. Tables. Let u ∈ Σe∗ and assume lcompM,s (u) hits right into a state pu . Motivated by [53], we define the table of M on u as the function tableM (u) := τ : Q⊥ → Q that satisfies τ (⊥) := pu and, for all q ∈ Q, ( p if rcompM,q (u) hits right into p, τ (q) := pu if rcompM,q (u) hits left, loops, or hangs. We stress that the table is defined only when lcompM,s (u) hits right ; in all other cases, no meaning is associated with the notation tableM (u). Note that, whenever the table of M on u is defined, it almost fully describes the behavior of the 1 + |Q| computations lcompM,s (u)

and

rcompM,q (u), for all q ∈ Q,

on the rightmost boundary of u, in the sense that, whenever the boundary is indeed hit, τ returns the resulting last state. The only ambiguity arises when τ (q) = pu , for some q ∈ Q: then we do not know if this is because the corresponding computation rcompM,q (u) misses the rightmost boundary, or because it hits it but it does so into pu . If we allowed τ to take values in Q⊥ (as opposed to just Q), we could easily remove this ambiguity—at the same time making our representation identical to that of [53]. But we will not do so. Our ultimate goal is the construction of a 1dfa that simulates M and, as we shall prove, this slightly ambiguous representation contains exactly the amount of information required for this purpose. 3. Remark. Note that, according to our conventions (Remark 1), the table on the empty string tableM (ǫ) is defined, and it equals the constant function s (i.e., the function that maps every element of Q⊥ to the start state of M ). Similarly, according to our conventions for end-marker violation (Remark 2), whenever the table tableM (u) on an end-marked u is defined, it necessarily equals the constant function f (i.e., the function that maps every element of Q⊥ to the final state of M ). 2.1-III. Frontiers. Fix some computation c = ((qt , it ))0≤t≤m of M and consider the i-th boundary of the string being read (Figure 2c). The computation crosses this boundary 0 or more times, each crossing being either in the left-to-right or in the right-to-left direction. Collect into set Ric all states that result from a rightward crossing and do the same for the leftward crossings to get set Lci : Ric := {qt+1 | 0 ≤ t < m & it = i & it+1 = i + 1}, Lci := {qt+1 | 0 ≤ t < m & it = i + 1 & it+1 = i},

36

1. EXACT TRADE-OFFS

also making the special provision that Ric0 −1 necessarily contains q0 .6 The pair (Lci , Ric ) partially describes the behavior of c over the i-th boundary and we call it the i-th frontier of c. Note that the description is indeed partial, as the pair contains no information about the order in which c exhibits the states around the ith boundary, and says nothing about the number of times each individual state is exhibited. For a full description we would need instead the i-th crossing sequence of c (cf. [21]). However, in certain interesting cases, the extra information of the complete description is redundant. In particular, if we only care to decide reachability between two points via cycle-free computations, then the computations’ frontiers contain exactly the amount of information that we need. We will prove and use this in Section 5. 2.2. Nondeterministic, one-way, and single-pass variations. If in the definition of a 2dfa M = (s, δ, f ) more than one next moves are allowed at each step, we say the automaton is nondeterministic (2nfa). This formally means that δ totally maps Q×Σe to the powerset of Q×{l, r} and implies that C := compM,q,i (u) is now a set of computations. If then P := {p | some c ∈ C hits right into p}, we say C hits right into P . Note that, if u is end-marked, then P is either ∅ or {f } (cf. Remark 2). An input w ∈ Σ ∗ is considered to be accepted iff the set of computations compM (w) = lcompM,s (we ) hits right into {f }. If the head of M never moves to the left, we say M is one-way (a 1nfa; or a 1dfa, if M is deterministic).7 If no computation of M ‘continues after reaching an end-marker’, we say M is single-pass (a znfa; or a zdfa). 2.2-I. Tables of a 2nfa. If M is a 2nfa and u ∈ Σe∗ is any string, we can define the table of M on u similarly to what we did for 2dfas in Section 2.1-II. In particular, the table is defined only if the set of computations lcompM,s (u) hits right into some Pu 6= ∅, and is then the function tableM (u) := T : Q⊥ → P ′ (Q) that satisfies T (⊥) := Pu and, for all q ∈ Q, ( P \ Pu if rcompM,q (u) hits right into some P ⊆ 6 Pu , T (q) := Pu if rcompM,q (u) hits right into some P ⊆ Pu . 6This reflects the convention that the starting state of any computation is considered to be the result of an ‘invisible’ left-to-right step. 7Note that, under this definition, a one-way finite automaton works on an endmarked input, a deviation from the standard definition. However, it is easy to verify that an automaton that follows either definition can be converted into an equivalent automaton that follows the other definition and has the same set of states. So, all our conclusions concerning the numbers of states in different automata will be valid irrespective of which definition we have in mind.

2. PRELIMINARIES

37

Note that the definition is consistent with the one for the deterministic case.8 Moreover, it suffers the same ambiguities: whenever T (q) = Pu we do not know if this is because all computations in rcompM,q (u) miss the right boundary or because some of them hit it but do so only into states that are already in Pu . 4. Remark. Also note an analogue of Remark 3: If u is the empty string, then T is defined and equal to the constant function {s}. If u is endmarked, then T is either undefined or equal to the constant function {f }. 2.3. Problems. Given any alphabet Σ, a (promise) problem over Σ is any pair Π = (Πyes , Πno ) of disjoint subsets of Σ ∗ . An automaton solves Π iff it accepts every w ∈ Πyes but no w ∈ Πno . Note that the behavior of an automaton on strings outside Πyes + Πno does not affect whether it solves Π or not. When there are no such strings, namely when Πno + Πno = Σ ∗ , the problem is also called language and is adequately described by Πyes alone (since then Πno = Πyes ). We will be interested in problems over the alphabet that contains pairs of the form (x, d) or (G, d), where x is a number in [n], G is a binary relation on [n], and d is a direction tag from {l, r}. In other words, our alphabet is Γ := [n] + P([n] × [n]) × {l, r}.

However, among all strings over Γ , we will only care about those that have length 4 and happen to follow the specific pattern (1)

(x, l)(G, l)(h, r)(y, r)

where x and y are two numbers in [n], G is a binary relation on [n], and h is a partial function from [n] to [n] which is not defined on y. (Note that a partial function is a special case of a binary relation. So, these symbols do exist in Γ .) We call these strings nice inputs. Intuitively, given a nice input as above, we think of the two-column graph of Figure 3a, where the columns are two copies of [n], the arrows between the columns are determined by G (left-to-right) and h (right-toleft), and the two special nodes are determined by x (entry point) and y (exit point). On this graph, a path from the entry point to the exit point may or may not exist; if it does, we just say that ‘the (graph of the) input has a path’. For example, the nice input of Figure 3a has a path, but the nice input of Figure 3b does not have a path. What makes nice inputs interesting is that a 2nfa can decide whether such an input has a path or not using only n states and a single pass over 8If the 2nfa M is actually deterministic and T , τ are its tables on u as described by the definitions for 2nfas and 2dfas respectively, then T is defined iff τ is. Moreover, when both tables are defined, we have T (r) = {τ (r)} for all r ∈ Q⊥ .

38

1. EXACT TRADE-OFFS

G

h

G

x

h

g y′

x

h

x

y

(a)

y

(b)

(c)

Figure 3. (a) A nice input, that has a path. (b) An nice input with no path. (c) A deterministic nice input, that has a path. it. More precisely, consider the promise problem Φ = (Φyes , Φno ) with Φyes := {w ∈ Γ ∗ | w is a nice input that has a path}, Φno := {w ∈ Γ ∗ | w is a nice input that has no path}. Then Φ can be solved by a znfa N0 that has [n] as its set of states and implements the following natural algorithm: On a nice input like (1) between end-markers, we use the first 2 steps to reach (G, l) at state x. Then we repeatedly and alternately read the two middle symbols, each time selecting nondeterministically and following one of the arrows defined by G (if any) or following the (at most one) arrow defined by h. If we ever reach (h, r) at a state z from which no h-arrow departs, we stay at z and move right to check whether z = y. If so, we move 2 more steps to the right and accept.

Formally, N0 := (1, δ, 1), where δ is any total function from [n] × Γe to the powerset of [n] × {l, r} that satisfies the following equations: δ(1, ⊢) = {(1, r)}, δ 1, (x, l) = {(x, r)}, δ(1, ⊣) = {(1, r)}, ′ ′ δ z, (G, l) = {(z , r) | (z, z ) ∈ G}, δ z, (h, r) = if h(z) is defined then h(z), l else {(z, r)}, δ z, (y, r) = if (z = y) then {(1, r)} else ∅.

Recall that the behavior of N0 on inputs that are not nice is irrelevant. We will also be interested in the special case of inputs of the form (1) where, like h, the relation G is also a partial function (Figure 3c). It is easy to verify that on such inputs N0 does not use its nondeterminism, so we refer to strings of this form as deterministic nice inputs and we use g in place of G to represent them: (2)

(x, l)(g, l)(h, r)(y, r).

3. FROM 2DFAS TO 1DFAS

39

Not surprisingly, the promise problem Ψ = (Ψyes , Ψno ) with Ψyes := {w ∈ Γ ∗ | w is a deterministic nice input that has a path}, Ψno := {w ∈ Γ ∗ | w is a deterministic nice input that has no path}, can be solved by a zdfa M0 with state set [n], executing the following straight-forward modification of the previous algorithm: On a deterministic nice input like (2) surrounded by endmarkers, we use the first 2 steps to reach (g, l) at state x. We then repeatedly and alternately read the two middle symbols, each time following the arrow (if any) defined by g or h. If we ever reach (h, r) at a state z from which no h-arrow departs, we stay at z and move right to check whether z = y. If so, we move 2 more steps to the right and accept.

Formally, M0 := (1, δ, 1), where δ is any partial function from [n] × Γe to [n] × {l, r} that satisfies the following equations: δ(1, ⊢) = (1, r), δ 1, (x, l) = (x, r), δ(1, ⊣) = (1, r), δ z, (g, l) = if g(z) is defined then g(z), r else ‘undefined’, δ z, (h, r) = if h(z) is defined then h(z), l else (z, r), δ z, (y, r) = if (z = y) then (1, r) else ‘undefined’.

Again, only the behavior of M0 on deterministic nice inputs is important. The lower bounds that we will be proving in the following sections will be based on variants of problems Φ and Ψ . Perhaps the reader has already recognized in them two restrictions of Cn , the 2n-complete language of [48]. At the same time, Φ is a large-alphabet variant of a problem used in [3]. 3. From 2DFAs to 1DFAs Fix an n-state 2dfa M = (s, δ, f ) over some set of states Q and an alphabet Σ. In this section we will build a 1dfa that is equivalent to M . We begin with some observations and facts. 3.1. Tables. Consider some non-empty string u and suppose that the table of M on it τ := tableM (u) is defined. We then know that the computation c := lcompM,s (u) hits right into τ (⊥). This implies that c visits the rightmost symbol of u at least once. If q is the state of M during the latest such visit, we easily see that the computation c′ := rcompM,q (u) is a suffix of c. Hence, it also hits right, like c. Moreover, it certainly hits right into the same state as c, meaning τ (q) = τ (⊥). We thus conclude that τ assigns to ⊥ one of the values that it uses for the states in Q. That is, τ (⊥) ∈ τ [Q]. This motivates the following. 5. Definition. A table of M is any τ : Q⊥ → Q such that τ (⊥) ∈ τ [Q].

40

1. EXACT TRADE-OFFS

u

a

s

u

a

u

s

a

s

p

p

r

p r p∗

(a)

(b)

(c)

Figure 4. Trying to compute τ ′ (⊥): (a) c hangs right away, (b) c hits right in the first step, and (c) c moves left in the first step. Note that this defines what a “table of M ” is, whereas Section 2.1-II defined what the “table of M on u” is, for any string u. The next lemma shows the relation between these two notions. The lemma after it, carries out an easy counting argument. 6. Lemma. If the table of M on u is defined, then it is a table of M . Proof. If u 6= ǫ, the proof is the argument before Definition 5. If u = ǫ, the table of M on u is the constant function s (cf. Remark 3), and therefore it obviously qualifies as a table of M . 7. Lemma. The number of distinct tables of M is exactly n(nn − (n − 1)n ). Proof. Easily, the number of distinct tables of M is exactly equal to the number of (n + 1)-tuples of elements of [n] where the first component equals some other component. Since there is a total of nn+1 unrestricted tuples and exactly n(n − 1)n of them violate the restriction about the first component, our number is the difference nn+1 − n(n − 1)n , as claimed. 3.2. Compatibilities among tables. Consider any string u, any symbol a, and suppose that the table τ := tableM (u) is defined. We would like to know whether the table τ ′ := tableM (ua) is also defined and, if so, to compute it. In this section we will show how this can be done using only τ and a, but not u. Our algorithm will be based on the algorithm implied in [53], but it will also need some modifications to account for the ambiguity of our representation. Recall that τ ′ is defined iff the computation lcompM,s (ua) hits right. (Figure 4.) Clearly, this computation ends in c := rcompM,τ (⊥) (ua). So, in order to figure out whether τ ′ is defined, we can just check whether c hits right. If it does, our check will also reveal the last state of c, which we know is the value of τ ′ on ⊥.

3. FROM 2DFAS TO 1DFAS

p ←− τ (⊥) repeat: if δ(p, a) undefined: fail (r, d) ←− δ(p, a) if d = r: return r if τ (r) = τ (⊥): fail if τ (r) seen before: fail p ←− τ (r)

41

p ←− q repeat: if δ(p, a) undefined: return τ ′ (⊥) (r, d) ←− δ(p, a) if d = r: return r if τ (r) = τ (⊥): return τ ′ (⊥) if τ (r) seen before: return τ ′ (⊥) p ←− τ (r)

Figure 5. Computing eτ,a (⊥) (left) and eτ,a (q) (right).

So, let us set p := τ (⊥) and consider the first step of c. If δ(p, a) is undefined (Figure 4a), then c hangs inside ua and τ ′ is undefined. If δ(p, a) is defined and equal to some (r, r) (Figure 4b), then c immediately hits right into r, so τ ′ is defined and τ ′ (⊥) = r. The last case (Figure 4c) is that δ(p, a) equals some (r, l), so then c starts behaving as d := rcompM,r (u) and we know this behavior is already encoded in τ as the value p∗ := τ (r). We distinguish two cases. • If p∗ = τ (⊥), then we can conclude that τ ′ is undefined, as we are in one of the following two cases: either d hits left, loops, or hangs inside u, therefore c does the same inside ua, and hence τ ′ is not defined; or d hits right into τ (⊥), therefore c is back on a and at state p again, so c loops inside ua and hence τ ′ is again undefined. • If p∗ 6= τ (⊥), we know d hits right into p∗ , so that c is back on a. To find out what happens next, we simply ask δ exactly as before, and continue. However, we should be careful not to ask δ a question we have already asked. If this is about to happen, then c repeats p∗ under a, so c actually loops inside ua, in which case τ ′ is undefined. This concludes our description of how to check whether τ ′ is defined and, if so, also compute τ ′ (⊥). Equivalently, we have shown that the algorithm of Figure 5(left) always terminates and either fails, if τ ′ is undefined, or correctly returns the value τ ′ (⊥), if τ ′ is defined. In the case that τ ′ is defined, we also want to compute the rest of its values, namely τ ′ (q) for all q ∈ Q. Given the discussion so far, this is easy: we simply run the algorithm of Figure 5(left) again, but starting with p := q (as opposed to p := τ (⊥)). It is easy to verify that, if the algorithm does not fail, then rcompM,q (ua) hits right exactly into the state that the algorithm returns. So, if the algorithm does return a value, this value is the correct τ ′ (q). On the other hand, if the algorithm fails, this is due to one of its fail statements. We distinguish cases. • If this is due to the 1st or the 3rd fail statement: Then we know that rcompM,q (ua) hangs on a or loops inside ua. Therefore, by

42

1. EXACT TRADE-OFFS

definition, τ ′ (q) equals τ ′ (⊥). So, instead of failing, the algorithm should have returned τ ′ (⊥). • If this is due to the 2nd fail statement: Then we know that one of the following is true about the computation rcompM,q (ua): • at some point, the computation locks itself inside u and eventually hits left, hangs, or loops: Then we again know that, by definition, τ ′ (q) = τ ′ (⊥). So, instead of failing, the algorithm should have returned τ ′ (⊥). • at some point, the computation really enters state τ (⊥) while on a: Then we know that, from that point on, the computation will behave identically to rcompM,τ (⊥) (ua) and hence it will eventually hit right into τ ′ (⊥). So, once again, the algorithm should have returned τ ′ (⊥). In conclusion, we see that in all cases of failure the algorithm should have returned τ ′ (⊥). Hence, if in Figure 5(left) we just replace every fail statement with the statement “return τ ′ (⊥)”, we have a correct algorithm for computing τ ′ (q)—provided, of course, that τ ′ (⊥) is defined and available. Figure 5(right) shows the algorithm after these modifications. Overall, we have described an algorithm eτ,a that can be used for checking if τ ′ is defined and, if so, for computing its values. The algorithm can be run on any element of Q⊥ . Figure 5(left) shows the computation eτ,a (⊥). Figure 5(right) shows the computation eτ,a (q), for q ∈ Q, where every reference to τ ′ (⊥) can be understood as a call to eτ,a (⊥). With eτ,a in our vocabulary, we can summarize the discussion of this section into the next definition and lemma. 8. Definition. If τ and τ ′ are two tables of M and a some symbol in Σe , we say that τ is a-compatible to τ ′ if and only if τ ′ (⊥) = eτ,a (⊥)

and

for all q ∈ Q : τ ′ (q) = eτ,a (q),

where eτ,a is the algorithm from Figure 5. 9. Lemma. Suppose u ∈ Σe∗ and the table τ := tableM (u) is defined. Then, for any a ∈ Σe and any table τ ′ of M , the following holds: τ is a-compatible to τ ′ ⇐⇒ tableM (ua) is defined and equals τ ′ . Proof. Suppose τ is a-compatible to τ ′ . Then τ ′ (⊥) = eτ,a (⊥). Hence, on input ⊥, the algorithm eτ,a does not fail. This implies that the table tableM (ua) is defined. Moreover, its values are exactly those returned by eτ,a . But the values of τ ′ are also the same as those returned by eτ,a (because τ is a-compatible to it). Overall, tableM (ua) = τ ′ . Conversely, assume that the table tableM (ua) is defined and equals τ ′ . The argument before Definition 8 proves that eτ,a returns the same values as tableM (ua). Hence, eτ,a returns the same values as τ ′ . So, τ is acompatible to τ ′ , as required.

3. FROM 2DFAS TO 1DFAS

43

3.3. The upper bound. We are now ready to build a 1dfa M ′ that simulates M with exactly n(nn − (n − 1)n ) states. First, we characterize the acceptance of an input by M in terms of the tables of M and the compatibilities between them. 10. Definition. Suppose w ∈ Σ ∗ is of length l and τ0 , τ1 , . . . , τl+2 is a sequence of tables of M . We say the sequence fits w iff: 1. τ0 is the constant function s, 2. for all i = 0, 1, . . . , l + 1: τi is we,i+1 -compatible to τi+1 ,9 3. τl+2 is the constant function f . 11. Theorem. M accepts w iff some sequence of tables of M fits w. Proof. Fix w and consider the sequence (3)

tableM (ǫ), tableM (⊢), tableM (⊢w1 ), . . . , tableM (⊢w⊣).

Clearly, there is no guarantee that all members in this sequence are defined. But if they all are, then the sequence trivially satisfies Conditions 1 and 3 of Definition 10 (cf. Remark 3) and also Condition 2 (by Lemma 9). Hence, if all tables in the sequence are defined, the sequence fits w. Now assume M accepts w. Then the computation lcompM,s (we ) hits right into f . This immediately implies that each one of the tables in (3) is defined. So, their sequence fits w. Conversely, suppose some sequence of tables of M fits w. By Lemma 9 and an easy induction, this sequence must be identical to the one in (3). This implies that the table tableM (we ) is defined and equal to the constant function f . Hence, the computation lcompM,s (we ) hits right, into f . This means that M accepts w. Based on this lemma, the construction of M ′ is straightforward. To test if its input is accepted by M , the automaton checks if there is a sequence of tables of M that fits it. At every step, it ‘remembers’ the last table of the subsequence found so far. More carefully, the algorithm is as follows: We start with the constant table s in our memory. On reading a symbol a, we check if the table in our memory is acompatible to any table of M . If so, there is exactly one such table, so we change our memory to it and move right. If not, there is no sequence that fits w and we hang. We accept if we ever reach the constant table f .

Formally, M ′ := (s′ , δ ′ , f ′ ) where Q′ := {τ | τ is a table of M }, s′ := the table that always returns s, and f ′ := the table that always returns f . For each τ and a, the value δ ′ (τ, a) is either the unique τ ′ to which τ is a-compatible, if such a τ ′ exists; or undefined, if τ is a-compatible to none of the tables of M . Clearly, M ′ is correct and as large as claimed. 9Recall that by w e,i+1 we mean the i + 1st symbol of the end-marked string ⊢w⊣. This is either ⊢ (when i = 0), or wi (when i 6= 0, l + 1), or ⊣ (when i = l + 1).

44

1. EXACT TRADE-OFFS

hiτ

gτ xτ

hiτ

gτ xτ

yτi i yτi

i (a)

(b)

Figure 6. The input wτi when n = 6, table τ maps ⊥, 1, 2, 3, 4, 5, 6 to the values 3, 1, 3, 6, 3, 5, 6 respectively, and (a) i = 6, or (b) i = 4. 3.4. The lower bound. We will now prove that the construction of the previous section is optimal. In other words, we will prove that some n-state 2dfas have no equivalent 1dfas with fewer than n(nn − (n − 1)n ) states. To this end, we will actually exhibit such a 2dfa. Our witness will be the automaton M0 from Section 2.3, solving problem Ψ . So, for the remainder of this section, we assume that the n-state 2dfa M that we kept fixed from the beginning of Section 3 is actually M0 . We will prove that the automaton M ′ constructed in the previous section is smallest among the 1dfas that are equivalent to M . In fact, we will show that M ′ needs all its states not only for staying equivalent to M , but even for solving Ψ —on a stricter promise, even. More precisely, consider any table τ : [n]⊥ → [n] of M and any ‘query’ i ∈ [n]. The pair (τ, i) gives rise to the deterministic nice input wτi := (xτ , l)(gτ , l)(hiτ , r)(yτi , r), where xτ is the smallest x for which τ (x) = τ (⊥); gτ is the restriction of τ to [n]; hiτ is the single arrow from τ (⊥) to i, if τ (i) 6= τ (⊥), or else the empty function; and yτi is just τ (i) (see Figure 6 for examples): xτ :=min{x | τ (x) = τ (⊥)}, (4) gτ :=

x, τ (x) | x ∈ [n]

It is easy to verify the following.

yτi := τ (i) ( ∅ i hτ :=

τ (⊥), i

if τ (i) = τ (⊥), otherwise.

12. Lemma. For any table τ and any state i of M : The table of M on the prefix ⊢(xτ , l)(gτ , l) of wτi is exactly τ and the computation c of M on wτi is accepting. Moreover, if τ (i) 6= τ (⊥), then c contains a left-to-right crossing of the middle boundary from i to τ (i). Hence, the inputs in the set {wτi | τ is a table and i a state of M } push M ′ to its limits, in the sense that they collectively force every one of its

3. FROM 2DFAS TO 1DFAS

45

states to be used in every interesting way in some accepting computation. Hence, no state of M ′ is redundant in some straightforward manner, which intuitively suggests M ′ is minimal. We will turn this intuition into a proof. We start with the observation that, for every two distinct tables of M , there is a ‘query’ that can distinguish them. 13. Lemma. Two tables τ and τ ′ of M are distinct iff there exist a partial function h : [n] → [n] and a y ∈ [n] such that exactly one of the following inputs has a path: (xτ , l)(gτ , l)(h, r)(y, r)

and

(xτ ′ , l)(gτ ′ , l)(h, r)(y, r).

Proof. If the two tables are identical, then the two inputs are also identical and either none or both of them have a path. For the interesting direction, suppose τ , τ ′ are distinct. We examine two cases. If τ (⊥) 6= τ ′ (⊥), then choose h to be the empty function and y the smallest between τ (⊥) and τ ′ (⊥). Then the input corresponding to the table from which y takes its value has a path but the other input does not. If τ (⊥) = τ ′ (⊥) =: y ∗ , then there exists an x ∈ [n] such that τ (x) 6= τ ′ (x) (or else τ , τ ′ would be identical, a contradiction). Pick the smallest such x, choose h to contain only the arrow from y ∗ to x, and set y to the smallest of τ (x), τ ′ (x) that is different from y ∗ . Clearly, there is a path in the input corresponding to the table from which y takes its value, while the other input has no path. Now, for every pair of tables τ and τ ′ of M , let us define the input wτ,τ ′ := (xτ , l)(gτ , l)(hτ,τ ′ , r)(yτ,τ ′ , r), where xτ and gτ are given by (4), while hτ,τ ′ and yτ,τ ′ are either the values given by the proof of Lemma 13, if τ 6= τ ′ ; or the values hiτ , yτi as defined in (4) for i = xτ , if τ = τ ′ .10 Strengthening the promise of Ψ to allow only deterministic nice inputs of this form, we get the problem Ψ ′ with ′ Ψyes := {wτ,τ ′ | τ, τ ′ are tables of M and wτ,τ ′ has a path}, ′ Ψno := {wτ,τ ′ | τ, τ ′ are tables of M and wτ,τ ′ has no path}.

Clearly, M solves Ψ ′ , so that a single-pass 2dfa can solve this problem with only n states. However, for 1dfas the problem is maximally hard. 14. Lemma. Every 1dfa for Ψ ′ has at least n nn − (n − 1)n states.

Proof. Towards a contradiction, suppose that A is a 1dfa that solves Ψ ′ with fewer than n(nn − (n − 1)n ) states. For every table τ of M , the automaton accepts wτ,τ (Lemma 12), so the computation cτ := compA (wτ,τ ) hits right. In particular, it crosses the middle boundary from left to right. 10In the first case, note that the order of the pair τ , τ ′ is not important: h ′ = τ,τ hτ ′ ,τ and yτ,τ ′ = yτ ′ ,τ . In the second case, note that hτ,τ = ∅ and yτ,τ = τ (⊥).

46

1. EXACT TRADE-OFFS

Let qτ be the state that results from this crossing. Since there are fewer states in A than tables of M , two tables τ 6= τ ′ must map to the same state q := qτ = qτ ′ . As a consequence, the computations of A on wτ,τ ′ and wτ ′ ,τ both cross the middle boundary into q (since the two strings are identical to wτ,τ and wτ ′ ,τ ′ before this boundary, respectively) and therefore have the same suffix (since the two strings are identical to each other after that boundary). In particular, they are either both accepting or both rejecting, a contradiction to Lemma 13 and the definition of wτ,τ ′ and wτ ′ ,τ . 4. From 2NFAs to 1DFAs Fix an n-state 2nfa N = (s, δ, f ) over some set of states Q and some alphabet Σ. In this section we will generalize the discussion of Section 3, in order to build a 1dfa equivalent to N . 4.1. Tables. Consider any non-empty string u and suppose that the table T := tableN (u) is defined. This means that the set of computations C := lcompN,s (u) hits right into the set of states T (⊥) 6= ∅. Hence, C contains right-hitting computations. Let c be one of them. Clearly, c visits the rightmost symbol of u at least once. If p is the state of N at one of these visits, then combining the prefix of c up to that visit with any of the (possibly 0) right-hitting computations in rcompN,p (u), produces a righthitting computation which is also in C. Therefore, the computations of rcompN,p (u) can hit right only into states that are already in T (⊥). By the definition of T , this implies that T (p) = T (⊥). We thus conclude that T assigns the value T (⊥) to at least one state. Furthermore, a straightforward inspection of the definition of T reveals that every state that is not assigned the set T (⊥) is assigned a set disjoint from T (⊥). This motivates the following definition. 15. Definition. A table of N is any T : Q⊥ → P ′ (Q) such that 1. for every p ∈ Q: T (p) = T (⊥) or T (p) ∩ T (⊥) = ∅, 2. for some p ∈ Q: T (p) = T (⊥). As in the deterministic case, note that this definition explains what a “table of N ” is, whereas Section 2.2-I defines what the “table of N on u” is, for any string u. The relation between the two notions is shown in the next lemma. The lemma after it carries out some counting. 16. Lemma. If the table of N on u is defined, then it is a table of N . Proof. If u 6= ǫ, then the argument before Definition 15 proves the claim. If u = ǫ, then the table of N on u is the constant function {s} (cf. Remark 4), and thus it is obviously a table of N .

4. FROM 2NFAS TO 1DFAS

47

17. Lemma. The number of distinct tables of N is exactly11 n−1 X n−1 X i=0 j=0

n

n i

j

j 2i − 1 .

Proof. The number of distinct tables for N is equal to the number of distinct (n + 1)-tuples of non-empty subsets of [n] where the set of the first component appears in other components, too, but intersects no other set in the tuple. For each i, j = 1,2, . . . , n, there are ni choices for the set S in the first component and nj choices for the set of the components after it that host the same set S. Given i and j, each one of the remaining (n + 1) − (j + 1) components can admit any of the 2n−i − 1 non-empty sets that avoid intersection with S. Overall, we have n X n X i=1 j=1

n

n i

j

2n−i − 1

n−j

=

n−1 X n−1 X i=0 j=0

n

n i

j

2i − 1

j

choices for completing this (n + 1)-tuple, where the right-hand-side of the equation is obtained via a straightforward variable substitution. 4.2. Compatibilities among tables. Consider any string u, any symbol a, and suppose the table T := tableN (u) is defined. We will describe an algorithm for deciding whether the table T ′ := tableN (ua) is defined and, if so, for computing it based on T and a but not u. As in Section 3.2, our algorithm works on any element of Q⊥ . On input ⊥, it either returns T ′ (⊥) or fails, depending on whether T ′ is defined or not. On input q ∈ Q and with T ′ (⊥) available, it returns T ′ (q). We call this algorithm ET,a and we derive it from eτ,a of Section 3.2 by a straightforward generalization that takes into account nondeterminism— see Figure 7 for the two distinct computations, ET,a (⊥) and ET,a (q). Based on ET,a , we can again define a-compatibility between tables—the proof of the next lemma is similar to that of Lemma 9 and is omitted. 18. Definition. If T and T ′ are two tables of N and a some symbol in Σe , we say that T is a-compatible to T ′ if and only if T ′ (⊥) = ET,a (⊥)

and

for all q ∈ Q: T ′ (q) = ET,a (q),

where ET,a is the algorithm from Figure 7. 19. Lemma. Suppose u ∈ Σe∗ and the table T := tableN (u) is defined. Then, for any a ∈ Σe and any table T ′ of N , the following holds: T is a-compatible to T ′ ⇐⇒ tableN (ua) is defined and equals T ′ . 11For i = j = 0, this expression uses the quantity 00 . In this context, 00 = 1.

48

1. EXACT TRADE-OFFS

P ←− T (⊥), S ′ ←− ∅ P ←− {q}, S ′ ←− ∅ repeat: S repeat: S R ←− {δ(p, a) | p ∈ P } R ←− {δ(p, a) | p ∈ P } S ′ ←− SS′ ∪ {r | (r, r) ∈ R} S ′ ←− SS′ ∪ {r | (r, r) ∈ R} ∗ P ←− {T (r) | (r, l) ∈ R} P ∗ ←− {T (r) | (r, l) ∈ R} \ T (⊥) ∗ P ←− {p ∈ P | p not seen before} P ←− {p ∈ P ∗ | p not seen before} if P = ∅ then: if P = ∅ then: if S ′ = ∅: fail if S ′ ⊆ T ′ (⊥): return T ′ (⊥) ′ ′ if S 6= ∅: return S if S ′ 6⊆ T ′ (⊥): return S ′ \ T ′ (⊥) Figure 7. Computing ET,a (⊥) (left) and ET,a (q) (right). 4.3. The upper bound. We now construct a 1dfa M that is equivalent to N . As in Section 3.3, we base our construction on a characterization of acceptance by N in terms of tables and compatibilities. 20. Definition. Suppose w ∈ Σ ∗ is of length l and T0 , T1 , . . . , Tl+2 is a sequence of tables of N . We say the sequence fits w iff: 1. T0 is the constant function {s}, 2. for all i = 0, 1, . . . , l + 1: Ti is we,i+1 -compatible to Ti+1 , 3. Tl+2 is the constant function {f }. 21. Theorem. N accepts w iff some sequence of tables of N fits w. The theorem is proved similarly to Theorem 11 and suggests that M should simply try to find a sequence of tables of N that fits its input. Therefore, M implements the following algorithm: We start with the constant table {s}. On reading a symbol a, we check if the current table is a-compatible to any table of M . If so, there is exactly one such table, so we change our memory to it and move right. Otherwise, there is no sequence that fits w, and we hang. We accept if we ever reach the constant table {f }.

Formally, M := (s′ , δ ′ , f ′ ) where Q′ := {T | T is a table of N }, s′ := the table that always returns {s}, and f ′ := the table that always returns {f }. For any T and a, the value δ ′ (T, a) is either the unique table to which T is a-compatible, if such table exists; or undefined, otherwise. It should be clear that M is correct and as large as claimed. 4.4. The lower bound. We will now exhibit an n-state 2nfa for which every equivalent 1dfa need at least one state per table. Our witness will be the automaton N0 from Section 2.3, solving problem Φ. So, for the rest of this section, we assume that the n-state 2nfa N that we fixed at the beginning of Section 4 is the automaton N0 , and we will show that the

4. FROM 2NFAS TO 1DFAS

hiT

GT i

hiT

GT

xT

49

xT j i

j (a)

(b)

Figure 8. The input wTi,j when n = 6; table T maps ⊥, 1, 2, 3, 4, 5, and 6 to the values {3, 5}, {2, 6}, {1, 2, 4}, {3, 5}, {4}, {6}, {3, 5} respectively; and (a) i = 1, j = 6, or (b) i = 6, j = 5. 2dfa M constructed in the previous section is minimal. We start with some intuition why this must be the case. For each table T : [n]⊥ → P ′ ([n]) of N , each i ∈ [n], and each j ∈ T (i), we consider the nice input

wTi,j := (xT , l)(GT , l)(hiT , r)(j, r), where xT is the smallest x for which T (x) = T (⊥); GT is the binary relation induced by T on [n]; and hiT contains either exactly the arrows from T (⊥) to i, if T (i) 6= T (⊥), or else no arrow at all (see Figure 8 for examples): ( xT :=min{x | T (x) = T (⊥)} ∅ if T (i) = T (⊥), i (5) hT := {(y, i) | y ∈ T (⊥)} else. GT :={(x, y) | y ∈ T (x)} It is easy to verify the following fact. 22. Lemma. For any table T and any two states i, j of N such that j ∈ T (i): The table of N on the prefix ⊢(xT , l)(GT , l) of wTi,j is exactly T and some computations in compN (wTi,j ) are accepting. Moreover, if T (i) 6= T (⊥), then each accepting computation contains a step that crosses the middle boundary from i to j. Hence, as T , i, and j vary as above, the inputs wTi,j collectively force every interesting use of every state of M in some accepting computation, so that intuitively no state of M is dispensable. To turn this intuition into a proof, we first establish the simple fact that, for every two distinct tables, there exists a ‘query’ that can distinguish between them. 23. Lemma. Two tables T and T ′ of N are distinct iff there exist a partial function h : [n] → [n] and a y ∈ [n] such that exactly one of the following two inputs has a path: (xT , l)(GT , l)(h, r)(y, r)

and

(xT ′ , l)(GT ′ , l)(h, r)(y, r).

50

1. EXACT TRADE-OFFS

Proof. If T = T ′ then, for all h and y, the two inputs are identical and therefore Φ does not distinguish between them. For the interesting direction, we suppose T 6= T ′ and examine two cases. If T (⊥) 6= T ′ (⊥), we pick h to be the empty function and y the smallest state in the symmetric difference of the two ⊥-values. Then the input that corresponds to the ⊥-value that contains y is the only one with a path. If T (⊥) = T ′ (⊥) =: Y ∗ , consider the smallest x ∈ [n] with T (x) 6= T ′ (x) (such an x exists, since T 6= T ′ ). It is not hard to see that either both of the two x-values avoid intersection with Y ∗ , or exactly one of them does while the other one equals Y ∗ . (Indeed: If an x-value intersects Y ∗ , then it is actually equal to Y ∗ , since T and T ′ are tables. Hence, if both xvalues intersected Y ∗ , we would have T (x) = T ′ (x), a contradiction. So, at most one of them intersects Y ∗ . And, if one does, this one is equal to Y ∗ .) In both cases, the symmetric difference of the two x-values contains an element which does not belong to Y ∗ . If y is the smallest such element and h contains exactly the arrows from Y ∗ to x, namely h := {(y ∗ , x) | y ∗ ∈ Y ∗ }, then the input that corresponds to the x-value containing y clearly has a path, while the other one does not. Now, for every pair of tables T and T ′ of N we define the nice input wT,T ′ := (xT , l)(GT , l)(hT,T ′ , r)(yT,T ′ , r) where xT , GT are as in (5), while hT,T ′ and yT,T ′ are either the ones given by the proof of Lemma 23, if T 6= T ′ ; or the values hiT and min T (i) as defined in (5) for i = xT , if T = T ′ .12 Strengthening the promise for Φ to allow only inputs of this particular form, we get a new problem Φ′ with Φ′yes := {wT,T ′ | T, T ′ are tables for N and wT,T ′ has a path}, Φ′no := {wT,T ′ | T, T ′ are tables for N and wT,T ′ has no path}. This is clearly still solvable by N , so that n states are enough on a singlepass 2nfa against Φ′ . However, 1dfas need many more states. 24. Lemma. The size of every 1dfa solving Φ′ is at least n−1 X n−1 X i=0 j=0

n

n i

j

j 2i − 1 .

Proof. Assume A is a 1dfa solving Φ′ . For every table T of N , the automaton accepts wT,T (Lemma 22). Hence, the computation cT := compA (wT,T ) hits right, and therefore crosses the middle boundary into some state, call it qT . If the states of A were fewer than the tables of N , 12Again, when T 6= T ′ , the order of the two tables in the definition of these values is not important: hT,T ′ = hT ′ ,T and yT,T ′ = yT ′ ,T . Also, hT,T = ∅.

5. FROM 2NFAS TO 1NFAS

51

two tables T 6= T ′ would map to the same state qT = qT ′ and (by the standard cut-and-paste argument, as in Lemma 14) the automaton would be deciding identically on wT,T ′ and wT ′ ,T , contradicting Lemma 23 and the definition of the two strings. Hence, A must have as many states as there are tables of N . The rest of the proof is by Lemma 17. 5. From 2NFAs to 1NFAs Fix an n-state 2nfa N = (s, δ, f ) over the states of some set Q and the symbols of some alphabet Σ. In this section we will build an equivalent 2n n+1 -state 1nfa via an optimal construction.

5.1. Frontiers. Let us momentarily assume that N is actually deterministic and that c := compN (w) is accepting, for some l-long input w. Consider the i-th frontier (Lci , Ric ) of c, for some i 6= 0, l + 2. The number of states in Ric equals the number of times that c left-to-right crosses the i-th boundary: each crossing contributes a state into Ric and no two crossings contribute the same state, or else c would be looping. Similarly, Lci contains as many states as many times c right-to-left crosses the i-th boundary. Now, since c accepts, it goes from the leftmost symbol ⊢ all the way past the rightmost one ⊣, forcing the left-to-right crossings on every boundary to be exactly 1 more than the right-to-left crossings. Hence, |Lci | + 1 = |Ric |, which remains true even on the leftmost boundary (i = 0, under our convention from Footnote 6 on page 36) and also on the rightmost one (i = l + 2, obviously). Therefore, the equality holds over every boundary, and motivates the following definition. 25. Definition. A frontier of N is any (L, R) ∈ Q × Q such that |L| + 1 = |R|. Note that this defines what a “frontier of N ” is, whereas Section 2.1-III defined what a “frontier of a computation” is. The relation between the two notions is partially explained by the motivating argument above, which shows that if the computation c of a 2dfa on an input is accepting, then all frontiers of c are frontiers of the 2dfa. However, this argument is not valid for our nondeterministic N , because a state repetition under a cell does not necessarily imply looping. However, it does imply a cycle. 26. Definition. A computation is minimal if its points are all distinct. In other words, a computation is minimal iff it is cycle-free. Obviously, a minimal computation is not looping, and for 2dfas the converse is also true. However, for 2nfas the converse is not always true: accepting computations

52

1. EXACT TRADE-OFFS

0

1

2

3

4

s0

4

4

1

3

5

1

2

4

3

2

5

6

7

8

0

1

2

1

1

0

3

2

3

2

1

4 1

4

∅

5

(a)

1

4

1

2

3

4

s

4

3

4

4

3

3 5

5

2

7

1

3

4

8

4 5

2

∅

2

6

1

4

5

f

∅

5

f

(b)

Figure 9. (a) An accepting minimal c ∈ compN (w), for a 6-long w; we assume 0, 1, . . . , 5 are states of N , and s = 0, f = 5. (b) The same c arranged in frontiers; only the even-indexed frontiers are drawn. may not be minimal. So, in order to extend our previous observation to 2nfas, we need to take this detail into account. The following lemma makes the appropriate corrections. The lemma after it is an easy counting. 27. Lemma. If a computation of N on a string w is accepting and minimal, then all frontiers of that computation are frontiers of N . Proof. We just need to modify the argument before Definition 25, as follows: No two left-to-right crossings of the i-th boundary contribute the same state into Ric , or else c would not be minimal. Similarly for Lci . 2n 28. Lemma. The number of distinct frontiers of N is exactly n+1 .

Proof. Easily, the number of different frontiers of N is equal to the number of distinct pairs of subsets of [n] where the second subset is 1-larger than the first one. In turn, this is equal to the number of different ways of choosing n + 1 elements of the set [2n]: the unselected elements of [n] determine the first subset, while the selected elements of [2n]−[n]13determine 2n the second subset. So, our number is exactly n+1 , as claimed. 5.2. Compatibilities among frontiers. Suppose c is an accepting minimal computation of N on an l-long w and let Fic = (Lci , Ric ) be its i-th frontier, for each i = 0, 1, . . . , l + 2 (Figure 9). Note that the first and last c frontiers in the sequence F0c , F1c , . . . , Fl+2 are always F0c = (∅, {s})

and

c Fl+2 = (∅, {f }),

as c starts at s, ends in f , and never right-to-left crosses an outer boundary. c Also note that, for (L, R) := (Lci , Ric ) and (L′ , R′ ) := (Lci+1 , Ri+1 ) two successive frontiers in the sequence (Figure 10a), it should always be that R ∩ L′ = ∅: otherwise, c would be using the same state under the (i + 1)st 13Alternatively, this number can be written as nC , where C is the n-th Catalan n n number [12]. So, Catalan strikes again! [10]

5. FROM 2NFAS TO 1NFAS 0

1

2

w1

i−1

i

i+1

wi−1 wi

R

L

i+2

wi+1

l

l+1

53

l+2

wi

R′

L′

i+1

i

wl

R

L

R′

L′

Fi Fi+1

Fi Fi+1

(a)

(b)

Figure 10. (a) Two successive frontiers. (b) The associated bijection. cell of the tape and would not be minimal. Hence, R + L′ contains as many states as many (occurrences of) states there are in L and R′ together: |R + L′ | = |R| + |L′ | = |L| + 1 + |R′ | − 1 = |L| + |R′ | = |L ⊎ R′ |.

Hence, bijections can be found from R + L′ to L ⊎ R′ . Among them, a very natural one (Figure 10b): for each q ∈ R + L′ find the unique step in c that produces q under the (i + 1)st cell (this is either a left-to-right crossing of boundary i or a right-to-left crossing of boundary i + 1; the minimality of c guarantees uniqueness); the next step left-to-right crosses boundary i+1 into some state p ∈ R′ or right-to-left crosses boundary i into some p ∈ L; depending on the case, map q to (p, r) or (p, l) respectively. If ρ : R + L′ → L ⊎ R′ is this mapping, it is easy to verify that it is injective (because c is minimal) and therefore bijective, as promised. In addition, it is clear that ρ respects the transition function, in the sense that ρ(q) ∈ δ(q, we,i+1 ), for all q ∈ R + L′ . Overall, this discussion shows that every accepting minimal computation in compN (w) exhibits a sequence of frontiers which obeys certain restrictions. The following definitions and lemma summarize these findings. 29. Definition. If (L, R) and (L′ , R′ ) are two frontiers of N and a some symbol in Σe , we say that (L, R) is a-compatible to (L′ , R′ ) iff 1. R ∩ L′ = ∅, and 2. there exists a bijection ρ : R + L′ → L ⊎ R′ that respects the transition function on a: ρ(q) ∈ δ(q, a), for all q ∈ R + L′ .14 14Note that, if N is deterministic, then (L, R) is a-compatible to (L′ , R′ ) exactly if the following, much simpler, condition holds: (∀ξ ∈ L ⊎ R′ )(∃q ∈ L′ ∪ R)(δ(q, a) = ξ).

54

1. EXACT TRADE-OFFS

30. Definition. Suppose w ∈ Σ ∗ is of length l and F0 , F1 , . . . , Fl+2 is a sequence of frontiers of N . We say the sequence fits w iff 1. F0 = (∅, {s}), 2. for all i = 0, 1, . . . , l + 1: Fi is we,i+1 -compatible to Fi+1 , 3. Fl+2 = (∅, {f }). 31. Lemma. For all w ∈ Σ ∗ : if compN (w) contains an accepting computation, then some sequence of frontiers of N fits w. Proof. Suppose compN (w) contains an accepting computation d. Removing from d all cycles, we get a computation c which is also in compN (w) and is accepting and minimal. Then, the argument before Definition 29 proves that the sequence of the frontiers of c fits w. The crucial observation—which we prove in the next section—is that the converse of this lemma is also true, and therefore an analogue of Theorems 11 and 21 holds. 32. Lemma. For all w ∈ Σ ∗ : if some sequence of frontiers of N fits w, then compN (w) contains an accepting computation. 33. Theorem. N accepts w iff some sequence of frontiers of N fits w. 5.3. Proof of the main observation. In this section we establish Lemma 32. So, assume that the sequence of frontiers F0 = (L0 , R0 ), F1 = (L1 , R1 ), . . . , Fl+2 = (Ll+2 , Rl+2 ) fits w. We will prove a stronger statement: for every i = 0, 1, . . . , l + 2, the states of Ri can be produced by |Ri | right-hitting computations on ⊢w1 · · · wi−1 , one of them starting at s and on ⊢ and each one of the remaining |Li | starting at a distinct q ∈ Li and on wi−1 . More formally, we will prove the following claim. Claim. For each i = 0, 1, . . . , l + 2, there is a bijection πi : (Li )⊥ → Ri which satisfies the following two conditions: 1. some c ∈ lcompN,s (⊢w1 · · · wi−1 ) hits right into πi (⊥), and 2. for all q ∈ Li , some c ∈ rcompN,q (⊢w1 · · · wi−1 ) hits right into πi (q). Note that Lemma 32 indeed follows from this claim, when i = l + 2: The only bijection from (Ll+2 )⊥ = ∅⊥ = {⊥} to Rl+2 = {f } is πl+2 := {(⊥, f )}. Hence, condition 1 says that some computation in lcompN,s (⊢w1 · · · wl ⊣) hits right into πl+2 (⊥) = f , which is simply another way of saying that compN (w) contains an accepting computation. To prove the claim, we use induction on i. Base Case. The base case i = 0 is satisfied by the definitions. The only bijection from (L0 )⊥ = ∅⊥ = {⊥} to R0 = {s} is π0 = {(⊥, s)}. Condition 1 is true because lcompN,s (ǫ) contains exactly the 0-length computation (s, 1) which does hit right into s (cf. Remark 1). Condition 2 is true vacuously, as L0 = ∅.

5. FROM 2NFAS TO 1NFAS 0

1

i−1

2

w1

i

wi−1

s

i+1

a

R

R′

L′

L Fi

π

55

Fi+1 π′

ρ

Figure 11. An example for the inductive step in the proof of Lemma 32. Note, for instance, that σ maps the 3rd and 5th (from the top) states of R to ⊥, while the 4th state is mapped to the 1st state of R′ . Inductive Step. For the inductive step (Figure 11), assume i < l + 2, let (L, R) := (Li , Ri ), (L′ , R′ ) := (Li+1 , Ri+1 ), a := we,i+1 , and consider the bijections π := πi : L⊥ → R

ρ : R + L′ → L ⊎ R ′

and

guaranteed respectively by the inductive hypothesis and by the assumption that (L, R) is a-compatible to (L′ , R′ ). We need to build a bijection π ′ := πi+1 : (L′ )⊥ → R′ that satisfies Conditions 1, 2 of the claim. We will do so based on π, ρ, and one more function σ that will emerge from the following discussion. Definition of σ. Consider any state q ∈ R and let us take a trip around under ⊢w1 w2 · · · wi−1 a by alternately following bijections ρ and π (6)

q, r0

ρ(q), r1

πρ(q), r2

ρπρ(q), r3

πρπρ(q), . . . r4

until the first time that ‘ρ fails to return a state in L’,15 and let r0 , r1 , r2 , . . . be the states that we visit. There are two cases about what might happen. 15Note that we abuse notation here. Bijection ρ can only return a pair of the form (p, l) or (p, r). So, in the description (6) above, ρ(·) really means ‘the first component of ρ(·), if the second component is l’. Similarly, ‘ρ fails to return a state in L’ means ‘ρ returns a pair of the form (p, r)’. Hopefully, the abuse does not confuse. We will stick to it throughout the definition of σ.

56

1. EXACT TRADE-OFFS

Case 1 is that ρ does eventually fail to return a state in L and the trip pays only a finite number of visits r0 , r1 , . . . , rk , for some even k ≥ 0 and with rk ∈ R. Then rk is ρ-mapped to some q ′ ∈ R′ . Case 2 is that ρ always returns a state in L and the trip is infinite. Since all even-indexed and all odd-indexed visits in the trip are inside the finite sets R and L respectively, there have to be repetitions of states both on the even and on the odd indices. Let k be the first index for which some earlier index j < k of the same parity points to the same state: rj = rk . If k is odd, then j is also odd and hence j ≥ 1; then rj = rk =⇒ ρ−1 (rj ) = ρ−1 (rk ) =⇒ rj−1 = rk−1 and k − 1 also has the property that k is the earliest one to have, a contradiction. So, k must be even, and so must j. In fact, j must be 0—otherwise we can again reach a contradiction (as before, with π −1 instead of ρ−1 ). Hence, the first state to be revisited is the state q we started from and our trip consists of infinitely many copies of a cycle r0 , r1 , . . . , rk , for some even k ≥ 2, for rk = r0 = q ∈ R, and with no two states in the list r0 , r1 , . . . , rk−1 being both equal and at positions of the same parity. Overall, in the trip that we take, we either reach a state rk ∈ R (possibly k = 0) that is ρ-mapped to a state q ′ ∈ R′ (Case 1), or we return to the starting state q ∈ R having previously repeated no state in L and no state in R (Case 2). We define the function σ : R → (R′ )⊥ to encode exactly this information. Specifically: in Case 1, we set σ(q) := q ′ ; in Case 2, we set σ(q) := ⊥. In either case, our trip respects π and ρ, which in turn respect the behavior of N on ⊢w1 w2 . . . wi−1 a. It should therefore be clear that, according to our construction, • σ(q) = q ′ implies that some c ∈ rcompN,q (⊢w1 · · · wi−1 a) respects π, ρ and hits right into q ′ , while • σ(q) = ⊥ implies that some looping c ∈ rcompN,q (⊢w1 · · · wi−1 a) respects π, ρ and repeats a cycle that can only visit states from R when under a. This concludes the definition of σ and our discussion of its properties. We are now ready to return to the construction of bijection π ′ . Recall that this must inject L′⊥ to R′ so that Conditions 1 and 2 of the claim above are satisfied. We examine three separate cases about the argument of π ′ . Case (a). The easiest argument is a state p ∈ L′ that is ρ-mapped rightward to a state r ∈ R′ . Then we just let π ′ also return that state: π ′ (p) := r. Since ρ respects the transition function on a, the corresponding 1-step computation from p rightward to r is indeed a computation in rcompN,p (⊢w1 · · · wi−1 a) that hits right into π ′ (p).

5. FROM 2NFAS TO 1NFAS

57

Case (b). If the argument is some p ∈ L′ that is ρ-mapped leftward to a state r ∈ L, then we consider where in R bijection π takes us from there: q := π(r). We know some computation of N can start at p under a and eventually reach q under a, so the question is what can happen after that if we keep following ρ and π. To answer this question, we examine σ(q). If σ(q) = ⊥, then we will eventually return to q after a cycle of length at least 2 and having visited only states of R when under a. But can this happen? If it does, then the next-to-last and last steps in this cycle will follow ρ and π respectively, ending up in q. Since ρ and π are bijections, the last two states (before q) in this cycle must respectively be p and r. In particular, p must be in the cycle. But, since the cycle visits only states from R whenever under a, we should have p ∈ R. This means R and L′ must intersect, and hence (L, R) is not a-compatible to (L′ , R′ ), a contradiction. It is thus necessary that σ(q) = q ′ ∈ R′ , which implies that some computation c ∈ rcompN,q (⊢w1 · · · wi−1 a) hits right into q ′ . Prefixed by the computation that takes p to q, this c becomes a computation in rcompN,p (⊢w1 · · · wi−1 a) that hits right into q ′ . So, we can set π ′ (p) := q ′ . Case (c). It remains to define π ′ (⊥). The reasoning resembles Case (b). We consider the state q ∈ R where π(⊥) takes us, and examine σ(q). Again, σ(q) = ⊥ is impossible, as this would imply that ⊥ ∈ L, a contradiction. Hence, σ(q) = q ′ for some q ′ ∈ R′ . Combining the computation guaranteed by π(⊥) = q with the one guaranteed by σ(q) = q ′ , we get a computation in lcompN,s (⊢w1 · · · wi−1 a) that hits right into q ′ . So, we can set π ′ (⊥) := q ′ . This concludes the definition of π ′ . The construction must have made clear that π ′ satisfies the two conditions of the claim; that it is also a bijection should be an easy consequence of the way bijections π and ρ are used. Hence, the inductive step is complete, and with it the proof of the claim. 2n 5.4. The upper bound. We are now ready to build a n+1 -state 1nfa N ′ that simulates N . Our construction is based on Theorem 33. In other words, the strategy of N ′ is to scan the input ‘guessing’ the members of a sequence of frontiers, one after the other, and verifying that this sequence fits the input. Precisely, N ′ implements the following algorithm: We start with the frontier (∅, {s}) in our memory. On reading a symbol a, we check if the frontier in our memory is a-compatible to any other frontiers. If not, we just hang. If it is, we find all such frontiers, select one of them nondeterministically, and move right with it as our new memory. If we ever reach the frontier (∅, {f }), we accept.

Formally, N ′ := (s′ , δ ′ , f ′ ) where Q′ := {F | F is a frontier for N }, s′ := (∅, {s}), f ′ := (∅, {f }), and the transition function is such that δ ′ (F, a) := {F ′ | F is a-compatible to F ′ }, for all F ∈ Q′ and a ∈ Σe . It should be clear that N ′ is correct and as large as promised.

58

1. EXACT TRADE-OFFS

gF x1

hF

x0

gF

y1 x0

y2 y3 y4

x2 x3

xF

(a)

y1

x1

y2

x2

y3

hF

x3

y4 yF

(b)

Figure 12. (a) The deterministic nice input wF when n = 6 and F = ({1, 5, 6}, {2, 4, 5, 6}). (b) How to derive it from the corresponding list 2, 2, 1, 4, 5, 5, 6, 6. 5.5. The lower bound. To prove that the construction of the previous section is optimal, we will exhibit an n-state 2nfa that has no equivalent 2n 1nfa with fewer than n+1 states. In fact, our witness will simply be the automaton M0 from Section 2.3, which is deterministic and single-pass. 2n Moreover, 1nfas will be shown to need n+1 states not only for staying equivalent to M0 , but even for solving Ψ —on a stricter promise, actually. So, assume that the n-state 2nfa N that we kept fixed since the beginning of Section 5 is actually M0 . We will show that the 1nfa N ′ constructed in the previous section is minimal. We start with some intuition. Consider an arbitrary frontier F = (L, R) of N and let us list the elements of the sets L, R ⊆ [n] in increasing order, L = {x1 , x2 , . . . , xm }

and

R = {y1 , y2 , . . . , ym+1 },

for the appropriate 0 ≤ m < n. Since m < n, we know that L is a strict subset of [n]. So, we can name an element that does not belong to L, say x0 := min L. Then the combined list (7)

x0 y1 x1 y2 x2 · · · ym xm ym+1

gives rise to the following deterministic nice input (see Figure 12): wF := (xF , l)(gF , l)(hF , r)(yF , r), where xF := x0 , the function gF maps each x of (7) to its following y, the function hF maps each y 6= ym+1 to its following x, and yF := ym+1 ; i.e.: (8)

xF := min L gF := {(xi , yi+1 ) | 0 ≤ i ≤ m}

yF := max R hF := {(yi , xi ) | 1 ≤ i ≤ m}.

It is easy to verify that this input has a path and that the following holds. 34. Lemma. For any frontier F of N , the computation of N on wF is accepting and its frontier under the middle boundary is exactly F . Hence, every state of N ′ is used in some accepting computation and is therefore not redundant in any obvious way. So, N ′ appears to be minimal.

5. FROM 2NFAS TO 1NFAS

x′1

x1 x0

y1 y2 y3 y4

x2 x3 (a)

x′1 y1′ x′0 y2′ y3′ x′2 ′ y4 x′3

x1 y1′ x0 y2′ y3′ y4′ x2 x3

x′0

x′2 x′3 (b)

59

(c)

y1 y2 y3 y4 (d)

Figure 13. (a) Input wF from Figure 12. (b) A new input wF ′ , for F ′ = ({1, 4, 5}, {2, 3, 4, 5}). (c,d) Inputs wF,F ′ and wF ′ ,F . Note that only wF,F ′ has a path. To prove this intuition, we start by noting that every two frontiers F and F ′ of N give rise to the deterministic nice input (cf. Figure 13) wF,F ′ := (xF , l)(gF , l)(hF ′ , r)(yF ′ , r), where xF , gF , hF ′ , yF ′ are as in (8). Strengthening the promise of problem ′′ ′′ Ψ to allow only inputs of this form, we get problem Ψ ′′ = (Ψyes , Ψno ), with ′′ Ψyes := {wF,F ′ | F, F ′ are frontiers for N and wF,F ′ has a path}, ′′ := {wF,F ′ | F, F ′ are frontiers for N and wF,F ′ has no path}. Ψno

Clearly, N solves this problem, so that n states on a (single-pass deterministic) 2nfa are enough to solve Ψ ′′ . For 1nfas the problem is harder. 2n 35. Lemma. Every 1nfa for Ψ ′′ has at least n+1 states.

At the heart of the argument for this lemma lies the fact that, in the

2n 2n ′ ′ n+1 × n+1 matrix W = [wF,F ]F,F containing all inputs of the form ′ wF,F , two distinct inputs sitting in cells that are symmetric with respect

to the main diagonal cannot both have a path. In other words:

′′ Claim. For all frontiers F , F ′ : wF,F ′ , wF ′ ,F ∈ Ψyes ⇐⇒ F = F ′ .

Proof. Let F = (L, R) and F ′ = (L′ , R′ ) be any two frontiers of N . If F = F ′ then wF,F ′ = wF ′ ,F = wF and we have already observed that this input has a path. For the interesting direction, we assume that F 6= F ′ and we will prove that at least one of wF,F ′ , wF ′ ,F lacks a path. We start by letting m = |L|, m′ = |L′ | and considering the combined lists defined by the two frontiers, as in (7): x0 y1 x1 y2 x2 · · · ym xm ym+1

′ ′ and x′0 y1′ x′1 y2′ x′2 · · · ym x′m ym ′ +1 .

If the two lists were identical after their first elements, they would agree in their lengths, in their x’s (except possibly at x0 , x′0 ), and in their y’s, forcing F = F ′ , a contradiction. Hence, there have to be positions of disagreement after 0. Consider the earliest one among them.

60

1. EXACT TRADE-OFFS

If this position is occupied by y’s, say yi and yi′ , then we have either that yi < yi′ (Case 1) or that yi > yi′ (Case 2). If it is occupied by x’s, say xi and x′i , then we have either that xi < x′i or x′i is not present at all16 (Case 3) or that xi > x′i or xi is not present at all (Case 4). The four cases are treated with similar arguments. We will present only the argument for the first one in detail, and sketch the rest. So, suppose that the first disagreement is between yi and yi′ , and that in fact yi < yi′ . This implies that all previous positions after 0 contain identical elements: xF

gF x0 x′0

y1 y1′

x1 x′1

y2 y2′

x2 x′2

··· ···

yi−1 xi−1 yi ′ yi−1 x′i−1 yi′

hF ′ It also implies that yi is not in R′ . Indeed, if it were, it would be in the sub′ list y1′ , y2′ , . . . , yi−1 (since yi < yi′ ), and hence in the sublist y1 , y2 , . . . , yi−1 (since the two sublists coincide), contradicting the fact that yi is greater than all these elements of R. So yi ∈ / R′ , and therefore yi is not yF ′ (which ′ ′ is in R ) and has no value under hF (since the domain of hF ′ is also in R′ ). But then searching for a path in wF,F ′ we travel deterministically gF

h

gF

h

h

gF

F F F (x′1 = x1 ) → (y2 = y2′ ) → ··· → (x′i−1 = xi−1 ) → yi x0 → (y1 = y1′ ) → ′

′

′

reaching a node which is neither the exit yF ′ nor the start of an hF ′ -arrow. This means that wF,F ′ has no path. In Case 2, we similarly conclude that yi′ ∈ / R and that gF ′ , hF combine to reach yi′ ; but this is neither the exit yF nor the start of an hF -arrow, implying wF ′ ,F has no path. In Case 3, we deduce that xi ∈ / L′ and yet gF ′ , hF combine to reach it, so wF ′ ,F has no path. Finally, in Case 4, x′i is outside L while gF and hF ′ reach it, so that wF,F ′ has no path. Proof of Lemma 35. Towards a contradiction, assume that A is a 2n that solves Ψ ′′ with fewer than n+1 states. For each frontier F for ′′ N , we know the input wF = wF,F is in Ψyes and therefore A accepts it. Choose any accepting computation cF ∈ compA (wF ) and let qF be the state immediately after the middle boundary is crossed. Since the states of A are fewer than the frontiers for N , we know qF = qF ′ for two frontiers F 6= F ′ . But then, the usual cut-and-paste argument on the computations cF and cF ′ shows that A must also accept the inputs wF,F ′ and wF ′ ,F . ′′ Since A solves Ψ ′′ , we conclude that wF,F ′ , wF ′ ,F ∈ Ψyes despite F 6= F ′ , a contradiction to the last claim. 1nfa

16This happens if the list for F ′ stops at y ′ . i

6. CONCLUSION

61

6. Conclusion In this chapter we showed the exact trade-offs in the conversions from two-way (deterministic or nondeterministic) to one-way (deterministic or nondeterministic) finite automata. Our arguments recast those of Birget [6] into a more standard set-theoretic vocabulary and then complement them by carefully removing the redundancies in the associated constructions.17 Introducing frontiers, we provided a set-theoretic characterization of 2nfa acceptance (already present in [6], essentially) that complements the also set-theoretic characterization of 2nfa rejection given in [60]. Moreover, by applying the concept of promise problems even to the domain of regular languages, we nicely confirmed its reputation for always leading us straight to the combinatorial core of the hardness of a computational task. Crucially, the tight simulations performed by one-way automata in our proofs are as ‘meaningful’ as the tight simulation of [47] for the determinization of 1nfas: each state in these automata corresponds to a realizable and non-redundant set-theoretic object (a table, a frontier) that naturally emerges from the computational behavior of the simulated machine. It would be nice to identify similar objects and derive exact tradeoffs for the conversions from and towards other types of automata (e.g., alternating, probabilistic, or pebble automata, or even Hennie machines [4]) and more powerful machines (e.g., pushdown automata). It would also be interesting to know if the large size of the alphabet over which problems Φ and Ψ are defined is necessary for the exactness of the associated trade-offs.

A preliminary version of the contents of Section 5 can be found in [29]. 17First, the reasoning for the improvement on Shepherdson’s idea in the proof of

[6, Theorem A3.4] was refined. Second, the universal 1nfa constructed in the proof of [6, Theorem 4.2(1)] was observed to not be minimal: it could be implemented with only 4n + 4 states, as opposed to 8n + 3. Then, a careful application of the reachable-set construction in the proof of [6, Theorem 4.5] (on the minimal universal 1nfa obtained previously) revealed the frontier structure.

CHAPTER 2

2D versus 2N After Chapter 1, our understanding for almost all conversions shown in Figure 1 (page 20) is perfect. The only exceptions are the two conversions associated with the 2d vs. 2n problem, and our understanding of them is so limited that we cannot even tell whether the associated trade-offs are polynomially bounded. In this chapter we will advance our knowledge about these conversions in two quite different directions. In Section 2, we will focus on the conversion from 1nfas to 2dfas and the associated complete problem of liveness. We will prove that a certain class of 2dfas of restricted information fail to solve this problem, no matter how large they are. In Section 3 we will focus on the conversion from 2nfas to 2dfas and a certain class of 2nfas of restricted bidirectionality, the sweeping 2nfas. We will prove that small automata of this kind are not closed under complement. See Section 3 of the Introduction for the motivation behind these two different approaches. We begin with a brief note on the history of the 2d vs. 2n question. 1. History of the Problem The 2d vs. 2n question was first studied in the manuscript [51]. In it, Seiferas worked on the conversion from 1nfas to 2dfas. He suggested the strong conjecture (cf. page 15) that the trade-off is at least 2n − 1, and presented several examples of problems that could serve as witnesses. Soon after that, Sakoda and Sipser [48] invested the question with a robust theoretical framework (cf. Introduction, Section 3.1). Among other things, they defined the classes 1n, 2d, and 2n, along with the appropriate reduction relation that allowed the identification of complete problems. A 2n-complete and a 1n-complete problem were also defined, the latter being liveness. At the same time, the problems from [51] proved to be 1n-complete, too. In one class of attempts towards 2d 6= 2n, people have focused on proving exponential lower bounds for the trade-off from 1nfas to 2dfas of limited bidirectionality. Already in [51], Seiferas showed that the tradeoff is at least 2n − 1 if the 2dfas are single-pass. Later, Sipser [55] did the same for the case of 2dfas that are sweeping—much later, Leung [34] showed the lower bound remains as large even on a binary alphabet, as 63

64

2. 2D VERSUS 2N

opposed to the exponentially large one of [55]. Recently, Hromkovic and Schnitger [22] did the same for the case when the 2dfas are oblivious, in the sense that they move identically on all inputs of the same length—they also showed the lower bound remains exponential if we relax the restriction to allow a sub-linear (in the input length) number of distinct trajectories. Unfortunately, we know that none of these theorems resolves the conjecture in its generality, since full 2dfas can be exponentially more succinct than each of these restricted variants [51, 55, 2]. A second class of attempts has focused on unary automata. Under this restriction, Chrobak [7] proved that the trade-off from 1nfas to 2dfas is at most O(n2 ) and at least Ω(n2 ). Note that, on one hand, this upper bound shows that 2d ⊇ 1n for unary automata, so that the situation on unary inputs is sharply different from what it is conjectured to be in the general case. On the other hand, the lower bound is the best known one even for the trade-off from general 2nfas to 2dfas. In two more recent developments, Geffert, Mereghetti and Pighizzini have established the sub2 exponential upper bound 2Θ(lg n) for the trade-off from unary 2nfas to 2dfas [13], as well as a polynomial upper bound for the trade-off in the complementation of unary 2nfas [14]. Finally, there have also been some variations of the general problem of converting a 2nfa to 2dfa. If we demand that the 2dfa can decide identically to the simulated 2nfa no matter what state and input position the latter is started at (a requirement conceptually stronger than ordinary k simulation, but always satisfiable [5]), then the trade-off is at least 2lg n , for any k [25]. If we demand that the 2dfa decides identically to the simulated 2nfa only on all polynomially long inputs (a requirement conceptually weaker than ordinary simulation), then an exponential lower bound would confirm the old belief that nondeterminism is essential in logarithmic-space Turing machines (l 6= nl) [3]. Last, if we allow the starting 2nfa to be a Hennie machine (a more powerful device, but still not powerful enough to solve non-regular problems), then converting to a 2dfa indeed costs exponentially, but only because converting to a 2nfa already does [4]. 2. Restricted Information: Moles In this section we explore the approach that we described in Section 3.2 of the Introduction. After we formally define what it means for a 2nfa to be a mole, we will move on to prove that two-way deterministic moles cannot solve liveness, irrespective of how large they are. 2.1. Preliminaries. Our notation and definitions are as explained in Section 2 of the previous chapter, plus the following few additional concepts. If A and B are two sets, then A ⊖ B denotes their symmetric difference. If f and g are two functions, then f ◦ g and f g denote their composition,

2. MOLES 0

1

2

65

3

1 2 3 4 5 (a)

(b)

(c)

Figure 14. (a) Three symbols in Σ5 . (b) The string they define, simplified and indexed. (c) A 5-long 2-{1, 2, 4}-1 path, which is 2-disjoint on itself. returning g f (x) for every x, while f k denotes the k-fold composition of f with itself. In contrast, for u a string of symbols, uk denotes the concatenation of k copies of u. 2.1-I. Behavior of a 2dfa. Given any 2dfa M over set of states Q and alphabet Σ and any string u ∈ Σ ∗ , the behavior of M on u is the partial mapping γu from Q × {l, r} to Q × {l, r} that encodes all possible ‘entry-exit pairs’ as M computes on u: for every q ∈ Q, if lcompM,q (u) hits left into p, (p, l) γu (q, l) := (p, r) if lcompM,q (u) hits right into p, undefined if lcompM,q (u) loops or hangs,

while γu (q, r) is defined analogously, with rcomp instead of lcomp. 2.1-II. Strings over Σn . Recall the alphabet Σn over which we defined liveness (cf. Introduction, Section 3.1; see also Figure 14a). A concise way to refer to a symbol of Σn is to simply list its arrows in brackets: e.g., the rightmost symbol in Figure 14a is [12,14,25,44] . The symbol [] containing no arrows is called the empty symbol. Given any string x ∈ Σn∗ , we define the set of its nodes in the following, quite natural way (Figure 14b): Vx := {(i, j) | i ∈ [n] & 0 ≤ j ≤ |x|}. The left-degree of a node (i, j) ∈ Vx is the number of its neighbors on the column to its left (column j − 1), or 0 if j = 0. Similarly, the right-degree of (i, j) is the number of its neighbors on the column to its right, or 0. If x has exactly |x| edges that form 1 live path, we say x is a path (Figure 14c). For il , ir ∈ I ⊆ [n], we say x is a il -I-ir path if this one live path connects the il th leftmost node to the ir th rightmost node and visits only nodes with indices in I. If y ∈ Σn∗ , then x ∪ y is the unique string of length max(|x|, |y|) that has all edges of x, all edges of y, and no other edges. For k ≥ 0, we say y is k-disjoint on x if in x ∪ ([]k y) the edges from x and from y meet at no node (Figure 14c; see also Figure 19a on page 78).

66

2. 2D VERSUS 2N 1 2 3 4 5 ?

p

?

Figure 15. State p of focus (5, l) is reading the middle symbol: if it moves right, the next focus will be (1, l) or (3, l); if it moves left, the next focus will be (1, r) or (5, r). 2.2. Moles. To define when a 2nfa over Σn is a mole, we need a way of describing the notion of a state ‘focusing on’ some particular node of the current symbol. We define a focus to be any pair (i, s) ∈ [n] × {l, r} of index and side. We write s for the side opposite s. The (i, s)th node of a string x is the ith node of its leftmost (resp., rightmost) column, if s = l (resp., if s = r). The connected component of that node in the graph implied by x is called the (i, s)th component of x. By x ↾ (i, s) we denote the unique string that has the same length as x, all edges of the (i, s)th component of x, and no other edges. A 2nfa is a mole if each state p of it can be assigned a focus (ip , sp ) so that, whenever at p, the automaton behaves like a mole located on the (ip , sp )th node of the current symbol and facing sp : (i) it can ‘see’ only the component of that node, and (ii) it can ‘move’ only to nodes in that same component. More carefully: 1. Definition. Let M = (·, δ, ·) be a 2nfa over a set of states Q and the alphabet Σn . An assignment of foci for M is any mapping φ : Q → [n] × {l, r} such that, for any states p, q ∈ Q, symbol a ∈ Σn , and side s ∈ {l, r}: whenever M is at p reading a, (i) its next move depends only on the component containing the node which p is focused on: δ(p, a) = δ(p, a ↾ φ(p) ), (ii) its next state and position can only be such that the new focused node belongs to the same connected component as the node which p is focused on: if δ(p, a) ∋ (q, s), then (∃i ∈ [n]) φ(q) = (i, s) & a ↾ (i, s) = a ↾ φ(p) .

If such φ exists, we say M is a mole; we also say φ(p) is the focus of p.

To understand Condition (ii), consider as an example the case s = r (see also Figure 15): If p on a moves right into q, then in the new position q must focus on the left column (φ(q) = (·, s) = (·, l)), the one shared with the previous position. Moreover, if in this column q focuses on the ith node (φ(q) = (i, l)), then in the previous position this node (now the ith node of the right column) must belong to the same connected component as the node which p focused on (a ↾ (i, r) = a ↾ φ(p)).

2. MOLES

67

Note that the 1nfa from page 21 clearly satisfies Definition 1. Hence, small one-way nondeterministic moles can indeed solve liveness—with just n states, actually. In contrast, we will prove the following. 2. Theorem. Two-way deterministic moles cannot solve liveness. Remark that the theorem applies to all two-way deterministic moles, as opposed to only small ones. We also stress that the main purpose of Definition 1 is to disambiguate the intuitive notion of a mole—in contrast, the arguments in our proofs will heavily rely on intuition. 2.3. Mazes. What makes moles so weak is the fact that, as they move through the input, they can only observe the part of the graph directly connected to their current location. The rest of the graph is not observable, even if it occupies the same symbols as the observable part, and therefore does not affect the computation. Lemma 3 below turns this intuition into a clean fact that can be used in proofs. Before stating it, we need to talk about mazes and how moles compute on them and their compositions. Intuitively, a maze is any string on which some nodes have been designated as ‘entry-exit gates’ for moles (Figure 19b on page 78). More carefully, for x ∈ Σ ∗ , let Vx0 ⊆ Vx consist of every node that has exactly one of its two degrees equal to 0 (and can thus serve as a gate). A maze on x is any pair (x, X) where X ⊆ Vx0 . The computation of a mole on a maze is the same object as the computation of any 2nfa on any string, with the extra condition that it ‘starts by entering a gate’ and ‘if it exits a gate, it ends immediately’. Formally, let χ = (x, X) be a maze, u = (i, j) ∈ X a gate with 0-degree side s, and p a state of a mole M with focus φ(p) = (i, s). Then, the computation compM,p,u (χ) of M on χ from p and u (note the overloading of operator comp) is a prefix of either compM,p,j+1 (x) (if s = l) or compM,p,j (x) (if s = r). The prefix ends the first time (if ever) it reaches a point (qt , jt ) where the focus φ(qt ) = (it , st ) is on a gate with 0-degree side st . Note that x may contain nodes that have degree 0 on one of their two sides but are not gates; the computation may very well visit the 0-degree side of these nodes without having to terminate. To compose two mazes means to draw their strings on top of each other and then discard all coinciding gates (Figure 19c). More carefully, mazes χ = (x, X) and ψ = (y, Y ) are composable iff |x| = |y| (so that Vx = Vy = V ) and their graphs intersect only at gates and only appropriately: every v ∈ V , either has both its degrees equal to 0 in at least one of x, y; or is a gate in both mazes, with a different 0-degree side in each of them. If χ, ψ are composable, then their composition is the pair χ ◦ ψ := (x ∪ y, X ⊖ Y ). Clearly, the composition is a maze, too. Note that, by the conditions of composability, in each one of the symbols of x ∪ y every non-empty connected component comes entirely from

68

2. 2D VERSUS 2N

exactly one of x or y. Hence, when a mole reads a symbol, its next step depends on exactly one of x or y. Generalizing, we can prove the following. 3. Lemma. Let χ and ψ be as above, and ω := χ◦ψ be their composition. Consider any computation c := compM,p,u (χ ◦ ψ) of a mole M from a gate u ∈ X ⊖ Y that comes from X. Then there exists a unique list of computations c1 , c2 , . . . such that: • each ct is a computation of M on χ (resp., on ψ) iff t is odd (even); • c1 starts from p and u, while every ct+1 starts from the state and gate where ct ends; • if we remove the first point of every ct after c1 and then concatenate all computations, the resulting computation is c. Put another way, if we can decompose a maze ω into two mazes χ and ψ, then any computation c of a mole on ω can be uniquely decomposed into ‘subcomputations’ c1 , c2 , . . . that alternate between χ and ψ. We say these computations are the fragments of c with respect to the decomposition ω = χ ◦ ψ. Clearly, either all fragments are finite, and then their list is infinite iff c is; or not all fragments are finite, in which case their list is finite and the only infinite fragment is the last one. Note that different decompositions of ω may lead to different decompositions of c. 2.4. Hard inputs. In Section 2.5 we will fix an arbitrary deterministic mole and prove that it fails against liveness. To this end, we will construct inputs on which the automaton decides incorrectly. Those fatally hard strings will be extremely long. However, we will build them out of other, much shorter (but still very long) strings, which already strain the ability of the automaton to process the information on its tape. In this section we describe those shorter strings. We start with inputs which can be built for any 2dfa and later (Section 2.4-V) focus on inputs that can be built particularly for deterministic moles. So, fix M to be an arbitrary 2dfa over state set Q and alphabet Σ. 2.4-I. Dilemmas. Consider any property P ⊆ Σ ∗ of the strings over Σ, and assume that it is infinitely extensible to the right, in the sense that every string that has the property can be right-extended into a strictly longer one that also has it: (∀y ∈ P )(∃z 6= ǫ)(yz ∈ P ). For example, the property of being of even length is of this kind. Given any y ∈ P , we can perform the following experiment. For each p ∈ Q, we examine the computation lcompM,p (y) and check if it hits right : if it does, we set a bit ay,p to 1; otherwise, the computation hangs, loops, or hits left, and ay,p is set to 0. In the end, we build the bit-vector ay := (ay,p )p∈Q . This is our outcome. How does the outcome change if we right-extend y into some yz ∈ P ? How do ay and ayz compare? For every p, clearly lcompM,p (y) is a prefix of lcompM,p (yz). So, if the first computation hits left, loops, or hangs, so

2. MOLES

69

does the second one; but if the first one hits right, there is no guarantee what the second computation does. Hence, all bits in ay that are 0 keep the same value in ayz ; but a bit which is 1 may turn into a 0. Overall, if “≥” is the natural component-wise order, we have the following. 4. Lemma. For all y, yz ∈ P : ay ≥ ayz . What happens to the outcome of the experiment if we further rightextend y into yzz ′ ∈ P ? And then into yzz ′z ′′ ∈ P ? While y is infinitely right-extensible inside P , the outcome may decrease only finitely many times. Obviously then, from some point on it must stop changing. When this happens, the extension of y that we have arrived at is a very useful tool. The following definition and lemma talk about it formally. 5. Definition. Let P ⊆ Σ ∗ . An l-dilemma over P is any string y ∈ P such that:1 for all extensions yz ∈ P and all states p ∈ Q, lcompM,p (y) hits right ⇐⇒ lcompM,p (yz) hits right. An r-dilemma over P is defined similarly, on left-extensions and rcomp. 6. Lemma. Let P ⊆ Σ ∗ . If P is non-empty and infinitely extensible to the right (resp., left), then l-dilemmas over P (r-dilemmas over P ) exist. Proof. Pick any y ∈ P and keep extending it in the direction of infinite extensibility until ay stops changing. When it does, the extension is a dilemma (as are, of course, all further extensions inside P ). In [51], dilemmas are called “blocking strings”. Both names serve as reminders of the way these string are used, as we now explain. 7. Lemma. Suppose x ∈ Σ ∗ , y is an l-dilemma over P , yz ∈ P , and some computation c := lcompM,p (xyz) crosses the xy-z boundary. After the first such crossing, c never visits x again and it eventually hits right. Proof. Consider the first time c crosses the xy-z boundary (Figure 16a). Let r be the state resulting from this crossing, and q the state resulting from the last crossing of the x-yz boundary before that. Then, the computation between these two crossings is lcompM,q (y) and hits right (into r). Since y is an l-dilemma over P and z does not spoil the property (yz ∈ P ), we know that lcompM,q (yz) also hits right. But this computation is a suffix of c. So, c also hits right. Moreover, after crossing the xy-z boundary, it never visits x again. 1Note that the given condition is the same as (∀yz ∈ P )(a = a ), but rather y yz

more informative. Also note that the “⇐=” part of the displayed equivalence is trivially true of all y, by Lemma 4. What is important is the “=⇒” part: on every extension of y in P , the computation will keep hitting right.

70

2. 2D VERSUS 2N

x

y

z

p

y p

q

z q

r q′ (a)

r

(b)

Figure 16. Two-way computations on dilemmas and on generic strings (see text). In total, once the computation crosses the xy-z boundary, it is restricted inside yz and forced to eventually hit right. Put another way, when M enters y, it faces a ‘dilemma’: either it will stay forever inside xy, never crossing the xy-z boundary; or it will cross it, but then also hit right without visiting x again. In effect, y ‘blocks’ M from returning to x after having seen z —and ‘locks’ it into hitting right. In yet other words, y makes sure that every left computation of M on xyz that hits left, hangs, or loops does so inside xy, before making it to z. 2.4-II. Generic strings. Consider again some P ⊆ Σ ∗ which is infinitely extensible to the right. For each y ∈ P , we can define the set of states that can be produced on the rightmost boundary of y by left computations: lstates(y) := q ∈ Q | (∃p ∈ Q) lcompM,p (y) hits right into q . How does this set change if we extend y into a string yz ∈ P ? How does it compare to the set lstates(yz)? Consider the function lmap(y, z)(·), defined as follows (Figure 16b): for each q ∈ lstates(y), the computation compM,q,|y|+1 (yz) is examined; if it hits right into some state r, then lmap(y, z)(q) := r; otherwise, it hits left, loops, or hangs, and lmap(y, z)(q) is left undefined. Note that the values of lmap(y, z) are all in lstates(yz). Indeed, if r is such a value, then r = lmap(y, z)(q) for some q ∈ lstates(y). Hence, the computation compM,q,|y|+1 (yz) hits right into r and some computation lcompM,p (y) hits right into q. Combining the two, we get the computation lcompM,p (yz), that hits right into r. We thus conclude that r ∈ lstates(yz), as claimed. Moreover, the values of lmap(y, z) cover lstates(yz). Indeed, if some state r ∈ lstates(yz), then some computation c := lcompM,p (yz) hits right into r. We know c crosses the y-z boundary, so let q be the state produced by the first such crossing. The computation before this crossing is lcompM,p (y) and hits right into q, so q ∈ lstates(y). The computation after the crossing is compM,q,|y|+1 (yz) and, as a suffix of c, hits right into r. We thus conclude that lmap(y, z)(q) = r, namely that lmap(y, z) covers r.

2. MOLES

71

Overall, lmap(y, z) is a partial surjection from the set lstates(y) to the set lstates(yz). This immediately implies its domain has enough elements to cover the range, so we know |lstates(y)| ≥ |lstates(yz)|. The next lemma summarizes our findings. Analogously to lstates(y), the set rstates(z) consists of all states that can be produced on the leftmost boundary of z by right computations. Clearly, the symmetric arguments apply. Note that these involve a partial surjection rmap(y, z) from rstates(z) to rstates(yz), defined analogously to lmap(y, z). 8. Lemma. For all y, yz ∈ P , the function lmap(y, z) partially surjects lstates(y) to lstates(yz); hence |lstates(y)| ≥ |lstates(yz)|. In the same manner, for all yz, z ∈ P , the function rmap(y, z) partially surjects rstates(z) to rstates(yz); hence |rstates(yz)| ≤ |rstates(z)|. As in Section 2.4-I, we now ask what happens to the size of the set lstates(y) as we keep right-extending y inside P . Although y is infinitely right-extensible, the size of the set can decrease only finitely many times. Hence, from some point on it must stop changing. When this happens, we have arrived at another useful tool. 9. Definition. Let P ⊆ Σ ∗ . A string y is l-generic over P if y ∈ P and:2 for all extensions yz ∈ P , |lstates(y)| = |lstates(yz)|. An r-generic string over P is defined symmetrically, on left-extensions and rstates. A string that is simultaneously l-generic and r-generic over P is called generic. 10. Lemma. Let P ⊆ Σ ∗ . If P is non-empty and infinitely extensible to the right (resp., left), then l-generic strings over P (r-generic strings over P ) exist. In addition, if yl is l-generic and yr is r-generic, then every string yl zyr ∈ P is generic. Proof. For the last claim, we simply note that every right-extension of an l-generic string inside P is also an l-generic string. Similarly in the other direction. Generic strings were first introduced by Sipser [55], for sdfas and over the property of liveness. As we will show in the next section, they strengthen dilemmas. Before presenting that argument, let us prove a last fact about the operators lstates and rstates. 11. Lemma. For any two strings y and z, lstates(yz) ⊆ lstates(z). Similarly, in the other direction, rstates(y) ⊇ rstates(yz). 2Note that the “≥” part of the displayed equality |lstates(y)| = |lstates(yz)| is trivial for all y, by Lemma 8. What is important is the “≤” part: on every extension of y in P , the set will manage to stay as large.

72

2. 2D VERSUS 2N

Proof. We prove the first containment. Pick any r ∈ lstates(yz) and any computation d := lcompM,p (yz) that hits right into r. (Figure 16b.) We know d crosses the y-z boundary. Let q ′ be the state produced by the last such crossing. Then lcompM,q′ (z) is a suffix of d, and therefore also hits right into r. So, r ∈ lstates(z). 2.4-III. Dilemmas versus generic strings. To examine the relation between dilemmas and generic strings, it is helpful to have the following alternative characterizations of the two classes of strings, in terms of the functions lmap(y, z) and rmap(y, z). 12. Lemma. Suppose y ∈ P ⊆ Σ ∗ . Then y is an l-dilemma over P iff for all yz ∈ P the function lmap(y, z) is total. Similarly for any z ∈ P and for r-dilemmas and rmap(y, z). Proof. For the forward direction, assume y is an l-dilemma over P . Consider any yz ∈ P and any q ∈ lstates(y). (Figure 16b.) Let c := lcompM,p (y) be a computation that hits right into q. We know c is a prefix of d := lcompM,p (yz). So, d crosses the y-z boundary (the first such crossing is into q), and thus hits right (Lemma 7). Hence, its suffix compM,q,|y|+1 (yz) hits right, too, which implies lmap(y, z)(q) is defined. For the reverse direction, fix y ∈ P and suppose lmap(y, z) is total for all yz ∈ P . Consider any such yz, any p ∈ Q, and assume c := lcompM,p (y) hits right into some state q. (Figure 16b.) Then q ∈ lstates(y). Therefore, lmap(y, z)(q) is defined. This implies c′ := compM,q,|y|+1 (yz) hits right. Combining c and c′ , we get the computation d := lcompM,p (yz). As c′ is a suffix of d, we know d hits right as well. 13. Lemma. Suppose y ∈ P ⊆ Σ ∗ . Then y is l-generic over P iff for all yz ∈ P the function lmap(y, z) is total and bijective. Similarly for z ∈ P being r-generic and for rmap(y, z). Proof. For the forward direction, suppose y is l-generic and pick any yz ∈ P . We know lmap(y, z) is a partial surjection from lstates(y) to lstates(yz). Since y is l-generic, we also know the two sets have the same size. So, lmap(y, z) must be total and injective. Conversely, fix y ∈ P and suppose ay,z is total and bijective for all yz ∈ P . Then clearly, for every such yz, the sets lstates(y) and lstates(yz) must have the same size. Intuitively, a dilemma guarantees that the computations that manage to survive through it will also survive through every extension that preserves the property. A generic string guarantees that, in addition, these computations will keep exiting each extension into different states. 14. Lemma. Let P ⊆ Σ ∗ . Over P , every l-generic string is an ldilemma and every l-dilemma is right-extensible into an l-generic string. Similarly for r-generic strings, r-dilemmas, and left-extensions.

2. MOLES

73

Proof. Lemmata 12 and 13 prove the first claim. For the second claim, we simply note that every string in P can be right-extended into l-generic strings. 2.4-IV. Traps. Consider a property P ⊆ Σ ∗ which is infinitely extensible in either direction and closed under concatenation. For this section, fix ϑ as a generic string over P , and let L := rstates(ϑ),

R := lstates(ϑ),

denote the sets of states producible on the leftmost and rightmost boundary of ϑ. Note that ϑ is both an l-dilemma and an r-dilemma (Lemma 14). A trap (on ϑ) is any string of the form ϑxϑ, where x ∈ P is the infix. By Lemma 10 and the closure of P under concatenation, traps are still generic strings. However, they further restrict M ’s freedom: By Lemma 13, the function lmap(ϑ, xϑ) is a total bijection from lstates(ϑ) = R to lstates(ϑxϑ). Since lstates(ϑxϑ) ⊆ R (by Lemma 11), lmap(ϑ, xϑ) is a total bijection from R to a subset of R. Clearly, this is possible only if this subset is R itself. So, lmap(ϑ, xϑ) simply permutes R. To simplify notation, we denote this permutation by αx . Namely, αx := lmap(ϑ, xϑ). Similarly, rmap(ϑx, ϑ) permutes L, and we denote this permutation as βx . Overall, we have proved the following. 15. Lemma. For all x ∈ P : αx permutes R and βx permutes L. Intuitively, in each direction, the computations that manage to cross the first copy of ϑ eventually cross the entire trap; but, after this first copy, they collectively do nothing more than simply permute the set of states that they have already produced. As we now show, the two permutations fully describe the behavior of M on the trap. 16. Lemma. For all x, y ∈ P : (αx , βx ) = (αy , βy ) =⇒ γϑxϑ = γϑyϑ . Proof. Suppose (αx , βx ) = (αy , βy ) and consider any p ∈ Q. We show γϑxϑ and γϑyϑ agree on (p, l)—the proof for (p, r) is similar. We examine the computations cx := lcompM,p (ϑxϑ) and cy := lcompM,p (ϑyϑ). Clearly, these behave identically up to the first crossing of the ‘critical’ boundary between ϑ and xϑ or yϑ. If one of them hits left, loops, or hangs, it does so inside ϑ (since ϑ is an l-dilemma) without crossing the critical boundary; so, the other computation behaves identically, thus γϑxϑ (p, l) = γϑyϑ (p, l). If one of them hits right, then it crosses the critical boundary into some state q, as does the other one; but then they both hit right, into the same state r := αx (q) = αy (q), so γϑxϑ (p, l) = γϑyϑ (p, l) = (r, r). We call (αx , βx ) the inner-behavior of M on the trap ϑxϑ. Note the distinction from the ‘behavior’ γϑxϑ .

74

2. 2D VERSUS 2N

ϑ p

x c1

ϑ

y

ϑ

q c2 r c3 s

Figure 17. A two-way computation on a trap (see text). An interesting case arises when ϑ is an infix of the infix itself. Then the inner-behavior of M on the trap can be deduced from its inner-behavior on the traps that are induced by the other two pieces of the infix. 17. Lemma. Suppose x, y ∈ P and z := xϑy. Then (αz , βz ) = (αx ◦ αy , βy ◦ βx ). Proof. To show that αz = αx ◦ αy (the argument for βz = βy ◦ βx is similar), we pick an arbitrary q ∈ R and show that αz (q) = αy αx (q) . (Figure 17.) We know q is produced by some right-hitting left computation on ϑ, say c1 := lcompM,p (ϑ) for some state p. Since ϑ is an l-dilemma over P and ϑzϑ ∈ P , we know c := lcompM,p (ϑzϑ) also hits right, into some state s. Therefore, αz (q) = s. Before hitting right, c surely crosses the ϑxϑ-yϑ boundary; let r be the state produced by the first such crossing. Clearly, the computation c2 := compM,q,|ϑ|+1 (ϑxϑ) hits right into r, and hence αx (q) = r. Moreover, the suffix of c after the first crossing of the ϑxϑ-yϑ boundary is c3 := compM,r,|ϑxϑ|+1 (ϑxϑyϑ) and obviously hits right into s. However, since ϑ is an l-dilemma over P and ϑyϑ ∈ P , we know c3 never visits the prefix ϑx. Hence, it can also be written as c3 = compM,r,|ϑ|+1 (ϑyϑ). Since it hits right into s, we conclude that αy (r) = s. Overall, αz (q) = s = αy (r) = αy αx (q) .

An obvious generalization holds when the infix contains multiple copies of ϑ. In a particular case of interest, the infix consists of several ϑ-separated copies of some x ∈ P . Specifically, for any k ≥ 1, we define x(k) := x(ϑx)k−1 and prove the following. 18. Lemma. For any x ∈ P and for any k ≥ 1: (αx(k) , βx(k) ) = (αx )k , (βx )k .

2.4-V. Hard inputs to deterministic moles. We now assume that the M of the previous sections is defined over Σn and that it is actually a mole. We will design inputs on which M misses a significant amount of information. All these inputs are going to be paths (cf. Section 2.1-II). We fix some I ⊆ [n] and i ∈ I, and consider the set Π ⊆ Σn∗ of all i-I-i paths. Clearly, Π is non-empty, infinitely extensible in both directions, and 2dfa

2. MOLES

75

closed under concatenation. Hence, by Lemma 10, generic strings over Π exist. We fix ϑ to be one, and let κ := |ϑ|. We also set L := rstates(ϑ), R := lstates(ϑ), and let µ := lcm(|L|!, |R|!) be the least common multiple of the sizes of the corresponding permutation groups. For every l ≥ 1, we consider all traps (on ϑ) with infixes of length l and collect into a set Ωl all inner-behaviors that M exhibits on them: Ωl := {(αx , βx ) | x is an i-I-i path of length l}. As shown in the next fact, every inner-behavior that can be induced by an l-long infix can also be induced by an infix of length l + 2µ(l + κ). The subsequent fact explains that sometimes the converse is also true. 19. Lemma. For every l ≥ 1: Ωl ⊆ Ωl+2µ(l+κ) . Proof. Pick any behavior (α, β) ∈ Ωl . We know that some l-long infix x ∈ Π induces this behavior, namely (α, β) = (αx , βx ). Consider the path x(2µ+1) = x(µ) ϑxϑx(µ) . This is also in Π and of length (2µ + 1)l + 2µκ = l + 2µ(l + κ). Moreover, by Lemma 18 and the selection of µ, we know that this path induces the behavior (α2µ+1 , βx2µ+1 ) = (αx )2µ αx , βx (βx )2µ = x (αx , βx ) = (α, β). Hence, (α, β) ∈ Ωl+2µ(l+κ) . 20. Lemma. There exist 3 l ≥ 1 such that Ωl = Ωl+2µ(l+κ) . Proof. The constant (|L|!) × (|R|!) upper bounds the sizes of all sets Ω1 , Ω2 , . . . , so at least one of them is of maximum size. Pick l so that Ωl is such. Then both Ωl ⊆ Ωl+2µ(l+κ) (by Lemma 19) and |Ωl | ≥ |Ωl+2µ(l+κ) | (by the selection of l). Necessarily then, the two sets must be equal. Intuitively, for the lengths l and l + 2µ(l + κ), this last fact says that between two copies of ϑ, every i-I-i path of either length can be replaced by some path of the other length without M noticing the trick (cf. Lemma 16). 2.5. The proof. We now fix an arbitrary deterministic mole M = (s, δ, f ) over Σ5 and prove that it fails to solve liveness. To this end, in Sections 2.5-II and 2.5-III we construct a maze that ‘confuses’ M . Our most important building blocks are the paths of the next section. 2.5-I. Three special paths. In this section we fix n := 5, i := 2, and I := {1, 2}. For these n, i, and I, we fix Π, ϑ, κ, and µ as in Section 2.4-V, we let λ be a length as in Lemma 20, and we set Λ := 2µ(λ + κ). 21. Lemma. There exist paths π, ρ, σ ∈ Π such that • M cannot distinguish among them: γπ = γρ = γσ . • ρ is Λ-disjoint on itself, and π is Λ-disjoint on σ. • π is Λ-shorter than ρ, and ρ is Λ-shorter than σ: |ρ|−|π| = |σ|−|ρ| = Λ. • π is non-empty but short: 0 < |π| ≤ Λ. 3Note that the argument essentially shows the existence of infinitely many such l.

(b)

(a)

ϑ′ ϑ

η

λ

ϑ′

η η′

Λ

ι

Λ − (2κ + λ)

ϑ′

ι

ϑ η′

η

ϑ

ϑ′

Figure 18. (a) Selecting the paths η and ι; then the ‘mirrors’ ϑ′ (of ϑ) and η ′ (of η). (b) The entire path ρ and how it is Λ-disjoint on itself.

ϑ

κ

ι

76 2. 2D VERSUS 2N

2. MOLES

77

Proof. Each one of π, ρ, and σ is going to be a trap on ϑ. So, the proof consists in properly selecting the corresponding infixes x, y, z ∈ Π. We set ρ := ϑyϑ, where y has length λ + Λ and guarantees ρ is Λdisjoint on itself. Constructing y is not hard (Figure 18a): We pick paths η := any 2-I-1 path of length λ, ϑ′ := the 1-I-1 path of length κ that is 0-disjoint on ϑ, ι := any 1-I-1 path of length Λ − (2κ + λ), and η ′ := the 1-I-2 path of length λ that is 0-disjoint on η. Then, setting y := ηϑ′ ιϑ′ η ′ we see that this is indeed a 2-I-2 path of length λ + Λ; and shifting ρ = ϑyϑ = ϑηϑ′ ιϑ′ η ′ ϑ on a copy of itself by Λ = |ϑηϑ′ ι| causes only its prefix ϑηϑ′ to overlap with the ‘mirroring’ suffix ϑ′ η ′ ϑ, so that no vertex is shared (Figure 18b). We set π := ϑxϑ, where x has length λ and guarantees π is indistinguishable to ρ. Selecting x is easy: Since y is of length λ + Λ, the inner-behavior (αy , βy ) of M on ρ is in Ωλ+Λ , and therefore in Ωλ . Hence, there exist λ-long paths that induce this inner-behavior. Picking x to be such a path, we know that (αx , βx ) = (αy , βy ) and hence γπ = γρ . We set σ := ϑzϑ, where z has length λ + 2Λ and guarantees that π is Λ-disjoint on σ and that σ is indistinguishable to π. Note that, given the lengths of x and z, the disjointness condition amounts to saying that π and σ should not intersect when ‘centered’ on top of each other. The construction of z is trickier. We start by selecting a path y ′ that is as long as y (i.e., of length λ + Λ) and does not intersect π when the two are ‘centered’ on top of each other (i.e., ϑxϑ is ( Λ2 −κ)-disjoint on y ′ ). This selection is trivial: we just take the unique 1-I-1 path that is as long as π (i.e., of length λ + 2κ) and 0-disjoint on it, and extend it by Λ2 − κ in both directions into any 2-I-2 path. Now, the inner-behavior (αy′ , βy′ ) of M on ϑy ′ ϑ is in Ωλ+Λ , and hence in Ωλ . Therefore, we can find an λ-long x′ ∈ Π that induces the same behavior, (αx′ , βx′ ) = (αy′ , βy′ ). We set z := (x′ )(µ) ϑy ′ ϑ(x′ )(µ−1) ϑx, the path containing 2µ + 1 ϑ-separated paths, all copies of x′ except the middle and rightmost ones, which copy y ′ and x. The length of z is indeed λ + 2Λ. Moreover, σ = ϑzϑ symmetrically extends y ′ by |ϑ(x′ )(µ) ϑ| = |ϑ(x′ )(µ−1) ϑxϑ| = Λ2 + κ, which in turn symmetrically out-lengths π by Λ2 − κ. Overall, σ symmetrically out-lengths π by Λ without intersecting it. That is, π is Λ-disjoint on σ. Finally, the inner-behavior (αz , βz ) of M on σ is (αx′ )µ αy′ (αx′ )µ−1 αx , βx (βx′ )µ−1 βy′ (βx′ )µ =

(αx′ )2µ αx , βx (βx′ )2µ

= (αx , βx ),

by Lemmata 17 and 18, and by the selection of µ. Hence, γσ = γπ .

(5)

(4)

(3)

(2)

(1)

Λ−1

Λ−1

ρ

ρ

Λ−1

|π| − 1

ρ

Figure 19. (a) in each of 1, 2, 3: a 29-long string, 6-disjoint on itself; see 5. (b) in each of 1, 2, 3: a maze; gates marked with circles. (c) in 3: the composition of the mazes of 1, 2. (d) in 1, 2, 3: examples of τ2 , τ1 , τ , respectively, for a schematic case Λ = 6, |π| = 4, and a schematic ρ. (e) in 4: a schematic of τ 4 ; in 5: a snippet of the union of a τ i with a Λ-shifted copy of itself.

Λ−1

78 2. 2D VERSUS 2N

2. MOLES

79

2.5-II. A maze of questions. We start with two strings (Figure 19d) τ1 := []3Λ ρ[] and τ2 := [33]Λ−1 [32][22]Λ−1 [23][33]Λ−1 [32,34][45][55]Λ−1 [54][44]|π|−1 [23,43],

which are equally long and each is Λ-disjoint on itself (recall the selection of ρ). Moreover, in τ := τ1 ∪ τ2 their graphs intersect only at the endpoints of ρ, so that τ is also Λ-disjoint on itself. This implies that τ i is Λ-disjoint on itself, too, for all i ≥ 1 (Figure 19e). Let P := {τ i | i ≥ 1} be the set of all powers of τ . Select τl and τr as l- and r-dilemmas over P . Fix m := 2|Q| + 1. The live string z := τl τ m τr is also a power of τ and in it we think of the m ‘middle’ copies of τ as distinguished. On this string, we consider the natural maze ω = (z, Z) := (z, {u, v}), where u := (3, 0) and v := (3, |z|). Consider the |Q| computations of the form compM,p,ε (ω) that we get as we vary p ∈ Q and pick ε := u when p focuses on the left (φ(p) = (·, l)), and ε := v otherwise. Some of them are infinite (i.e., they loop) or finite but non-crossing (i.e., they hang; or they start and end on the same gate). We disregard them and keep only those that are crossing (i.e., they start and end in different gates). Let k be their number. Clearly, k ≤ |Q|. Fix d to be any of these k computations and fix 1 ≤ i ≤ m. We know d ‘visits’ the ith distinguished copy of τ , and we want to discuss its behavior there. In particular, we want to consider the parity bi,d ∈ {0, 1} of the number of times that d ‘fully crosses’ the copy of ρ in the ith distinguished copy of τ . A careful definition of bi,d follows. If we ‘rip off’ ρ from the ith distinguished copy of τ and then add the two endpoints ui , vi of the path as new gates, we construct a new maze, χi := (τl τ i−1 ) τ2 (τ m−i τr ), {u, v, ui , vi } . By the ‘complementary’ operation, where we rip off everything except the particular copy of ρ, we can construct the ‘complementary’ maze, i−1 m−i τr | ψi := ([]|τl τ | ) []3Λ ρ[] ([]|τ ), {ui , vi } .

Clearly, ω = χi ◦ ψi , and d is a finite computation on this composition. By Lemma 3, we can break d into its finitely many, finite fragments d1 , d2 , . . . , dν . We know every even(-indexed) fragment is a computation on ψi ; we call it crossing if its starting and ending gates differ. The bit bi,d records the parity of the number of such fragments. In other words: bi,d = 0 ⇐⇒ d exhibits an even number of crossing even fragments. Intuitively, as the mole develops a crossing computation on ω, each distinguished copy of τ asks: “odd or even?” The mole answers this question with the parity of the number of times that it fully crosses ρ in that copy. The bits bi,d record exactly these answers. Organizing these m × k bits into m k-long vectors bi := (bi,d )d , for i = 1, . . . , m, we see that there are more vectors than values for them:

(5)

(4)

(3)

(2)

(1)

u

u

σ

ul u′l

π

vl

vl′

ur u′r

vr vr′

π

ρ

v

v

Figure 20. (a) in 3: a schematic of χ′ , focusing on the snippets around the leftmost, i1 th distinguished, i2 th distinguished, and rightmost pairs of copies of τ . (b) in 4: a schematic of ψ ′ , for the same snippets. (c) in 5: a schematic of ω ′ = χ′ ◦ ψ ′ , for the same snippets; in 1, 2: a better view of how σ, π connect the two disjoint graphs of x′ when they replace two copies of ρ.

u′

u′

σ

ρ

v′

v′

80 2. 2D VERSUS 2N

2. MOLES

81

2k ≤ 2|Q| < 2|Q| + 1 = m. Hence, bi1 = bi2 for some 1 ≤ i1 < i2 ≤ m. This means that, in each crossing finite computation, the answer to the i1 th question equals the answer to the i2 th one. 2.5-III. A more complex maze. We now return to ω = (z, {u, v}). We remove ρ from the i1 th and i2 th distinguished copies of τ , and name the four natural new gates ul , vl (endpoints of ρ in the i1 th copy) and ur , vr (endpoints of ρ in the i2 th copy) to get the new maze χ = (x, X) := (τl τ i1 −1 ) τ2 (τ i2 −i1 −1 ) τ2 (τ m−i2 τr ), {u, v, ul, vl , ur , vr } . As before, the ‘complementary’ maze (rip everything except the two ρ’s) is ψ = (y, Y ) := (· · · ) []3Λ ρ[] (· · · ) []3Λ ρ[] (· · · ), {ul , vl , ur , vr } ,

where ellipses stand for appropriately many []s. Obviously, ω = χ ◦ ψ. In this section, we will construct a maze ω ′ = χ′ ◦ ψ ′ , where the mazes χ′ and ψ ′ are complex versions of χ and ψ. We start by noting that x is Λ-disjoint on itself (because z is). So, in the union x′ := x ∪ ([]Λ x) of x with a Λ-shifted copy of itself, the two graphs do not intersect. (Figure 20a.) So, letting χ′ := (x′ , X ′ ), where X ′ := X ∪ {u′ , v ′ , u′l , vl′ , u′r , vr′ } contains all gates of χ plus their counterparts in the shifted copy, we know every computation on χ′ visits and depends on exactly one of the two disjoint graphs. Similarly, y is Λ-disjoint on itself (because ρ is), the union y ∪ ([]Λ y) contains two pairs of disjoint copies of ρ, and Y ′ := Y ∪ {u′l , vl′ , u′r , vr′ } contains their endpoints. Viewing each pair of copies of ρ as a copy of the string ρ∪([]Λ ρ), we can replace it with a copy of the string ρ′ := σ∪([]Λ π). If y ′ is the new string, we set ψ ′ := (y ′ , Y ′ ). (Figure 20b.) Crucially, this substitution preserved (i) the lengths of strings: |y ′ | = |y ∪ ([]Λ y)|, because |ρ′ | = |σ ∪ ([]Λ π)| = |σ| = 2κ + λ + 2Λ = |ρ| + Λ = |ρ ∪ ([]Λ ρ)|; (ii) the number and disjointness of paths: since π is Λ-disjoint on σ, we know ρ′ also contains two disjoint paths; and (iii) the set of endpoints of paths: for example, on the copy of ρ′ on the left, σ and π have endpoints ul , vl′ and u′l , vl . Note that every computation on ψ ′ visits and depends on exactly one of the paths. Clearly, the graphs of x′ and y ′ intersect only at the gates in Y ′ . So, χ′ and ψ ′ are composable, into ω ′ = (z ′ , Z ′ ) := χ′ ◦ ψ ′ = (x′ ∪ y ′ , {u, v, u′ , v ′ }). (Figure 20c.) Note that u and u′ are on the far left; v and v ′ are on the far right; and the four paths of y ′ connect the two graphs of x′ : the mole can switch graphs only if it fully crosses one of the paths. 2.5-IV. The hidden gate. Consider the dead input z ′ [] and the computation c′ := lcompM,s (⊢z ′ []⊣) on it. From now on, our goal is to prove that c′ never visits []. Equivalently, we want to show that M never visits the 0-degree side of the rightmost node v ′ of z ′ . Intuitively, this is the same as saying that the maze implied by z ′ hides v ′ from the mole. Note that

82

2. 2D VERSUS 2N

this will immediately imply the failure of M : on the live input z ′ [33] the mole will compute exactly as on the dead input z [], as it will never visit the 0-degree side of v ′ to note the difference. We start by remarking that, since the first symbol of z ′ is [33], any attempt of the mole to depart from ⊢ into a state of focus other than (3, l) is followed by a step back to ⊢. Ignoring these attempts and also noting that the mole can never move past [], we see that c′ consists essentially of zero or more computations of the form compM,p,1 (z ′ []) with φ(p) = (3, l). For our purposes, it is enough to study the case where c′ consists of exactly one such computation. So, suppose c′ := compM,p,1 (z ′ []), where φ(p) = (3, l). As a mole, every time M visits the 0-degree side of the nodes u′ , v, v ′ , it changes direction to ‘return into the graph’ of z ′ . Call every such move a turn and break c′ into segments c′1 , c′2 , . . . so that successive segments are joined at a turn: the later segment starts at the state and position following the last state and position of the earlier segment. Clearly: each segment is a computation on ω ′ ; the first segment is c′1 = compM,p,u (ω ′ ), but later segments start at a gate in {u′ , v, v ′ }; and either all segments are finite, in which case their list is finite iff c′ is, or not, in which case the list is finite and only the last segment is infinite. To prove that c′ never visits [], it is enough to show that no segment ends in v ′ . This, in turn, is a corollary of the following: • the first segment starts at gate u, • every finite segment that starts at gate u and does not hang necessarily ends either at gate u or at gate v, and • every finite segment that starts at gate v and does not hang necessarily ends either at gate u or at gate v. We only prove the second statement, in the next section. The third statement can be proved similarly, whereas the first statement is already known. 2.5-V. The final argument. Let d′ be a non-hanging finite segment of ′ c that starts at u. As a finite computation on ω ′ = χ′ ◦ ψ ′ , it can be broken into finitely many, finite fragments d′1 , d′2 , . . . , d′ν ; odd(-indexed) fragments compute on χ′ and even(-indexed) fragments compute on ψ ′ (Lemma 3). By previous remarks, every odd fragment visits and depends on exactly one of the two graphs (non-shifted and shifted) inside x′ ; and every even fragment visits and depends on exactly one of the four paths in y ′ . Calling an even fragment crossing if its start and final gates differ, we clearly see that two successive odd fragments visit different graphs in x′ iff the even fragment between them is crossing. Generalizing, and since d′ starts on u, each odd fragment visits the shifted graph in x′ iff the number of crossing even fragments that precede it is odd. Towards a contradiction, assume d′ does not end in u or v. Then it ends in either u′ or v ′ . Hence, d′ν is an odd fragment that visits the shifted

2. MOLES

83

graph in x′ . This immediately implies that the total number of crossing even fragments (before d′ν , and so throughout d′ ) is odd. In particular, even fragments exist and d′1 necessarily ends at a gate in Y . To reach a contradiction, we will prove that, by replacing every fragment d′i of d′ with an appropriate computation di on the original maze ω, we can create a computation d on ω that cannot possibly exist. Before we start, let h : X ′ → X be the function that maps every gate in X ′ to its ‘unprimed’ version in X: for example, h(ul ) = h(u′l ) = ul . Based on this mapping, we can find the the appropriate di as follows: • If d′i is an odd fragment (a computation on exactly one of the two graphs in χ′ ) from state q and gate ε to state r and gate ζ, we let di be the computation on (the one graph of) χ from q and h(ε). Clearly, di ends at r and h(ζ). In particular, d1 starts at h(u) = u and ends at a gate in h(Y ) = Y . • If d′i is an even fragment (a computation on exactly one of the four paths in ψ ′ ) from state q and gate ε to state r and gate ζ, we let di be the computation on (one of the two copies of ρ in) ψ from q and h(ε). Since ρ is indistinguishable from each of π and σ, we know di ends at r and h(ζ). Note here the critical use of the inability of the mole to detect the big difference in the lengths of π, ρ, and σ. Reviewing the list d1 , d2 , . . . , dν , we see that: d1 starts at h(u) = u; for every 1 ≤ i < ν, fragment di ends at the state and gate where di+1 starts; fragment dν ends on h(u′ ) = u or h(v ′ ) = v; and every even fragment di is crossing (on the path of ψ that it visits) iff d′i is crossing (on the path of ψ ′ that it visits). Therefore, by concatenating all these new fragments, we can build a computation d on χ ◦ ψ = ω that starts at u, ends at u or v, and contains an odd number of crossing even fragments. But is this possible? If d ends at u, then it never moves beyond τl (if it did, it would traverse the l-dilemma and get ‘blocked’ away from u). In particular, d1 never reaches a gate in Y . But (by a previous remark) this is where it is supposed to end. Contradiction. If d ends in v, then it is a crossing computation on ω. As ω equals each of the compositions χ ◦ ψ, χi1 ◦ ψi1 , and χi2 ◦ ψi2 , we know d can be fragmented in three different ways. Clearly, every even fragment with respect to either χi1 ◦ ψi1 or χi2 ◦ ψi2 is also an even fragment with respect to χ ◦ ψ, and vice versa; and is crossing or not (on the copy of ρ that it visits) irrespective of which composition we look at it through. So, letting ξ, ξ1 , ξ2 be the numbers of crossing even fragments with respect to the three compositions, we know ξ = ξ1 + ξ2 and (as established above) ξ is odd. Yet, by the selection of i1 and i2 , the parities of ξ1 , ξ2 are respectively bi1 ,d , bi2 ,d and hence equal (as bi1 = bi2 ), so that ξ should be even. Contradiction. So, in both cases we reach a contradiction, as desired.

84

2. 2D VERSUS 2N

3. Restricted Bidirectionality: Sweeping Automata In this section we explore the approach that we described in Section 3.3 of the Introduction. After we formally define what it means for a 2nfa to be sweeping, we will prove that every sweeping 2nfa solving Bn needs 2Ω(n) states—here, Bn is the complement of liveness. 3.1. Preliminaries. Our basic notation and definitions are as presented in Section 2 of Chapter 1. Some extra notions and facts, of special interest to this section, are presented below. 3.1-I. Sets, functions, and relations. As usual, for any set U , we write U , |U |, P(U ), and U 2 for the complement, the size, the powerset, and the set of pairs of U . The next simple lemma plays a central role in our proof. 22. Lemma. Let I be a set of indices totally ordered by |u| + |v|, and thus (u′ , v ′ ) > (u, v). Otherwise, u′ = u and v ′ = v, and thus (u′ , v ′ ) = (u, v). In both cases, (u′ , v ′ ) 6< (u, v), as needed. 3.1-II. Strings over Σn . Recall the alphabet Σn over which liveness is defined. In this section, it will be convenient to have a concise way of describing how the edges of a string over Σn connect the vertices of its outer columns. So, given any z ∈ Σn∗ , we say that z has connectivity ξ ⊆ [n]2 if the following holds: (a, b) ∈ ξ iff z contains a path from the a-th node of its leftmost column to the b-th node of its rightmost column. For example, the connectivity of the string on page 21 is {(3, 1), (3, 4)}; the connectivity of the empty string ǫ is the identity relation {(a, a) | a ∈ [n]}; and the connectivity of any single symbol is the symbol itself. The set of all strings

86

2. 2D VERSUS 2N

of connectivity ξ is written as Bn,ξ . In this terminology, the dead strings are exactly those with connectivity ∅; in other words, Bn = Bn,∅ . Every other connectivity implies a string which is live. 3.2. Sweeping automata. One way to define sweeping 2nfas is to start with our standard definition for 2nfas (cf. Chapter 1, Section 2) and simply impose the restriction that the transition function is such that the direction of the input head never changes strictly inside the input, for all inputs and all branches of the corresponding nondeterministic computations. Note that, with a definition of this kind, it becomes meaningful to ask whether a given 2nfa is sweeping or not. Our approach will be different. We will give an entirely new definition, with the restriction about the direction of motion built-in. The best way to explain what this means to give the definition right away. So, here it is. As usual, we start with the deterministic version and leave the straightforward generalization to nondeterminism for later. 3.2-I. The deterministic case. By a sweeping deterministic finite automaton (sdfa) over the states of a set Q and the symbols of an alphabet Σ we mean any triple M = (s, δ, f ), where δ is the transition function, partially mapping Q × Σe to Q, and s, f ∈ Q are the start and the final states. An input w ∈ Σ ∗ is presented to M surrounded by the end-markers, as ⊢w⊣. The computation starts at s and on the symbol to the right of ⊢, heading rightward. The next state is always derived from δ and the current state and symbol. The next position is always the adjacent one in the direction of motion, except when the current symbol is ⊢ or when the current symbol is ⊣ and the next state is not f , in which two cases the next position is the adjacent one in the opposite direction. Note that the computation can either loop, or hang, or move past ⊣ into f . In this last case we say that M accepts w. We stress that the values of the transition function do not contain any direction information. In contrast, this information is derived implicitly from the assumption that the automaton is sweeping. This greatly simplifies the setting and helps us stay closer to the combinatorial essence of the sweeping automata, avoiding the distraction caused by irrelevant inherited features. Moreover, this definitional shift does not invalidate our conclusions in this section. Specifically, it is not hard to verify that the size of a sweeping 2nfa under the new definition is linearly related to the size of a smallest equivalent sweeping 2nfa under the standard definition, and vice versa. Hence, any exponential lower bound under either definition implies an exponential lower bound under the other one—of course, if we cared about the exact trade-off, the choice of definition would matter.

3. SWEEPING AUTOMATA

87

The simplified definition allows for a simplified notion of computation, as well. In particular, for any z ∈ Σ ∗ and p ∈ Q, the left computation of M from p on z is the unique sequence lcompM,p (z) = (qt )1≤t≤m where q1 = p; every next state is qt+1 = δ(qt , zt ), provided that t ≤ |z| and the value of δ is defined; and m is the first t for which this last provision fails. If m = |z| + 1, then the computation exits into qm ; otherwise, 1 ≤ m ≤ |z| and the computation hangs at qm . The right computation of M from p on z is defined symmetrically, as the sequence rcompM,p (z) = qt )1≤t≤m with qt+1 = δ(qt , z|z|+1−t ). 3.2-II. The nondeterministic case. If M is allowed more than one next move at each step, we say that it is nondeterministic (snfa). Formally, this means that δ totally maps Q × Σe to the powerset of Q and implies that, on input any w ∈ Σ ∗ , M exhibits a set of computations on ⊢w⊣. If at least one of them moves past ⊣ into f , then M accepts w. Similarly, lcompM,p (z) and rcompM,p (z) are now sets of computations. We also introduce a notion to describe how the states of M connect via left and right computations on some string. For left computations on some z ∈ Σ ∗ , we encode these connections into a binary relation lviewM (z) ⊆ Q2 , which we refer to as the left behavior of M on z, defined as: (p, q) ∈ lviewM (z) ⇐⇒ ∃c ∈ lcompM,p (z) (c exits into q). Then, for any u ⊆ Q, the set lviewM (z)(u) of states reachable from within u via left computations on z is the left view of u on z. The right behavior rviewM (z) of M on z and the right view rviewM (z)(u) of u on z are defined symmetrically. We note that, on strings of length 1, the automaton has the same behavior in both directions: 25. Lemma. If |z| = 1, then lviewM (z) and rviewM (z) coincide: lviewM (z) = rviewM (z) = {(p, q) | δ(p, z) ∋ q}. We also note that, if extending a string z does not cause a view to include new states, then this remains true on all identical further extensions: 26. Lemma. The following implications are true, for all t ≥ 1: a. lviewM (z)(u) ⊇ lviewM (z z˜)(u) ⇒ lviewM (z)(u) ⊇ lviewM (z z˜t )(u), b. rviewM (z)(u) ⊇ rviewM (˜ z z)(u) ⇒ rviewM (z)(u) ⊇ rviewM (˜ z t z)(u). Proof. For part (a), suppose lviewM (z)(u) contains lviewM (z z˜)(u). To show that it also contains lviewM (z z˜t )(u), we use induction on t. The case t = 1 is the assumption itself. For t ≥ 1, we calculate: lviewM (z z˜t+1 )(u) = lviewM (˜ z ) lviewM (z z˜t )(u) ⊆ lviewM (˜ z ) lviewM (z)(u) = lviewM (z z˜)(u) ⊆ lviewM (z)(u).

88

2. 2D VERSUS 2N

The 1st step holds because lviewM (z z˜t+1 ) = lviewM (z z˜t ) ◦ lviewM (˜ z ). In the 2nd step, we use the inductive hypothesis and the monotonicity of lviewM (˜ z )(·). The 3rd step holds because lviewM (z z˜) = lviewM (z) ◦ lviewM (˜ z ). Finally, the last step uses the original assumption. For the implication of part (b), a symmetric argument applies. 3.3. Proof outline. We are now ready to present an outline of our proof. As explained in the introduction, to show that small snfas are not closed under complement (sn 6= cosn), it is enough to prove the following. 27. Theorem. Every snfa that recognizes Bn,∅ has 2Ω(n) states. The rest of Section 3 is a proof is this fact. We fix n and an snfa M = (s, δ, f ) over a set of k states Q that solves Bn,∅ , and we show that k = 2Ω(n) . The proof is based on Lemma 22. We build two sequences (Xι )ι∈I and (Yι )ι∈I that are related as in the lemma. The indices are all pairs of non-empty subsets of [n], the universe is all sets of 1 or 2 steps of M :4 I := {(α, β) | ∅ = 6 α, β ⊆ [n]} E := {e′ , e} | e′ , e ∈ Q2 , and the total order < is the restriction on I of some nice order on P([n])2 . If we indeed construct these sequences, then the lemma says |I| ≤ |E|, or 2 (2n − 1)2 ≤ k 2 + k2 ,

which implies k = 2Ω(n) . For the remainder, we fix I and E as here. Note that, from now on, some subscripts in our notation are redundant, and will be dropped: e.g., Bn,∅ and lviewM (z)(u) will be referred to simply as B∅ and lview(z)(u). Before moving on, let us also quickly prove a fact that will be useful later: In order to accept a dead string but reject a live one, M must produce on the dead string a single-state view that “escapes” the corresponding view on the live string. 28. Lemma. Suppose z ′ is live and z is dead. Then at least one of the following two claims is true: • lview(z ′ )(p) + lview(z)(p) for some p ∈ Q. • rview(z ′ )(p) + rview(z)(p) for some p ∈ Q.

Proof. Towards a contradiction, suppose that neither claim is true, namely lview(z ′ )(p) ⊇ lview(z)(p) and rview(z ′ )(p) ⊇ rview(z)(p), for every state p. Pick any accepting computation c of M on z and break it into its traversals c1 , . . . , cm , in the natural way: for all j < m, • cj starts at some pj next to ⊢ and ends at some qj on ⊣, if j is odd; • cj starts at some pj next to ⊣ and ends at some qj on ⊢, if j is even; 4A step of M is any e ∈ Q2 . Also, note that {e′ , e} is a singleton when e′ = e.

3. SWEEPING AUTOMATA

89

p1 = s and pj+1 is in δ(qj , ⊣) or δ(qj , ⊢), depending on whether j is odd or even, respectively; m is even and the last fragment is not really a traversal, but simply cm = (f ). Then, for each j < m, we know that • qj is in lview(z)(pj ) and thus also in lview(z ′ )(pj ), if j is odd, • qj is in rview(z)(pj ) and thus also in rview(z ′ )(pj ), if j is even. Hence, in both cases, some computation c′j of M on z ′ starts and ends identically to cj . If we also set c′m := (f ) and concatenate the computations c′1 , . . . , c′m , we end up with a computation c′ of M on z ′ which is also accepting. So, M accepts the live string z ′ , a contradiction. 3.4. Hard inputs and the two sequences. In this section, we will construct a set of inputs that collectively force M to use exponentially many states. Similarly to what we did for moles, here we will again need to start with strings that are long and rich enough to strain the ability of M to process their information, and then use those strings as building blocks for constructing the hard inputs. Once again, we call these strings generic, and we base their construction on the same general idea of [55]. 3.4-I. Generic strings. Consider any y ∈ Σ ∗ and the set of views produced via left computations on it (i.e., the range of lview(y)(·)): lviews(y) := {lview(y)(u) | u ⊆ Q}, How does this set change when we extend y into a longer string yz? To answer this question, it is useful to consider the function lmap(y, z) that for every left view produced on y returns its left view on z—namely, lmap(y, z) is the restriction of lview(z)(·) to lviews(y). It is easy to verify that all values of lmap(y, z) are inside lviews(yz). Indeed, consider any u in the range of lmap(y, z). Then some u′ in the domain of lmap(y, z) is such that lmap(y, z)(u′ ) = u. Since this domain is lviews(y), some u′′ ⊆ Q is such that lview(y)(u′′ ) = u′ . Then, u = lmap(y, z)(u′ ) = lmap(y, z) lview(y)(u′′ ) = lview(z) lview(y)(u′′ ) = lview(yz)(u′′ ), so that u is indeed in lviews(yz). (Note that in the last equality we used the fact that lview(yz) = lview(y) ◦ lview(z).) Moreover, the values of lmap(y, z) cover lviews(yz). Indeed, consider any u ∈ lviews(yz). Then there exists u′′ ⊆ Q such that lview(yz)(u′′ ) = u. Letting u′ := lview(y)(u′′ ), we see that u′ ∈ lviews(y). Therefore, u′ is in the domain of lmap(y, z). Moreover, lmap(y, z)(u′ ) = lview(z)(u′ ) = lview(z) lview(y)(u′′ ) = lview(yz)(u′′ ) = u,

so that u is indeed in the range of lmap(y, z). (Again, in the last equality we used the fact that lview(yz) = lview(y) ◦ lview(z).)

90

2. 2D VERSUS 2N

Overall, lmap(y, z) is a surjection from lviews(y) to lviews(yz), which immediately implies that |lviews(y)| ≥ |lviews(yz)|. The next fact encodes this conclusion, along with the obvious remark that lmap(y, z) is monotone. It also shows the symmetric facts, for left extensions and right views. The set rviews(y) consists of all views produced on y via right computations, and rmap(z, y) is the restriction of rview(z)(·) on rviews(y). 29. Lemma. For any two strings y and z: lmap(y, z) is a monotone surjection of lviews(y) onto lviews(yz), so |lviews(y)| ≥ |lviews(yz)|; similarly, in the other direction, rmap(z, y) is a monotone surjection of rviews(y) onto rviews(zy), so |rviews(y)| ≥ |rviews(zy)|. Now suppose y belongs to an infinitely right-extensible property P ⊆ Σ ∗ . What happens to the size of lviews(y) if we keep extending y into yz, yzz ′, . . . inside P ? Although there are infinitely many extensions, the size of the set can decrease only finitely many times. So, at some point it must stop changing. When this happens, we have arrived at a very useful tool. We define it as follows—compare with Definition 9 and Lemma 10. 30. Definition. Let P ⊆ Σ ∗ . A string y is l-generic over P if y ∈ P and: for all extensions yz ∈ P , |lviews(y)| = |lviews(yz)|. An r-generic string over P is defined similarly, on left-extensions and rviews(·). A string that is simultaneously l-generic and r-generic over P is called generic. 31. Lemma. Let P ⊆ Σ ∗ . If P is non-empty and infinitely extensible to the right (resp., left), then l-generic strings over P (resp., r-generic strings over P ) exist. In addition, if yl is l-generic and yr is r-generic, then every string yl xyr ∈ P is generic. Intuitively, from the perspective of M , a generic string is among the richest inputs that have property P , in the sense that it exhibits a greatest subset of the “features” that M is “prepared to pay attention to”. This makes generic strings useful in building hard inputs, as described in the Lemma 33 below and in Section 3.4-II. 32. Lemma. For any two strings y and z, lviews(yz) ⊆ lviews(z). Similarly, in the other direction, rviews(zy) ⊆ rviews(z). Proof. By Lemma 29, lviews(yz) is the range of lmap(y, z), which is a restriction of lview(z)(·). So, the first containment follows. The argument in the other direction is similar. 33. Lemma. Let y be generic over P ⊆ Σ ∗ . If yxy ∈ P , then • lmap(y, xy) is an automorphism on lviews(y), and • rmap(yx, y) is an automorphism on rviews(y).

3. SWEEPING AUTOMATA

91

Proof. Suppose yxy ∈ P . Then |lviews(y)| = |lviews(yxy)| (since y is generic) and lviews(yxy) ⊆ lviews(y) (by Lemma 32). Therefore, we know that lviews(y) = lviews(yxy). By this and Lemma 29, we conclude lmap(y, xy) surjects lviews(y) onto itself, which is possible only if it is injective. Since lmap(y, xy) is also monotone, Lemma 23 implies it is an automorphism. A similar argument applies for rmap(yx, y). 3.4-II. Constructing the hard inputs. Fix ι = (α, β) ∈ I and let Pι := Bα×β be the property of connecting exactly every leftmost node in α to every rightmost node in β. Easily, Pι is non-empty and infinitely extensible in both directions. Therefore, an l-generic string yl and an r-generic string yr exist (Lemma 31). Then, for η := [n]2 the complete symbol, we easily see that yl ηyr ∈ Pι , too. Hence, this string is generic over Pι (Lemma 31). We define yι := yl ηyr . We also define the symbol xι := β × α. 34. Lemma. For the sequences (yι )ι∈I , (xι )ι∈I and for all ι′ , ι ∈ I: ι′ < ι =⇒ yι xι′ yι ∈ Pι

ι′ = ι =⇒ yι xι′ yι ∈ B∅ .

and

Proof. Fix ι′ = (α′ , β ′ ) and ι = (α, β) and let z := yι xι′ yι . Note that the connectivities of yι and xι′ are respectively ξ := α× β and ξ ′ := β ′ × α′ . yι

β ′ xι′ α′

yι

yι

α

β ′ xι′ α′

yι

α β b∗

β

a∗

If ι′ < ι (on the left), then α′ + α or β ′ + β (since < is nice). Suppose β + β (if α′ + α, use a similar argument) and fix any b∗ ∈ β \ β ′ and any a∗ ∈ α. For any a, b ∈ [n], consider the a-th leftmost and b-th rightmost nodes of z. If a 6∈ α or b 6∈ β, then the two nodes do not connect in z, since neither can “see through” yι . If a ∈ α and b ∈ β, then (a, b∗ ) ∈ ξ and (b∗ , a∗ ) ∈ ξ ′ and (a∗ , b) ∈ ξ, so the two nodes connect via a path of the form a b ∗ → a∗ b. Overall, z ∈ Pι . ′ If ι = ι (on the right), then ξ ′ = β × α. Suppose z 6∈ B∅ . Then some path in z connects the leftmost to the rightmost column. Suppose it is of the form a b ∗ → a∗ b. Then b∗ ∈ β and (b∗ , a∗ ) ∈ ξ ′ and a∗ ∈ α. ∗ ∗ That is, (b , a ) is both inside and outside β × α, a contradiction. ′

3.4-III. Constructing the two sequences. Suppose ι′ < ι. Since the extension yι xι′ yι of yι preserves Pι (Lemma 34), each of lmap(yι , xι′ yι ) and rmap(yι xι′ , yι ) is an automorphism (Lemma 33). Put another way, the interaction between the steps of M on xι′ and its two behaviors on yι is such that these two mappings are automorphisms. Put formally, both • the restriction of Eι′ ◦ lview(yι ) (·) on lviews(yι ) and • the restriction of Eι′ ◦ rview(yι ) (·) on rviews(yι )

92

2. 2D VERSUS 2N

are automorphisms, where Eι′ := {(p, q) | δ(p, xι′ ) ∋ q} = lview(xι′ ) = rview(xι′ ) (cf. Lemma 25). What if ι′ = ι? What is then the status of the mappings lmap(yι , xι yι ) and rmap(yι xι , yι )? We can show that, since yι xι yι is dead (Lemma 34), we cannot have both of them to be automorphisms.5 However, something stronger is also true: we can even convince ourselves that one of the functions is not an automorphism by pointing at only 1 or 2 of the steps of M on xι . The next figure shows three examples of this. In each, we sketch the left behavior of M on yι and all single-state views. yι

xι

yι

yι

xι

yι

yι

xι e′

e′

e

e u

(i)

v

′

u

u

(ii)

yι

e v

′

v

′

u

u

(iii)

v′ v

Example i shows only 1 of the steps of M on xι , say e = (p, q) — many more may be included in Eι . Is lmap(yι , xι yι ) an automorphism? Normally, we would need to know the entire Eι to answer this question. Yet, in this case e is enough to answer no. To see why, note that the view v of q on yι has height 2, while one of the views that contain p is u, of height 1. Irrespective of the rest of Eι , lmap(yι , xι yι ) will map u to a view that contains v and thus has height 2 or more. So, it does not respect heights, which implies it is not an automorphism. Example ii shows 2 of the steps in Eι , say e′ = (p′ , q ′ ) and e = (p, q). Is lmap(yι , xι yι ) an automorphism? Observe that neither step alone can force a negative answer: the view v ′ of q ′ on yι has height 1, as does the lowest view u′ containing p′ ; similarly for e, u, v, and height 2. So, individually each of e′ and e may very well participate in sets of steps that induce automorphisms. Yet, they cannot belong to the same such set. To see why, suppose they do. Since u′ ⊆ u, the image of u would be v ′ ∪ v or a superset. Since v ′ * v, the height of that image would be greater than the height of v, and thus greater than the height of u, violating the respect to heights. Example iii also shows 2 of the steps in Eι , say e′ = (p′ , q ′ ) and e = (p, q), neither of which can disqualify lmap(yι , xι yι ) from being an automorphism. Yet, together they can. To see why, suppose both steps participate in the same automorphism. Then the image of u′ must be exactly v ′ : otherwise, it would be some strict superset of v ′ , of height 2 or 5If they were, they would be bijections (because each of lviews(y ) and rviews(y ) ι ι has a maximum). Hence, M would not be able to distinguish between the live yι and the dead yι (xι yι )t , for t any exponent that turns both bijections into identities. (Note that this is true even for the n-state snfa that solves liveness. Therefore, this observation alone can give rise to no interesting lower bound for k.)

3. SWEEPING AUTOMATA

93

more, disrespecting the height of u′ . On the other hand, u must map to a set that contains v, and thus also v ′ ⊆ v. Hence, v ′ must be the exact image of some u∗ ⊆ u. But then both u∗ and u′ map to v ′ , when u∗ 6= u′ (since u′ * u), a contradiction to the map being injective. In short, each step in Eι severely restricts the form of lmap(yι , xι yι ) and rmap(yι xι , yι ). And, either individually or in pairs, some steps can be so restrictive that they cannot be part of any set of steps that induces an automorphism in both directions. This motivates the following. 35. Definition. A set of steps E ⊆ Q2 is compatible with yι if there exˆ ⊆ Q2 such that the following are both automorphisms: ists a set E ⊆ E ˆ ◦ lview(yι ) (·) on lviews(yι ), and • the restriction of E ˆ ◦ rview(yι ) (·) on rviews(yι ). • the restriction of E

E.g., {e} in Example i and {e′ , e} in Examples ii,iii are incompatible with yι . We are now ready to define the sequences promised in Section 3.3. For each ι ∈ I, we let Xι consist of all sets of 1 or 2 steps of M on xι , and Yι consist of all sets of 1 or 2 steps of M that are incompatible with yι : Xι := E ∈ E | E ⊆ Eι , Yι := E ∈ E | E is incompatible with yι .

We need, of course, to show that the sequences relate as in Lemma 22. The case ι′ < ι is easy. Every E ∈ Xι′ can be extended to the set of all ˆ := Eι′ ), which does induce automorphisms steps of M on xι′ (namely to E (by Lemmata 33 and 34), so Xι′ ∩ Yι = ∅. The case ι′ = ι is harder. We analyze it in the next section. 3.5. The main argument. Suppose ι′ = ι. We will exhibit a singleton or two-set E ⊆ Eι that is incompatible with yι . First, some preparation. 3.5-I. The witness. Consider the strings yι (xι yι )t = (yι xι )t yι , for all t ≥ 1. Since yι xι yι is dead, the same is true of all these strings. Since yι is live, Lemma 28 says that for all t ≥ 1: • lview(yι )(p) + lview yι (xι yι )t (p) for some p ∈ Q, or • rview(yι )(p) + rview (yι xι )t yι (p) for some p ∈ Q. Namely, in order to accept the extensions yι (xι yι )t = (yι xι )t yι but reject the original yι , M must exhibit on each of them a single-state view that ‘escapes’ its counterpart on the original. In a sense, among all 2k singlestate views on each extension, the escaping one is a ‘witness’ for the fact that the extension is accepted, and Lemma 28 says that every extension has a witness. Of course, this allows for the possibility that different extensions may have different witnesses. However, we can actually find the same witness for all extensions: 36. Lemma. At least one of the following is true: • lview(yι )(p) + lview yι (xι yι )t (p) for some p ∈ Q and all t ≥ 1. • rview(yι )(p) + rview (yι xι )t yι (p) for some p ∈ Q and all t ≥ 1.

94

2. 2D VERSUS 2N

Proof. Suppose neither is true. Then each of the 2k single-state views has an extension on which it fails to escape from its counterpart on yι . That is, for every p there exists a tp,l ≥ 1 such that lview(yι )(p) ⊇ lview yι (xι yι )tp,l (p)

as well as a tp,r ≥ 1 such that

rview(yι )(p) ⊇ rview (yι xι )tp,r yι (p). ∗

∗

Consider then the extension z := yι (xι yι )t = (yι xι )t yι , where Q Q t∗ := p∈Q tp,l · p∈Q tp,r .

Clearly, every p has a t ≥ 1 such that z = yι ((xι yι )tp,l )t , and thus Lemma 26 implies that lview(yι )(p) ⊇ lview(z)(p). By the same argument, we also conclude that rview(yι )(p) ⊇ rview(z)(p). Overall, all single-state views on z fall within their counterparts on yι , contradicting Lemma 28. We fix p to be a witness as in Lemma 36. We assume p is of the first type, involving left views (otherwise, a symmetric argument applies). Moreover, among all witnesses of this type, we select p so as to minimize the height of lview(yι )(p) in lviews(yι ). For convenience, in the rest of the argument we let v0 := lview(yι )(p) and V := lviews(yι ), h := hV . Note that, by the selection of p, no p˜ with lview(yι )(˜ p) ( v0 can be a witness of the first type. Hence, every such p˜ has a t˜ ≥ 1 satisfying ˜ lview(yι )(˜ p) ⊇ lview yι (xι yι )t (˜ p).

We fix t∗ to be the product of all such t˜. Then the following lemma holds. 37. Lemma. If p˜ is such that lview(yι )(˜ p) ( v0 , then for all λ ≥ 1: ∗

lview(yι )(˜ p) ⊇ lview(yι (xι yι )λt )(˜ p). Proof. Fix any p˜ as in the statement and consider the t˜ for which lview(yι )(˜ p) ⊇ lview yι (xι yι )t˜ (˜ p). For any λ ≥ 1, we know λt∗ is a ˜ multiple of t, and therefore Lemma 26 applies. 3.5-II. Escape computations. For all t ≥ 1, collect into a set Ct all computations c ∈ lcompp (yι (xι yι )t ) that exit into some state q 6∈ v0 . We will be calling these computations ‘the escape computations for p on the t-th extension’. We also define C := ∪t≥1 Ct . Let us see how an escape computation looks like. (See Figure 21a on page 96.) Pick any c ∈ C, say on the t-th extension, exiting into q. Let e 1 , e 2 , . . . , et be the steps of c on xι , where ej = (pj , qj ) ∈ Eι . These are the critical steps along c. Let vj := lview(yι )(qj ) be the view of the right end-point

3. SWEEPING AUTOMATA

95

of ej . Along with v0 , these views form the list v0 , v1 , . . . , vt of the major views along c. Clearly, each of them contains the left end-point of the next critical step: vj−1 ∋ pj (similarly, vt ∋ q). So, for each ej there exist views u ∈ V that contain its left end-point and are contained in the preceding major view: vj−1 ⊇ u ∋ pj (similarly, vt ⊇ u ∋ q). Among them, let uj−1 be of minimum height in V (select ut similarly). Then the list u0 , . . . , ut−1 , ut are the minor views along c. This concludes our description. 3.5-III. The incompatible set. We are now ready to find the incompatible set E that we are looking for. We will find its one or two steps among the critical steps of escape computations. We distinguish two cases. Case 1: Some c ∈ C contains some critical step e such that the singleton {e} is incompatible with yι . Then we can select E := {e}, and we are done. Case 2: For all c ∈ C and all critical steps e in c, the singleton {e} is compatible with yι . In this case, we will find an incompatible two-set. Steepness. First of all, every c ∈ C (say with t, ej , vj , uj as above) has every major view at least as high as the next minor one (h(vj ) ≥ h(uj ), since vj ⊇ uj ) and every minor view at least as high as the next major one (h(uj ) ≥ h(vj+1 ), or {ej+1 } would be incompatible, as in Example i). Hence, every c ∈ C has views of monotonically decreasing height: h(v0 ) ≥ h(u0 ) ≥ h(v1 ) ≥ h(u1 ) ≥ · · · ≥ h(vt ) ≥ h(ut ). To capture the “rate” of this decrease, we record the list of minor view heights Hc := h(uj ) 0≤j≤t , and order each Ct lexicographically: c′ ≤ c

iff

Hc′ ≤lex Hc .

With respect to this total order, “smaller” computation means “steeper”. Long and steepest computation. We now fix t to be a multiple of t∗ which is at least |V |, and we select c to be one of the steepest computations in Ct . We let q, ej , vj , uj be as usual. Since t ≥ |V |, the list u0 , . . . , ut contains repetitions. Let j ′ < j be the indices for the earliest one. Then uj ′ = uj , so h(uj ′ ) = h(uj ), and thus all views in between have the same height: h(uj ′ ) = h(vj ′ +1 ) = · · · = h(vj ) = h(uj ). As a result, each major view equals the next minor one: vj ′ +1 = uj ′ +1 ,

...,

vj = uj .

The rest of the argument depends on whether the earliest repetition occurs at the very beginning or not. So, we distinguish two cases.

p

p

p˜

p

yι

yι

u0

u0

0

yι

v0

v0

v0

xι

e1

xι

e1

xι

yι

v1

v1

u1

yι v1

′ vl−1

yι

u1

1

e2

xι

e′l

e2

xι

e2

xι

yι

yι

vl′

yι

u2

2

u2

v2

v2

v2

xι

e3

xι

e3

xι

yι

yι

yι

u3

3

u3

v0

v3

e4

xι

q′

xι

e4

xι

yι

yι

yι

u4

4

u2

v4

xι

xι

e5

xι

yι

yι

yι

u5

5

v5

q

q

q

Figure 21. (a) An escape computation c ∈ C5 , exiting into q. (b) An example of Case 2A, for j = 3 and l = 2; in dashes, the new computation c′ ∈ Cj . (c) An example of Case 2B, for j ′ = 2 and j = 4; in dashes, the hypothetical case uj ′ −1 ⊇ uj−1 and c′ .

(c)

(b)

(a)

96 2. 2D VERSUS 2N

3. SWEEPING AUTOMATA

97

Case 2A: j ′ = 0. Then h(u0 ) = h(v1 ) = · · · = h(vj ) = h(uj ), and therefore v1 = u1 , . . . , vj = uj . In fact, we also have h(v0 ) = h(u0 ), and thus v0 = u0 . To see why, suppose h(v0 ) 6= h(u0 ). Then v0 ) u0 . Since u0 ∈ V , some state p˜ has lview(yι )(˜ p) = u0 (Figure 21a), and therefore Lemma 37 applies to it (since u0 ( v0 ). In particular, lview(yι )(˜ p) ⊇ lview yι (xι yι )t (˜ p)

(since t is a multiple of t∗ ). On the other hand, u0 contains the left endpoint of e1 , so the part of c after e1 shows that q ∈ lview yι (xι yι )t (˜ p), and thus q ∈ lview(yι )(˜ p) = u0 . Since u0 ⊆ v0 , this means that c is not an escape computation, a contradiction. So, if the earliest repetition occurs at the very beginning, we know h(v0 ) = h(u0 ) = · · · = h(vj ) = h(uj )

and

v0 = u0 , . . . , vj = uj

(see Figure 21b). Now, by the selection of p, its view on the j-th extension escapes v0 . Pick any c′ ∈ Cj , with exit state q ′ ∈ / v0 , critical steps e′1 , . . . , e′j , ′ ′ ′ and major views v0 , . . . , vj . Then v0 = v0 (since both c′ and c start at p) and q ′ ∈ vj′ \ vj (since vj = uj = u0 = v0 and q ′ ∈ / v0 ). So, the respective major views start with inclusion v0′ ⊆ v0 but end with non-inclusion vj′ * vj . ′ So there is 1 ≤ l ≤ j so that vl−1 ⊆ vl−1 but vl′ * vl . We are now ready to prove that {e′l , el } is incompatible with yι . The argument is as in Example ii. Suppose the two steps participate in a set ′ inducing an automorphism ζ. Since vl−1 ⊆ vl−1 , both e′l and el have their left end-points in vl−1 . Hence, ζ(vl−1 ) ⊇ vl′ ∪ vl . Since vl′ * vl , the height of ζ(vl−1 ) is greater than that of vl . But h(vl−1 ) = h(vl ). Therefore h ζ(vl−1 ) > h(vl−1 ), a contradiction.

Case 2B: j ′ 6= 0. Then we can talk of the minor views uj ′ −1 and uj−1 that precede the first repetition. Of course, uj ′ −1 6= uj−1 . In fact, uj ′ −1 + uj−1 . To see why, suppose uj ′ −1 ⊇ uj−1 (Figure 21c). Then uj ′ −1 ) uj−1 (since uj ′ −1 6= uj−1 ) and thus h(uj ′ −1 ) > h(uj−1 ). Moreover, ej has its left end-point in vj ′ −1 (since vj ′ −1 ⊇ uj ′ −1 ⊇ uj−1 ) while its right end-point has view uj ′ (since vj = uj = uj ′ ). Hence, by replacing ej ′ with ej , we get a new computation c′ that is also in Ct . In addition, Hc′ differs from Hc only in that h(uj ′ −1 ) is replaced by h(uj−1 ). But then c′ is strictly steeper than c, a contradiction. We are now ready to prove that {ej ′ , ej } is incompatible with yι . The argument is as in Example iii. Suppose the two steps participate in a set inducing an automorphism ζ. Because of ej , ζ(uj−1 ) ⊇ uj ; but h(uj−1 ) = h(uj ) and ζ respects heights, so in fact ζ(uj−1 ) = uj . Because of ej ′ , ζ(uj ′ −1 ) ⊇ uj ′ = uj ; so there exists u∗ ⊆ uj ′ −1 such that ζ(u∗ ) = uj . Overall, u∗ 6= uj−1 (since exactly one is in uj ′ −1 ) and ζ(u∗ ) = ζ(uj−1 ). Hence ζ is not injective, a contradiction. This concludes the analysis of the case ι′ = ι and thus the overall proof.

98

2. 2D VERSUS 2N

3.6. 2DFAs versus SNFAs. As already mentioned, an easy modification of the proof of Theorem 27 allows us to also establish the following. 38. Theorem. The trade-off from 2dfas to snfas is exponential. In other words, there exists a problem that can be solved by small 2dfas but cannot be solved by small snfas. This problem is simply an appropriate restriction of liveness and the small 2dfa solving it is actually single-pass. To describe this restriction, let us use Σn′ to denote the subset of Σn containing only the 2n ‘parallel’ symbols of the form {(a, a) | a ∈ α} for α ⊆ [n]. For example, the leftmost symbol in Figure 14a (page 65) is in Σ5′ , for α = {2, 3, 4, 5} ⊆ [5]. Let us also recall the complete symbol η = [n]2 from Section 3.4-II. The restriction of liveness that we have in mind is the problem Bn′ where all inputs are promised to follow the pattern: ∗ Σn′ ηΣn′ Σn Σn′ ηΣn′ . In other words, according to this promise, every input z starts and ends with a parallel symbol and the rest of it consists of one or more copies of the complete symbol separated by 3-symbol snippets of the form

where the outer symbols have to parallel and the middle one can be anything. For example, here is an input that obeys this promise:

Notice that every such string is live iff its first and last symbols are nonempty and its snippets are all live. Intuitively, the copies of the complete symbol ‘reset’ liveness every four symbols. This last observation immediately suggests a small 2dfa algorithm for solving liveness under this promise: just check that the first and last symbols are non-empty and that every snippet is live. More carefully: We read the first symbol. If it is empty, we hang. Otherwise, we start scanning the input from left to right. Every time that we read a copy of η, we use a depth-first search to check whether the string of the next 3 symbols is live. If it is not, we hang. If it is, we move to the next copy of η. If there is only 1 next symbol, we check whether it is empty or not. If it is, we hang. Otherwise, we accept.

Easily, this can be implemented on a zdfa with only O(n2 ) states.6 6In fact, replacing the depth-first search on the 3-symbol snippets with a cleverer search, we can reduce the size of this zdfa to only O(n2 / log n) states—which, by the

4. CONCLUSION

99

On the other hand, even under this promise, every snfa solving liveness still needs 2Ω(n) states. This is true simply because the promise does not invalidate our argument for the general case, as the hard inputs constructed in Section 3.4-II can all be drawn so as to obey the promise. More specifically, we can replace property Pι with the property Pι′ ⊆ Pι which contains only the strings of Pι that obey the promise. Easily, Pι′ is still non-empty and infinitely extensible in both directions. So, we can again find an l-generic string yl′ and an r-generic string yr′ over Pι′ and construct yι′ := yl′ ηyr′ , which is clearly in Pι′ and is thus generic over Pι′ . Then, it is trivial to verify that the sequence (yι′ )ι∈I is related to (xι )ι∈I exactly as (yι )ι∈I is in Lemma 34. The rest of the proof remains the same. 4. Conclusion In the first part of this chapter, we focused on a natural class of restricted but still fully bidirectional 2nfa algorithms for liveness, which includes the small 1nfa solvers. We asked whether small 2dfas of that kind can succeed and proved that they cannot, no matter how large they are. It is certainly good to know that graph exploration alone can never be a sufficient strategy. However, as already mentioned in the Introduction, in the context of the full conjecture the emphasis above stresses an alarming mismatch: a complexity question received a computability answer. This suggests that the reasons why deterministic moles fail against liveness are only loosely related to the reasons why small 2dfas fail —if they really do. In order for this approach to ultimately be of any use against the full conjecture, we need restricted versions of fully bidirectional 2dfas that are both weak enough to succumb to our arguments and strong enough to keep us in complexity: large 2dfas of this kind should be able to solve liveness. In the second part of the chapter, we proved that snfas must be exponentially large to solve the complement of liveness, and hence small snfas are not closed under complement. With an easy modification, our proof also showed that zdfas can be exponentially more succinct than snfas. An interesting next question concerns the exact value of our lower bound. The smallest known snfa for Bn,∅ is the obvious 2n -state 1dfa. Is this really the best snfa algorithm? If so, then nondeterminism and sweeping bidirectionality together are completely useless in this context.

A preliminary version of the contents of Section 2 can be found in [27], whereas Section 3 contains the material of [30]. way, is asymptotically optimal for this restriction of liveness. However, the algorithm is too complicated to describe here and is not necessary for Theorem 38, anyway.

CHAPTER 3

Non-Recursive Trade-Offs In Chapters 1 and 2 we compared the relative succinctness of several pairs of types of machines. In each case, the two types had the same computational power, in the sense that they could solve the same class of problems—the regular languages. Moreover, the associated trade-off was easily seen to be bounded from above by some recursive function. In contrast, the comparisons that we will consider in this chapter are more general. We will discuss conversions between types of machines of different computational power and, most often, the associated trade-offs will be growing faster than any computable function. One of the earliest studies of the relative succinctness of types of machines of different power was conducted by Stearns [56], as part of a proof that we can algorithmically check whether the language of a deterministic pushdown automaton (1dpa) is regular or not. Stearns showed that, although not every 1dpa can be converted into an equivalent 1dfa, whenever such equivalent automata exist, the smallest among them are at most triple-exponentially larger than the 1dpa itself. This naturally lead to the corresponding question for one-way nondeterministic pushdown automata (1npas): whenever a 1npa has equivalent 1dfas, what is an upper bound for the size of the smallest among them? The answer was qualitatively new, by Meyer and Fischer [35], who showed that every such bound grows (as a function of the size of the 1npa) faster than any computable function. Hence, among the cases where it is possible to turn a 1npa into a 1dfa, the trade-off in the description size is in general non-recursive.1 Several refinements of this result followed [59, 50]. 1As already noted in the introduction, this name can be misleading. Our intention here is to characterize the trade-offs that admit no recursive upper bounds. Clearly, every such trade-off is non-recursive. However, it is conceivable that there exist non-recursive trade-offs that admit recursive upper bounds. (It is easy to present natural functions of this kind. However, we do not know of one that is also the trade-off of some natural conversion.) So, strictly speaking, the class of non-recursive trade-offs is a subclass of the class of trade-offs that do not admit recursive upper bounds and, if the two classes are in fact equal, then an argument is needed to support this. With this clarification, we will move on using the popular choice. Note that, under this choice, a “recursive trade-off” is one that admits recursive upper bounds, and the “(non-)recursiveness of a trade-off” refers to the (non-)existence of a recursive upper bound for it. 101

102

3. NON-RECURSIVE TRADE-OFFS

In an important development, Hartmanis [16] later explained that the recursiveness of the trade-off from a type of machines a to a not-as-powerful type of machines b typically implies the recognizability (semi-decidability) of the corresponding inadequacy problem: “given a machine of type a, check that it has no equivalent machines of type b (i.e., that the machines of type b are inadequate to describe the language of the given machine)”. This greatly simplified the proofs of [35, 59, 50], while it nicely revealed the connections of the entire discussion to G¨odel’s theorem that the addition of an extra axiom to a formal system typically results in non-recursively shorter proofs for some of its theorems [17]. Today several refinements of the above results are known and nonrecursive trade-offs have emerged in many other comparisons between different types of machines. Comprehensive surveys can be found in [15, 32]. 1. Two-Way Multi-Pointer Machines In a remark in [19], Hartmanis and Baker showed that a non-recursive trade-off can occur even when an optimal algorithm replaces a near-optimal one.2 For example, converting an n2+ǫ -space deterministic Turing machine (dtm) into one that uses only n2 -space involves a non-recursive blowup in the size of description. In the pattern of [16], they derived this observation from the unrecognizability of the inadequacy problem from near-optimal to optimal machines (from n2+ǫ -space to n2 -space dtms), which in turn was shown to be a consequence of the fact that the near-optimal complexity class is strictly larger than the optimal one (some n2+ǫ -space dtms have no n2 -space equivalent). In this chapter we will refine that argument. We will prove a general theorem that directly shows the non-recursiveness of the trade-off in many conversions between machines of different power. In loose terms, our theorem states the following: If two types of machines a and b are such that 1. some machine of type a has no equivalent machine of type b, and 2. every unary two-way deterministic finite automaton with access to a linearly-bounded counter can be simulated by some machine of type a, then the trade-off from type a to type b is non-recursive. For example, in order to arrive at the previous remark on space, we can argue that, since n2 = o(n2+ǫ ), we know there exist n2+ǫ -space dtms with no n2 -space equivalent and therefore condition 1 is indeed true; that condition 2 is also true follows from the easy observation that any Ω(lg n) amount of space suffices for the simulation of a linearly-bounded counter. 2The reader is referred to [19, 17, 18] for a quite interesting discussion of the implications that this might have to our search for optimal algorithms.

1. TWO-WAY MULTI-POINTER MACHINES

103

The most characteristic applications of our theorem concern the successive levels of hierarchies of two-way multipointer automata, where by ‘pointer’ we mean any of the following accessories (in order of nondecreasing power): a linearly-bounded counter; a blind read-only head, namely a head that cannot distinguish between different input symbols (but can distinguish between input symbols and end-markers); an ordinary read-only head; a sensing read-only head, namely one that can sense which of the other heads are at the same cell as itself; or a pebble. For example, we can establish the non-recursiveness of the following trade-offs (in each case, the reference indicates where condition 1 of the theorem has been established; for condition 2, it is always easy to see that it is also satisfied): • from k + 1 to k counters, on linearly-bounded two-way deterministic counter automata (unary or not) and for all k [42], • from k + 1 to k heads, on two-way multi-head finite automata (deterministic or not, unary or not) and for all k [40, 41, 42], • from k + 1 to k heads, on two-way multi-head pushdown automata (deterministic or nondeterministic) and for all k [23]. Sometimes, we can only be as refined as the hierarchy is known to be: • from k + 2 to k registers, on linearly-bounded register machines (deterministic or nondeterministic) and for all k [42], • from k + 2 to k counters, on linearly-bounded two-way nondeterministic counter automata (unary or not) and for all k [42]. Similarly, the trade-off is non-recursive: • from 3 to 2 heads, on a simple two-way deterministic finite automaton [9] (a multi-head automaton is simple if every input head after the first one is blind). It remains non-recursive even when we start from a 2-head two-way deterministic finite automaton, or from a 1-head two-way deterministic pushdown automaton [9]. Finally, we can also conclude the non-recursiveness of the trade-off: • from k + 1 to k work-tape symbols, on Turing machines (deterministic or not) that use at most lg n work-tape cells on every input of length n, and for all k ≥ 2 [52]. It remains non-recursive even when we start from a Turing machine with a unary input alphabet, but then only for all sufficiently large k [52]. Several other conversions between machines of distinct computational power can be treated in a similar way. Returning to the statement of the theorem above, we warn that it is, in fact, incomplete. Additional conditions have to be met, concerning a and b, their descriptions, and how ‘size’ is measured. Still, in most interesting cases these conditions are trivially satisfied (in the above examples they are), so that listing them in this introduction would be a distraction. The complete list is contained in the formal statement of the theorem in Section 3.

104

3. NON-RECURSIVE TRADE-OFFS

The next section describes the formal framework of this study in more detail. Section 3 states and proves the theorem, except for an important lemma, which is proved in Section 5, after some preparation in Section 4. We warn that the discussion in this chapter is going to be much more abstract than in Chapters 1 and 2, so as to ensure that the conclusions cover as many conversions as possible—including cases where a or b denote types of language descriptors other than machines (e.g., regular expressions, grammars). A more concrete discussion can be found in [26], where the theorem is proved specifically for the conversion from k + 1 to k heads on two-way multi-head finite automata. 2. Preliminaries We write N for the set of positive integers, and lga n for ⌊loga n⌋. As usual, given any problem Π = (Πyes , Πno ) over some alphabet Σ and any dtm M , we say that M recognizes Π if M accepts all w ∈ Πyes and rejects (possibly by looping) all w ∈ Πno . If some dtm recognizes Π, we say Π is (Turing-) recognizable. If Π ′ is also a problem over Σ, we write Π ≤ Π ′ and say that Π reduces to Π ′ iff there is a dtm that, on input w ∈ Πyes ∪ Πno , eventually halts with an output w′ such that ′ w ∈ Πyes =⇒ w′ ∈ Πyes

and

′ w ∈ Πno =⇒ w′ ∈ Πno .

Clearly, if some unrecognizable problem Π reduces to a problem Π ′ , then Π ′ must also be unrecognizable. In the case where Π is a language (namely, Πyes + Πno = Σ ∗ ) and Πyes contains exactly all sufficiently long strings, for some interpretation 0 ≤ l ≤ ∞ of ‘sufficiently long’ Πyes = {w ∈ Σ ∗ | |w| ≥ l}, we say Π obeys a threshold —clearly then, Πyes is empty iff this threshold is infinite. A machine that solves Π is also said to obey the same threshold. 2.1. Descriptional systems. A descriptional system over the alphabets Γ and Σ is any set D ⊆ Γ ∗ of names (or descriptors), along with two total functions (·)D and |·|D , mapping every name d ∈ D to its language (d)D ⊆ Σ ∗ and its size |d|D ∈ N, respectively. For example, suppose that we fix a binary encoding of all 1dfas over, say, the input alphabet {a, b, c}. This induces the descriptional system over {0, 1} and {a, b, c} that contains all encoding strings as names and maps each of them to the language accepted by the corresponding 1dfa (as its language) and to the number of states in that 1dfa (as its size). Alternatively, the size of every name could just be its length (i.e., the number of binary symbols in it). A system D is decidable if the membership problem for its names is decidable. That is, if there is a dtm UD that halts always and is such that: for all d ∈ D and w ∈ Σ ∗ :

UD (d, w)accepts ⇐⇒ w ∈ (d)D .

2. PRELIMINARIES

105

Thus, the system of the previous example is clearly decidable, whereas a system containing binary encodings of dtms would be undecidable. In order to be able to compare two descriptional systems D and E in terms of their relative succinctness, we require that they are comparable, in the sense that [i] they are defined over the same alphabets, and that [ii] their (·) and |·| mappings agree on all common names,3 for all z ∈ D ∩ E:

(z)D = (z)E

and

|z|D = |z|E ,

so that subscripts can be dropped: for all z ∈ D ∪ E, (z) and |z| are unambiguous. For such systems, the comparison of E against D involves two natural notions: i. For a name e ∈ E, there may or may not exist a name in D that maps to the same language. In the latter case, we say that D is inadequate for describing the language of e and, accordingly, we call the associated computational problem, “given an e ∈ E, check that no d ∈ D maps to (e)”, the inadequacy problem from E to D. Formally, this is the promise problem I = (Iyes , Ino ), with: Iyes := {e ∈ E | (d) 6= (e) for all d ∈ D}, Ino := {e ∈ E | (d) = (e) for some d ∈ D}. Notice that e is promised to be in E, so that solving I does not require checking membership in E (which might be hard, even impossible). ii. When a name e ∈ E does have equivalent names in D (i.e., names mapping to (e)), we naturally ask how larger than e the smallest of these D-equivalents are. As usual, we answer this question with a function f : N → N that upper bounds this increase in size in terms of the size of e. Namely, f is such that for all s ∈ N and all e ∈ E of size s: if D contains names that are equivalent to e, then at least one of them is of size at most f (s). We say that f is an upper bound for the trade-off (for the conversion) from E to D.4 When a computable such upper bound exists, we say that the trade-off from E to D is recursive.5 As first noted in [16], discussions i and ii above are not unrelated: the unrecognizability of the inadequacy problem typically implies the nonrecursiveness of the trade-off, as explained in the following lemma. 3The agreement for |·| is not necessary but we do require it, for simplicity. 4Note that this defines directly the notion of an upper bound for the trade-off from

E to D. A more natural approach would be to first define the notion of the trade-off from E to D (in the sense of Chapters 1 and 2), and only then say what an upper bound for it is. However, that would be redundant, as our goal in this chapter is to show the non-recursiveness of the upper bounds. 5See the discussion in Footnote 1 (page 101) on what “recursive trade-off” means.

106

3. NON-RECURSIVE TRADE-OFFS

1. Lemma (Hartmanis). Let D and E be two comparable descriptional systems over alphabets Γ and Σ, satisfying the following conditions: H1 . both D and E are decidable, H2 . for every e ∈ E, we can effectively compute its size |e|, and H3 . there exists a halting dtm that, on input any s ∈ N, outputs a list of names Z ⊆ Γ ∗ such that i. the non-D names can be recognized in Z: the problem (Z ∩ D, Z ∩ D) is recognizable. ii. the languages of the D-names in Z cover all and only those languages over Σ that are supported by names in D of size ≤ s: {(z) | z ∈ Z ∩ D} = {(d) | d ∈ D & |d| ≤ s}. If the trade-off from E to D is recursive, then the inadequacy problem from E to D is recognizable. Before the proof, let us remark how mild conditions H1 –H3 are. For most interesting cases, the first two of them are trivially true and H3 is satisfied via the dtm that simply lists all names in D that have size ≤ s (so that the problem of H3 i is trivially decidable and the sets of H3 ii trivially identical). Having H3 as complicated simply covers some special cases (e.g., comparing general to unambiguous context-free grammars [17, Example 2]). Proof. Suppose D, E are as in the statement and f is a computable upper bound for the trade-off from E to D. To check that a given e ∈ E has no D-equivalents, we first compute s := f (|e|) (by H2 and since f is computable). We then run the dtm guaranteed by H3 on s, to produce a (finite, since the dtm is halting) list of names Z := {z1 , z2 , . . . , zk }. At that moment, we know (by the selection of f and H3 ii) that we should accept iff every D-name in Z maps to a language different from (e). Equivalently, we should accept iff: for every z ∈ Z, either z is not a D-name or z is a D-name and (z), (e) differ at one or more w ∈ Σ ∗ . In order to check this, we start simulating, in two parallel threads: i. the recognizer given by H3 i on each of z1 , z2 , . . . , zk in parallel, and ii. for all w ∈ Σ ∗ : the machines UE and UD (guaranteed by H1 ) respectively on (e, w) and on each of (z1 , w), (z2 , w), . . . , (zk , w). Whenever a z ∈ Z is accepted in thread i, we cross it off the list. Whenever a z ∈ Z is found to disagree with e on some w in thread ii, it is crossed off the list, as well. Finally, if the list ever gets empty, we accept. Clearly, every string in Z that is not a D-name, will eventually be crossed off, in thread i. Similarly, each D-name that is inequivalent to e will also be eventually removed, in thread ii. Moreover, neither thread can delete a D-name that is equivalent to e. Hence, the list will eventually get empty iff e had no D-equivalent in the original list Z, which is true iff e has no D-equivalent at all.

2. PRELIMINARIES

107

2.2. Multi-counter automata. Our main theorem will need to make use of the natural notion of a unary two-way deterministic finite automaton that has additional access to a number of counters. Such models are of course known and well studied, but mainly for non-unary alphabets—see, for example, the two-way multi-counter machines of [11, 42]. Since we will only be interested in the unary case, it is possible to simplify the model in helpful ways. Most notably, we can avoid the notion of input tape, and assume instead that the input is the upper bound for one of the counters.6 The simplified definition follows. A deterministic automaton with k counters (dcak ) consists of a finite state control and k counters, each of which can store a nonnegative integer. One of the counters is distinguished as primary, the rest being referred to as secondary. The input to the automaton is a nonnegative upper bound n for the primary counter. The machine starts at a designated start state with all its counters set to 0. At each step, based on its current state, the automaton decides which counter it should act upon and whether it should decrease it or increase it. Then the action is attempted. An attempt to decrease fails iff the counter already contains 0; an attempt to increase fails iff the counter is the primary one and it already contains n; an attempt to increase a secondary counter never fails. A failed attempt leaves the counter contents intact; a successful attempt updates the counter contents accordingly. Based on its current state and on whether the attempt succeeded or not, the automaton selects a new state and moves to it. The input is accepted if the machine ever enters a designated final state. The language of the machine is exactly the set of inputs that it accepts. If, for all n, the behavior of the automaton is such that no secondary counter ever grows larger than n, we say that the automaton is (linearly) bounded. We will be interested in a special version of the emptiness problem for multi-counter automata. One way to introduce this problem is to start with the emptiness problem for dtms (“given a description of a dtm, check that the language of the machine is empty”), which is well known to be unrecognizable [21], and to consider certain ways of ‘simplifying’ it: • What happens if, instead of a full-fledged dtm, the machine we are given is ‘simpler’ ? Say, a multi-counter automaton? Or just a dca2 ? Clearly, checking emptiness becomes ‘simpler’, too. Does it also become recognizable? • What if the given dca2 is also promised to be bounded ? And terminating, too? And to also obey a threshold ? As the promise gets stronger, checking emptiness again becomes ‘simpler’. But does it become recognizable? So, the problem that we want to define is the following: “given a description of a dca2 that is promised to be bounded and terminating and to obey a 6Note a difference from [26], where the upper bound is applied to all counters.

108

3. NON-RECURSIVE TRADE-OFFS

threshold, check that its language is empty.” Formally, E = (Eyes , Eno ) with Eyes := {z ∈ hdca∗2 i | (z) = ∅} Eno := {z ∈ hdca∗2 i | (z) 6= ∅}. Here, we use hdca∗2 i to denote the set of descriptions (under a fixed encoding) of all terminating, bounded dca2 s that obey a threshold, whereas (z) stands for the language of the machine described by z. Interestingly, although not surprisingly, even for such a weak automaton and under such a strong promise, emptiness remains unrecognizable:7 2. Lemma. E is unrecognizable. We use this fact in the next section, but defer proving it until Section 5. In between, Section 4 discusses the capabilities of multi-counter automata. 3. The Main Theorem We are now ready to state and prove the main theorem. 3. Theorem. Let D, E be two comparable descriptional systems that satisfy conditions H1 –H3 of Lemma 1. If they also satisfy the following: C1 . there exists a name e0 ∈ E that has no equivalent in D, C2 . given a description z of a terminating, bounded dca2 that obeys a threshold, we can effectively construct a name ez ∈ E such that (9)

(ez ) = (e0 ) ∪ {w ∈ Σ ∗ | |w| ∈ (z)},

C3 . every co-finite language has a name in D that maps to it, then the trade-off from E to D is non-recursive. Before proving the theorem, note how mild conditions C1 –C3 really are. Since every co-finite language is regular, C3 is trivially satisfied whenever the names in D describe machines that have some kind of finite control. The second condition essentially says that the machines described by E have enough resources to simulate a bounded dca2 . Because then, given z, we can always construct the description ez of the following E-machine: on input w ∈ Σ ∗ : first simulate on |w| the dca2 described by z; if this accepts, then halt and accept; otherwise, simulate on w the machine described by e0 and accept, reject, or loop accordingly. which obviously satisfies (9) (note the importance of the promise that z describes a dca2 that never loops). Given how weak bounded dca2 s are, most two-way machines with non-regular capabilities will easily meet C2 . 7Clearly E ∈ Π , and the proof of Lemma 2 will show that E is Π -complete. Recall 1 1 that, under no promise and after non-trivially modifying the definition of dca2 s, E is the emptiness problem for 2-register machines, which is well-known to be Π1 -complete [38].

4. PROGRAMMING COUNTERS

109

The important condition is C1 , which requires that the machines described by D are not as powerful as those described by E; in other words, a separation is needed between the complexity classes that correspond to the two systems. Proof. We essentially repeat Hartmanis’ argument from [17, Example 4] (see also [31, Theorem 7]). Suppose D, E are as in the statement of the theorem. Since H1 –H3 are satisfied, Lemma 1 implies that we only need to prove that the inadequacy problem I from E to D is unrecognizable. By Lemma 2, we just need to reduce E to it: E ≤ I. Given a z ∈ hdca′2 i, we simply construct the name ez ∈ E guaranteed by conditions C1 and C2 , so that (ez ) = (e0 ) ∪ {w ∈ Σ ∗ | |w| ∈ (z)}. If z ∈ Eyes , then the language of z is empty, so that (ez ) = (e0 ) and ez has no D-equivalent (because e0 does not); hence ez ∈ Iyes . On the other hand, if z ∈ Eno , then the language of z contains all sufficiently large w ∈ Σ ∗ , so that (ez ) is co-finite and has D-equivalents (by C3 ); hence ez ∈ Ino . This concludes the proof. As a side remark, we note that the proof has shown a slightly stronger fact: problem I remains unrecognizable even under the promise that the given e ∈ E either has no D-equivalent or its language is co-finite. In addition, the promise that the given dca′2 obeys a threshold can be slightly relaxed: we only need to know that its language is either empty or co-finite. 4. Programming Counters In order to present the capabilities of multi-counter automata, we introduce some ‘program’ notation. First, the two atomic operations, the attempt to decrease a counter X and the attempt to increase it, are denoted respectively by f

X ←− X − 1

and

f

X ←− X + 1,

where, in each case, flag f is set to true iff the attempt succeeds. Then, the compound operation of setting X to 0, denoted by X ←− 0, can be described by (10)

f

repeat X ←− X − 1 until ¬f .

If a second counter Y is present, we can transfer the contents of Y into X: we set X to 0, then repeatedly decrease Y and increase X until Y is 0. We denote this by f

(X, Y ) ←− (Y, 0),

110

3. NON-RECURSIVE TRADE-OFFS

and describe it by a line similar to (10). Note that, if X is the primary counter and Y > n, then one of the attempts to increase X will fail; in that case, we restore the original value of Y returning X to 0, and set flag f to false. So, X’s original contents are always lost, but this never happens to the original contents of Y . Changing how fast X increases as Y decreases, we can multiply/divide Y into X by any constant a ∈ N. We denote these operations by f f,r (X, Y ) ←− (aY, 0) and (X, Y ) ←− Ya , 0 ,

where, in the second operation, we also find the remainder and return it in r. As before, if X is the primary counter and aY > n (respectively, ⌊Y /a⌋ > n) then one of the attempts to increase X will fail; we then restore the original value of Y returning X to 0, and set flag f to false. At a higher level, we can try to multiply Y by a constant a (into Y ) using X as an auxiliary counter and making sure Y changes only if the operation succeeds: f

(X, Y ) ←− (aY, 0);

t

if f then (Y, X) ←− (X, 0).

Note the use of t in the place of a flag, indicating that the action is guaranteed to be successful. Division (with remainder) can be performed in a similar manner. We denote the two operations by f,X f,r,X (11) Y ←− aY and Y ←− Ya . Now, if X is primary, we can set Y to the largest power of a that fits in n: f

(12)

X ←− 0; X ←− X + 1; if f then g,X

t

{Y ←− 0; Y ←− Y + 1; repeat Y ←− aY until ¬g}, an operation that fails iff n = 0. We denote it by f,X

Y ←− alga n where, as already mentioned, lga n := ⌊loga n⌋. If a third counter Z is present, we can modify (12) to also count (in Z) the number of iterations performed. This gives us a way to calculate lga n: f,X,Y

Z ←− lga n, an operation that fails iff n = 0. In another variation, we can modify the multiplication in (11) so that the success of the operation depends on the contents of X (as opposed to its upper bound n): f,X,Z

Y ←− aY, meaning that, using Z as auxiliary and without affecting X: if aY ≤ X, then Y is set to aY ; otherwise, Y is unaffected. Specifically, to implement this, we first set Z to 0. Then, we repeatedly decrease Y , increase Z, and

5. PROOF OF THE MAIN LEMMA

111

decrease X by a. If X becomes 0 before Y , then aY > X and the operation should fail: we restore the original values of Y and X by repeatedly decreasing Z, increasing Y , and increasing X by a, until Z becomes 0. Otherwise, aY ≤ X and the operation will succeed: we copy the correct value to Y and restore the value of X by repeatedly decreasing Z and increasing each of Y , X by a, until Z becomes 0. Note that if originally Y, Z ≤ X, then at no point during the operation does any of the counters assume a value greater than the original value of X. Using this last operation, we can program the following variant of (12): f

t

X ←− X − 1; if f then {X ←− X + 1; g,X,Z

t

Y ←− 0; Y ←− Y + 1; repeat Y ←− aY until ¬g} which implements the attempt to set Y to the largest power of a that is at most X, using Z as auxiliary and leaving X unaffected (and failing iff X is 0). We denote this operation by f,Z

Y ←− alga X . It is important to note that, by the remark at the end of the previous paragraph, if originally Y, Z ≤ X, then during this operation no counter ever assumes a value greater than the original value of X. Hopefully, the reader is convinced of the quite significant capabilities of dcak s that have 2 or more counters. We will be using these capabilities in the next section. 5. Proof of the Main Lemma We now prove that E is unrecognizable. We do this by presenting a reduction from the complement of the halting problem: HALTING ≤ E, where HALTING := {z ∈ {0, 1}∗ | z encodes a dtm that loops on z} is well-known to be unrecognizable [21]. That is, we present an algorithm that, given a description z of a dtm M , returns a description z ′ of a terminating, bounded, threshold-obeying dca2 M ′ such that (13)

M loops on z =⇒ (z ′ ) = ∅ and M halts on z =⇒ (z ′ ) 6= ∅.

In describing this algorithm, we will be calling a machine (dtm or dcak ) good, if it is terminating, bounded (for dcak s), obeys a threshold, and its language satisfies (13) when it replaces (z ′ ). Thus, e.g., M ′ will be good. On its way to z ′ , the algorithm will construct descriptions of two other machines: a description zA of a dtm A, and a description zB of a dca3 B. In the sequence M , A, B, M ′ each machine after M will be defined

112

3. NON-RECURSIVE TRADE-OFFS

in terms of the previous one and will be good. Our constructions use the ideas of [61] and [38] (also found in [39, 21]). 5.1. The first machine. A is a dtm with one tape, infinite in both ˙ 1}, ˙ while the input alphabet is directions; the tape alphabet is {⊔, 0, 1, 0, {0}. On input 0n , A starts with tape contents · · · ⊔ ⊔ ⊔ 000 · · · 00} ⊔ ⊔ ⊔ · · · | {z n times

and its input head on the ⊔ next to the leftmost 0 (or on any ⊔, if n = 0). It then computes as follows: 1. For all w ∈ {0, 1}n , from 0n up to 1n : — if w encodes a halting computation history of M on z, accept. 2. Reject. The check inside the loop presupposes some fixed reasonable encoding of sequences of configurations of M into binary strings, with the additional property that if w encodes a computation history, then every string of the form w0∗ encodes the same computation history. Note that, using the extra dotted symbols, A can easily perform this check without ever writing a non-blank symbol on a ⊔, or a ⊔ on a nonblank symbol; and without ever visiting any ⊔ that lies beyond the two that originally delimit the input. As a consequence, throughout its computation on 0n , A keeps exactly n non-blank symbols on its tape, occupying the same n cells as the symbols of the input. Also note that, by the selection of the encoding scheme for M ’s computation histories, if A accepts an input 0n , it necessarily accepts all longer inputs, as well. 5.2. The second machine. B is a dca3 that, on input n ≥ 30, simulates the behavior of A on input 0lg5 lg30 n ; on input n < 30, B just rejects. Note the strange length lg5 lg30 n. This is chosen as a function of n that is (i) computable by a dca3 and (ii) increasing, but also (iii) small enough. Goodness of B bases on (ii), whereas (iii) facilitates the simulation performed by M ′ in the next section. To explain B’s behavior, let J, L, R be its three counters. J is primary and helps performing operations on L and R, while L and R together encode tape configurations of A. To see the encoding, consider the following example of a configuration: ··· ···

l4 ⊔

l3 ⊔

l2 ×

l1 ×

l0 ×

h × ↑

r0 ×

r1 ×

r2 ×

r3 ×

r4 ⊔

r5 ⊔

··· ···

where × stands for any non-blank symbol, and ↑ shows the head position. ˙ 0, ˙ 1 and 0 to the numbers 0, 1, 2, 3 and 4, respecMapping the symbols ⊔, 1, tively (in fact, any mapping that maps symbol ⊔ to code 0 and symbol 0 to code 4 will do), we get each tape cell map to a digit of the 5-ary numbering

5. PROOF OF THE MAIN LEMMA

113

system. Then, the head position splits the tape into three portions, which define the integers ∞ ∞ X X l= li · 5 i and h and r= ri · 5i , r=0

i=0

where the two sums are finite, exactly because ⊔ maps to 0. Of these three values, l and r are kept respectively in L and R, while h is kept in a register H in B’s finite memory. More specifically, on input n, B starts with a two-part initialization. First, it computes lg30 n into J, leaving 0s in L and R (this is only if n ≥ 1; if n = 0, then B rejects): f,J,L

R ←− lg30 n; t if ¬f then reject else {(J, R) ←− (R, 0); L ←− 0}. Then, it computes into R the value 5m − 1, where m := lg5 lg30 n, leaving 0s in L and H (this is only if J ≥ 1, that is if n ≥ 30; otherwise, n < 30 and B rejects): f,L

R ←− 5lg5 J ; t if ¬f then reject else {R ←− R − 1; L ←− 0; H ←− 0}. This completes the initialization, with L = H = 0 and R = 5m − 1. Hence, the 5-ary representations of the three counters are: L=0

and

H =0

and

R = |4 4 4{z· · · 4}, m times

which means that they correctly represent A’s starting tape configuration on input 0m (recall that symbol 0 maps to code 4): ··· ···

l1 ⊔

l0 ⊔

h ⊔ ↑

r0 0

r1 0

r2 0

··· ···

rm−1 0

rm ⊔

rm+1 ⊔

··· ···

At this point, B is ready to start a faithful step-by-step simulation of A. The automaton remembers in its finite memory the current state of A as well as the code of the currently read symbol (in H). If s is the code of the new symbol to be written on the tape, B computes t,J t,r0 ,J t L ←− 5L; repeat s times: L ←− L + 1; R ←− R5 ; H ←− r0 to simulate writing s and moving to the right; similarly, it computes t,J t,l0 ,J t R ←− 5R; repeat s times: R ←− R + 1; L ←− L5 ; H ←− l0

to simulate writing s and moving to the left. It is important to note the range of the values assumed by the counters. By the design of its main operation, the second part of the initialization phase never assigns to a counter a value greater than the original value of J, which is lg30 n. Then, in the simulation phase, the behavior of A (the

114

3. NON-RECURSIVE TRADE-OFFS

tape starts with m 0s and always contains exactly m non-blank symbols) and the selection of the symbol codes (0 gets the largest code) are such that the initial value 5m − 1 of R upper bounds all possible values that may appear in B’s counters. One consequence of this is that all operations in the previous paragraph are guaranteed to be successful (hence the t reminder). Another consequence is that, since 5m − 1 < lg30 n, the entire computation of B after the first part of its initialization phase keeps all values of all counters at or below lg30 n. This will prove crucial in the next section. 5.3. The final machine. M ′ is a dca2 that simulates the behavior of B. If U , V are its two counters, then U is primary and helps performing operations on V , while V encodes the contents of the counters of B: whenever J, L, R, contain j, l, r respectively, V contains 2j 3l 5r . The automaton starts by computing into V the product 30t = 2t 3t 5t , where t := lg30 n (this is only if n ≥ 1; if n = 0, then M ′ rejects, exactly as B would do): f,U

V ←− 30lg30 n ; if ¬f then reject. It then removes all 3s and 5s of this product, so that V becomes 2lg30 n 30 50 . Specifically, in order to remove all 3s, M ′ divides V by 3 repeatedly: t,r,U V ←− V3 ,

until a non-zero remainder r is returned, which implies there were no 3s in V before the last division. Then the correction t,U

t

V ←− 3V ; repeat r times: V ←− V + 1 undoes the damage caused by the last division. After this, M ′ performs a similar computation to remove from V all 5s. At this point, the value 2lg30 n 30 50 correctly encodes the values of the counters of B right after the first part of its initialization phase and M ′ is ready to start a faithful step-by-step simulation of B. The current state of B is stored in the finite memory of M ′ . Whenever B tries to decrease J, f

J ←− J − 1, M ′ divides V by 2. If this division returns no remainder, then it has simulated a successful decrement; otherwise, the simulated attempt has failed, and M ′ restores the initial value of V : t,r,U

V ←−

V 2 ; if r = 0 then f ←− true else t,U

t

{f ←− false; V ←− 2V ; repeat r times: V ←− V + 1}.

The attempts of B to decrease L or R are implemented similarly, but with 3 or 5 in the place of 2.

6. CONCLUSION

115

Attempts of B to increase its counters are of course simulated by appropriate multiplications of V . The only subtlety involves failure during increment attempts. To be faithful, the simulation must ensure that an attempt of B the corresponding attempt of M ′ iff to increase a counter fails to multiply V fails. How is this condition satisfied? If it is, this does not happen in some obvious way. In B the upper bound for J is always the same (the input of B), whereas in M ′ the upper bound for its representation is the base 2 logarithm of a value that depends (on the input of M ′ and) on the values of the other two counters. Similarly, in B counter L is unrestricted, whereas in M ′ its representation is bounded by a value that depends on the other two counters—and the same is true for R. The crucial observation (from the previous section) is that, since we are after the first part of B’s initialization phase, no counter of B ever assumes a value greater than t = lg30 n. This immediately implies that, after the initialization phase of M ′ , no counter of M ′ ever assumes a value greater than 2t 3t 5t . Now, since t < n and 2t 3t 5t ≤ n, we conclude that both • all increment attempts of B are successful, and • all corresponding multiplication attempts of M ′ are successful, as well. Hence, the equivalence above is satisfied vacuously. Put another way, when M ′ multiplies V to simulate a counter increment in B, it knows in advance that this increment does not fail and therefore that the multiplication will not fail, either. Overall, B’s atomic operation t

J ←− J + 1

is simulated by

t,U

V ←− 2V,

and similarly for L and R. As a final remark, we note the following immediate by-product of our last argument: Since V clearly never exceeds n during the initialization phase of M ′ and it also never exceeds n during the simulation of B, it follows that M ′ is bounded. This concludes the definitions of all three machines in our reduction. It should be clear that M ′ is good and that a description z ′ of it can be effectively computed out of z. 6. Conclusion Using old ideas [61, 38], we showed the unrecognizability of the emptiness problem for dca2 s that are promised to be bounded, always terminate, and obey a threshold. We then combined this with the idea of [19] to show that, if machines a have the resources to simulate dca2 s of the particular kind and can also solve problems that machines b cannot, then typically the trade-off from a to b is non-recursive. Applying the theorem, we derived such trade-offs in many conversions.

116

3. NON-RECURSIVE TRADE-OFFS

We do not know if the emptiness problem of Section 2.2 remains unrecognizable even when the underlying machine is a 2-register automaton [38] (that is, a dca2 that starts with n in its primary counter and where increments of that counter never fail). If it is, then our main theorem can be made slightly stronger.

A preliminary and more concrete version of the contents of this chapter can be found in [26]. An improved but more abstract version appeared in [28].

End Note I would like to thank my research advisor, Michael Sipser, for suggesting to me the 2d vs. 2n problem and for being a constant source of encouragement and ideas as I was working on it during the past five years. I learned a lot from him about computation and (more generally, and perhaps more importantly) about how to think and how to explain. I enjoyed the kindness and humanity of his personality and I particularly admire his ability to just take a few silent seconds and then return with the most valuable advice for whatever problem needs to be solved. I would also like to thank Albert Meyer, from whom I also learned a lot over the past years as a teaching assistant for his class on discrete mathematics. I really enjoyed the honesty of his character and I cannot but admire his eagerness and ability to talk and think efficiently through almost any kind of problem. This is probably a good place to also thank the people of MPLA in Greece, where my graduate studies actually began. When I moved to Athens in 1997, it was not certain that the universities was indeed going to be the place where I would be burning my energy. If it turned out this way, it is mainly due to the quality of the people and the academic program at MPLA. I am particularly grateful to Yiannis Moschovakis for encouraging me to continue my studies abroad. On my way out of Greece, I also had the honor to meet General Leftheris and Mrs. Roula Kanellakis. Like so many other students, I am proud to have been a Paris Kanellakis Fellow and I have often drawn inspiration from Paris’s academic conduct and achievements. Finally, many thanks are due to my uncle Demos Fokas. When I first came to Cambridge, he had already been around for a while and he helped me a lot with the transition. Most importantly, having been through graduate school himself, Demos knew very well what this process is all about and in many occasions used his experience to provide me with valuable advice and inspiration. He is now in the ‘southern provinces’ already—and this is exactly where I am also heading for.

117

Bibliography [1] Bruce H. Barnes. A two-way automaton with fewer states than any equivalent oneway automaton. IEEE Transactions on Computers, C-20(4):474–475, 1971. [2] Piotr Berman. A note on sweeping automata. In Proceedings of the International Colloquium on Automata, Languages, and Programming, pages 91–97, 1980. [3] Piotr Berman and Andrzej Lingas. On complexity of regular languages in terms of finite automata. Report 304, Institute of Computer Science, Polish Academy of Sciences, Warsaw, 1977. [4] Jean-Camille Birget. Two-way automata and length-preserving homomorphisms. Report 109, Department of Computer Science, University of Nebraska, 1990. [5] Jean-Camille Birget. Positional simulation of two-way automata: proof of a conjecture of R. Kannan and generalizations. Journal of Computer and System Sciences, 45:154–179, 1992. [6] Jean-Camille Birget. State-complexity of finite-state devices, state compressibility and incompressibility. Mathematical Systems Theory, 26:237–269, 1993. [7] Marek Chrobak. Finite automata and unary languages. Theoretical Computer Science, 47:149–158, 1986. [8] David Damanik. Finite automata with restricted two-way motion. Master’s thesis, J. W. Goethe-Universit¨ at Frankfurt, 1996. In german. ˇ s and Zvi Galil. Fooling a two-way automaton or one pushdown store [9] Pavol Duriˇ is better than one counter for two-way machines. Theoretical Computer Science, 21:39–53, 1982. [10] Roger B. Eggleton and Richard K. Guy. Catalan strikes again! How likely is a function to be convex? Mathematics Magazine, 61(4):211–219, 1988. [11] Michael J. Fischer, Albert R. Meyer, and Arnold L. Rosenberg. Counter machines and counter languages. Mathematical Systems Theory, 3:265–283, 1968. [12] Martin Gardner. Catalan numbers: an integer sequence that materializes in unexpected places. Scientific American, 234(6):120–125, June 1976. [13] Viliam Geffert, Carlo Mereghetti, and Giovanni Pighizzini. Converting two-way nondeterministic unary automata into simpler automata. Theoretical Computer Science, 295:189–203, 2003. [14] Viliam Geffert, Carlo Mereghetti, and Giovanni Pighizzini. Complementing two-way finite automata. In Proceedings of the International Conference on Developments in Language Theory, pages 260–271, 2005. [15] Jonathan Goldstine, Martin Kappes, Chandra M. R. Kintala, Hing Leung, Andreas Malcher, and Detlef Wotschke. Descriptional complexity of machines with limited resources. Journal of Universal Computer Science, 8(2):193–234, 2002. [16] Juris Hartmanis. On the succinctness of different representations of languages. SIAM Journal of Computing, 9(1):114–120, 1980. [17] Juris Hartmanis. On G¨ odel speed-up and succinctness of language representations. Theoretical Computer Science, 26:335–342, 1983. 119

120

BIBLIOGRAPHY

[18] Juris Hartmanis. On the importance of being Π2 -hard. Bulletin of the EATCS, 37:117–127, 1989. [19] Juris Hartmanis and Theodore P. Baker. Relative succinctness of representations of languages and separation of complexity classes. In Proceedings of the International Symposium on Mathematical Foundations of Computer Science, pages 70–88, 1979. [20] John E. Hopcroft and Jeffrey D. Ullman. Formal languages and their relation to automata. Addison-Wesley, Reading, MA, 1969. [21] John E. Hopcroft and Jeffrey D. Ullman. Introduction to automata theory, languages, and computation. Addison-Wesley, Reading, MA, 1979. [22] Juraj Hromkoviˇ c and Georg Schnitger. Nondeterminism versus determinism for twoway finite automata: generalizations of Sipser’s separation. In Proceedings of the International Colloquium on Automata, Languages, and Programming, pages 439– 451, 2003. [23] Oscar H. Ibarra. On two-way multihead automata. Journal of Computer and System Sciences, 7:28–36, 1973. [24] Neil Immerman. Nondeterministic space is closed under complementation. SIAM Journal of Computing, 17(5):935–938, 1988. [25] Ravi Kannan. Alternation and the power of nondeterminism. In Proceedings of the Symposium on the Theory of Computing, pages 344–346, 1983. [26] Christos Kapoutsis. From k + 1 to k heads the descriptive trade-off is non-recursive. In Proceedings of the Workshop on Descriptional Complexity of Formal Systems, pages 213–224, 2004. [27] Christos Kapoutsis. Deterministic moles cannot solve liveness. In Proceedings of the Workshop on Descriptional Complexity of Formal Systems, pages 194–205, 2005. [28] Christos Kapoutsis. Non-recursive trade-offs for two-way machines. International Journal of Foundations of Computer Science, 16:943–956, 2005. [29] Christos Kapoutsis. Removing bidirectionality from nondeterministic finite automata. In Proceedings of the International Symposium on Mathematical Foundations of Computer Science, pages 544–555, 2005. [30] Christos Kapoutsis. Small sweeping 2NFAs are not closed under complement. In Proceedings of the International Colloquium on Automata, Languages, and Programming, pages 144–156, 2006. [31] Martin Kutrib. On the descriptional power of heads, counters, and pebbles. In Proceedings of the Workshop on Descriptional Complexity of Formal Systems, pages 138–149, 2003. [32] Martin Kutrib. The phenomenon of non-recursive trade-offs. In Proceedings of the Workshop on Descriptional Complexity of Formal Systems, pages 83–97, 2004. [33] Hing Leung. Separating exponentially ambiguous finite automata from polynomially ambiguous finite automata. SIAM Journal of Computing, 27(4):1073–1082, 1998. [34] Hing Leung. Tight lower bounds on the size of sweeping automata. Journal of Computer and System Sciences, 63(3):384–393, 2001. [35] Albert R. Meyer and Michael J. Fischer. Economy of description by automata, grammars, and formal systems. In Proceedings of the Symposium on Switching and Automata Theory, pages 188–191, 1971. [36] Silvio Micali. Two-way deterministic finite automata are exponentially more succinct than sweeping automata. Information Processing Letters, 12(2):103–105, 1981. [37] Pascal Michel. An NP-complete language accepted in linear time by a one-tape Turing machine. Theoretical Computer Science, 85(1):205–212, 1991. [38] Marvin L. Minsky. Recursive unsolvability of Post’s problem of “tag” and other topics in theory of Turing machines. Annals of Mathematics, 74(3):437–455, 1961. [39] Marvin L. Minsky. Computation: finite and infinite machines. Prentice-Hall, Englewood Cliffs, NJ, 1967.

BIBLIOGRAPHY

121

[40] Burkhard Monien. Transformational methods and their application to complexity problems. Acta Informatica, 6:95–108, 1976. [41] Burkhard Monien. Corrigenda: Transformational methods and their application to complexity problems. Acta Informatica, 8:383–384, 1977. [42] Burkhard Monien. Two-way multihead automata over a one-letter alphabet. RAIRO Informatique Th´ eorique/Theoretical Informatics, 14(1):67–82, 1980. [43] Frank R. Moore. On the bounds for state-set size in the proofs of equivalence between deterministic, nondeterministic, and two-way finite automata. IEEE Transactions on Computers, 20(10):1211–1214, 1971. [44] G. Ott. On multipath automata I. Research report 69, SRRC, 1964. [45] Michael O. Rabin. Two-way finite automata. In Proceedings of the Summer Institute of Symbolic Logic, pages 366–369, Cornell, 1957. [46] Michael O. Rabin and Dana Scott. Remarks on finite automata. In Proceedings of the Summer Institute of Symbolic Logic, pages 106–112, Cornell, 1957. [47] Michael O. Rabin and Dana Scott. Finite automata and their decision problems. IBM Journal of Research and Development, 3:114–125, 1959. [48] William J. Sakoda and Michael Sipser. Nondeterminism and the size of two-way finite automata. In Proceedings of the Symposium on the Theory of Computing, pages 275–286, 1978. [49] Walter J. Savitch. Relationships between nondeterministic and deterministic tape complexities. Journal of Computer and System Sciences, 4:177–192, 1970. [50] Erik M. Schmidt and Thomas G. Szymanski. Succinctness of descriptions of unambiguous context-free languages. SIAM Journal of Computing, 6(3):547–553, 1977. [51] Joel I. Seiferas. Manuscript communicated to Michael Sipser. October 1973. [52] Joel I. Seiferas. Relating refined space complexity classes. Journal of Computer and System Sciences, 14(1):100–129, 1977. [53] John C. Shepherdson. The reduction of two-way automata to one-way automata. IBM Journal of Research and Development, 3:198–200, 1959. [54] Michael Sipser. Halting space-bounded computations. Theoretical Computer Science, 10:335–338, 1980. [55] Michael Sipser. Lower bounds on the size of sweeping automata. Journal of Computer and System Sciences, 21(2):195–202, 1980. [56] Richard E. Stearns. A regularity test for pushdown machines. Information and Control, 11:323–340, 1967. [57] Ivan H. Sudborough. On tape-bounded complexity classes and multihead finite automata. Journal of Computer and System Sciences, 10(1):62–76, 1975. [58] R´ obert Szelepcs´ enyi. The method of forced enumeration for nondeterministic automata. Acta Informatica, 26(3):279–284, 1988. [59] Leslie G. Valiant. A note on the succinctness of descriptions of deterministic languages. Information and Control, 32:139–145, 1976. [60] Moshe Y. Vardi. A note on the reduction of two-way automata to one-way automata. Information Processing Letters, 30:261–264, 1989. [61] Hao Wang. A variant of Turing’s theory of computing machines. Journal of the ACM, 4(1):63–92, 1957.

ALGORITHMS AND LOWER BOUNDS IN FINITE AUTOMATA SIZE COMPLEXITY by

christos kapoutsis

Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of

doctor of philosophy at the Massachusetts Institute of Technology.

June 2006

Abstract In this thesis we investigate the relative succinctness of several types of finite automata, focusing mainly on the following four basic models: one-way deterministic (1dfas), one-way nondeterministic (1nfas), two-way deterministic (2dfas), and two-way nondeterministic (2nfas). We first establish the exact values of the trade-offs for all conversions from two-way to one-way automata. Specifically, we show that the functions j Pn−1 Pn−1 n n i 2n n nn − (n − 1)n , i=0 j=0 i j 2 −1 , n+1

return the exact values of the trade-offs from 2dfas to 1dfas, from 2nfas to 1dfas, and from 2dfas or 2nfas to 1nfas, respectively. Second, we examine the question whether the trade-offs from 1nfas or 2nfas to 2dfas are polynomial or not. We prove two theorems for liveness, the complete problem for the conversion from 1nfas to 2dfas. We first focus on moles, a restricted class of 2nfas that includes the polynomially large 1nfas which solve liveness. We prove that, in contrast, 2dfa moles cannot solve liveness, irrespective of size. We then focus on sweeping 2nfas, which can change the direction of their input head only on the end-markers. We prove that all sweeping 2nfas solving the complement of liveness are of exponential size. A simple modification of this argument also proves that the trade-off from 2dfas to sweeping 2nfas is exponential. Finally, we examine conversions between two-way automata with more than one head-like devices (e.g., heads, linearly bounded counters, pebbles). We prove that, if the automata of some type a have enough resources to (i) solve problems that no automaton of some other type b can solve, and (ii) simulate any unary 2dfa that has additional access to a linearlybounded counter, then the trade-off from automata of type a to automata of type b admits no recursive upper bound.

Contents Introduction 1. The 2D vs. 2N Problem 2. Motivation 3. Progress 4. Other Problems in This Thesis

9 9 12 19 25

Chapter 1. Exact Trade-Offs 1. History of the Conversions 2. Preliminaries 3. From 2DFAs to 1DFAs 4. From 2NFAs to 1DFAs 5. From 2NFAs to 1NFAs 6. Conclusion

29 29 33 39 46 51 61

Chapter 2. 2D versus 2N 1. History of the Problem 2. Restricted Information: Moles 3. Restricted Bidirectionality: Sweeping Automata 4. Conclusion

63 63 64 84 99

Chapter 3. Non-Recursive Trade-Offs 1. Two-Way Multi-Pointer Machines 2. Preliminaries 3. The Main Theorem 4. Programming Counters 5. Proof of the Main Lemma 6. Conclusion

101 102 104 108 109 111 115

End Note

117

Bibliography

119

7

Introduction The main subject of this thesis is the 2d vs. 2n problem, a question on the power of nondeterminism in two-way finite automata. We start by defining it, explaining the motivation for its study, and describing our progress against it. 1. The 2D vs. 2N Problem A two-way deterministic finite automaton (2dfa) is the machine that we get from the common one-way deterministic finite automaton (1dfa) when we allow its input head to move in both directions; equivalently, this is the machine that we get from the common single-tape deterministic Turing machine (dtm) when we do not allow its input head to write on the input tape. The nondeterministic version of a 2dfa is simply called twoway nondeterministic finite automaton (2nfa) and, as usual, is the machine that we get by allowing more than one options at each step and acceptance by any computation branch. The 2d vs. 2n question asks whether 2nfas can be strictly more efficient than 2dfas, in the sense that there exists a problem for which the best 2dfa algorithm is significantly worse than the best 2nfa one. Of course, to complete the description of the question, we need to explain how we measure the efficiency of an algorithm on a two-way finite automaton. It is easy to check that, with respect to the length of its input, every algorithm of this kind uses zero space and at most linear time. Therefore, the time and space measures—our typical criteria for algorithmic efficiency on the full-fledged Turing machine—are of little help in this context. Instead, we focus on the size of the program that encodes this algorithm, namely the size of the transition function of the corresponding two-way finite automaton. In turn, a good measure for this size is simply the automaton’s number of states. So, the 2d vs. 2n question asks whether there is a problem that, although it can be solved both by 2dfas and by 2nfas, the smallest possible 2dfa for it (i.e., the 2dfa that solves it with the fewest possible states) is still significantly larger than the smallest possible 2nfa for it. To fully understand the question, two additional clarifications are needed. 9

10

INTRODUCTION

First, it is well-known that the problems that are solvable by 2dfas are exactly those described by the regular languages, and that the same is true for 2nfas [53]. Hence, efficiency considerations aside, the two types of automata have the same power—in the same way that 1dfas or dtms have the same power with their nondeterministic counterparts. Therefore, the above reference to problems that “can be solved both by 2dfas and by 2nfas” is a reference to exactly the regular problems. Second, although we explained that efficiency is measured with respect to the number of states, we still have not defined what it means for the number of states in one automaton to be “significantly larger” than the number of states in another one. The best way to clarify this aspect of the question is to give a specific example. Consider the problem in which we are given a list of subsets of the set {0, 1, . . . , n − 1} and we are asked whether this list can be broken into sublists so that the first set of every sublist contains the number of sets after it [51]. More precisely, the problem is defined on the alphabet ∆n := P({0, 1, . . . , n − 1}) of all sets of numbers smaller than n. A string a0 a1 · · · al over this alphabet is a block if its first symbol is a set that contains the number of symbols after it, that is, if a0 ∋ l. The problem consists in determining whether a given string over ∆n can be written as a concatenation of blocks. For example, for n = 8 and for the string {1,2,4}∅{4}{0,4}{2,4,6}{4}{4,6}∅{3,6}∅{2,4}{5,7}{0,3}{4,7}∅{4}∅{4}{0,1}{2,5,6}{1}

the answer should be “yes”, since this is the concatenation of the substrings {1,2,4}∅{4} {0,4}{2,4,6}{4}{4,6}∅ {3,6}∅{2,4}{5,7} {0,3} {4,7}∅{4}∅{4}{0,1}{2,5,6}{1}

where the first set in each substring indeed contains the number of sets after it in the same substring, as shown by boldface. In contrast, for the string {1,2,7}{4}{5,6}∅{3,6}{2,4,6}

the answer should be “no”, as there is no way to break it into blocks. Is there a 2dfa algorithm for this problem? Is there a 2nfa algorithm? Indeed, the problem is regular. The best 2nfa algorithm for it is the following, rather obvious one: We scan the list of sets from left to right. At the beginning of each block, we read the first set. If it is empty, we just hang (in this computation branch). Otherwise, we nondeterministically select from it the correct number of remaining sets in the block, and consume as many sets counting from that number down. When the count reaches 0, we know the block is over and a new one is about to start. In the end, we accept if the list and our last count-down finish simultaneously.

It is easy to see that this algorithm can be implemented on a 2nfa (which does not actually use its bidirectionality) with exactly 1 state per possible

1. THE 2D VS. 2N PROBLEM

11

value of the counter, for a total number of n states. As for a 2dfa algorithm, here is the best known one: We scan the list of sets from left to right. At each step, we remember all possibilities about how many more sets there are in the most recent block. (E.g., right after the first set, every number in it is a possibility.) After reading each set, we decrease each possibility by 1, to account for the set just consumed; if a possibility becomes 0, we replace it by all numbers of the next set. If at any point the set of possibilities gets empty, we just hang. Otherwise, we eventually reach the end of the list. There, we check whether 0 is among the possibilities. If so, we accept.

Easily, this algorithm can be implemented on a 2dfa (which does not actually use its bidirectionality) that has exactly 1 state for each possible non-empty1 set of possibilities, for a total number of 2n − 1 states. Overall, we see that the difference in size between the automata implementing the two algorithms is exponential. Such a difference, we surely want to say that it is “significant”. At this point, after the above clarifications, we may want to restate the 2d vs. 2n question as the question whether there exists a regular problem for which the smallest possible 2dfa is still super-polynomially larger than the smallest possible 2nfa. However, a further clarification is due. Given any fixed regular problem, the sizes of the smallest possible 2dfa and the smallest possible 2nfa for it are nothing more than just two numbers. So, asking whether their difference is polynomial or not makes little sense. What the previous example really describes is a family of regular problems Π = (Πn )n≥0 , one for each natural value of n; then, a family of 2nfas N = (Nn )n≥0 which solve these problems and whose sizes grow linearly in n; and, finally, a family of 2dfas D = (Dn )n≥0 which also solve these problems and whose sizes grow exponentially in n. Therefore, our reference to “the difference in size between the automata” was really a reference to the difference in the rate of growth of the sizes of the automata in the two families. It is this difference that can be characterized as polynomial or not. Naturally, we decide to call it significant if one of the two rates can be bounded by a polynomial but the other one cannot. So, the 2d vs. 2n question asks whether there exists a family of regular problems such that the family of the smallest 2nfas that solve them have sizes that grow polynomially in n, whereas the family of the smallest 2dfas that solve them have sizes that grow super-polynomially in n. Equivalently, the question is whether there exists a family of regular problems that can be 1Note that a 2dfa is allowed to reject by just hanging anywhere along its input. Without this freedom, the number of states required to implement our algorithm would actually be 2n .

12

INTRODUCTION

solved by a polynomial-size family of 2nfas but no polynomial-size family of 2dfas. With this clarification, we are ready to explain the name “2d vs. 2n”. We define 2d as the class of families of regular problems that can be solved by polynomial-size families of 2dfas, and 2n as the corresponding class for 2nfas [48]. Under these definitions, we obviously have 2d ⊆ 2n, and the question is whether 2d and 2n are actually different. Observe how the nature of the question resembles both circuit and Turing machine complexity: like circuits, we are concerned with the rate of growth of size in families of programs; unlike circuits and like Turing machines, each program in a family can work on inputs of any length. Concluding this description of the problem, let us also remark that its formulation in terms of families is not really part of our every-day vocabulary in working on the problem. Instead, we think and speak as if n were a built-in parameter of our world, so that only the n-th members of the three families (of problems, of 2nfas, and of 2dfas) were visible. Under this pretense, the 2d vs. 2n question asks whether there is a regular problem that can be solved by a polynomially large 2nfa but no polynomially large 2dfa—and it is redundant to mention what parameter “polynomially large” refers to. In addition, we also use “small” as a more intuitive substitute for “polynomially large”, and drop the obviously redundant characterization “regular”. So, the every-day formulation of the question is whether there exists a problem that can be solved by a small 2nfa but no small 2dfa. Throughout this thesis, we will occasionally be using this kind of talk, with the understanding that it is a substitute for its formal interpretation in terms of families. 2. Motivation Our motivation for studying the 2d vs. 2n question comes from two distinct sources: the theory of computational complexity and the theory of descriptional complexity. We discuss these two different contexts separately. 2.1. Computational complexity. From the perspective of computational complexity, the 2d vs. 2n question falls within the general major goal of understanding the power of nondeterminism. For certain computational models and resources, this quest has been going on for more than four decades now. Of course, the most important (and correspondingly famous) problem of this kind is p vs. np, defined on the Turing machine and for time which is bounded by some polynomial of the length of the input. The next most important question is probably l vs. nl, also defined on the Turing machine and for space which is bounded by the logarithm of some polynomial of the length of the input. It is perhaps fair to say that our progress against the core of these problems has been slow. Although our theory is constantly being enriched

2. MOTIVATION

13

with new concepts and new connections between the already existing ones, major advances of our understanding are rather sparse. At the same time, the common conceptual origin as well as certain combinatorial similarities between these problems has led some to suspect that essentially the same elusive idea lies at the core of all problems of this kind. In particular, the suspicion goes, this idea may be independent of the specifics of the underlying computational model and resource. In this context, a possibly advantageous approach is to focus on weak models of computation. The simple setting provided by these models allows us to work closer to the set-theoretic objects that are produced by their computations. This serves us in two ways. First, it obviously helps our arguments become cleaner and more robust. Second, it helps our reasoning become more objective, by neutralizing our often misleading algorithmic intuitions about what a machine may or may not do. The only problem with this approach is that it may very well lead us to models of computation that are too weak to be relevant. In other words, some of these models are obviously not rich enough to involve ideas of the kind that we are looking for. We should therefore be careful to look for evidence that indeed such ideas are present. We believe that the 2d vs. 2n question passes this test, and we explain why. 2.1-I. Robustness. One of the main reasons why problems like p vs. np and l vs. nl are so attractive is their robustness. The essence of each question remains the same under many different variations of the mathematical definitions of the model and/or the resource. It is only on such stable grounds that the theoretical framework around each question could have been erected, with the definition of the classes of problems that can be solved in each case (p, np, l, nl) and the identification of complete problems for the nondeterministic classes, that allowed for a more tangible reformulation of the original question (is satisfiability in p? is connectivity in l?). The coherence and richness of these theories further enhance our confidence that they indeed describe important high-level questions about the nature of computation, as opposed to technical low-level inquiries about the peculiarities of particular models. The 2d vs. 2n problem is also robust. Its resolution is independent of all reasonable variations of the definition of the two-way finite automaton and/or the size measure, including changes in the conventions for accepting and rejecting, for end-marking the input, for moving the head; changes in the size of the alphabet, which may be fixed to binary; changes in the size measure, which may include the number of transitions or equal the length of some fixed binary encoding. It is on this stable ground that the classes 2d and 2n have been defined. In addition, 2n-complete (families of) problems have been identified [48], allowing more concrete forms of the

14

INTRODUCTION

question. Overall, there is no doubt that in this case, too, our investigations are beyond technical peculiarities and into the high-level properties of computation. 2.1-II. Hardness. A second important characteristic of problems like p vs. np and l vs. nl that contributes to their popularity is their hardness. Until today, a long list of people that cared about these problems have tried to attack them from several different perspectives and with several different techniques. Their limited success constitutes significant evidence that, indeed, answering these questions will require a deeper understanding of the combinatorial structure of computation. In this sense, the answer is likely to be highly rewarding. In contrast, the 2d vs. 2n problem can boast no similar attention on the part of the community, as it has never attracted the efforts of a large group of researchers. However, it does belong to the culture of this same community and several of its members have tried to attack it or problems around it, with the same limited success. In this sense, it is again fair to predict that the answer to this question will indeed involve ideas which are deep enough to be significantly rewarding. 2.1-III. Surprising conjecture. If we open a computational complexity textbook, we will probably find p defined as the class of problems that can be solved by polynomial-time Turing machines, and np as the class of problems that can be solved by the same machines when they are enhanced with the ability to make nondeterministic choices. Then, we will probably also find a discussion of the standard conjecture that p 6= np, justified by the well-known compelling list of pragmatic and philosophical reasons. In this context, the conjecture does not sound surprising at all. There is certainly nothing strange with the enhanced machines being strictly more powerful than the original, non-enhanced ones. They start already as powerful, and then the magical extra feature of nondeterminism comes along: how could this result in nothing new? But the conjecture does describe a fairly intriguing situation. First, if we interpret the conjecture in the context of the struggle of fast deterministic algorithms to subdue fast nondeterministic ones, it says that a particular nondeterministic Turing machine using just one tape and only linear time [37] can actually solve a problem that defies all polynomial-time multi-tape deterministic Turing machines, irrespective of the degree of the associated polynomial and the number of tapes. In other words, the claim is that nondeterministic algorithms can beat deterministic ones even with minimal use of the rest of their abilities. This is an aspect of the p 6= np conjecture that our definitions do not highlight. Second, if we interpret the conjecture in the context of the struggle of fast deterministic algorithms to solve a specific np-complete problem, it

2. MOTIVATION

15

says that the fastest way to check whether a propositional formula is satisfiable is essentially to try all possible assignments. Although this matches with our experience in one sense (in the simple sense that we have no idea how to do anything faster in general), it also seriously clashes with our experience in another sense, equally important: it claims that the obvious, highly inefficient solution is also the optimal one. This is in contrast with what one would first imagine, given that optimality is almost always associated with high sophistication. Similar comments are valid for l vs. nl. The standard conjecture that l 6= nl asserts that a particular nondeterministic finite automaton with a small bunch of states and just two heads that only move one-way [57] can actually solve a problem that defies all deterministic multi-head two-way finite automata, irrespective of their number of states and heads. At the same time, it claims that the most space-economical method of checking connectivity is essentially the one by Savitch, a fairly non-trivial algorithm which nevertheless still involves several natural choices (e.g., recursion) and lacks the high sophistication that we usually expect in optimality. Similarly to p vs. np and l vs. nl, the conjecture for the 2d vs. 2n problem is again that 2d 6= 2n. Moreover, two stronger variants of it have also been proposed. First, it is conjectured that 2nfas can be exponentially smaller than 2dfas even when they never move their input head to the left. In other words, it is suggested that even one-way nondeterministic finite automata (1nfas) can be exponentially smaller than 2dfas. (Note the similarity with the version of l 6= nl mentioned above, where the nondeterministic automaton need only move its heads one-way to beat all two-way multi-head deterministic ones.) So, once more we have an instance of the claim that nondeterministic algorithms can beat deterministic ones with minimal use of the rest of their abilities. Second, it is even conjectured that this alleged exponential difference in size between 1nfas and 2dfas covers the entire gap from n to 2n − 1 which is known to exist between 1nfas and 1dfas [35].2 In other words, according to this conjecture, a 2dfa trying to simulate a 1nfa may as well drop its bidirectionality, since it is going to be totally useless: its optimal strategy is going to be the well-known brute-force one-way deterministic simulation [47]. So, once more we have an instance of the claim that the obvious, highly inefficient solution is also the optimal one.3 2As with 2dfas (cf. Footnote 1 on page 11), a 1dfa is allowed to reject by just

hanging anywhere along its input. Without this freedom, the gap would actually be from n to 2n . 3 Note that, contrary to the strong versions of p 6= np and l 6= nl mentioned above [37, 57], the two conjectures mentioned in these two last paragraphs may be strictly stronger than 2d 6= 2n. It may very well be that 1nfas cannot be exponentially smaller than 2dfas, but 2nfas can (it is known that 2nfas can be exponentially smaller

16

INTRODUCTION

For a concrete example of what all this means, recall the problem that we described early in this introduction (page 10). Remember that we presented it as a problem that is solvable by small 2nfas but is conjectured to require large 2dfas. Notice that the best 2nfa algorithm presented there is actually one-way, namely a 1nfa. So, if the conjecture about this problem is true, then 2nfas can indeed beat 2dfas, and they can do so without using their bidirectionality. Also notice that the best known 2dfa for that problem is one-way, too. In fact, it is simply the brute-force 1dfa simulation of the 1nfa solver. So, if this is really the smallest 2dfa for the problem, then indeed the optimal way of simulating the hardest 2nfa is the obvious, highly inefficient one. In total, interpreting the questions on the power of nondeterminism (p vs. np, l vs. nl) as a contest between deterministic and nondeterministic algorithms, our conjectures claim that nondeterministic algorithms can win with one hand behind their back; and then, the best that deterministic algorithms can do in their defeat to minimize their losses is essentially not to think. This is a counter-intuitive claim, and our conjectures for 2d vs. 2n make this same claim, too. 2.1-IV. A mathematical connection. Of the similarities that we described above between 2d vs. 2n and the more important questions on the power of nondeterminism, none is mathematical. However, a mathematical connection is known, too. As explained in [3], if we can establish that 2d 6= 2n using only “short” strings, then we would also have a proof that l 6= nl. To describe this implication more carefully, we need to first discuss how a proof of 2d 6= 2n may actually proceed. To prove the conjecture, we need to start with a 2n-complete family of regular problems Π = (Πn )n≥0 , and prove that it is not in 2d. That is, we must prove that for any polynomial-size family of 2dfas D = (Dn )n≥0 there exists an n which is bad, in the sense that Dn does not solve Πn . Now, two observations are due: • To prove that some n is bad, we need to find a string wn that “fools” Dn , in the sense that wn ∈ Πn but Dn rejects wn , or wn 6∈ Πn but Dn accepts wn . • Every D has a bad n iff every D has infinitely many bad n. This is true because, if a polynomial-size family D has only finitely many bad n, then replacing the corresponding Dn with correct automata of any size would result in a new family which is still polynomial-size and has no bad n. Hence, proving the conjecture amounts to proving that, for any polynomialsize family D of 2dfas for Π, there is a family of strings w = (wn )n≥0 such that, for infinitely many n, the input wn fools Dn . than 1nfas). Moreover, even if 1nfas can be exponentially smaller than 2dfas, it may very well be that this exponential gap is smaller than the gap from n to 2n − 1.

2. MOTIVATION

17

Now, the connection with l vs. nl says the following: if we can indeed find such a proof and in addition manage to guarantee that the lengths of the strings in w are bounded by some polynomial of n, then l 6= nl. There is no doubt that this connection increases our confidence in the relevance of the 2d vs. 2n problem to the more important questions on the power of nondeterminism. However, its significance should not be overestimated. First, the two problems may very well be resolved independently. On one hand, if 2d = 2n then the connection is irrelevant, obviously. On the other hand, if 2d 6= 2n then our tools for short strings are so much weaker than our tools for long strings, that it is hard to imagine us arriving at a proof that uses only short strings before actually having a proof that uses long ones. Second, and perhaps most importantly, ideas do not need mathematical connections to transcend domains. In other words, an idea that works for one type of machines may very well be applicable to other types of machines, too, even if no high-level theorem encodes this transfer. Examples of this situation include the Immerman-Szelepcs´enyi idea [24, 58], Savitch’s idea [49], and Sipser’s “rewind” idea [54], each of which has been applied to machines of significantly different power [14, 54]. In conclusion, from the computational complexity perspective, the 2d vs. 2n problem is a question on the power of nondeterminism which seems both simple enough to be tractable and, at the same time, robust, hard, and intriguing enough to be relevant to our efforts against other, more important questions of its kind. 2.2. Descriptional complexity. From the perspective of descriptional complexity, the 2d vs. 2n question falls within the general major goal of understanding the relative succinctness of language descriptors. Here, by “language descriptor” we mean any formal model for recognizing or generating strings: finite automata, regular expressions, pushdown automata, grammars, Turing machines, etc. Perhaps the most famous question in this domain is the one about the relative succinctness of 1dfas and 1nfas. Since both types of automata recognize exactly the regular languages [47], every such language can be described both by the deterministic and by the nondeterministic version. Which type of description is shorter? Or, measuring the size of these descriptions by the number of states in the corresponding automata, which type of automaton needs the fewest states? Clearly, since determinism is a special case of nondeterminism, a smallest 1dfa cannot be smaller than a smallest 1nfa. So, the question really is: How much larger than a smallest 1nfa need a smallest 1dfa be? By the well-known simulation of [47], we know that every n-state 1nfa has an equivalent 1dfa with at most 2n − 1 states (cf. Footnote 2 on page 15). Moreover, this simulation is optimal [35], in the sense that certain n-state 1nfas have no equivalent 1dfa with fewer

18

INTRODUCTION

than 2n − 1 states. Hence, this question of descriptional complexity is fully resolved: if the minimal 1nfa description of a regular language is of size n, then the corresponding minimal 1dfa description is of size at most 2n − 1, and sometimes is exactly that big. There is really no end to the list of questions of this kind that can be asked. For the example of finite automata alone, we can change how the size of the descriptions is measured (e.g., use the number of transitions) and/or the resource that differentiates the machines (e.g., use any combination of nondeterminism, bidirectionality, ambiguity, alternation, randomness, pebbles, heads, etc.). Moreover, the models being compared can even be of completely different kind (e.g., 1nfas versus regular expressions) and/or have different power (e.g., 1nfas versus deterministic pushdown automata, or context-free grammars), in which case each model may have its own measure for the size of descriptions. Typically, every question of this kind is viewed in the context of the corresponding conversion. For example, the question about 1dfas and 1nfas is viewed as follows: Given an arbitrary 1nfa, we would like to convert it into a smallest equivalent 1dfa. What is the increase in the number of states in the worst case? In other words, starting with a 1nfa, we want to trade size for determinism and we would like to know in advance the worst possible loss in size. We encode this information into a function f , called the trade-off of the conversion: for every n, f (n) is the least upper bound for the new number of states when an arbitrary n-state 1nfa is converted into a smallest equivalent 1dfa. In this terminology, our previous discussion can be summarized into the following concise statement: the trade-off from 1nfas to 1dfas is f (n) = 2n − 1. Note that this encodes both the simulation of [47], by saying that f (n) ≤ 2n − 1, and the “hard” 1nfas of [35], by saying that f (n) ≥ 2n − 1. The 2d vs. 2n problem can also be concisely expressed in these terms. It concerns the conversion from 2nfas to 2dfas, where again we trade size for determinism, and precisely asks whether the associated trade-off can be upper-bounded by some polynomial: = 2n ⇔ the trade-off from 2nfas to 2dfas is polynomially bounded. Indeed, if the trade-off is polynomially bounded, then every family of regular problems that is solvable by a polynomial-size family of 2nfas N = (Nn )n≥0 is also solvable by a polynomial-size family of 2dfas: just convert Nn into a smallest equivalent 2dfa Dn , and form the resulting family D := (Dn )n≥0 . Since the size sn of Nn is bounded by a polynomial in n and the size of Dn is bounded by a polynomial in sn (the trade-off bound), the size of Dn is also bounded by a polynomial in n. Overall, 2d

3. PROGRESS

19

= 2n. Conversely, suppose the trade-off is not polynomially bounded. For every n, let Nn be any of the n-state 2nfas that cause the value of the trade-off for n, and let Dn be a smallest equivalent 2dfa. Then the sizes of the automata in the family D := (Dn )n≥0 are exactly the values of the trade-off, and therefore D is not of polynomial size. Moreover, for Πn the language recognized by Nn and Dn , the family Π := (Πn )n≥0 is clearly in 2n (because of the linear-size family N := (Nn )n≥0 ) but not in 2d (since D is not of polynomial size). Overall, 2d 6= 2n. Note the sharp difference in our understanding of the two conversions mentioned so far. On the one hand, our understanding of the conversion from 1nfas to 1dfas is perfect: we know the exact value of the associated trade-off. On the other hand, our understanding of the conversion from 2nfas to 2dfas is minimal: not only do we not know the exact value of the associated trade-off, but we cannot even tell whether it is polynomial or not. The best known upper bound for it is exponential, while the best known lower bound is quadratic. In fact, the details of this gap reveal a much more embarrassing ignorance. The exponential upper bound is the trade-off from 2nfas to 1dfas, while the quadratic lower bound is the trade-off from unary 1nfas to 2dfas. In other words, put in the shoes of a 2dfa that tries to simulate a 2nfa, we have no idea how to benefit from our bidirectionality; at the same time, put in the shoes of a 2nfa that tries to resist being simulated by a 2dfa, we have no idea how to use our bidirectionality or our ability to distinguish between different tape symbols. A bigger picture is even more peculiar. The 12 arrows in Figure 1 show all possible conversions that can be performed between the four most fundamental types of finite automata: 1dfas, 1nfas, 2dfas, and 2nfas. For 10 of these conversions, the problem of finding the exact value of the associated trade-off has been completely resolved (as shown in the figure), and therefore our understanding of them is perfect. The only two that remain unresolved are the ones from 2nfas and 1nfas to 2dfas (as shown by the dashed arrows), that is, the ones associated with 2d vs. 2n. 2d

In conclusion, from the descriptional complexity perspective, the 2d vs. 2n problem represents the last two open questions about the relative succinctness of the basic types of automata defined by nondeterminism and bidirectionality. Moreover, the contrast in our understanding between these two questions and the remaining ten is the sharp contrast between minimal and perfect understanding. 3. Progress Our progress against the 2d vs. 2n question has been in two distinct directions: we have proved lower bounds for automata of restricted information and for automata of restricted bidirectionality. In both cases, our

20

INTRODUCTION

2nfa

a = 2n − 1

e e

e

d d

2dfa

e

e c b

1nfa

b = n nn − (n − 1)n j Pn−1 Pn−1 c = i=0 j=0 ni nj 2i − 1 2n d = n+1 e=n

a 1dfa

Figure 1. The 12 conversions defined by nondeterminism and bidirectionality in finite automata, and the known exact trade-offs. theorems involve a particular computational problem called liveness. We start by describing this problem. 3.1. Liveness. As mentioned in Section 2.1-III, we currently believe that 2nfas can be exponentially smaller than 2dfas even without using their bidirectionality. That is, we believe that even 1nfas can be exponentially smaller than 2dfas. In computational complexity terms, this is the same as saying that the reason why 2d + 2n is because already 2d + 1n, where 1n is the class of families of regular problems that can be solved by polynomialsize families of 1nfas. In descriptional complexity terms, this is the same as saying that the reason why the trade-off from 2nfas to 2dfas is not polynomially bounded is because already the trade-off from 1nfas to 2dfas is not. In this thesis, we focus on this stronger conjecture. As in any attempt to show non-containment of one complexity class into another (p + np, l + nl), it is important to know specific complete problems—namely, problems which witness the non-containment iff the non-containment indeed holds. In our case, we need a family of regular problems that can be solved by a polynomial-size family of 1nfas and, in addition, they are such that no polynomial-size family of 2dfas can solve them iff 2d + 1n. Such families are known. In fact, we have already presented one: the family of problems defined on page 10 over the alphabets ∆n . So, it is safe to invest all our efforts in trying to understand that particular family, and prove or disprove the 2d + 1n conjecture by showing that the family does not or does belong to 2d. However, it is easier (and as safe) to work with another complete family, which is defined over an even larger alphabet and thus brings us closer to the combinatorial core of the conjecture. This family is called liveness, denoted by B = (Bn )n≥0 , and defined as follows [48].

3. PROGRESS

21

For each n, we consider the alphabet Σn := P({1, 2, . . . , n}2 ) of all directed 2-column graphs with n nodes per column and only rightward arrows. For example, for n = 5 this alphabet includes the symbols: 1

1

1

2

2

2

3

3

3

4

4

4

5

5

5

where, e.g., indexing the vertices from top to bottom, the rightmost symbol is {(1, 2), (2, 1), (4, 4), (5, 5)}. Given an m-long string over Σn , we naturally interpret it as the representation of a directed (m+1)-column graph, the one that we get by identifying the adjacent columns of neighboring symbols. For example, for m = 8 the string of the above symbols represents the graph: 0

1

2

3

4

5

6

7

8

1

1

1

2

2

2

3

3

3

4

4

4

5

5

5

where columns are indexed from left to right starting from 0. In this graph, a live path is any path that connects the leftmost column to the rightmost one (i.e., the 0th to the mth column), and a live vertex is any vertex that has a path from the leftmost column to it. The string is live if live paths exist; equivalently, if the rightmost column contains live vertices. Otherwise, the string is dead. For example, in the above string, the 5th node of the 2nd column is live because of the path 3 → 3 → 5, and the string is live because of two live paths, one of which is 3 → 3 → 2 → 5 → 5 → 3 → 3 → 2 → 1. Note that no information is lost if we drop the direction of the arrows, and we do. So, the above string is simply: 1

1

2

2

3

3

4

4

5

5

The problem Bn consists in determining whether a given string of Σn∗ is live or not. In formal dialect, Bn is the language {w ∈ Σn∗ | w is live} of all live strings over Σn , for all n. As already claimed, B ∈ 1n. That is, there exist small 1nfa algorithms for Bn . The smallest possible one is rather obvious: We scan the list of graphs from left to right, trying to nondeterministically follow one live path. Initially, we guess the starting vertex among those of the leftmost column. Then, on reading each graph, we find which vertices in the next column are accessible from the most recent vertex. If none is, we hang (in this branch of the nondeterminism). Otherwise, we guess one of them and move on remembering only it. If we ever arrive at the end of the input, we accept.

22

INTRODUCTION

It is easy to verify that this algorithm can be implemented on a 1nfa with exactly one state per possible vertex in a column. Hence, Bn is solvable by an n-state 1nfa. In contrast, nobody knows how to solve Bn on a 2dfa with fewer than 2n − 1 states. The best known 2dfa algorithm is the following: We scan the list of graphs from left to right, remembering only the set of live vertices in the most recent column. Initially, all vertices of the leftmost column are live. Then, on reading each graph, we use its arrows to compute the set of live vertices in the next column. If it is empty, we simply hang. Otherwise, we move on, remembering only this set. If we ever arrive at the end of the input, we accept.

Easily, this algorithm needs exactly one state per possible non-empty set of live vertices in a column, for a total of 2n − 1 states, as promised. By the completeness of B, our questions about the relation between 2d and 1n can be encoded into questions about the size of a 2dfa solving Bn . In other words, the following three statements are equivalent: • 2d ⊇ 1n, • the trade-off from 1nfas to 2dfas is polynomially bounded, • Bn can be solved by a 2dfa of size polynomial in n. Hence, to prove the conjecture that 2d + 1n, we just need to prove that the number of states in every 2dfa solving Bn is super-polynomial in n. In fact, as explained in Section 2.1-III, a stronger conjecture says that the above 2dfa algorithm is optimal! That is, in every 2dfa solving Bn —the conjecture goes—the number of states is not only super-polynomial but already 2n − 1 or bigger. To better understand what this means, observe that the above algorithm is one-way: it is, in fact, the smallest 1dfa for liveness (as we can easily prove). Therefore, the claim is that in solving liveness, a 2dfa has no way of using its bidirectionality to save even 1 of the 2n − 1 states that are necessary without it. 3.2. Restricted information: Moles. The first direction in our investigation of the efficiency of 2dfas against liveness is motivated by the particular way of operation of the 1nfa algorithm that we described above. Specifically, consider any branch of the nondeterministic computation of that 1nfa. Along that branch, the automaton moves through the input from left to right, reading one graph after the other. However, although at every step the entire next graph is read, only part of its information is used. In particular, the automaton ‘focuses’ only on one of the vertices in the left column of the graph and ‘sees’ only the arrows which depart from that vertex. The rest of the graph is ignored. In this sense, the automaton operates in a mode of ‘restricted information’. A more intuitive way to describe this mode of operation is to view the input string as a ‘network of tunnels’ and the 1nfa as an n-state one-way nondeterministic robot that explores this network. Then, at each step, the

3. PROGRESS

23

robot reads only the index of the vertex that it is currently on and the tunnels that depart from that vertex, and has the option to either follow one of these tunnels or abort, if none exists. In yet more intuitive terms, the automaton behaves like an n-state one-way nondeterministic mole. Given this observation, a natural question to ask is the following: Suppose we apply to this mole the same conversion that defines the question whether 2d ⊇ 1n. Namely, suppose that this mole loses its nondeterminism in exchange for bidirectionality. How much larger does it need to get to still be solving Bn ? That is, can liveness be solved by a small two-way deterministic mole? Equivalently, is there a 2dfa algorithm that can tell whether a string is live or not by simply exploring the graph defined by it? Note that, at first glance, there is nothing to exclude the possibility of some clever graph exploration technique that correctly detects the existence of live paths and can indeed be implemented on a small 2dfa. In Chapter 2 we answer this question in a strongly negative manner: no two-way deterministic mole can solve liveness. To understand the value of this answer, it is necessary to understand both the “good news” and the “bad news” that it contains. The good news is that we have crossed an entire, very natural class of 2dfa algorithms off the list of candidates against liveness. We have thus come to know that every correct 2dfa must be using the information of every symbol in a more complex way than moles. However, note that our answer talks of all two-way deterministic moles, as opposed to only small ones. This might sound like “even better news”, but it is actually bad. Remember that our primary interest is not moles themselves, but rather the behavior of small 2dfas against liveness. So, our hope was that we would get an answer that involves small moles, and this hope did not materialize. Put another way, we asked a complexity-theoretic question and we received a computability-theoretic answer. Overall, our understanding has indeed advanced, but not for the class of machines that we were mostly interested in. Nevertheless, some of the tools developed for the proof of this theorem may still be useful for the more general goal. Specifically, if indeed small 2dfas cannot solve liveness, then it is hard to imagine a proof that will not involve very long inputs. Such a proof will probably need tools similar to the dilemmas and generic strings for 2dfas that were used in our argument. 3.3. Restricted bidirectionality: Sweeping automata. The second direction that we explore is motivated by the known fact that 2d is closed under complement [54, 14], whereas the corresponding question for 2n is open. So, one way to prove that 2d 6= 2n is to show that 2n is not closed under complement. In terms of classes, we can write this goal as 2n 6= co2n, where co2n is the class of families of regular problems whose

24

INTRODUCTION

complements can be solved by polynomial-size families of 2nfas. Of course, it is conceivable that 2n = co2n, in which case a proof of this closure would constitute evidence that 2d = 2n. As a matter of fact, 2n = co2n is already known to hold in some special cases. First, the analogue of this question for logarithmic-space Turing machines is known to have been resolved this way: nl = conl [24, 58]. By the argument of [3], this implies that every small 2nfa can be converted into a small 2nfa that makes exactly the opposite decisions on all “short” inputs (in the sense of Section 2.1-IV). In addition, the proof idea of nl = conl has been used to prove that indeed 2n = co2n for the case of unary regular problems [14]. So, 2n and co2n are already known to coincide on short and on unary inputs. However, there is little doubt that the above special cases avoid the core of the hardness of the 2n vs. co2n question. In this sense, our confidence in the conjecture that 2n 6= co2n is not seriously harmed. As a matter of fact, in Chapter 2 we prove a theorem that constitutes evidence for it. We consider a restriction on the bidirectionality of the 2nfas and prove that, under this restriction, 2n 6= co2n. The restricted automata that we consider are the “sweeping” 2nfas. A two-way automaton is sweeping if its head can change direction only on the input endmarkers. In other words, each computation of a sweeping automaton is simply a sequence of one-way passes over the input, with alternating direction. We use the notation snfa for sweeping 2nfas, and sn for the class of families of regular problems that can be solved by polynomialsize families of snfas. With these names, our theorem says that: sn 6= cosn. Specifically, our proof uses liveness, which is obviously in sn: B ∈ sn. We show that, in contrast, every snfa for the complement of Bn needs 2Ω(n) states, so that B ∈ / cosn. Hence, B ∈ sn \ cosn and the two classes differ. Another way to interpret this theorem is to view it as a generalization of two other, previously known facts about the complement of liveness: that it is not solvable by small 1nfas [48] and that it is not solvable by small sweeping 2dfas [55, 14], either. So, proving the same for small snfas amounts to generalizing both these facts to sweeping bidirectionality and to nondeterminism, respectively. For another interesting interpretation, note that the smallest known snfa solving the complement of Bn is still the obvious 2n -state 1dfa from page 22. Hence, our theorem says that, even after allowing sweeping bidirectionality and nondeterminism together, a 1dfa can still not achieve significant savings in size against the complement of liveness—whether it can save even 1 state is still open. Finally, our proof can be modified so that all strings used in it are drawn from a special subclass of Σn∗ on which the complement of liveness can actually be determined by a small 2dfa. This immediately implies that:

4. OTHER PROBLEMS IN THIS THESIS

25

the trade-off from 2dfas to snfas is exponential, generalizing a known similar relation between 2dfas and sdfas [55, 2, 36]. 4. Other Problems in This Thesis Apart from the progress against the 2d vs. 2n question explained above, this thesis also contains several other theorems in descriptional complexity. 4.1. Exact trade-offs for regular conversions. As explained already in Section 2.2 (Figure 1 on page 20), the 2d vs. 2n question concerns only 2 of the 12 possible conversions between the four most basic types of finite automata (1dfas, 1nfas, 2dfas, and 2nfas). For each of the remaining conversions our understanding is perfect, in the sense that we know the exact value of the associated trade-off. For the conversion from 1nfas to 1dfas (Figure 1a), the upper bound is due to [47] and the lower bound due to [35]. For any of the conversions from weaker to stronger automata (Figure 1e), the upper bound is obvious by the definitions and the lower bound is due to [6]. For the remaining four conversions (from 2nfas or 2dfas to 1nfas or 1dfas), both the upper and lower bounds are due to this thesis—although the fact that the trade-offs were exponential was known before. We establish these exact values in Chapter 1. For a quick look, see Figure 1b–d. We stress, however, that the exact values alone do to reveal the depth of the understanding behind the associated proofs. In order to explain what we mean by this, let us revisit the conversion from 1nfas to 1dfas. As already mentioned, we can encode our understanding of this conversion into the concise statement that: the trade-off from 1nfas to 1dfas is exactly 2n − 1. A less succinct but more informative description is that, for all n: • every n-state 1nfa has an equivalent 1dfa with ≤ 2n − 1 states, and • some n-state 1nfa has no equivalent 1dfa with < 2n − 1 states. But even these more verbose statements fail to describe the kind of understanding that led to them. What we really know is that every 1nfa N can be simulated by a 1dfa that has 1 distinct state for each non-empty subset of states of N which (as an instantaneous description of N ) is both realizable and non-redundant. This is exactly the idea where everything else comes from: the value 2n − 1 (by a standard counting argument), the simulation for the upper bound (just construct a 1dfa with these states and with the then obvious transitions), and the hard instances for the lower bound (just find 1nfas that manage to keep all of their instantaneous descriptions realizable and non-redundant). In this sense, we know more than just the value of the trade-off; we know the precise, single reason behind it: the non-empty subsets of states of the 1nfa

26

INTRODUCTION

that is being converted. To be able to pin down the exact source of the difficulty of a conversion in terms of such a simple and well-understood class of set-theoretic objects is a rather elegant achievement. Our analyses in Chapter 1 are supported by this same kind of understanding: in each one of the four trade-offs that we discuss, we first identify the correct set-theoretic object at the core of the conversion and then move on to extract from it the exact value, the simulation, and the hard instances that we need. As a foretaste, here are the objects at the core of the conversion from 2nfas to 1nfas: the pairs of subsets of states of the 2nfa, where the second subset has exactly 1 more state than the first subset. So, every 2nfa can be simulated by a 1nfa that has 1 distinct state for every such pair, and for some 2nfas all these states are necessary. Moreover, the value of the trade-off is exactly the number of such pairs that we can construct out of an n-state 2nfa; a standard counting argument shows that 2n this number is n+1 . 4.2. Non-recursive trade-offs for non-regular conversions. In contrast to Chapters 1 and 2, the last chapter studies conversions between machines other than the automata of Figure 1, including machines that can also recognize non-regular problems. As we shall see, the trade-offs for such conversions may, in general, behave in a quite different manner. To understand the difference, note that already since [53] we knew how to effectively convert any 2nfa (the strongest type of automata in Figure 1) into a 1dfa (the weakest type). This immediately guaranteed a recursive upper bound for each one of the 12 trade-offs of Figure 1. In contrast, for other conversions, such a recursive upper bound cannot be taken for granted. As first shown in [35], there are cases where the trade-off of a conversion grows faster than any recursive function: e.g., the conversion from one-way nondeterministic pushdown automata that recognize regular languages to 1dfas. Moreover, this phenomenon cannot be attributed simply to the difference in power between the types of the machines involved. As shown in [56], if the pushdown automata in the previous conversion are deterministic, then the trade-off does admit a recursive upper bound. Such trade-offs, that cannot be recursively bounded, are simply called nonrecursive. Note that this name is slightly misleading, as it allows the possibility of a non-recursive trade-off that still admits recursive upper bounds. However, no such cases will appear in this thesis. In Chapter 3 we refine a well-known technique [16] to prove a general theorem that implies the non-recursiveness of the trade-off in a list of conversions involving two-way machines. Roughly speaking, our theorem concerns any two types of machines, a and b, such that the following two conditions are satisfied:

4. OTHER PROBLEMS IN THIS THESIS

27

• there exists a problem that can be solved by a machine of type a but cannot be solved by any machine of type b, and • any two-way deterministic finite automaton that works on a unary alphabet and has access to a linearly-bounded counter can be simulated by some machine of type a. For any such pair of types of machines, our theorem says that the trade-off from a to b is non-recursive. For example, we can have a be the multi-head finite automata with k + 1 heads and b be the multi-head finite automata with k heads. No matter what k is, the two conditions above are known to be true, and therefore replacing a multi-head finite automaton with an equivalent one that has 1 fewer head results in an non-recursive increase in the size of the automaton’s description, in general. At the core of the argument of this theorem lies a lemma of independent interest: we prove that the emptiness problem remains unrecognizable (nonsemidecidable) even for a unary two-way deterministic finite automaton that has access to a linearly-bounded counter and obeys a threshold —in the sense that it either rejects all its inputs or accepts exactly those that are longer than some fixed length.

CHAPTER 1

Exact Trade-Offs In this chapter we prove the exact values of the trade-offs for the conversions from two-way to one-way finite automata, as pictured in Figure 1 (page 20). In Section 3 we cover the conversion from 2dfas to 1dfas (Figure 1b), whereas the conversion from 2nfas to 1dfas (Figure 1c) is the subject of Section 4. The conversions from 2nfas and 2dfas to 1nfas (Figure 1d) are covered together in Section 5. We begin with a short note on the history of the subject and a summary of our conclusions. 1. History of the Conversions The conversion of 1nfas into 1dfas is the archetypal problem of descriptional complexity. As already mentioned (Figure 1a), it is fully resolved, in the sense that we know the exact value of the associated trade-off:1 the trade-off from 1nfas to 1dfas is 2n − 1. The history of the problem began in the late 50’s, when Rabin and Scott [46, 47] introduced 1nfas as a generalization of 1dfas and showed how 1dfas can simulate them. This proved the upper bound for the trade-off. The matching lower bound was established much later, via several examples of “hard” 1nfas [44, 35, 43, 51, 48, 33].2 Both bounds are based on the crucial idea that the non-empty subsets of states of the 1nfa capture everything that a simulating 1dfa needs to describe with its states. As part of the same seminal work [45, 47], Rabin and Scott also introduced two-way automata and proved “to their surprise” that 1dfas were again able to simulate their generalization. This time, though, the proof was complicated enough to be superseded by a simpler proof by Shepherdson [53] at around the same time. All authors were actually talking about 1Recall our conventions, as explained in Footnote 2 on page 15. 2The earliest ones, both over a binary alphabet, appeared in [35] (an example

that was described as a simplification of one contained in an even earlier unpublished report [44]) and in [43] (where [44] is again mentioned as containing a different example with similar properties). Other examples have also appeared, over both large [51, 48] and binary alphabets [33]. A more natural but not optimal example was also mentioned in [35] and attributed to Paterson. 29

30

1. EXACT TRADE-OFFS

what we would now call single-pass two-way deterministic finite automata (zdfas), as their definitions did not involve end-markers. However, the automata quickly grew into full-fledged two-way deterministic finite automata (2dfas) and also into nondeterministic counterparts (znfas and 2nfas), while all theorems remained valid or easily adjustable. Naturally, the descriptional complexity questions arose again. Shepherdson mentioned that, according to his proof, every n-state 2dfa had an equivalent 1dfa with at most (n + 1)(n+1) states. Had he cared for his bound to be tight, he would surely have noted that his proof had actually established an upper bound of only n(n + 1)n —e.g., see [20, Section 3.7]. Many years later, Birget [6, Theorem A3.4] claimed that this upper bound is really just nn . On the other hand, towards a lower bound, several authors showed that the trade-off is at least exponential [1, 35, 55] and indeed very close to the upper bound given by Shepherdson [35, 43].3 Here we will prove that both the upper and lower bounds meet at the value n nn − (n − 1)n . We will thus have arrived at the conclusion that the trade-off from 2dfas to 1dfas is n nn − (n − 1)n . Note that this value is larger than the upper bound nn claimed by Birget [6]. Indeed, his argument contained an oversight. But it did contain the correct idea, and it is exactly that idea which we apply here. We also note that our lower bound is valid even when the 2dfa being converted is single-pass. More importantly, both the upper and the lower bound are derived in a straightforward manner after we carefully identify the correct set-theoretic objects that ‘live’ in the relationship between the computations of 2dfas and 1dfas. These are the tables of the 2dfa as defined in Section 2.1-II. No big surprise is to be anticipated: we simply follow Birget’s idea from [6] in properly restricting the functions used by Shepherdson [53, proof of Theorem 2]. The relation between the most and least powerful of all automata mentioned so far, namely between 2nfas and 1dfas, has also been examined. Via a straightforward adjustment, Shepherdson’s argument could show very early that every n-state 2nfa can be converted into a 1dfa with at most 2 2n (2n −1) states. Much later, Birget [6, Theorem A3.4] claimed it to be no (n−1)2 n more than n n/2 2 . Towards a lower bound, we just mention the one 3That the lower bound is close to the upper bound given by Shepherdson was shown by (a slight modification of) the language of [35, Proposition 2], which requires ≥ nn states on every 1dfa, but only ≤ 5n + 5 states on a 2dfa (a zdfa, even). Similarly, [43] gave a language that requires ≥ nn + o(nn ) states on a 1dfa, but only ≤ 2n + 5 on a 2dfa. For just an exponential separation, one can look at [1] for ≥ 2n + 2 and ≤ 2n + 2 states (even on a zdfa); at the Paterson example of [35] for ≥ 2n and ≤ n + 2 states (even on a sdfa); or at [55] for ≥ 2n and ≤ O(n) (even on a sdfa).

1. HISTORY OF THE CONVERSIONS

31 2

provided by the systematic framework of [48], which was 2(n/2−2) . Here we will show that j Pn−1 Pn−1 n n i the trade-off from 2nfas to 1dfas is i=0 j=0 i j 2 −1

where the lower bound is valid even when the 2nfa being converted is singlepass. As before, we will first identify the correct set-theoretic objects that relate the computations of 2nfas to those of 1dfas. These are the tables of the 2nfa as defined in Section 2.2. Again, we arrive at them by appropriately restricting the functions in the Shepherdson argument. The most interesting of the descriptional complexity questions that we consider emerge in the conversions from 2nfas to 1nfas and 2dfas: (Q1) from 2nfas to 1nfas: is bidirectionality essential to 2nfas? Or, is there a problem that a 1nfa would be able to solve with exponentially fewer states if allowed to move its head to the left? (Q2) from 2nfas to 2dfas: is nondeterminism essential to 2nfas? Or, is there a problem that a 2dfa would be able to solve with exponentially fewer states if allowed to make nondeterministic choices? The second question is of course the 2d vs. 2n problem, as explained in the Introduction and covered in Chapter 2. For the first question, the answer is known to be positive, but here we will find the exact value of the trade-off. For the upper bound, it is straightforward to use Shepherdson’s idea 2 to show that every n-state 2nfa has an equivalent 1nfa with at most n2n states. A more economical simulation, with fewer than (n!)2 states, can be achieved by crossing sequences [21, Section 2.6]. However, it is not hard to observe that the order in which the pairs of successive states (after the first state) appear inside a crossing sequence is not important; equivalently, in applying Shepherdson’s idea we can use the nondeterminism of the simulating 1nfa not only for ‘guessing forwards’ but also for ‘guessing backwards’. Based on this observation, we can actually construct a 1nfa with at most n(n + 1)n states. But this would still be wasting exponentially many states, as Birget [6] showed 8n + 2 states are always enough. On the other hand, towards a lower bound, exponential separations between 2dfas and 1nfas have long been known [48, 6], even when the 2dfas are single-pass [51, 8], the best being 2(n−1)/2 − 1.4 Here, we will again show that the upper and lower bounds meet exactly 2n at the value n+1 . We will thus have concluded that 4In [48] a language was given that requires ≤ 2n + 1 states on a 2dfa, but ≥ 2n − 1

states on a 1nfa. Through a different method, [6] found the same 2(n−1)/2 − 1 lower bound. Seiferas [51] gave a language that needs ≤ 4n + 2 states on a zdfa, but ≥ 2n states on any 1nfa, while Damanik [8] independently arrived at the same argument. Copying that idea, one can easily see that the restriction of the language Bn of [48] to strings of length 2 has similar properties (≤ 2n and ≥ 2n states).

32

1. EXACT TRADE-OFFS

2n the trade-off from 2nfas to 1nfas is n+1 . This will again be possible after we identify the correct set-theoretic objects relating the computations of 2nfas to those of 1nfas. These are the frontiers of the 2nfa as defined in Section 5.1. Essentially, these objects are what remains of the crossing sequences of [21] after we ignore not only the order of the pairs of successive states (as we did for the n(n + 1)n bound above) but the correspondence between first and second components in this set of pairs. As a matter of fact, the lower bound is valid even when the 2nfa being converted is deterministic (and single-pass, actually). This immediately implies that 2n the trade-off from 2dfas to 1nfas is n+1 , as well. Hence, the ability of a 2nfa to move its head in both directions strictly inside the input can alone cause all the hardness that a simulating 1nfa must overcome. In other words, the answer to question (Q1) from above is positive, exactly because even the answer to the following question is positive (Figure 1d): (Q3) from 2dfas to 1nfas: can bidirectionality beat nondeterminism? Is there a problem that a 1nfa would be able to solve with exponentially fewer states if it were allowed to replace nondeterminism with bidirectionality? Note the similarity with the conjectured resolution of question (Q2). As explained in the Introduction, we believe that the answer to (Q2) is also positive exactly because the answer to the following question is positive: (Q4) from 1nfas to 2dfas: can nondeterminism beat bidirectionality? Is there a problem that a 2dfa would be able to solve with exponentially fewer states if it were allowed to replace bidirectionality with nondeterminism? So, it appears that in both cases, the hardness of the simulation may stem entirely from the feature of the simulated machine that is absent in the simulating machine. Finally, let us also briefly consider the conversions from weaker to stronger automata (Figure 1e). By the definitions, the trade-off for each one of them is trivially upper-bounded by n. Moreover, it is also lower bounded by n. To see why, notice that the n-th singleton unary language {0n−1 } can be solved by an n-state 1dfa but no 2nfa with fewer than n states [4]. This proves that the trade-off from 1dfas to 2nfas is n and thus immediately implies the same for all other conversions of this kind. In the next section, we carefully define the notions that we will be working with in the rest of this chapter.

2. PRELIMINARIES

33

2. Preliminaries We write [n] for the set {1, . . . , n}. The special objects l, r, ⊥ are used for building the disjoint union of two sets and the augmentation of a set A ⊎ B = (A × {l}) ∪ (B × {r})

and

A⊥ = A ∪ {⊥}.

When A, B are disjoint, their union A ∪ B is also written as A + B (so that + can replace ∪ in both equations above). The size of A, the set of subsets of A, and the set of non-empty subsets of A are denoted respectively by |A|, P(A), and P ′ (A). For Σ an alphabet, we use Σ ∗ for the set of all finite strings over Σ and Σe for Σ + {⊢, ⊣}, where ⊢ and ⊣ are two special end-marking symbols. If u ∈ Σe∗ is a string, |u| is its length and ui is its i-th symbol, for all i = 1, 2, . . . , |u|. By ‘the i-th boundary of u’ we mean the boundary between ui and ui+1 , if 0 < i < |u|; or the leftmost boundary of u, if i = 0; or the rightmost boundary of u, if i = |u|. (Figure 2a.) We also write ue for the end-marked extension ⊢u⊣ of u and ue,i for the i-th symbol (ue )i of this extension. The empty string is denoted by ǫ. Of the automata that we consider, the two-way deterministic ones constitute the most natural variety and are described in the next section. Section 2.2 defines the one-way and nondeterministic cases. Section 2.3 discusses some of the problems that we will be solving with all these machines. 2.1. Two-way deterministic finite automata. A two-way deterministic finite automaton (2dfa) over the states of a set Q and the symbols of an alphabet Σ consists of a finite control that can represent all states in Q, a tape that can represent all symbols in Σe , and a read-only head. An input w ∈ Σ ∗ is presented on the tape surrounded by the end-markers, as ⊢w⊣. The automaton starts at a designated start state, its head reading the left end-marker ⊢. At every step, the symbol under the head is read; based on this symbol and the current state, the automaton selects a next state and whether to move its head left or right; it then simultaneously changes its state and moves its head. The input is accepted if the machine ever moves past the right end-marker ⊣ into a designated final state—this being the only case in which violating an end-marker is allowed.5 Formally, a 2dfa over Q and Σ is defined as a triple M = (s, δ, f ), where s, f ∈ Q are the start and the final states, respectively, and δ is the transition function, partially mapping Q × Σe to Q × {l, r}. In addition, δ obeys the aforementioned restrictions about end-marker violation: on ⊢, it either moves the head to the right or hangs; on ⊣, it moves the head to the left, or hangs, or moves the head to the right and enters f . 5Note the unusual conventions about end-marker violations and the position of the head at acceptance, borrowed from [6]. They make our definitions and theorems significantly nicer.

34

1. EXACT TRADE-OFFS

0

i 2 0 3

1 u1

u2

u3

4 u4

q0

5 u5

6

i0

0

6

i0

0

i

q0

q0

qm

qm

qm (a)

6

u6

(b)

(c)

Figure 2. (a) Symbols and boundaries on a 6-long string u, and a computation that hits left. (b) A computation that hangs. (c) A computation c that hits right, and its i-th frontier: Ric in circles and Lci in boxes. 2.1-I. Computations. Although M is typically started at s and on the tape cell containing the left end-marker ⊢, many other possibilities exist: for any string u, position i, and state q, the computation of M when started at q on the i-th symbol of u is the unique sequence compM,q,i (u) = (qt , it ) 0≤t≤m

with (q0 , i0 ) = (q, i) and 0 ≤ m ≤ ∞, that meets the following restrictions: • the head is always inside u, except possibly at the very end: 0 ≤ t < m =⇒ 1 ≤ it ≤ |u| & m 6= ∞ =⇒ 0 ≤ im ≤ |u| + 1. • every two successive pairs respect the transition function: 0 ≤ t < m =⇒ δ(qt , uit ) = (qt+1 , d), where either d = l & it+1 = it − 1 or d = r & it+1 = it + 1. • a last pair inside u exists only if the transition function allows it: m 6= ∞ & 1 ≤ im ≤ |u| =⇒ δ(qm , uim ) is undefined. We say (qt , it ) is the t-th point and m is the length of this computation. If m = ∞, we say the computation loops; otherwise, it hits left into qm , if im = 0; or it hangs, if 1 ≤ im ≤ |u|; or it hits right into qm , if im = |u| + 1. (Figure 2.) When i = 1 or i = |u| we get the left computation of M from q on u or the right computation of M from q on u, respectively: lcompM,q (u) := compM,q,1 (u) or rcompM,q (u) := compM,q,|u| (u). Finally, for w ∈ Σ ∗ , the computation of M on w refers to the typical usage compM (w) := lcompM,s (we ), so that M accepts w iff the computation compM (w) hits right into f . 1. Remark. Note that, when u is the empty string, the left computation of M from q on u is just lcompM,q (ǫ) = (q, 1) and therefore hits right into q, whereas the corresponding right computation is just rcompM,q (ǫ) = (q, 0) and therefore hits left into q.

2. PRELIMINARIES

35

2. Remark. Also note that, since M can violate an end-marker only by moving past ⊣ into f , a computation of M on any end-marked u (e.g., compM (w) is such) can only loop, or hang, or hit right into f . 2.1-II. Tables. Let u ∈ Σe∗ and assume lcompM,s (u) hits right into a state pu . Motivated by [53], we define the table of M on u as the function tableM (u) := τ : Q⊥ → Q that satisfies τ (⊥) := pu and, for all q ∈ Q, ( p if rcompM,q (u) hits right into p, τ (q) := pu if rcompM,q (u) hits left, loops, or hangs. We stress that the table is defined only when lcompM,s (u) hits right ; in all other cases, no meaning is associated with the notation tableM (u). Note that, whenever the table of M on u is defined, it almost fully describes the behavior of the 1 + |Q| computations lcompM,s (u)

and

rcompM,q (u), for all q ∈ Q,

on the rightmost boundary of u, in the sense that, whenever the boundary is indeed hit, τ returns the resulting last state. The only ambiguity arises when τ (q) = pu , for some q ∈ Q: then we do not know if this is because the corresponding computation rcompM,q (u) misses the rightmost boundary, or because it hits it but it does so into pu . If we allowed τ to take values in Q⊥ (as opposed to just Q), we could easily remove this ambiguity—at the same time making our representation identical to that of [53]. But we will not do so. Our ultimate goal is the construction of a 1dfa that simulates M and, as we shall prove, this slightly ambiguous representation contains exactly the amount of information required for this purpose. 3. Remark. Note that, according to our conventions (Remark 1), the table on the empty string tableM (ǫ) is defined, and it equals the constant function s (i.e., the function that maps every element of Q⊥ to the start state of M ). Similarly, according to our conventions for end-marker violation (Remark 2), whenever the table tableM (u) on an end-marked u is defined, it necessarily equals the constant function f (i.e., the function that maps every element of Q⊥ to the final state of M ). 2.1-III. Frontiers. Fix some computation c = ((qt , it ))0≤t≤m of M and consider the i-th boundary of the string being read (Figure 2c). The computation crosses this boundary 0 or more times, each crossing being either in the left-to-right or in the right-to-left direction. Collect into set Ric all states that result from a rightward crossing and do the same for the leftward crossings to get set Lci : Ric := {qt+1 | 0 ≤ t < m & it = i & it+1 = i + 1}, Lci := {qt+1 | 0 ≤ t < m & it = i + 1 & it+1 = i},

36

1. EXACT TRADE-OFFS

also making the special provision that Ric0 −1 necessarily contains q0 .6 The pair (Lci , Ric ) partially describes the behavior of c over the i-th boundary and we call it the i-th frontier of c. Note that the description is indeed partial, as the pair contains no information about the order in which c exhibits the states around the ith boundary, and says nothing about the number of times each individual state is exhibited. For a full description we would need instead the i-th crossing sequence of c (cf. [21]). However, in certain interesting cases, the extra information of the complete description is redundant. In particular, if we only care to decide reachability between two points via cycle-free computations, then the computations’ frontiers contain exactly the amount of information that we need. We will prove and use this in Section 5. 2.2. Nondeterministic, one-way, and single-pass variations. If in the definition of a 2dfa M = (s, δ, f ) more than one next moves are allowed at each step, we say the automaton is nondeterministic (2nfa). This formally means that δ totally maps Q×Σe to the powerset of Q×{l, r} and implies that C := compM,q,i (u) is now a set of computations. If then P := {p | some c ∈ C hits right into p}, we say C hits right into P . Note that, if u is end-marked, then P is either ∅ or {f } (cf. Remark 2). An input w ∈ Σ ∗ is considered to be accepted iff the set of computations compM (w) = lcompM,s (we ) hits right into {f }. If the head of M never moves to the left, we say M is one-way (a 1nfa; or a 1dfa, if M is deterministic).7 If no computation of M ‘continues after reaching an end-marker’, we say M is single-pass (a znfa; or a zdfa). 2.2-I. Tables of a 2nfa. If M is a 2nfa and u ∈ Σe∗ is any string, we can define the table of M on u similarly to what we did for 2dfas in Section 2.1-II. In particular, the table is defined only if the set of computations lcompM,s (u) hits right into some Pu 6= ∅, and is then the function tableM (u) := T : Q⊥ → P ′ (Q) that satisfies T (⊥) := Pu and, for all q ∈ Q, ( P \ Pu if rcompM,q (u) hits right into some P ⊆ 6 Pu , T (q) := Pu if rcompM,q (u) hits right into some P ⊆ Pu . 6This reflects the convention that the starting state of any computation is considered to be the result of an ‘invisible’ left-to-right step. 7Note that, under this definition, a one-way finite automaton works on an endmarked input, a deviation from the standard definition. However, it is easy to verify that an automaton that follows either definition can be converted into an equivalent automaton that follows the other definition and has the same set of states. So, all our conclusions concerning the numbers of states in different automata will be valid irrespective of which definition we have in mind.

2. PRELIMINARIES

37

Note that the definition is consistent with the one for the deterministic case.8 Moreover, it suffers the same ambiguities: whenever T (q) = Pu we do not know if this is because all computations in rcompM,q (u) miss the right boundary or because some of them hit it but do so only into states that are already in Pu . 4. Remark. Also note an analogue of Remark 3: If u is the empty string, then T is defined and equal to the constant function {s}. If u is endmarked, then T is either undefined or equal to the constant function {f }. 2.3. Problems. Given any alphabet Σ, a (promise) problem over Σ is any pair Π = (Πyes , Πno ) of disjoint subsets of Σ ∗ . An automaton solves Π iff it accepts every w ∈ Πyes but no w ∈ Πno . Note that the behavior of an automaton on strings outside Πyes + Πno does not affect whether it solves Π or not. When there are no such strings, namely when Πno + Πno = Σ ∗ , the problem is also called language and is adequately described by Πyes alone (since then Πno = Πyes ). We will be interested in problems over the alphabet that contains pairs of the form (x, d) or (G, d), where x is a number in [n], G is a binary relation on [n], and d is a direction tag from {l, r}. In other words, our alphabet is Γ := [n] + P([n] × [n]) × {l, r}.

However, among all strings over Γ , we will only care about those that have length 4 and happen to follow the specific pattern (1)

(x, l)(G, l)(h, r)(y, r)

where x and y are two numbers in [n], G is a binary relation on [n], and h is a partial function from [n] to [n] which is not defined on y. (Note that a partial function is a special case of a binary relation. So, these symbols do exist in Γ .) We call these strings nice inputs. Intuitively, given a nice input as above, we think of the two-column graph of Figure 3a, where the columns are two copies of [n], the arrows between the columns are determined by G (left-to-right) and h (right-toleft), and the two special nodes are determined by x (entry point) and y (exit point). On this graph, a path from the entry point to the exit point may or may not exist; if it does, we just say that ‘the (graph of the) input has a path’. For example, the nice input of Figure 3a has a path, but the nice input of Figure 3b does not have a path. What makes nice inputs interesting is that a 2nfa can decide whether such an input has a path or not using only n states and a single pass over 8If the 2nfa M is actually deterministic and T , τ are its tables on u as described by the definitions for 2nfas and 2dfas respectively, then T is defined iff τ is. Moreover, when both tables are defined, we have T (r) = {τ (r)} for all r ∈ Q⊥ .

38

1. EXACT TRADE-OFFS

G

h

G

x

h

g y′

x

h

x

y

(a)

y

(b)

(c)

Figure 3. (a) A nice input, that has a path. (b) An nice input with no path. (c) A deterministic nice input, that has a path. it. More precisely, consider the promise problem Φ = (Φyes , Φno ) with Φyes := {w ∈ Γ ∗ | w is a nice input that has a path}, Φno := {w ∈ Γ ∗ | w is a nice input that has no path}. Then Φ can be solved by a znfa N0 that has [n] as its set of states and implements the following natural algorithm: On a nice input like (1) between end-markers, we use the first 2 steps to reach (G, l) at state x. Then we repeatedly and alternately read the two middle symbols, each time selecting nondeterministically and following one of the arrows defined by G (if any) or following the (at most one) arrow defined by h. If we ever reach (h, r) at a state z from which no h-arrow departs, we stay at z and move right to check whether z = y. If so, we move 2 more steps to the right and accept.

Formally, N0 := (1, δ, 1), where δ is any total function from [n] × Γe to the powerset of [n] × {l, r} that satisfies the following equations: δ(1, ⊢) = {(1, r)}, δ 1, (x, l) = {(x, r)}, δ(1, ⊣) = {(1, r)}, ′ ′ δ z, (G, l) = {(z , r) | (z, z ) ∈ G}, δ z, (h, r) = if h(z) is defined then h(z), l else {(z, r)}, δ z, (y, r) = if (z = y) then {(1, r)} else ∅.

Recall that the behavior of N0 on inputs that are not nice is irrelevant. We will also be interested in the special case of inputs of the form (1) where, like h, the relation G is also a partial function (Figure 3c). It is easy to verify that on such inputs N0 does not use its nondeterminism, so we refer to strings of this form as deterministic nice inputs and we use g in place of G to represent them: (2)

(x, l)(g, l)(h, r)(y, r).

3. FROM 2DFAS TO 1DFAS

39

Not surprisingly, the promise problem Ψ = (Ψyes , Ψno ) with Ψyes := {w ∈ Γ ∗ | w is a deterministic nice input that has a path}, Ψno := {w ∈ Γ ∗ | w is a deterministic nice input that has no path}, can be solved by a zdfa M0 with state set [n], executing the following straight-forward modification of the previous algorithm: On a deterministic nice input like (2) surrounded by endmarkers, we use the first 2 steps to reach (g, l) at state x. We then repeatedly and alternately read the two middle symbols, each time following the arrow (if any) defined by g or h. If we ever reach (h, r) at a state z from which no h-arrow departs, we stay at z and move right to check whether z = y. If so, we move 2 more steps to the right and accept.

Formally, M0 := (1, δ, 1), where δ is any partial function from [n] × Γe to [n] × {l, r} that satisfies the following equations: δ(1, ⊢) = (1, r), δ 1, (x, l) = (x, r), δ(1, ⊣) = (1, r), δ z, (g, l) = if g(z) is defined then g(z), r else ‘undefined’, δ z, (h, r) = if h(z) is defined then h(z), l else (z, r), δ z, (y, r) = if (z = y) then (1, r) else ‘undefined’.

Again, only the behavior of M0 on deterministic nice inputs is important. The lower bounds that we will be proving in the following sections will be based on variants of problems Φ and Ψ . Perhaps the reader has already recognized in them two restrictions of Cn , the 2n-complete language of [48]. At the same time, Φ is a large-alphabet variant of a problem used in [3]. 3. From 2DFAs to 1DFAs Fix an n-state 2dfa M = (s, δ, f ) over some set of states Q and an alphabet Σ. In this section we will build a 1dfa that is equivalent to M . We begin with some observations and facts. 3.1. Tables. Consider some non-empty string u and suppose that the table of M on it τ := tableM (u) is defined. We then know that the computation c := lcompM,s (u) hits right into τ (⊥). This implies that c visits the rightmost symbol of u at least once. If q is the state of M during the latest such visit, we easily see that the computation c′ := rcompM,q (u) is a suffix of c. Hence, it also hits right, like c. Moreover, it certainly hits right into the same state as c, meaning τ (q) = τ (⊥). We thus conclude that τ assigns to ⊥ one of the values that it uses for the states in Q. That is, τ (⊥) ∈ τ [Q]. This motivates the following. 5. Definition. A table of M is any τ : Q⊥ → Q such that τ (⊥) ∈ τ [Q].

40

1. EXACT TRADE-OFFS

u

a

s

u

a

u

s

a

s

p

p

r

p r p∗

(a)

(b)

(c)

Figure 4. Trying to compute τ ′ (⊥): (a) c hangs right away, (b) c hits right in the first step, and (c) c moves left in the first step. Note that this defines what a “table of M ” is, whereas Section 2.1-II defined what the “table of M on u” is, for any string u. The next lemma shows the relation between these two notions. The lemma after it, carries out an easy counting argument. 6. Lemma. If the table of M on u is defined, then it is a table of M . Proof. If u 6= ǫ, the proof is the argument before Definition 5. If u = ǫ, the table of M on u is the constant function s (cf. Remark 3), and therefore it obviously qualifies as a table of M . 7. Lemma. The number of distinct tables of M is exactly n(nn − (n − 1)n ). Proof. Easily, the number of distinct tables of M is exactly equal to the number of (n + 1)-tuples of elements of [n] where the first component equals some other component. Since there is a total of nn+1 unrestricted tuples and exactly n(n − 1)n of them violate the restriction about the first component, our number is the difference nn+1 − n(n − 1)n , as claimed. 3.2. Compatibilities among tables. Consider any string u, any symbol a, and suppose that the table τ := tableM (u) is defined. We would like to know whether the table τ ′ := tableM (ua) is also defined and, if so, to compute it. In this section we will show how this can be done using only τ and a, but not u. Our algorithm will be based on the algorithm implied in [53], but it will also need some modifications to account for the ambiguity of our representation. Recall that τ ′ is defined iff the computation lcompM,s (ua) hits right. (Figure 4.) Clearly, this computation ends in c := rcompM,τ (⊥) (ua). So, in order to figure out whether τ ′ is defined, we can just check whether c hits right. If it does, our check will also reveal the last state of c, which we know is the value of τ ′ on ⊥.

3. FROM 2DFAS TO 1DFAS

p ←− τ (⊥) repeat: if δ(p, a) undefined: fail (r, d) ←− δ(p, a) if d = r: return r if τ (r) = τ (⊥): fail if τ (r) seen before: fail p ←− τ (r)

41

p ←− q repeat: if δ(p, a) undefined: return τ ′ (⊥) (r, d) ←− δ(p, a) if d = r: return r if τ (r) = τ (⊥): return τ ′ (⊥) if τ (r) seen before: return τ ′ (⊥) p ←− τ (r)

Figure 5. Computing eτ,a (⊥) (left) and eτ,a (q) (right).

So, let us set p := τ (⊥) and consider the first step of c. If δ(p, a) is undefined (Figure 4a), then c hangs inside ua and τ ′ is undefined. If δ(p, a) is defined and equal to some (r, r) (Figure 4b), then c immediately hits right into r, so τ ′ is defined and τ ′ (⊥) = r. The last case (Figure 4c) is that δ(p, a) equals some (r, l), so then c starts behaving as d := rcompM,r (u) and we know this behavior is already encoded in τ as the value p∗ := τ (r). We distinguish two cases. • If p∗ = τ (⊥), then we can conclude that τ ′ is undefined, as we are in one of the following two cases: either d hits left, loops, or hangs inside u, therefore c does the same inside ua, and hence τ ′ is not defined; or d hits right into τ (⊥), therefore c is back on a and at state p again, so c loops inside ua and hence τ ′ is again undefined. • If p∗ 6= τ (⊥), we know d hits right into p∗ , so that c is back on a. To find out what happens next, we simply ask δ exactly as before, and continue. However, we should be careful not to ask δ a question we have already asked. If this is about to happen, then c repeats p∗ under a, so c actually loops inside ua, in which case τ ′ is undefined. This concludes our description of how to check whether τ ′ is defined and, if so, also compute τ ′ (⊥). Equivalently, we have shown that the algorithm of Figure 5(left) always terminates and either fails, if τ ′ is undefined, or correctly returns the value τ ′ (⊥), if τ ′ is defined. In the case that τ ′ is defined, we also want to compute the rest of its values, namely τ ′ (q) for all q ∈ Q. Given the discussion so far, this is easy: we simply run the algorithm of Figure 5(left) again, but starting with p := q (as opposed to p := τ (⊥)). It is easy to verify that, if the algorithm does not fail, then rcompM,q (ua) hits right exactly into the state that the algorithm returns. So, if the algorithm does return a value, this value is the correct τ ′ (q). On the other hand, if the algorithm fails, this is due to one of its fail statements. We distinguish cases. • If this is due to the 1st or the 3rd fail statement: Then we know that rcompM,q (ua) hangs on a or loops inside ua. Therefore, by

42

1. EXACT TRADE-OFFS

definition, τ ′ (q) equals τ ′ (⊥). So, instead of failing, the algorithm should have returned τ ′ (⊥). • If this is due to the 2nd fail statement: Then we know that one of the following is true about the computation rcompM,q (ua): • at some point, the computation locks itself inside u and eventually hits left, hangs, or loops: Then we again know that, by definition, τ ′ (q) = τ ′ (⊥). So, instead of failing, the algorithm should have returned τ ′ (⊥). • at some point, the computation really enters state τ (⊥) while on a: Then we know that, from that point on, the computation will behave identically to rcompM,τ (⊥) (ua) and hence it will eventually hit right into τ ′ (⊥). So, once again, the algorithm should have returned τ ′ (⊥). In conclusion, we see that in all cases of failure the algorithm should have returned τ ′ (⊥). Hence, if in Figure 5(left) we just replace every fail statement with the statement “return τ ′ (⊥)”, we have a correct algorithm for computing τ ′ (q)—provided, of course, that τ ′ (⊥) is defined and available. Figure 5(right) shows the algorithm after these modifications. Overall, we have described an algorithm eτ,a that can be used for checking if τ ′ is defined and, if so, for computing its values. The algorithm can be run on any element of Q⊥ . Figure 5(left) shows the computation eτ,a (⊥). Figure 5(right) shows the computation eτ,a (q), for q ∈ Q, where every reference to τ ′ (⊥) can be understood as a call to eτ,a (⊥). With eτ,a in our vocabulary, we can summarize the discussion of this section into the next definition and lemma. 8. Definition. If τ and τ ′ are two tables of M and a some symbol in Σe , we say that τ is a-compatible to τ ′ if and only if τ ′ (⊥) = eτ,a (⊥)

and

for all q ∈ Q : τ ′ (q) = eτ,a (q),

where eτ,a is the algorithm from Figure 5. 9. Lemma. Suppose u ∈ Σe∗ and the table τ := tableM (u) is defined. Then, for any a ∈ Σe and any table τ ′ of M , the following holds: τ is a-compatible to τ ′ ⇐⇒ tableM (ua) is defined and equals τ ′ . Proof. Suppose τ is a-compatible to τ ′ . Then τ ′ (⊥) = eτ,a (⊥). Hence, on input ⊥, the algorithm eτ,a does not fail. This implies that the table tableM (ua) is defined. Moreover, its values are exactly those returned by eτ,a . But the values of τ ′ are also the same as those returned by eτ,a (because τ is a-compatible to it). Overall, tableM (ua) = τ ′ . Conversely, assume that the table tableM (ua) is defined and equals τ ′ . The argument before Definition 8 proves that eτ,a returns the same values as tableM (ua). Hence, eτ,a returns the same values as τ ′ . So, τ is acompatible to τ ′ , as required.

3. FROM 2DFAS TO 1DFAS

43

3.3. The upper bound. We are now ready to build a 1dfa M ′ that simulates M with exactly n(nn − (n − 1)n ) states. First, we characterize the acceptance of an input by M in terms of the tables of M and the compatibilities between them. 10. Definition. Suppose w ∈ Σ ∗ is of length l and τ0 , τ1 , . . . , τl+2 is a sequence of tables of M . We say the sequence fits w iff: 1. τ0 is the constant function s, 2. for all i = 0, 1, . . . , l + 1: τi is we,i+1 -compatible to τi+1 ,9 3. τl+2 is the constant function f . 11. Theorem. M accepts w iff some sequence of tables of M fits w. Proof. Fix w and consider the sequence (3)

tableM (ǫ), tableM (⊢), tableM (⊢w1 ), . . . , tableM (⊢w⊣).

Clearly, there is no guarantee that all members in this sequence are defined. But if they all are, then the sequence trivially satisfies Conditions 1 and 3 of Definition 10 (cf. Remark 3) and also Condition 2 (by Lemma 9). Hence, if all tables in the sequence are defined, the sequence fits w. Now assume M accepts w. Then the computation lcompM,s (we ) hits right into f . This immediately implies that each one of the tables in (3) is defined. So, their sequence fits w. Conversely, suppose some sequence of tables of M fits w. By Lemma 9 and an easy induction, this sequence must be identical to the one in (3). This implies that the table tableM (we ) is defined and equal to the constant function f . Hence, the computation lcompM,s (we ) hits right, into f . This means that M accepts w. Based on this lemma, the construction of M ′ is straightforward. To test if its input is accepted by M , the automaton checks if there is a sequence of tables of M that fits it. At every step, it ‘remembers’ the last table of the subsequence found so far. More carefully, the algorithm is as follows: We start with the constant table s in our memory. On reading a symbol a, we check if the table in our memory is acompatible to any table of M . If so, there is exactly one such table, so we change our memory to it and move right. If not, there is no sequence that fits w and we hang. We accept if we ever reach the constant table f .

Formally, M ′ := (s′ , δ ′ , f ′ ) where Q′ := {τ | τ is a table of M }, s′ := the table that always returns s, and f ′ := the table that always returns f . For each τ and a, the value δ ′ (τ, a) is either the unique τ ′ to which τ is a-compatible, if such a τ ′ exists; or undefined, if τ is a-compatible to none of the tables of M . Clearly, M ′ is correct and as large as claimed. 9Recall that by w e,i+1 we mean the i + 1st symbol of the end-marked string ⊢w⊣. This is either ⊢ (when i = 0), or wi (when i 6= 0, l + 1), or ⊣ (when i = l + 1).

44

1. EXACT TRADE-OFFS

hiτ

gτ xτ

hiτ

gτ xτ

yτi i yτi

i (a)

(b)

Figure 6. The input wτi when n = 6, table τ maps ⊥, 1, 2, 3, 4, 5, 6 to the values 3, 1, 3, 6, 3, 5, 6 respectively, and (a) i = 6, or (b) i = 4. 3.4. The lower bound. We will now prove that the construction of the previous section is optimal. In other words, we will prove that some n-state 2dfas have no equivalent 1dfas with fewer than n(nn − (n − 1)n ) states. To this end, we will actually exhibit such a 2dfa. Our witness will be the automaton M0 from Section 2.3, solving problem Ψ . So, for the remainder of this section, we assume that the n-state 2dfa M that we kept fixed from the beginning of Section 3 is actually M0 . We will prove that the automaton M ′ constructed in the previous section is smallest among the 1dfas that are equivalent to M . In fact, we will show that M ′ needs all its states not only for staying equivalent to M , but even for solving Ψ —on a stricter promise, even. More precisely, consider any table τ : [n]⊥ → [n] of M and any ‘query’ i ∈ [n]. The pair (τ, i) gives rise to the deterministic nice input wτi := (xτ , l)(gτ , l)(hiτ , r)(yτi , r), where xτ is the smallest x for which τ (x) = τ (⊥); gτ is the restriction of τ to [n]; hiτ is the single arrow from τ (⊥) to i, if τ (i) 6= τ (⊥), or else the empty function; and yτi is just τ (i) (see Figure 6 for examples): xτ :=min{x | τ (x) = τ (⊥)}, (4) gτ :=

x, τ (x) | x ∈ [n]

It is easy to verify the following.

yτi := τ (i) ( ∅ i hτ :=

τ (⊥), i

if τ (i) = τ (⊥), otherwise.

12. Lemma. For any table τ and any state i of M : The table of M on the prefix ⊢(xτ , l)(gτ , l) of wτi is exactly τ and the computation c of M on wτi is accepting. Moreover, if τ (i) 6= τ (⊥), then c contains a left-to-right crossing of the middle boundary from i to τ (i). Hence, the inputs in the set {wτi | τ is a table and i a state of M } push M ′ to its limits, in the sense that they collectively force every one of its

3. FROM 2DFAS TO 1DFAS

45

states to be used in every interesting way in some accepting computation. Hence, no state of M ′ is redundant in some straightforward manner, which intuitively suggests M ′ is minimal. We will turn this intuition into a proof. We start with the observation that, for every two distinct tables of M , there is a ‘query’ that can distinguish them. 13. Lemma. Two tables τ and τ ′ of M are distinct iff there exist a partial function h : [n] → [n] and a y ∈ [n] such that exactly one of the following inputs has a path: (xτ , l)(gτ , l)(h, r)(y, r)

and

(xτ ′ , l)(gτ ′ , l)(h, r)(y, r).

Proof. If the two tables are identical, then the two inputs are also identical and either none or both of them have a path. For the interesting direction, suppose τ , τ ′ are distinct. We examine two cases. If τ (⊥) 6= τ ′ (⊥), then choose h to be the empty function and y the smallest between τ (⊥) and τ ′ (⊥). Then the input corresponding to the table from which y takes its value has a path but the other input does not. If τ (⊥) = τ ′ (⊥) =: y ∗ , then there exists an x ∈ [n] such that τ (x) 6= τ ′ (x) (or else τ , τ ′ would be identical, a contradiction). Pick the smallest such x, choose h to contain only the arrow from y ∗ to x, and set y to the smallest of τ (x), τ ′ (x) that is different from y ∗ . Clearly, there is a path in the input corresponding to the table from which y takes its value, while the other input has no path. Now, for every pair of tables τ and τ ′ of M , let us define the input wτ,τ ′ := (xτ , l)(gτ , l)(hτ,τ ′ , r)(yτ,τ ′ , r), where xτ and gτ are given by (4), while hτ,τ ′ and yτ,τ ′ are either the values given by the proof of Lemma 13, if τ 6= τ ′ ; or the values hiτ , yτi as defined in (4) for i = xτ , if τ = τ ′ .10 Strengthening the promise of Ψ to allow only deterministic nice inputs of this form, we get the problem Ψ ′ with ′ Ψyes := {wτ,τ ′ | τ, τ ′ are tables of M and wτ,τ ′ has a path}, ′ Ψno := {wτ,τ ′ | τ, τ ′ are tables of M and wτ,τ ′ has no path}.

Clearly, M solves Ψ ′ , so that a single-pass 2dfa can solve this problem with only n states. However, for 1dfas the problem is maximally hard. 14. Lemma. Every 1dfa for Ψ ′ has at least n nn − (n − 1)n states.

Proof. Towards a contradiction, suppose that A is a 1dfa that solves Ψ ′ with fewer than n(nn − (n − 1)n ) states. For every table τ of M , the automaton accepts wτ,τ (Lemma 12), so the computation cτ := compA (wτ,τ ) hits right. In particular, it crosses the middle boundary from left to right. 10In the first case, note that the order of the pair τ , τ ′ is not important: h ′ = τ,τ hτ ′ ,τ and yτ,τ ′ = yτ ′ ,τ . In the second case, note that hτ,τ = ∅ and yτ,τ = τ (⊥).

46

1. EXACT TRADE-OFFS

Let qτ be the state that results from this crossing. Since there are fewer states in A than tables of M , two tables τ 6= τ ′ must map to the same state q := qτ = qτ ′ . As a consequence, the computations of A on wτ,τ ′ and wτ ′ ,τ both cross the middle boundary into q (since the two strings are identical to wτ,τ and wτ ′ ,τ ′ before this boundary, respectively) and therefore have the same suffix (since the two strings are identical to each other after that boundary). In particular, they are either both accepting or both rejecting, a contradiction to Lemma 13 and the definition of wτ,τ ′ and wτ ′ ,τ . 4. From 2NFAs to 1DFAs Fix an n-state 2nfa N = (s, δ, f ) over some set of states Q and some alphabet Σ. In this section we will generalize the discussion of Section 3, in order to build a 1dfa equivalent to N . 4.1. Tables. Consider any non-empty string u and suppose that the table T := tableN (u) is defined. This means that the set of computations C := lcompN,s (u) hits right into the set of states T (⊥) 6= ∅. Hence, C contains right-hitting computations. Let c be one of them. Clearly, c visits the rightmost symbol of u at least once. If p is the state of N at one of these visits, then combining the prefix of c up to that visit with any of the (possibly 0) right-hitting computations in rcompN,p (u), produces a righthitting computation which is also in C. Therefore, the computations of rcompN,p (u) can hit right only into states that are already in T (⊥). By the definition of T , this implies that T (p) = T (⊥). We thus conclude that T assigns the value T (⊥) to at least one state. Furthermore, a straightforward inspection of the definition of T reveals that every state that is not assigned the set T (⊥) is assigned a set disjoint from T (⊥). This motivates the following definition. 15. Definition. A table of N is any T : Q⊥ → P ′ (Q) such that 1. for every p ∈ Q: T (p) = T (⊥) or T (p) ∩ T (⊥) = ∅, 2. for some p ∈ Q: T (p) = T (⊥). As in the deterministic case, note that this definition explains what a “table of N ” is, whereas Section 2.2-I defines what the “table of N on u” is, for any string u. The relation between the two notions is shown in the next lemma. The lemma after it carries out some counting. 16. Lemma. If the table of N on u is defined, then it is a table of N . Proof. If u 6= ǫ, then the argument before Definition 15 proves the claim. If u = ǫ, then the table of N on u is the constant function {s} (cf. Remark 4), and thus it is obviously a table of N .

4. FROM 2NFAS TO 1DFAS

47

17. Lemma. The number of distinct tables of N is exactly11 n−1 X n−1 X i=0 j=0

n

n i

j

j 2i − 1 .

Proof. The number of distinct tables for N is equal to the number of distinct (n + 1)-tuples of non-empty subsets of [n] where the set of the first component appears in other components, too, but intersects no other set in the tuple. For each i, j = 1,2, . . . , n, there are ni choices for the set S in the first component and nj choices for the set of the components after it that host the same set S. Given i and j, each one of the remaining (n + 1) − (j + 1) components can admit any of the 2n−i − 1 non-empty sets that avoid intersection with S. Overall, we have n X n X i=1 j=1

n

n i

j

2n−i − 1

n−j

=

n−1 X n−1 X i=0 j=0

n

n i

j

2i − 1

j

choices for completing this (n + 1)-tuple, where the right-hand-side of the equation is obtained via a straightforward variable substitution. 4.2. Compatibilities among tables. Consider any string u, any symbol a, and suppose the table T := tableN (u) is defined. We will describe an algorithm for deciding whether the table T ′ := tableN (ua) is defined and, if so, for computing it based on T and a but not u. As in Section 3.2, our algorithm works on any element of Q⊥ . On input ⊥, it either returns T ′ (⊥) or fails, depending on whether T ′ is defined or not. On input q ∈ Q and with T ′ (⊥) available, it returns T ′ (q). We call this algorithm ET,a and we derive it from eτ,a of Section 3.2 by a straightforward generalization that takes into account nondeterminism— see Figure 7 for the two distinct computations, ET,a (⊥) and ET,a (q). Based on ET,a , we can again define a-compatibility between tables—the proof of the next lemma is similar to that of Lemma 9 and is omitted. 18. Definition. If T and T ′ are two tables of N and a some symbol in Σe , we say that T is a-compatible to T ′ if and only if T ′ (⊥) = ET,a (⊥)

and

for all q ∈ Q: T ′ (q) = ET,a (q),

where ET,a is the algorithm from Figure 7. 19. Lemma. Suppose u ∈ Σe∗ and the table T := tableN (u) is defined. Then, for any a ∈ Σe and any table T ′ of N , the following holds: T is a-compatible to T ′ ⇐⇒ tableN (ua) is defined and equals T ′ . 11For i = j = 0, this expression uses the quantity 00 . In this context, 00 = 1.

48

1. EXACT TRADE-OFFS

P ←− T (⊥), S ′ ←− ∅ P ←− {q}, S ′ ←− ∅ repeat: S repeat: S R ←− {δ(p, a) | p ∈ P } R ←− {δ(p, a) | p ∈ P } S ′ ←− SS′ ∪ {r | (r, r) ∈ R} S ′ ←− SS′ ∪ {r | (r, r) ∈ R} ∗ P ←− {T (r) | (r, l) ∈ R} P ∗ ←− {T (r) | (r, l) ∈ R} \ T (⊥) ∗ P ←− {p ∈ P | p not seen before} P ←− {p ∈ P ∗ | p not seen before} if P = ∅ then: if P = ∅ then: if S ′ = ∅: fail if S ′ ⊆ T ′ (⊥): return T ′ (⊥) ′ ′ if S 6= ∅: return S if S ′ 6⊆ T ′ (⊥): return S ′ \ T ′ (⊥) Figure 7. Computing ET,a (⊥) (left) and ET,a (q) (right). 4.3. The upper bound. We now construct a 1dfa M that is equivalent to N . As in Section 3.3, we base our construction on a characterization of acceptance by N in terms of tables and compatibilities. 20. Definition. Suppose w ∈ Σ ∗ is of length l and T0 , T1 , . . . , Tl+2 is a sequence of tables of N . We say the sequence fits w iff: 1. T0 is the constant function {s}, 2. for all i = 0, 1, . . . , l + 1: Ti is we,i+1 -compatible to Ti+1 , 3. Tl+2 is the constant function {f }. 21. Theorem. N accepts w iff some sequence of tables of N fits w. The theorem is proved similarly to Theorem 11 and suggests that M should simply try to find a sequence of tables of N that fits its input. Therefore, M implements the following algorithm: We start with the constant table {s}. On reading a symbol a, we check if the current table is a-compatible to any table of M . If so, there is exactly one such table, so we change our memory to it and move right. Otherwise, there is no sequence that fits w, and we hang. We accept if we ever reach the constant table {f }.

Formally, M := (s′ , δ ′ , f ′ ) where Q′ := {T | T is a table of N }, s′ := the table that always returns {s}, and f ′ := the table that always returns {f }. For any T and a, the value δ ′ (T, a) is either the unique table to which T is a-compatible, if such table exists; or undefined, otherwise. It should be clear that M is correct and as large as claimed. 4.4. The lower bound. We will now exhibit an n-state 2nfa for which every equivalent 1dfa need at least one state per table. Our witness will be the automaton N0 from Section 2.3, solving problem Φ. So, for the rest of this section, we assume that the n-state 2nfa N that we fixed at the beginning of Section 4 is the automaton N0 , and we will show that the

4. FROM 2NFAS TO 1DFAS

hiT

GT i

hiT

GT

xT

49

xT j i

j (a)

(b)

Figure 8. The input wTi,j when n = 6; table T maps ⊥, 1, 2, 3, 4, 5, and 6 to the values {3, 5}, {2, 6}, {1, 2, 4}, {3, 5}, {4}, {6}, {3, 5} respectively; and (a) i = 1, j = 6, or (b) i = 6, j = 5. 2dfa M constructed in the previous section is minimal. We start with some intuition why this must be the case. For each table T : [n]⊥ → P ′ ([n]) of N , each i ∈ [n], and each j ∈ T (i), we consider the nice input

wTi,j := (xT , l)(GT , l)(hiT , r)(j, r), where xT is the smallest x for which T (x) = T (⊥); GT is the binary relation induced by T on [n]; and hiT contains either exactly the arrows from T (⊥) to i, if T (i) 6= T (⊥), or else no arrow at all (see Figure 8 for examples): ( xT :=min{x | T (x) = T (⊥)} ∅ if T (i) = T (⊥), i (5) hT := {(y, i) | y ∈ T (⊥)} else. GT :={(x, y) | y ∈ T (x)} It is easy to verify the following fact. 22. Lemma. For any table T and any two states i, j of N such that j ∈ T (i): The table of N on the prefix ⊢(xT , l)(GT , l) of wTi,j is exactly T and some computations in compN (wTi,j ) are accepting. Moreover, if T (i) 6= T (⊥), then each accepting computation contains a step that crosses the middle boundary from i to j. Hence, as T , i, and j vary as above, the inputs wTi,j collectively force every interesting use of every state of M in some accepting computation, so that intuitively no state of M is dispensable. To turn this intuition into a proof, we first establish the simple fact that, for every two distinct tables, there exists a ‘query’ that can distinguish between them. 23. Lemma. Two tables T and T ′ of N are distinct iff there exist a partial function h : [n] → [n] and a y ∈ [n] such that exactly one of the following two inputs has a path: (xT , l)(GT , l)(h, r)(y, r)

and

(xT ′ , l)(GT ′ , l)(h, r)(y, r).

50

1. EXACT TRADE-OFFS

Proof. If T = T ′ then, for all h and y, the two inputs are identical and therefore Φ does not distinguish between them. For the interesting direction, we suppose T 6= T ′ and examine two cases. If T (⊥) 6= T ′ (⊥), we pick h to be the empty function and y the smallest state in the symmetric difference of the two ⊥-values. Then the input that corresponds to the ⊥-value that contains y is the only one with a path. If T (⊥) = T ′ (⊥) =: Y ∗ , consider the smallest x ∈ [n] with T (x) 6= T ′ (x) (such an x exists, since T 6= T ′ ). It is not hard to see that either both of the two x-values avoid intersection with Y ∗ , or exactly one of them does while the other one equals Y ∗ . (Indeed: If an x-value intersects Y ∗ , then it is actually equal to Y ∗ , since T and T ′ are tables. Hence, if both xvalues intersected Y ∗ , we would have T (x) = T ′ (x), a contradiction. So, at most one of them intersects Y ∗ . And, if one does, this one is equal to Y ∗ .) In both cases, the symmetric difference of the two x-values contains an element which does not belong to Y ∗ . If y is the smallest such element and h contains exactly the arrows from Y ∗ to x, namely h := {(y ∗ , x) | y ∗ ∈ Y ∗ }, then the input that corresponds to the x-value containing y clearly has a path, while the other one does not. Now, for every pair of tables T and T ′ of N we define the nice input wT,T ′ := (xT , l)(GT , l)(hT,T ′ , r)(yT,T ′ , r) where xT , GT are as in (5), while hT,T ′ and yT,T ′ are either the ones given by the proof of Lemma 23, if T 6= T ′ ; or the values hiT and min T (i) as defined in (5) for i = xT , if T = T ′ .12 Strengthening the promise for Φ to allow only inputs of this particular form, we get a new problem Φ′ with Φ′yes := {wT,T ′ | T, T ′ are tables for N and wT,T ′ has a path}, Φ′no := {wT,T ′ | T, T ′ are tables for N and wT,T ′ has no path}. This is clearly still solvable by N , so that n states are enough on a singlepass 2nfa against Φ′ . However, 1dfas need many more states. 24. Lemma. The size of every 1dfa solving Φ′ is at least n−1 X n−1 X i=0 j=0

n

n i

j

j 2i − 1 .

Proof. Assume A is a 1dfa solving Φ′ . For every table T of N , the automaton accepts wT,T (Lemma 22). Hence, the computation cT := compA (wT,T ) hits right, and therefore crosses the middle boundary into some state, call it qT . If the states of A were fewer than the tables of N , 12Again, when T 6= T ′ , the order of the two tables in the definition of these values is not important: hT,T ′ = hT ′ ,T and yT,T ′ = yT ′ ,T . Also, hT,T = ∅.

5. FROM 2NFAS TO 1NFAS

51

two tables T 6= T ′ would map to the same state qT = qT ′ and (by the standard cut-and-paste argument, as in Lemma 14) the automaton would be deciding identically on wT,T ′ and wT ′ ,T , contradicting Lemma 23 and the definition of the two strings. Hence, A must have as many states as there are tables of N . The rest of the proof is by Lemma 17. 5. From 2NFAs to 1NFAs Fix an n-state 2nfa N = (s, δ, f ) over the states of some set Q and the symbols of some alphabet Σ. In this section we will build an equivalent 2n n+1 -state 1nfa via an optimal construction.

5.1. Frontiers. Let us momentarily assume that N is actually deterministic and that c := compN (w) is accepting, for some l-long input w. Consider the i-th frontier (Lci , Ric ) of c, for some i 6= 0, l + 2. The number of states in Ric equals the number of times that c left-to-right crosses the i-th boundary: each crossing contributes a state into Ric and no two crossings contribute the same state, or else c would be looping. Similarly, Lci contains as many states as many times c right-to-left crosses the i-th boundary. Now, since c accepts, it goes from the leftmost symbol ⊢ all the way past the rightmost one ⊣, forcing the left-to-right crossings on every boundary to be exactly 1 more than the right-to-left crossings. Hence, |Lci | + 1 = |Ric |, which remains true even on the leftmost boundary (i = 0, under our convention from Footnote 6 on page 36) and also on the rightmost one (i = l + 2, obviously). Therefore, the equality holds over every boundary, and motivates the following definition. 25. Definition. A frontier of N is any (L, R) ∈ Q × Q such that |L| + 1 = |R|. Note that this defines what a “frontier of N ” is, whereas Section 2.1-III defined what a “frontier of a computation” is. The relation between the two notions is partially explained by the motivating argument above, which shows that if the computation c of a 2dfa on an input is accepting, then all frontiers of c are frontiers of the 2dfa. However, this argument is not valid for our nondeterministic N , because a state repetition under a cell does not necessarily imply looping. However, it does imply a cycle. 26. Definition. A computation is minimal if its points are all distinct. In other words, a computation is minimal iff it is cycle-free. Obviously, a minimal computation is not looping, and for 2dfas the converse is also true. However, for 2nfas the converse is not always true: accepting computations

52

1. EXACT TRADE-OFFS

0

1

2

3

4

s0

4

4

1

3

5

1

2

4

3

2

5

6

7

8

0

1

2

1

1

0

3

2

3

2

1

4 1

4

∅

5

(a)

1

4

1

2

3

4

s

4

3

4

4

3

3 5

5

2

7

1

3

4

8

4 5

2

∅

2

6

1

4

5

f

∅

5

f

(b)

Figure 9. (a) An accepting minimal c ∈ compN (w), for a 6-long w; we assume 0, 1, . . . , 5 are states of N , and s = 0, f = 5. (b) The same c arranged in frontiers; only the even-indexed frontiers are drawn. may not be minimal. So, in order to extend our previous observation to 2nfas, we need to take this detail into account. The following lemma makes the appropriate corrections. The lemma after it is an easy counting. 27. Lemma. If a computation of N on a string w is accepting and minimal, then all frontiers of that computation are frontiers of N . Proof. We just need to modify the argument before Definition 25, as follows: No two left-to-right crossings of the i-th boundary contribute the same state into Ric , or else c would not be minimal. Similarly for Lci . 2n 28. Lemma. The number of distinct frontiers of N is exactly n+1 .

Proof. Easily, the number of different frontiers of N is equal to the number of distinct pairs of subsets of [n] where the second subset is 1-larger than the first one. In turn, this is equal to the number of different ways of choosing n + 1 elements of the set [2n]: the unselected elements of [n] determine the first subset, while the selected elements of [2n]−[n]13determine 2n the second subset. So, our number is exactly n+1 , as claimed. 5.2. Compatibilities among frontiers. Suppose c is an accepting minimal computation of N on an l-long w and let Fic = (Lci , Ric ) be its i-th frontier, for each i = 0, 1, . . . , l + 2 (Figure 9). Note that the first and last c frontiers in the sequence F0c , F1c , . . . , Fl+2 are always F0c = (∅, {s})

and

c Fl+2 = (∅, {f }),

as c starts at s, ends in f , and never right-to-left crosses an outer boundary. c Also note that, for (L, R) := (Lci , Ric ) and (L′ , R′ ) := (Lci+1 , Ri+1 ) two successive frontiers in the sequence (Figure 10a), it should always be that R ∩ L′ = ∅: otherwise, c would be using the same state under the (i + 1)st 13Alternatively, this number can be written as nC , where C is the n-th Catalan n n number [12]. So, Catalan strikes again! [10]

5. FROM 2NFAS TO 1NFAS 0

1

2

w1

i−1

i

i+1

wi−1 wi

R

L

i+2

wi+1

l

l+1

53

l+2

wi

R′

L′

i+1

i

wl

R

L

R′

L′

Fi Fi+1

Fi Fi+1

(a)

(b)

Figure 10. (a) Two successive frontiers. (b) The associated bijection. cell of the tape and would not be minimal. Hence, R + L′ contains as many states as many (occurrences of) states there are in L and R′ together: |R + L′ | = |R| + |L′ | = |L| + 1 + |R′ | − 1 = |L| + |R′ | = |L ⊎ R′ |.

Hence, bijections can be found from R + L′ to L ⊎ R′ . Among them, a very natural one (Figure 10b): for each q ∈ R + L′ find the unique step in c that produces q under the (i + 1)st cell (this is either a left-to-right crossing of boundary i or a right-to-left crossing of boundary i + 1; the minimality of c guarantees uniqueness); the next step left-to-right crosses boundary i+1 into some state p ∈ R′ or right-to-left crosses boundary i into some p ∈ L; depending on the case, map q to (p, r) or (p, l) respectively. If ρ : R + L′ → L ⊎ R′ is this mapping, it is easy to verify that it is injective (because c is minimal) and therefore bijective, as promised. In addition, it is clear that ρ respects the transition function, in the sense that ρ(q) ∈ δ(q, we,i+1 ), for all q ∈ R + L′ . Overall, this discussion shows that every accepting minimal computation in compN (w) exhibits a sequence of frontiers which obeys certain restrictions. The following definitions and lemma summarize these findings. 29. Definition. If (L, R) and (L′ , R′ ) are two frontiers of N and a some symbol in Σe , we say that (L, R) is a-compatible to (L′ , R′ ) iff 1. R ∩ L′ = ∅, and 2. there exists a bijection ρ : R + L′ → L ⊎ R′ that respects the transition function on a: ρ(q) ∈ δ(q, a), for all q ∈ R + L′ .14 14Note that, if N is deterministic, then (L, R) is a-compatible to (L′ , R′ ) exactly if the following, much simpler, condition holds: (∀ξ ∈ L ⊎ R′ )(∃q ∈ L′ ∪ R)(δ(q, a) = ξ).

54

1. EXACT TRADE-OFFS

30. Definition. Suppose w ∈ Σ ∗ is of length l and F0 , F1 , . . . , Fl+2 is a sequence of frontiers of N . We say the sequence fits w iff 1. F0 = (∅, {s}), 2. for all i = 0, 1, . . . , l + 1: Fi is we,i+1 -compatible to Fi+1 , 3. Fl+2 = (∅, {f }). 31. Lemma. For all w ∈ Σ ∗ : if compN (w) contains an accepting computation, then some sequence of frontiers of N fits w. Proof. Suppose compN (w) contains an accepting computation d. Removing from d all cycles, we get a computation c which is also in compN (w) and is accepting and minimal. Then, the argument before Definition 29 proves that the sequence of the frontiers of c fits w. The crucial observation—which we prove in the next section—is that the converse of this lemma is also true, and therefore an analogue of Theorems 11 and 21 holds. 32. Lemma. For all w ∈ Σ ∗ : if some sequence of frontiers of N fits w, then compN (w) contains an accepting computation. 33. Theorem. N accepts w iff some sequence of frontiers of N fits w. 5.3. Proof of the main observation. In this section we establish Lemma 32. So, assume that the sequence of frontiers F0 = (L0 , R0 ), F1 = (L1 , R1 ), . . . , Fl+2 = (Ll+2 , Rl+2 ) fits w. We will prove a stronger statement: for every i = 0, 1, . . . , l + 2, the states of Ri can be produced by |Ri | right-hitting computations on ⊢w1 · · · wi−1 , one of them starting at s and on ⊢ and each one of the remaining |Li | starting at a distinct q ∈ Li and on wi−1 . More formally, we will prove the following claim. Claim. For each i = 0, 1, . . . , l + 2, there is a bijection πi : (Li )⊥ → Ri which satisfies the following two conditions: 1. some c ∈ lcompN,s (⊢w1 · · · wi−1 ) hits right into πi (⊥), and 2. for all q ∈ Li , some c ∈ rcompN,q (⊢w1 · · · wi−1 ) hits right into πi (q). Note that Lemma 32 indeed follows from this claim, when i = l + 2: The only bijection from (Ll+2 )⊥ = ∅⊥ = {⊥} to Rl+2 = {f } is πl+2 := {(⊥, f )}. Hence, condition 1 says that some computation in lcompN,s (⊢w1 · · · wl ⊣) hits right into πl+2 (⊥) = f , which is simply another way of saying that compN (w) contains an accepting computation. To prove the claim, we use induction on i. Base Case. The base case i = 0 is satisfied by the definitions. The only bijection from (L0 )⊥ = ∅⊥ = {⊥} to R0 = {s} is π0 = {(⊥, s)}. Condition 1 is true because lcompN,s (ǫ) contains exactly the 0-length computation (s, 1) which does hit right into s (cf. Remark 1). Condition 2 is true vacuously, as L0 = ∅.

5. FROM 2NFAS TO 1NFAS 0

1

i−1

2

w1

i

wi−1

s

i+1

a

R

R′

L′

L Fi

π

55

Fi+1 π′

ρ

Figure 11. An example for the inductive step in the proof of Lemma 32. Note, for instance, that σ maps the 3rd and 5th (from the top) states of R to ⊥, while the 4th state is mapped to the 1st state of R′ . Inductive Step. For the inductive step (Figure 11), assume i < l + 2, let (L, R) := (Li , Ri ), (L′ , R′ ) := (Li+1 , Ri+1 ), a := we,i+1 , and consider the bijections π := πi : L⊥ → R

ρ : R + L′ → L ⊎ R ′

and

guaranteed respectively by the inductive hypothesis and by the assumption that (L, R) is a-compatible to (L′ , R′ ). We need to build a bijection π ′ := πi+1 : (L′ )⊥ → R′ that satisfies Conditions 1, 2 of the claim. We will do so based on π, ρ, and one more function σ that will emerge from the following discussion. Definition of σ. Consider any state q ∈ R and let us take a trip around under ⊢w1 w2 · · · wi−1 a by alternately following bijections ρ and π (6)

q, r0

ρ(q), r1

πρ(q), r2

ρπρ(q), r3

πρπρ(q), . . . r4

until the first time that ‘ρ fails to return a state in L’,15 and let r0 , r1 , r2 , . . . be the states that we visit. There are two cases about what might happen. 15Note that we abuse notation here. Bijection ρ can only return a pair of the form (p, l) or (p, r). So, in the description (6) above, ρ(·) really means ‘the first component of ρ(·), if the second component is l’. Similarly, ‘ρ fails to return a state in L’ means ‘ρ returns a pair of the form (p, r)’. Hopefully, the abuse does not confuse. We will stick to it throughout the definition of σ.

56

1. EXACT TRADE-OFFS

Case 1 is that ρ does eventually fail to return a state in L and the trip pays only a finite number of visits r0 , r1 , . . . , rk , for some even k ≥ 0 and with rk ∈ R. Then rk is ρ-mapped to some q ′ ∈ R′ . Case 2 is that ρ always returns a state in L and the trip is infinite. Since all even-indexed and all odd-indexed visits in the trip are inside the finite sets R and L respectively, there have to be repetitions of states both on the even and on the odd indices. Let k be the first index for which some earlier index j < k of the same parity points to the same state: rj = rk . If k is odd, then j is also odd and hence j ≥ 1; then rj = rk =⇒ ρ−1 (rj ) = ρ−1 (rk ) =⇒ rj−1 = rk−1 and k − 1 also has the property that k is the earliest one to have, a contradiction. So, k must be even, and so must j. In fact, j must be 0—otherwise we can again reach a contradiction (as before, with π −1 instead of ρ−1 ). Hence, the first state to be revisited is the state q we started from and our trip consists of infinitely many copies of a cycle r0 , r1 , . . . , rk , for some even k ≥ 2, for rk = r0 = q ∈ R, and with no two states in the list r0 , r1 , . . . , rk−1 being both equal and at positions of the same parity. Overall, in the trip that we take, we either reach a state rk ∈ R (possibly k = 0) that is ρ-mapped to a state q ′ ∈ R′ (Case 1), or we return to the starting state q ∈ R having previously repeated no state in L and no state in R (Case 2). We define the function σ : R → (R′ )⊥ to encode exactly this information. Specifically: in Case 1, we set σ(q) := q ′ ; in Case 2, we set σ(q) := ⊥. In either case, our trip respects π and ρ, which in turn respect the behavior of N on ⊢w1 w2 . . . wi−1 a. It should therefore be clear that, according to our construction, • σ(q) = q ′ implies that some c ∈ rcompN,q (⊢w1 · · · wi−1 a) respects π, ρ and hits right into q ′ , while • σ(q) = ⊥ implies that some looping c ∈ rcompN,q (⊢w1 · · · wi−1 a) respects π, ρ and repeats a cycle that can only visit states from R when under a. This concludes the definition of σ and our discussion of its properties. We are now ready to return to the construction of bijection π ′ . Recall that this must inject L′⊥ to R′ so that Conditions 1 and 2 of the claim above are satisfied. We examine three separate cases about the argument of π ′ . Case (a). The easiest argument is a state p ∈ L′ that is ρ-mapped rightward to a state r ∈ R′ . Then we just let π ′ also return that state: π ′ (p) := r. Since ρ respects the transition function on a, the corresponding 1-step computation from p rightward to r is indeed a computation in rcompN,p (⊢w1 · · · wi−1 a) that hits right into π ′ (p).

5. FROM 2NFAS TO 1NFAS

57

Case (b). If the argument is some p ∈ L′ that is ρ-mapped leftward to a state r ∈ L, then we consider where in R bijection π takes us from there: q := π(r). We know some computation of N can start at p under a and eventually reach q under a, so the question is what can happen after that if we keep following ρ and π. To answer this question, we examine σ(q). If σ(q) = ⊥, then we will eventually return to q after a cycle of length at least 2 and having visited only states of R when under a. But can this happen? If it does, then the next-to-last and last steps in this cycle will follow ρ and π respectively, ending up in q. Since ρ and π are bijections, the last two states (before q) in this cycle must respectively be p and r. In particular, p must be in the cycle. But, since the cycle visits only states from R whenever under a, we should have p ∈ R. This means R and L′ must intersect, and hence (L, R) is not a-compatible to (L′ , R′ ), a contradiction. It is thus necessary that σ(q) = q ′ ∈ R′ , which implies that some computation c ∈ rcompN,q (⊢w1 · · · wi−1 a) hits right into q ′ . Prefixed by the computation that takes p to q, this c becomes a computation in rcompN,p (⊢w1 · · · wi−1 a) that hits right into q ′ . So, we can set π ′ (p) := q ′ . Case (c). It remains to define π ′ (⊥). The reasoning resembles Case (b). We consider the state q ∈ R where π(⊥) takes us, and examine σ(q). Again, σ(q) = ⊥ is impossible, as this would imply that ⊥ ∈ L, a contradiction. Hence, σ(q) = q ′ for some q ′ ∈ R′ . Combining the computation guaranteed by π(⊥) = q with the one guaranteed by σ(q) = q ′ , we get a computation in lcompN,s (⊢w1 · · · wi−1 a) that hits right into q ′ . So, we can set π ′ (⊥) := q ′ . This concludes the definition of π ′ . The construction must have made clear that π ′ satisfies the two conditions of the claim; that it is also a bijection should be an easy consequence of the way bijections π and ρ are used. Hence, the inductive step is complete, and with it the proof of the claim. 2n 5.4. The upper bound. We are now ready to build a n+1 -state 1nfa N ′ that simulates N . Our construction is based on Theorem 33. In other words, the strategy of N ′ is to scan the input ‘guessing’ the members of a sequence of frontiers, one after the other, and verifying that this sequence fits the input. Precisely, N ′ implements the following algorithm: We start with the frontier (∅, {s}) in our memory. On reading a symbol a, we check if the frontier in our memory is a-compatible to any other frontiers. If not, we just hang. If it is, we find all such frontiers, select one of them nondeterministically, and move right with it as our new memory. If we ever reach the frontier (∅, {f }), we accept.

Formally, N ′ := (s′ , δ ′ , f ′ ) where Q′ := {F | F is a frontier for N }, s′ := (∅, {s}), f ′ := (∅, {f }), and the transition function is such that δ ′ (F, a) := {F ′ | F is a-compatible to F ′ }, for all F ∈ Q′ and a ∈ Σe . It should be clear that N ′ is correct and as large as promised.

58

1. EXACT TRADE-OFFS

gF x1

hF

x0

gF

y1 x0

y2 y3 y4

x2 x3

xF

(a)

y1

x1

y2

x2

y3

hF

x3

y4 yF

(b)

Figure 12. (a) The deterministic nice input wF when n = 6 and F = ({1, 5, 6}, {2, 4, 5, 6}). (b) How to derive it from the corresponding list 2, 2, 1, 4, 5, 5, 6, 6. 5.5. The lower bound. To prove that the construction of the previous section is optimal, we will exhibit an n-state 2nfa that has no equivalent 2n 1nfa with fewer than n+1 states. In fact, our witness will simply be the automaton M0 from Section 2.3, which is deterministic and single-pass. 2n Moreover, 1nfas will be shown to need n+1 states not only for staying equivalent to M0 , but even for solving Ψ —on a stricter promise, actually. So, assume that the n-state 2nfa N that we kept fixed since the beginning of Section 5 is actually M0 . We will show that the 1nfa N ′ constructed in the previous section is minimal. We start with some intuition. Consider an arbitrary frontier F = (L, R) of N and let us list the elements of the sets L, R ⊆ [n] in increasing order, L = {x1 , x2 , . . . , xm }

and

R = {y1 , y2 , . . . , ym+1 },

for the appropriate 0 ≤ m < n. Since m < n, we know that L is a strict subset of [n]. So, we can name an element that does not belong to L, say x0 := min L. Then the combined list (7)

x0 y1 x1 y2 x2 · · · ym xm ym+1

gives rise to the following deterministic nice input (see Figure 12): wF := (xF , l)(gF , l)(hF , r)(yF , r), where xF := x0 , the function gF maps each x of (7) to its following y, the function hF maps each y 6= ym+1 to its following x, and yF := ym+1 ; i.e.: (8)

xF := min L gF := {(xi , yi+1 ) | 0 ≤ i ≤ m}

yF := max R hF := {(yi , xi ) | 1 ≤ i ≤ m}.

It is easy to verify that this input has a path and that the following holds. 34. Lemma. For any frontier F of N , the computation of N on wF is accepting and its frontier under the middle boundary is exactly F . Hence, every state of N ′ is used in some accepting computation and is therefore not redundant in any obvious way. So, N ′ appears to be minimal.

5. FROM 2NFAS TO 1NFAS

x′1

x1 x0

y1 y2 y3 y4

x2 x3 (a)

x′1 y1′ x′0 y2′ y3′ x′2 ′ y4 x′3

x1 y1′ x0 y2′ y3′ y4′ x2 x3

x′0

x′2 x′3 (b)

59

(c)

y1 y2 y3 y4 (d)

Figure 13. (a) Input wF from Figure 12. (b) A new input wF ′ , for F ′ = ({1, 4, 5}, {2, 3, 4, 5}). (c,d) Inputs wF,F ′ and wF ′ ,F . Note that only wF,F ′ has a path. To prove this intuition, we start by noting that every two frontiers F and F ′ of N give rise to the deterministic nice input (cf. Figure 13) wF,F ′ := (xF , l)(gF , l)(hF ′ , r)(yF ′ , r), where xF , gF , hF ′ , yF ′ are as in (8). Strengthening the promise of problem ′′ ′′ Ψ to allow only inputs of this form, we get problem Ψ ′′ = (Ψyes , Ψno ), with ′′ Ψyes := {wF,F ′ | F, F ′ are frontiers for N and wF,F ′ has a path}, ′′ := {wF,F ′ | F, F ′ are frontiers for N and wF,F ′ has no path}. Ψno

Clearly, N solves this problem, so that n states on a (single-pass deterministic) 2nfa are enough to solve Ψ ′′ . For 1nfas the problem is harder. 2n 35. Lemma. Every 1nfa for Ψ ′′ has at least n+1 states.

At the heart of the argument for this lemma lies the fact that, in the

2n 2n ′ ′ n+1 × n+1 matrix W = [wF,F ]F,F containing all inputs of the form ′ wF,F , two distinct inputs sitting in cells that are symmetric with respect

to the main diagonal cannot both have a path. In other words:

′′ Claim. For all frontiers F , F ′ : wF,F ′ , wF ′ ,F ∈ Ψyes ⇐⇒ F = F ′ .

Proof. Let F = (L, R) and F ′ = (L′ , R′ ) be any two frontiers of N . If F = F ′ then wF,F ′ = wF ′ ,F = wF and we have already observed that this input has a path. For the interesting direction, we assume that F 6= F ′ and we will prove that at least one of wF,F ′ , wF ′ ,F lacks a path. We start by letting m = |L|, m′ = |L′ | and considering the combined lists defined by the two frontiers, as in (7): x0 y1 x1 y2 x2 · · · ym xm ym+1

′ ′ and x′0 y1′ x′1 y2′ x′2 · · · ym x′m ym ′ +1 .

If the two lists were identical after their first elements, they would agree in their lengths, in their x’s (except possibly at x0 , x′0 ), and in their y’s, forcing F = F ′ , a contradiction. Hence, there have to be positions of disagreement after 0. Consider the earliest one among them.

60

1. EXACT TRADE-OFFS

If this position is occupied by y’s, say yi and yi′ , then we have either that yi < yi′ (Case 1) or that yi > yi′ (Case 2). If it is occupied by x’s, say xi and x′i , then we have either that xi < x′i or x′i is not present at all16 (Case 3) or that xi > x′i or xi is not present at all (Case 4). The four cases are treated with similar arguments. We will present only the argument for the first one in detail, and sketch the rest. So, suppose that the first disagreement is between yi and yi′ , and that in fact yi < yi′ . This implies that all previous positions after 0 contain identical elements: xF

gF x0 x′0

y1 y1′

x1 x′1

y2 y2′

x2 x′2

··· ···

yi−1 xi−1 yi ′ yi−1 x′i−1 yi′

hF ′ It also implies that yi is not in R′ . Indeed, if it were, it would be in the sub′ list y1′ , y2′ , . . . , yi−1 (since yi < yi′ ), and hence in the sublist y1 , y2 , . . . , yi−1 (since the two sublists coincide), contradicting the fact that yi is greater than all these elements of R. So yi ∈ / R′ , and therefore yi is not yF ′ (which ′ ′ is in R ) and has no value under hF (since the domain of hF ′ is also in R′ ). But then searching for a path in wF,F ′ we travel deterministically gF

h

gF

h

h

gF

F F F (x′1 = x1 ) → (y2 = y2′ ) → ··· → (x′i−1 = xi−1 ) → yi x0 → (y1 = y1′ ) → ′

′

′

reaching a node which is neither the exit yF ′ nor the start of an hF ′ -arrow. This means that wF,F ′ has no path. In Case 2, we similarly conclude that yi′ ∈ / R and that gF ′ , hF combine to reach yi′ ; but this is neither the exit yF nor the start of an hF -arrow, implying wF ′ ,F has no path. In Case 3, we deduce that xi ∈ / L′ and yet gF ′ , hF combine to reach it, so wF ′ ,F has no path. Finally, in Case 4, x′i is outside L while gF and hF ′ reach it, so that wF,F ′ has no path. Proof of Lemma 35. Towards a contradiction, assume that A is a 2n that solves Ψ ′′ with fewer than n+1 states. For each frontier F for ′′ N , we know the input wF = wF,F is in Ψyes and therefore A accepts it. Choose any accepting computation cF ∈ compA (wF ) and let qF be the state immediately after the middle boundary is crossed. Since the states of A are fewer than the frontiers for N , we know qF = qF ′ for two frontiers F 6= F ′ . But then, the usual cut-and-paste argument on the computations cF and cF ′ shows that A must also accept the inputs wF,F ′ and wF ′ ,F . ′′ Since A solves Ψ ′′ , we conclude that wF,F ′ , wF ′ ,F ∈ Ψyes despite F 6= F ′ , a contradiction to the last claim. 1nfa

16This happens if the list for F ′ stops at y ′ . i

6. CONCLUSION

61

6. Conclusion In this chapter we showed the exact trade-offs in the conversions from two-way (deterministic or nondeterministic) to one-way (deterministic or nondeterministic) finite automata. Our arguments recast those of Birget [6] into a more standard set-theoretic vocabulary and then complement them by carefully removing the redundancies in the associated constructions.17 Introducing frontiers, we provided a set-theoretic characterization of 2nfa acceptance (already present in [6], essentially) that complements the also set-theoretic characterization of 2nfa rejection given in [60]. Moreover, by applying the concept of promise problems even to the domain of regular languages, we nicely confirmed its reputation for always leading us straight to the combinatorial core of the hardness of a computational task. Crucially, the tight simulations performed by one-way automata in our proofs are as ‘meaningful’ as the tight simulation of [47] for the determinization of 1nfas: each state in these automata corresponds to a realizable and non-redundant set-theoretic object (a table, a frontier) that naturally emerges from the computational behavior of the simulated machine. It would be nice to identify similar objects and derive exact tradeoffs for the conversions from and towards other types of automata (e.g., alternating, probabilistic, or pebble automata, or even Hennie machines [4]) and more powerful machines (e.g., pushdown automata). It would also be interesting to know if the large size of the alphabet over which problems Φ and Ψ are defined is necessary for the exactness of the associated trade-offs.

A preliminary version of the contents of Section 5 can be found in [29]. 17First, the reasoning for the improvement on Shepherdson’s idea in the proof of

[6, Theorem A3.4] was refined. Second, the universal 1nfa constructed in the proof of [6, Theorem 4.2(1)] was observed to not be minimal: it could be implemented with only 4n + 4 states, as opposed to 8n + 3. Then, a careful application of the reachable-set construction in the proof of [6, Theorem 4.5] (on the minimal universal 1nfa obtained previously) revealed the frontier structure.

CHAPTER 2

2D versus 2N After Chapter 1, our understanding for almost all conversions shown in Figure 1 (page 20) is perfect. The only exceptions are the two conversions associated with the 2d vs. 2n problem, and our understanding of them is so limited that we cannot even tell whether the associated trade-offs are polynomially bounded. In this chapter we will advance our knowledge about these conversions in two quite different directions. In Section 2, we will focus on the conversion from 1nfas to 2dfas and the associated complete problem of liveness. We will prove that a certain class of 2dfas of restricted information fail to solve this problem, no matter how large they are. In Section 3 we will focus on the conversion from 2nfas to 2dfas and a certain class of 2nfas of restricted bidirectionality, the sweeping 2nfas. We will prove that small automata of this kind are not closed under complement. See Section 3 of the Introduction for the motivation behind these two different approaches. We begin with a brief note on the history of the 2d vs. 2n question. 1. History of the Problem The 2d vs. 2n question was first studied in the manuscript [51]. In it, Seiferas worked on the conversion from 1nfas to 2dfas. He suggested the strong conjecture (cf. page 15) that the trade-off is at least 2n − 1, and presented several examples of problems that could serve as witnesses. Soon after that, Sakoda and Sipser [48] invested the question with a robust theoretical framework (cf. Introduction, Section 3.1). Among other things, they defined the classes 1n, 2d, and 2n, along with the appropriate reduction relation that allowed the identification of complete problems. A 2n-complete and a 1n-complete problem were also defined, the latter being liveness. At the same time, the problems from [51] proved to be 1n-complete, too. In one class of attempts towards 2d 6= 2n, people have focused on proving exponential lower bounds for the trade-off from 1nfas to 2dfas of limited bidirectionality. Already in [51], Seiferas showed that the tradeoff is at least 2n − 1 if the 2dfas are single-pass. Later, Sipser [55] did the same for the case of 2dfas that are sweeping—much later, Leung [34] showed the lower bound remains as large even on a binary alphabet, as 63

64

2. 2D VERSUS 2N

opposed to the exponentially large one of [55]. Recently, Hromkovic and Schnitger [22] did the same for the case when the 2dfas are oblivious, in the sense that they move identically on all inputs of the same length—they also showed the lower bound remains exponential if we relax the restriction to allow a sub-linear (in the input length) number of distinct trajectories. Unfortunately, we know that none of these theorems resolves the conjecture in its generality, since full 2dfas can be exponentially more succinct than each of these restricted variants [51, 55, 2]. A second class of attempts has focused on unary automata. Under this restriction, Chrobak [7] proved that the trade-off from 1nfas to 2dfas is at most O(n2 ) and at least Ω(n2 ). Note that, on one hand, this upper bound shows that 2d ⊇ 1n for unary automata, so that the situation on unary inputs is sharply different from what it is conjectured to be in the general case. On the other hand, the lower bound is the best known one even for the trade-off from general 2nfas to 2dfas. In two more recent developments, Geffert, Mereghetti and Pighizzini have established the sub2 exponential upper bound 2Θ(lg n) for the trade-off from unary 2nfas to 2dfas [13], as well as a polynomial upper bound for the trade-off in the complementation of unary 2nfas [14]. Finally, there have also been some variations of the general problem of converting a 2nfa to 2dfa. If we demand that the 2dfa can decide identically to the simulated 2nfa no matter what state and input position the latter is started at (a requirement conceptually stronger than ordinary k simulation, but always satisfiable [5]), then the trade-off is at least 2lg n , for any k [25]. If we demand that the 2dfa decides identically to the simulated 2nfa only on all polynomially long inputs (a requirement conceptually weaker than ordinary simulation), then an exponential lower bound would confirm the old belief that nondeterminism is essential in logarithmic-space Turing machines (l 6= nl) [3]. Last, if we allow the starting 2nfa to be a Hennie machine (a more powerful device, but still not powerful enough to solve non-regular problems), then converting to a 2dfa indeed costs exponentially, but only because converting to a 2nfa already does [4]. 2. Restricted Information: Moles In this section we explore the approach that we described in Section 3.2 of the Introduction. After we formally define what it means for a 2nfa to be a mole, we will move on to prove that two-way deterministic moles cannot solve liveness, irrespective of how large they are. 2.1. Preliminaries. Our notation and definitions are as explained in Section 2 of the previous chapter, plus the following few additional concepts. If A and B are two sets, then A ⊖ B denotes their symmetric difference. If f and g are two functions, then f ◦ g and f g denote their composition,

2. MOLES 0

1

2

65

3

1 2 3 4 5 (a)

(b)

(c)

Figure 14. (a) Three symbols in Σ5 . (b) The string they define, simplified and indexed. (c) A 5-long 2-{1, 2, 4}-1 path, which is 2-disjoint on itself. returning g f (x) for every x, while f k denotes the k-fold composition of f with itself. In contrast, for u a string of symbols, uk denotes the concatenation of k copies of u. 2.1-I. Behavior of a 2dfa. Given any 2dfa M over set of states Q and alphabet Σ and any string u ∈ Σ ∗ , the behavior of M on u is the partial mapping γu from Q × {l, r} to Q × {l, r} that encodes all possible ‘entry-exit pairs’ as M computes on u: for every q ∈ Q, if lcompM,q (u) hits left into p, (p, l) γu (q, l) := (p, r) if lcompM,q (u) hits right into p, undefined if lcompM,q (u) loops or hangs,

while γu (q, r) is defined analogously, with rcomp instead of lcomp. 2.1-II. Strings over Σn . Recall the alphabet Σn over which we defined liveness (cf. Introduction, Section 3.1; see also Figure 14a). A concise way to refer to a symbol of Σn is to simply list its arrows in brackets: e.g., the rightmost symbol in Figure 14a is [12,14,25,44] . The symbol [] containing no arrows is called the empty symbol. Given any string x ∈ Σn∗ , we define the set of its nodes in the following, quite natural way (Figure 14b): Vx := {(i, j) | i ∈ [n] & 0 ≤ j ≤ |x|}. The left-degree of a node (i, j) ∈ Vx is the number of its neighbors on the column to its left (column j − 1), or 0 if j = 0. Similarly, the right-degree of (i, j) is the number of its neighbors on the column to its right, or 0. If x has exactly |x| edges that form 1 live path, we say x is a path (Figure 14c). For il , ir ∈ I ⊆ [n], we say x is a il -I-ir path if this one live path connects the il th leftmost node to the ir th rightmost node and visits only nodes with indices in I. If y ∈ Σn∗ , then x ∪ y is the unique string of length max(|x|, |y|) that has all edges of x, all edges of y, and no other edges. For k ≥ 0, we say y is k-disjoint on x if in x ∪ ([]k y) the edges from x and from y meet at no node (Figure 14c; see also Figure 19a on page 78).

66

2. 2D VERSUS 2N 1 2 3 4 5 ?

p

?

Figure 15. State p of focus (5, l) is reading the middle symbol: if it moves right, the next focus will be (1, l) or (3, l); if it moves left, the next focus will be (1, r) or (5, r). 2.2. Moles. To define when a 2nfa over Σn is a mole, we need a way of describing the notion of a state ‘focusing on’ some particular node of the current symbol. We define a focus to be any pair (i, s) ∈ [n] × {l, r} of index and side. We write s for the side opposite s. The (i, s)th node of a string x is the ith node of its leftmost (resp., rightmost) column, if s = l (resp., if s = r). The connected component of that node in the graph implied by x is called the (i, s)th component of x. By x ↾ (i, s) we denote the unique string that has the same length as x, all edges of the (i, s)th component of x, and no other edges. A 2nfa is a mole if each state p of it can be assigned a focus (ip , sp ) so that, whenever at p, the automaton behaves like a mole located on the (ip , sp )th node of the current symbol and facing sp : (i) it can ‘see’ only the component of that node, and (ii) it can ‘move’ only to nodes in that same component. More carefully: 1. Definition. Let M = (·, δ, ·) be a 2nfa over a set of states Q and the alphabet Σn . An assignment of foci for M is any mapping φ : Q → [n] × {l, r} such that, for any states p, q ∈ Q, symbol a ∈ Σn , and side s ∈ {l, r}: whenever M is at p reading a, (i) its next move depends only on the component containing the node which p is focused on: δ(p, a) = δ(p, a ↾ φ(p) ), (ii) its next state and position can only be such that the new focused node belongs to the same connected component as the node which p is focused on: if δ(p, a) ∋ (q, s), then (∃i ∈ [n]) φ(q) = (i, s) & a ↾ (i, s) = a ↾ φ(p) .

If such φ exists, we say M is a mole; we also say φ(p) is the focus of p.

To understand Condition (ii), consider as an example the case s = r (see also Figure 15): If p on a moves right into q, then in the new position q must focus on the left column (φ(q) = (·, s) = (·, l)), the one shared with the previous position. Moreover, if in this column q focuses on the ith node (φ(q) = (i, l)), then in the previous position this node (now the ith node of the right column) must belong to the same connected component as the node which p focused on (a ↾ (i, r) = a ↾ φ(p)).

2. MOLES

67

Note that the 1nfa from page 21 clearly satisfies Definition 1. Hence, small one-way nondeterministic moles can indeed solve liveness—with just n states, actually. In contrast, we will prove the following. 2. Theorem. Two-way deterministic moles cannot solve liveness. Remark that the theorem applies to all two-way deterministic moles, as opposed to only small ones. We also stress that the main purpose of Definition 1 is to disambiguate the intuitive notion of a mole—in contrast, the arguments in our proofs will heavily rely on intuition. 2.3. Mazes. What makes moles so weak is the fact that, as they move through the input, they can only observe the part of the graph directly connected to their current location. The rest of the graph is not observable, even if it occupies the same symbols as the observable part, and therefore does not affect the computation. Lemma 3 below turns this intuition into a clean fact that can be used in proofs. Before stating it, we need to talk about mazes and how moles compute on them and their compositions. Intuitively, a maze is any string on which some nodes have been designated as ‘entry-exit gates’ for moles (Figure 19b on page 78). More carefully, for x ∈ Σ ∗ , let Vx0 ⊆ Vx consist of every node that has exactly one of its two degrees equal to 0 (and can thus serve as a gate). A maze on x is any pair (x, X) where X ⊆ Vx0 . The computation of a mole on a maze is the same object as the computation of any 2nfa on any string, with the extra condition that it ‘starts by entering a gate’ and ‘if it exits a gate, it ends immediately’. Formally, let χ = (x, X) be a maze, u = (i, j) ∈ X a gate with 0-degree side s, and p a state of a mole M with focus φ(p) = (i, s). Then, the computation compM,p,u (χ) of M on χ from p and u (note the overloading of operator comp) is a prefix of either compM,p,j+1 (x) (if s = l) or compM,p,j (x) (if s = r). The prefix ends the first time (if ever) it reaches a point (qt , jt ) where the focus φ(qt ) = (it , st ) is on a gate with 0-degree side st . Note that x may contain nodes that have degree 0 on one of their two sides but are not gates; the computation may very well visit the 0-degree side of these nodes without having to terminate. To compose two mazes means to draw their strings on top of each other and then discard all coinciding gates (Figure 19c). More carefully, mazes χ = (x, X) and ψ = (y, Y ) are composable iff |x| = |y| (so that Vx = Vy = V ) and their graphs intersect only at gates and only appropriately: every v ∈ V , either has both its degrees equal to 0 in at least one of x, y; or is a gate in both mazes, with a different 0-degree side in each of them. If χ, ψ are composable, then their composition is the pair χ ◦ ψ := (x ∪ y, X ⊖ Y ). Clearly, the composition is a maze, too. Note that, by the conditions of composability, in each one of the symbols of x ∪ y every non-empty connected component comes entirely from

68

2. 2D VERSUS 2N

exactly one of x or y. Hence, when a mole reads a symbol, its next step depends on exactly one of x or y. Generalizing, we can prove the following. 3. Lemma. Let χ and ψ be as above, and ω := χ◦ψ be their composition. Consider any computation c := compM,p,u (χ ◦ ψ) of a mole M from a gate u ∈ X ⊖ Y that comes from X. Then there exists a unique list of computations c1 , c2 , . . . such that: • each ct is a computation of M on χ (resp., on ψ) iff t is odd (even); • c1 starts from p and u, while every ct+1 starts from the state and gate where ct ends; • if we remove the first point of every ct after c1 and then concatenate all computations, the resulting computation is c. Put another way, if we can decompose a maze ω into two mazes χ and ψ, then any computation c of a mole on ω can be uniquely decomposed into ‘subcomputations’ c1 , c2 , . . . that alternate between χ and ψ. We say these computations are the fragments of c with respect to the decomposition ω = χ ◦ ψ. Clearly, either all fragments are finite, and then their list is infinite iff c is; or not all fragments are finite, in which case their list is finite and the only infinite fragment is the last one. Note that different decompositions of ω may lead to different decompositions of c. 2.4. Hard inputs. In Section 2.5 we will fix an arbitrary deterministic mole and prove that it fails against liveness. To this end, we will construct inputs on which the automaton decides incorrectly. Those fatally hard strings will be extremely long. However, we will build them out of other, much shorter (but still very long) strings, which already strain the ability of the automaton to process the information on its tape. In this section we describe those shorter strings. We start with inputs which can be built for any 2dfa and later (Section 2.4-V) focus on inputs that can be built particularly for deterministic moles. So, fix M to be an arbitrary 2dfa over state set Q and alphabet Σ. 2.4-I. Dilemmas. Consider any property P ⊆ Σ ∗ of the strings over Σ, and assume that it is infinitely extensible to the right, in the sense that every string that has the property can be right-extended into a strictly longer one that also has it: (∀y ∈ P )(∃z 6= ǫ)(yz ∈ P ). For example, the property of being of even length is of this kind. Given any y ∈ P , we can perform the following experiment. For each p ∈ Q, we examine the computation lcompM,p (y) and check if it hits right : if it does, we set a bit ay,p to 1; otherwise, the computation hangs, loops, or hits left, and ay,p is set to 0. In the end, we build the bit-vector ay := (ay,p )p∈Q . This is our outcome. How does the outcome change if we right-extend y into some yz ∈ P ? How do ay and ayz compare? For every p, clearly lcompM,p (y) is a prefix of lcompM,p (yz). So, if the first computation hits left, loops, or hangs, so

2. MOLES

69

does the second one; but if the first one hits right, there is no guarantee what the second computation does. Hence, all bits in ay that are 0 keep the same value in ayz ; but a bit which is 1 may turn into a 0. Overall, if “≥” is the natural component-wise order, we have the following. 4. Lemma. For all y, yz ∈ P : ay ≥ ayz . What happens to the outcome of the experiment if we further rightextend y into yzz ′ ∈ P ? And then into yzz ′z ′′ ∈ P ? While y is infinitely right-extensible inside P , the outcome may decrease only finitely many times. Obviously then, from some point on it must stop changing. When this happens, the extension of y that we have arrived at is a very useful tool. The following definition and lemma talk about it formally. 5. Definition. Let P ⊆ Σ ∗ . An l-dilemma over P is any string y ∈ P such that:1 for all extensions yz ∈ P and all states p ∈ Q, lcompM,p (y) hits right ⇐⇒ lcompM,p (yz) hits right. An r-dilemma over P is defined similarly, on left-extensions and rcomp. 6. Lemma. Let P ⊆ Σ ∗ . If P is non-empty and infinitely extensible to the right (resp., left), then l-dilemmas over P (r-dilemmas over P ) exist. Proof. Pick any y ∈ P and keep extending it in the direction of infinite extensibility until ay stops changing. When it does, the extension is a dilemma (as are, of course, all further extensions inside P ). In [51], dilemmas are called “blocking strings”. Both names serve as reminders of the way these string are used, as we now explain. 7. Lemma. Suppose x ∈ Σ ∗ , y is an l-dilemma over P , yz ∈ P , and some computation c := lcompM,p (xyz) crosses the xy-z boundary. After the first such crossing, c never visits x again and it eventually hits right. Proof. Consider the first time c crosses the xy-z boundary (Figure 16a). Let r be the state resulting from this crossing, and q the state resulting from the last crossing of the x-yz boundary before that. Then, the computation between these two crossings is lcompM,q (y) and hits right (into r). Since y is an l-dilemma over P and z does not spoil the property (yz ∈ P ), we know that lcompM,q (yz) also hits right. But this computation is a suffix of c. So, c also hits right. Moreover, after crossing the xy-z boundary, it never visits x again. 1Note that the given condition is the same as (∀yz ∈ P )(a = a ), but rather y yz

more informative. Also note that the “⇐=” part of the displayed equivalence is trivially true of all y, by Lemma 4. What is important is the “=⇒” part: on every extension of y in P , the computation will keep hitting right.

70

2. 2D VERSUS 2N

x

y

z

p

y p

q

z q

r q′ (a)

r

(b)

Figure 16. Two-way computations on dilemmas and on generic strings (see text). In total, once the computation crosses the xy-z boundary, it is restricted inside yz and forced to eventually hit right. Put another way, when M enters y, it faces a ‘dilemma’: either it will stay forever inside xy, never crossing the xy-z boundary; or it will cross it, but then also hit right without visiting x again. In effect, y ‘blocks’ M from returning to x after having seen z —and ‘locks’ it into hitting right. In yet other words, y makes sure that every left computation of M on xyz that hits left, hangs, or loops does so inside xy, before making it to z. 2.4-II. Generic strings. Consider again some P ⊆ Σ ∗ which is infinitely extensible to the right. For each y ∈ P , we can define the set of states that can be produced on the rightmost boundary of y by left computations: lstates(y) := q ∈ Q | (∃p ∈ Q) lcompM,p (y) hits right into q . How does this set change if we extend y into a string yz ∈ P ? How does it compare to the set lstates(yz)? Consider the function lmap(y, z)(·), defined as follows (Figure 16b): for each q ∈ lstates(y), the computation compM,q,|y|+1 (yz) is examined; if it hits right into some state r, then lmap(y, z)(q) := r; otherwise, it hits left, loops, or hangs, and lmap(y, z)(q) is left undefined. Note that the values of lmap(y, z) are all in lstates(yz). Indeed, if r is such a value, then r = lmap(y, z)(q) for some q ∈ lstates(y). Hence, the computation compM,q,|y|+1 (yz) hits right into r and some computation lcompM,p (y) hits right into q. Combining the two, we get the computation lcompM,p (yz), that hits right into r. We thus conclude that r ∈ lstates(yz), as claimed. Moreover, the values of lmap(y, z) cover lstates(yz). Indeed, if some state r ∈ lstates(yz), then some computation c := lcompM,p (yz) hits right into r. We know c crosses the y-z boundary, so let q be the state produced by the first such crossing. The computation before this crossing is lcompM,p (y) and hits right into q, so q ∈ lstates(y). The computation after the crossing is compM,q,|y|+1 (yz) and, as a suffix of c, hits right into r. We thus conclude that lmap(y, z)(q) = r, namely that lmap(y, z) covers r.

2. MOLES

71

Overall, lmap(y, z) is a partial surjection from the set lstates(y) to the set lstates(yz). This immediately implies its domain has enough elements to cover the range, so we know |lstates(y)| ≥ |lstates(yz)|. The next lemma summarizes our findings. Analogously to lstates(y), the set rstates(z) consists of all states that can be produced on the leftmost boundary of z by right computations. Clearly, the symmetric arguments apply. Note that these involve a partial surjection rmap(y, z) from rstates(z) to rstates(yz), defined analogously to lmap(y, z). 8. Lemma. For all y, yz ∈ P , the function lmap(y, z) partially surjects lstates(y) to lstates(yz); hence |lstates(y)| ≥ |lstates(yz)|. In the same manner, for all yz, z ∈ P , the function rmap(y, z) partially surjects rstates(z) to rstates(yz); hence |rstates(yz)| ≤ |rstates(z)|. As in Section 2.4-I, we now ask what happens to the size of the set lstates(y) as we keep right-extending y inside P . Although y is infinitely right-extensible, the size of the set can decrease only finitely many times. Hence, from some point on it must stop changing. When this happens, we have arrived at another useful tool. 9. Definition. Let P ⊆ Σ ∗ . A string y is l-generic over P if y ∈ P and:2 for all extensions yz ∈ P , |lstates(y)| = |lstates(yz)|. An r-generic string over P is defined symmetrically, on left-extensions and rstates. A string that is simultaneously l-generic and r-generic over P is called generic. 10. Lemma. Let P ⊆ Σ ∗ . If P is non-empty and infinitely extensible to the right (resp., left), then l-generic strings over P (r-generic strings over P ) exist. In addition, if yl is l-generic and yr is r-generic, then every string yl zyr ∈ P is generic. Proof. For the last claim, we simply note that every right-extension of an l-generic string inside P is also an l-generic string. Similarly in the other direction. Generic strings were first introduced by Sipser [55], for sdfas and over the property of liveness. As we will show in the next section, they strengthen dilemmas. Before presenting that argument, let us prove a last fact about the operators lstates and rstates. 11. Lemma. For any two strings y and z, lstates(yz) ⊆ lstates(z). Similarly, in the other direction, rstates(y) ⊇ rstates(yz). 2Note that the “≥” part of the displayed equality |lstates(y)| = |lstates(yz)| is trivial for all y, by Lemma 8. What is important is the “≤” part: on every extension of y in P , the set will manage to stay as large.

72

2. 2D VERSUS 2N

Proof. We prove the first containment. Pick any r ∈ lstates(yz) and any computation d := lcompM,p (yz) that hits right into r. (Figure 16b.) We know d crosses the y-z boundary. Let q ′ be the state produced by the last such crossing. Then lcompM,q′ (z) is a suffix of d, and therefore also hits right into r. So, r ∈ lstates(z). 2.4-III. Dilemmas versus generic strings. To examine the relation between dilemmas and generic strings, it is helpful to have the following alternative characterizations of the two classes of strings, in terms of the functions lmap(y, z) and rmap(y, z). 12. Lemma. Suppose y ∈ P ⊆ Σ ∗ . Then y is an l-dilemma over P iff for all yz ∈ P the function lmap(y, z) is total. Similarly for any z ∈ P and for r-dilemmas and rmap(y, z). Proof. For the forward direction, assume y is an l-dilemma over P . Consider any yz ∈ P and any q ∈ lstates(y). (Figure 16b.) Let c := lcompM,p (y) be a computation that hits right into q. We know c is a prefix of d := lcompM,p (yz). So, d crosses the y-z boundary (the first such crossing is into q), and thus hits right (Lemma 7). Hence, its suffix compM,q,|y|+1 (yz) hits right, too, which implies lmap(y, z)(q) is defined. For the reverse direction, fix y ∈ P and suppose lmap(y, z) is total for all yz ∈ P . Consider any such yz, any p ∈ Q, and assume c := lcompM,p (y) hits right into some state q. (Figure 16b.) Then q ∈ lstates(y). Therefore, lmap(y, z)(q) is defined. This implies c′ := compM,q,|y|+1 (yz) hits right. Combining c and c′ , we get the computation d := lcompM,p (yz). As c′ is a suffix of d, we know d hits right as well. 13. Lemma. Suppose y ∈ P ⊆ Σ ∗ . Then y is l-generic over P iff for all yz ∈ P the function lmap(y, z) is total and bijective. Similarly for z ∈ P being r-generic and for rmap(y, z). Proof. For the forward direction, suppose y is l-generic and pick any yz ∈ P . We know lmap(y, z) is a partial surjection from lstates(y) to lstates(yz). Since y is l-generic, we also know the two sets have the same size. So, lmap(y, z) must be total and injective. Conversely, fix y ∈ P and suppose ay,z is total and bijective for all yz ∈ P . Then clearly, for every such yz, the sets lstates(y) and lstates(yz) must have the same size. Intuitively, a dilemma guarantees that the computations that manage to survive through it will also survive through every extension that preserves the property. A generic string guarantees that, in addition, these computations will keep exiting each extension into different states. 14. Lemma. Let P ⊆ Σ ∗ . Over P , every l-generic string is an ldilemma and every l-dilemma is right-extensible into an l-generic string. Similarly for r-generic strings, r-dilemmas, and left-extensions.

2. MOLES

73

Proof. Lemmata 12 and 13 prove the first claim. For the second claim, we simply note that every string in P can be right-extended into l-generic strings. 2.4-IV. Traps. Consider a property P ⊆ Σ ∗ which is infinitely extensible in either direction and closed under concatenation. For this section, fix ϑ as a generic string over P , and let L := rstates(ϑ),

R := lstates(ϑ),

denote the sets of states producible on the leftmost and rightmost boundary of ϑ. Note that ϑ is both an l-dilemma and an r-dilemma (Lemma 14). A trap (on ϑ) is any string of the form ϑxϑ, where x ∈ P is the infix. By Lemma 10 and the closure of P under concatenation, traps are still generic strings. However, they further restrict M ’s freedom: By Lemma 13, the function lmap(ϑ, xϑ) is a total bijection from lstates(ϑ) = R to lstates(ϑxϑ). Since lstates(ϑxϑ) ⊆ R (by Lemma 11), lmap(ϑ, xϑ) is a total bijection from R to a subset of R. Clearly, this is possible only if this subset is R itself. So, lmap(ϑ, xϑ) simply permutes R. To simplify notation, we denote this permutation by αx . Namely, αx := lmap(ϑ, xϑ). Similarly, rmap(ϑx, ϑ) permutes L, and we denote this permutation as βx . Overall, we have proved the following. 15. Lemma. For all x ∈ P : αx permutes R and βx permutes L. Intuitively, in each direction, the computations that manage to cross the first copy of ϑ eventually cross the entire trap; but, after this first copy, they collectively do nothing more than simply permute the set of states that they have already produced. As we now show, the two permutations fully describe the behavior of M on the trap. 16. Lemma. For all x, y ∈ P : (αx , βx ) = (αy , βy ) =⇒ γϑxϑ = γϑyϑ . Proof. Suppose (αx , βx ) = (αy , βy ) and consider any p ∈ Q. We show γϑxϑ and γϑyϑ agree on (p, l)—the proof for (p, r) is similar. We examine the computations cx := lcompM,p (ϑxϑ) and cy := lcompM,p (ϑyϑ). Clearly, these behave identically up to the first crossing of the ‘critical’ boundary between ϑ and xϑ or yϑ. If one of them hits left, loops, or hangs, it does so inside ϑ (since ϑ is an l-dilemma) without crossing the critical boundary; so, the other computation behaves identically, thus γϑxϑ (p, l) = γϑyϑ (p, l). If one of them hits right, then it crosses the critical boundary into some state q, as does the other one; but then they both hit right, into the same state r := αx (q) = αy (q), so γϑxϑ (p, l) = γϑyϑ (p, l) = (r, r). We call (αx , βx ) the inner-behavior of M on the trap ϑxϑ. Note the distinction from the ‘behavior’ γϑxϑ .

74

2. 2D VERSUS 2N

ϑ p

x c1

ϑ

y

ϑ

q c2 r c3 s

Figure 17. A two-way computation on a trap (see text). An interesting case arises when ϑ is an infix of the infix itself. Then the inner-behavior of M on the trap can be deduced from its inner-behavior on the traps that are induced by the other two pieces of the infix. 17. Lemma. Suppose x, y ∈ P and z := xϑy. Then (αz , βz ) = (αx ◦ αy , βy ◦ βx ). Proof. To show that αz = αx ◦ αy (the argument for βz = βy ◦ βx is similar), we pick an arbitrary q ∈ R and show that αz (q) = αy αx (q) . (Figure 17.) We know q is produced by some right-hitting left computation on ϑ, say c1 := lcompM,p (ϑ) for some state p. Since ϑ is an l-dilemma over P and ϑzϑ ∈ P , we know c := lcompM,p (ϑzϑ) also hits right, into some state s. Therefore, αz (q) = s. Before hitting right, c surely crosses the ϑxϑ-yϑ boundary; let r be the state produced by the first such crossing. Clearly, the computation c2 := compM,q,|ϑ|+1 (ϑxϑ) hits right into r, and hence αx (q) = r. Moreover, the suffix of c after the first crossing of the ϑxϑ-yϑ boundary is c3 := compM,r,|ϑxϑ|+1 (ϑxϑyϑ) and obviously hits right into s. However, since ϑ is an l-dilemma over P and ϑyϑ ∈ P , we know c3 never visits the prefix ϑx. Hence, it can also be written as c3 = compM,r,|ϑ|+1 (ϑyϑ). Since it hits right into s, we conclude that αy (r) = s. Overall, αz (q) = s = αy (r) = αy αx (q) .

An obvious generalization holds when the infix contains multiple copies of ϑ. In a particular case of interest, the infix consists of several ϑ-separated copies of some x ∈ P . Specifically, for any k ≥ 1, we define x(k) := x(ϑx)k−1 and prove the following. 18. Lemma. For any x ∈ P and for any k ≥ 1: (αx(k) , βx(k) ) = (αx )k , (βx )k .

2.4-V. Hard inputs to deterministic moles. We now assume that the M of the previous sections is defined over Σn and that it is actually a mole. We will design inputs on which M misses a significant amount of information. All these inputs are going to be paths (cf. Section 2.1-II). We fix some I ⊆ [n] and i ∈ I, and consider the set Π ⊆ Σn∗ of all i-I-i paths. Clearly, Π is non-empty, infinitely extensible in both directions, and 2dfa

2. MOLES

75

closed under concatenation. Hence, by Lemma 10, generic strings over Π exist. We fix ϑ to be one, and let κ := |ϑ|. We also set L := rstates(ϑ), R := lstates(ϑ), and let µ := lcm(|L|!, |R|!) be the least common multiple of the sizes of the corresponding permutation groups. For every l ≥ 1, we consider all traps (on ϑ) with infixes of length l and collect into a set Ωl all inner-behaviors that M exhibits on them: Ωl := {(αx , βx ) | x is an i-I-i path of length l}. As shown in the next fact, every inner-behavior that can be induced by an l-long infix can also be induced by an infix of length l + 2µ(l + κ). The subsequent fact explains that sometimes the converse is also true. 19. Lemma. For every l ≥ 1: Ωl ⊆ Ωl+2µ(l+κ) . Proof. Pick any behavior (α, β) ∈ Ωl . We know that some l-long infix x ∈ Π induces this behavior, namely (α, β) = (αx , βx ). Consider the path x(2µ+1) = x(µ) ϑxϑx(µ) . This is also in Π and of length (2µ + 1)l + 2µκ = l + 2µ(l + κ). Moreover, by Lemma 18 and the selection of µ, we know that this path induces the behavior (α2µ+1 , βx2µ+1 ) = (αx )2µ αx , βx (βx )2µ = x (αx , βx ) = (α, β). Hence, (α, β) ∈ Ωl+2µ(l+κ) . 20. Lemma. There exist 3 l ≥ 1 such that Ωl = Ωl+2µ(l+κ) . Proof. The constant (|L|!) × (|R|!) upper bounds the sizes of all sets Ω1 , Ω2 , . . . , so at least one of them is of maximum size. Pick l so that Ωl is such. Then both Ωl ⊆ Ωl+2µ(l+κ) (by Lemma 19) and |Ωl | ≥ |Ωl+2µ(l+κ) | (by the selection of l). Necessarily then, the two sets must be equal. Intuitively, for the lengths l and l + 2µ(l + κ), this last fact says that between two copies of ϑ, every i-I-i path of either length can be replaced by some path of the other length without M noticing the trick (cf. Lemma 16). 2.5. The proof. We now fix an arbitrary deterministic mole M = (s, δ, f ) over Σ5 and prove that it fails to solve liveness. To this end, in Sections 2.5-II and 2.5-III we construct a maze that ‘confuses’ M . Our most important building blocks are the paths of the next section. 2.5-I. Three special paths. In this section we fix n := 5, i := 2, and I := {1, 2}. For these n, i, and I, we fix Π, ϑ, κ, and µ as in Section 2.4-V, we let λ be a length as in Lemma 20, and we set Λ := 2µ(λ + κ). 21. Lemma. There exist paths π, ρ, σ ∈ Π such that • M cannot distinguish among them: γπ = γρ = γσ . • ρ is Λ-disjoint on itself, and π is Λ-disjoint on σ. • π is Λ-shorter than ρ, and ρ is Λ-shorter than σ: |ρ|−|π| = |σ|−|ρ| = Λ. • π is non-empty but short: 0 < |π| ≤ Λ. 3Note that the argument essentially shows the existence of infinitely many such l.

(b)

(a)

ϑ′ ϑ

η

λ

ϑ′

η η′

Λ

ι

Λ − (2κ + λ)

ϑ′

ι

ϑ η′

η

ϑ

ϑ′

Figure 18. (a) Selecting the paths η and ι; then the ‘mirrors’ ϑ′ (of ϑ) and η ′ (of η). (b) The entire path ρ and how it is Λ-disjoint on itself.

ϑ

κ

ι

76 2. 2D VERSUS 2N

2. MOLES

77

Proof. Each one of π, ρ, and σ is going to be a trap on ϑ. So, the proof consists in properly selecting the corresponding infixes x, y, z ∈ Π. We set ρ := ϑyϑ, where y has length λ + Λ and guarantees ρ is Λdisjoint on itself. Constructing y is not hard (Figure 18a): We pick paths η := any 2-I-1 path of length λ, ϑ′ := the 1-I-1 path of length κ that is 0-disjoint on ϑ, ι := any 1-I-1 path of length Λ − (2κ + λ), and η ′ := the 1-I-2 path of length λ that is 0-disjoint on η. Then, setting y := ηϑ′ ιϑ′ η ′ we see that this is indeed a 2-I-2 path of length λ + Λ; and shifting ρ = ϑyϑ = ϑηϑ′ ιϑ′ η ′ ϑ on a copy of itself by Λ = |ϑηϑ′ ι| causes only its prefix ϑηϑ′ to overlap with the ‘mirroring’ suffix ϑ′ η ′ ϑ, so that no vertex is shared (Figure 18b). We set π := ϑxϑ, where x has length λ and guarantees π is indistinguishable to ρ. Selecting x is easy: Since y is of length λ + Λ, the inner-behavior (αy , βy ) of M on ρ is in Ωλ+Λ , and therefore in Ωλ . Hence, there exist λ-long paths that induce this inner-behavior. Picking x to be such a path, we know that (αx , βx ) = (αy , βy ) and hence γπ = γρ . We set σ := ϑzϑ, where z has length λ + 2Λ and guarantees that π is Λ-disjoint on σ and that σ is indistinguishable to π. Note that, given the lengths of x and z, the disjointness condition amounts to saying that π and σ should not intersect when ‘centered’ on top of each other. The construction of z is trickier. We start by selecting a path y ′ that is as long as y (i.e., of length λ + Λ) and does not intersect π when the two are ‘centered’ on top of each other (i.e., ϑxϑ is ( Λ2 −κ)-disjoint on y ′ ). This selection is trivial: we just take the unique 1-I-1 path that is as long as π (i.e., of length λ + 2κ) and 0-disjoint on it, and extend it by Λ2 − κ in both directions into any 2-I-2 path. Now, the inner-behavior (αy′ , βy′ ) of M on ϑy ′ ϑ is in Ωλ+Λ , and hence in Ωλ . Therefore, we can find an λ-long x′ ∈ Π that induces the same behavior, (αx′ , βx′ ) = (αy′ , βy′ ). We set z := (x′ )(µ) ϑy ′ ϑ(x′ )(µ−1) ϑx, the path containing 2µ + 1 ϑ-separated paths, all copies of x′ except the middle and rightmost ones, which copy y ′ and x. The length of z is indeed λ + 2Λ. Moreover, σ = ϑzϑ symmetrically extends y ′ by |ϑ(x′ )(µ) ϑ| = |ϑ(x′ )(µ−1) ϑxϑ| = Λ2 + κ, which in turn symmetrically out-lengths π by Λ2 − κ. Overall, σ symmetrically out-lengths π by Λ without intersecting it. That is, π is Λ-disjoint on σ. Finally, the inner-behavior (αz , βz ) of M on σ is (αx′ )µ αy′ (αx′ )µ−1 αx , βx (βx′ )µ−1 βy′ (βx′ )µ =

(αx′ )2µ αx , βx (βx′ )2µ

= (αx , βx ),

by Lemmata 17 and 18, and by the selection of µ. Hence, γσ = γπ .

(5)

(4)

(3)

(2)

(1)

Λ−1

Λ−1

ρ

ρ

Λ−1

|π| − 1

ρ

Figure 19. (a) in each of 1, 2, 3: a 29-long string, 6-disjoint on itself; see 5. (b) in each of 1, 2, 3: a maze; gates marked with circles. (c) in 3: the composition of the mazes of 1, 2. (d) in 1, 2, 3: examples of τ2 , τ1 , τ , respectively, for a schematic case Λ = 6, |π| = 4, and a schematic ρ. (e) in 4: a schematic of τ 4 ; in 5: a snippet of the union of a τ i with a Λ-shifted copy of itself.

Λ−1

78 2. 2D VERSUS 2N

2. MOLES

79

2.5-II. A maze of questions. We start with two strings (Figure 19d) τ1 := []3Λ ρ[] and τ2 := [33]Λ−1 [32][22]Λ−1 [23][33]Λ−1 [32,34][45][55]Λ−1 [54][44]|π|−1 [23,43],

which are equally long and each is Λ-disjoint on itself (recall the selection of ρ). Moreover, in τ := τ1 ∪ τ2 their graphs intersect only at the endpoints of ρ, so that τ is also Λ-disjoint on itself. This implies that τ i is Λ-disjoint on itself, too, for all i ≥ 1 (Figure 19e). Let P := {τ i | i ≥ 1} be the set of all powers of τ . Select τl and τr as l- and r-dilemmas over P . Fix m := 2|Q| + 1. The live string z := τl τ m τr is also a power of τ and in it we think of the m ‘middle’ copies of τ as distinguished. On this string, we consider the natural maze ω = (z, Z) := (z, {u, v}), where u := (3, 0) and v := (3, |z|). Consider the |Q| computations of the form compM,p,ε (ω) that we get as we vary p ∈ Q and pick ε := u when p focuses on the left (φ(p) = (·, l)), and ε := v otherwise. Some of them are infinite (i.e., they loop) or finite but non-crossing (i.e., they hang; or they start and end on the same gate). We disregard them and keep only those that are crossing (i.e., they start and end in different gates). Let k be their number. Clearly, k ≤ |Q|. Fix d to be any of these k computations and fix 1 ≤ i ≤ m. We know d ‘visits’ the ith distinguished copy of τ , and we want to discuss its behavior there. In particular, we want to consider the parity bi,d ∈ {0, 1} of the number of times that d ‘fully crosses’ the copy of ρ in the ith distinguished copy of τ . A careful definition of bi,d follows. If we ‘rip off’ ρ from the ith distinguished copy of τ and then add the two endpoints ui , vi of the path as new gates, we construct a new maze, χi := (τl τ i−1 ) τ2 (τ m−i τr ), {u, v, ui , vi } . By the ‘complementary’ operation, where we rip off everything except the particular copy of ρ, we can construct the ‘complementary’ maze, i−1 m−i τr | ψi := ([]|τl τ | ) []3Λ ρ[] ([]|τ ), {ui , vi } .

Clearly, ω = χi ◦ ψi , and d is a finite computation on this composition. By Lemma 3, we can break d into its finitely many, finite fragments d1 , d2 , . . . , dν . We know every even(-indexed) fragment is a computation on ψi ; we call it crossing if its starting and ending gates differ. The bit bi,d records the parity of the number of such fragments. In other words: bi,d = 0 ⇐⇒ d exhibits an even number of crossing even fragments. Intuitively, as the mole develops a crossing computation on ω, each distinguished copy of τ asks: “odd or even?” The mole answers this question with the parity of the number of times that it fully crosses ρ in that copy. The bits bi,d record exactly these answers. Organizing these m × k bits into m k-long vectors bi := (bi,d )d , for i = 1, . . . , m, we see that there are more vectors than values for them:

(5)

(4)

(3)

(2)

(1)

u

u

σ

ul u′l

π

vl

vl′

ur u′r

vr vr′

π

ρ

v

v

Figure 20. (a) in 3: a schematic of χ′ , focusing on the snippets around the leftmost, i1 th distinguished, i2 th distinguished, and rightmost pairs of copies of τ . (b) in 4: a schematic of ψ ′ , for the same snippets. (c) in 5: a schematic of ω ′ = χ′ ◦ ψ ′ , for the same snippets; in 1, 2: a better view of how σ, π connect the two disjoint graphs of x′ when they replace two copies of ρ.

u′

u′

σ

ρ

v′

v′

80 2. 2D VERSUS 2N

2. MOLES

81

2k ≤ 2|Q| < 2|Q| + 1 = m. Hence, bi1 = bi2 for some 1 ≤ i1 < i2 ≤ m. This means that, in each crossing finite computation, the answer to the i1 th question equals the answer to the i2 th one. 2.5-III. A more complex maze. We now return to ω = (z, {u, v}). We remove ρ from the i1 th and i2 th distinguished copies of τ , and name the four natural new gates ul , vl (endpoints of ρ in the i1 th copy) and ur , vr (endpoints of ρ in the i2 th copy) to get the new maze χ = (x, X) := (τl τ i1 −1 ) τ2 (τ i2 −i1 −1 ) τ2 (τ m−i2 τr ), {u, v, ul, vl , ur , vr } . As before, the ‘complementary’ maze (rip everything except the two ρ’s) is ψ = (y, Y ) := (· · · ) []3Λ ρ[] (· · · ) []3Λ ρ[] (· · · ), {ul , vl , ur , vr } ,

where ellipses stand for appropriately many []s. Obviously, ω = χ ◦ ψ. In this section, we will construct a maze ω ′ = χ′ ◦ ψ ′ , where the mazes χ′ and ψ ′ are complex versions of χ and ψ. We start by noting that x is Λ-disjoint on itself (because z is). So, in the union x′ := x ∪ ([]Λ x) of x with a Λ-shifted copy of itself, the two graphs do not intersect. (Figure 20a.) So, letting χ′ := (x′ , X ′ ), where X ′ := X ∪ {u′ , v ′ , u′l , vl′ , u′r , vr′ } contains all gates of χ plus their counterparts in the shifted copy, we know every computation on χ′ visits and depends on exactly one of the two disjoint graphs. Similarly, y is Λ-disjoint on itself (because ρ is), the union y ∪ ([]Λ y) contains two pairs of disjoint copies of ρ, and Y ′ := Y ∪ {u′l , vl′ , u′r , vr′ } contains their endpoints. Viewing each pair of copies of ρ as a copy of the string ρ∪([]Λ ρ), we can replace it with a copy of the string ρ′ := σ∪([]Λ π). If y ′ is the new string, we set ψ ′ := (y ′ , Y ′ ). (Figure 20b.) Crucially, this substitution preserved (i) the lengths of strings: |y ′ | = |y ∪ ([]Λ y)|, because |ρ′ | = |σ ∪ ([]Λ π)| = |σ| = 2κ + λ + 2Λ = |ρ| + Λ = |ρ ∪ ([]Λ ρ)|; (ii) the number and disjointness of paths: since π is Λ-disjoint on σ, we know ρ′ also contains two disjoint paths; and (iii) the set of endpoints of paths: for example, on the copy of ρ′ on the left, σ and π have endpoints ul , vl′ and u′l , vl . Note that every computation on ψ ′ visits and depends on exactly one of the paths. Clearly, the graphs of x′ and y ′ intersect only at the gates in Y ′ . So, χ′ and ψ ′ are composable, into ω ′ = (z ′ , Z ′ ) := χ′ ◦ ψ ′ = (x′ ∪ y ′ , {u, v, u′ , v ′ }). (Figure 20c.) Note that u and u′ are on the far left; v and v ′ are on the far right; and the four paths of y ′ connect the two graphs of x′ : the mole can switch graphs only if it fully crosses one of the paths. 2.5-IV. The hidden gate. Consider the dead input z ′ [] and the computation c′ := lcompM,s (⊢z ′ []⊣) on it. From now on, our goal is to prove that c′ never visits []. Equivalently, we want to show that M never visits the 0-degree side of the rightmost node v ′ of z ′ . Intuitively, this is the same as saying that the maze implied by z ′ hides v ′ from the mole. Note that

82

2. 2D VERSUS 2N

this will immediately imply the failure of M : on the live input z ′ [33] the mole will compute exactly as on the dead input z [], as it will never visit the 0-degree side of v ′ to note the difference. We start by remarking that, since the first symbol of z ′ is [33], any attempt of the mole to depart from ⊢ into a state of focus other than (3, l) is followed by a step back to ⊢. Ignoring these attempts and also noting that the mole can never move past [], we see that c′ consists essentially of zero or more computations of the form compM,p,1 (z ′ []) with φ(p) = (3, l). For our purposes, it is enough to study the case where c′ consists of exactly one such computation. So, suppose c′ := compM,p,1 (z ′ []), where φ(p) = (3, l). As a mole, every time M visits the 0-degree side of the nodes u′ , v, v ′ , it changes direction to ‘return into the graph’ of z ′ . Call every such move a turn and break c′ into segments c′1 , c′2 , . . . so that successive segments are joined at a turn: the later segment starts at the state and position following the last state and position of the earlier segment. Clearly: each segment is a computation on ω ′ ; the first segment is c′1 = compM,p,u (ω ′ ), but later segments start at a gate in {u′ , v, v ′ }; and either all segments are finite, in which case their list is finite iff c′ is, or not, in which case the list is finite and only the last segment is infinite. To prove that c′ never visits [], it is enough to show that no segment ends in v ′ . This, in turn, is a corollary of the following: • the first segment starts at gate u, • every finite segment that starts at gate u and does not hang necessarily ends either at gate u or at gate v, and • every finite segment that starts at gate v and does not hang necessarily ends either at gate u or at gate v. We only prove the second statement, in the next section. The third statement can be proved similarly, whereas the first statement is already known. 2.5-V. The final argument. Let d′ be a non-hanging finite segment of ′ c that starts at u. As a finite computation on ω ′ = χ′ ◦ ψ ′ , it can be broken into finitely many, finite fragments d′1 , d′2 , . . . , d′ν ; odd(-indexed) fragments compute on χ′ and even(-indexed) fragments compute on ψ ′ (Lemma 3). By previous remarks, every odd fragment visits and depends on exactly one of the two graphs (non-shifted and shifted) inside x′ ; and every even fragment visits and depends on exactly one of the four paths in y ′ . Calling an even fragment crossing if its start and final gates differ, we clearly see that two successive odd fragments visit different graphs in x′ iff the even fragment between them is crossing. Generalizing, and since d′ starts on u, each odd fragment visits the shifted graph in x′ iff the number of crossing even fragments that precede it is odd. Towards a contradiction, assume d′ does not end in u or v. Then it ends in either u′ or v ′ . Hence, d′ν is an odd fragment that visits the shifted

2. MOLES

83

graph in x′ . This immediately implies that the total number of crossing even fragments (before d′ν , and so throughout d′ ) is odd. In particular, even fragments exist and d′1 necessarily ends at a gate in Y . To reach a contradiction, we will prove that, by replacing every fragment d′i of d′ with an appropriate computation di on the original maze ω, we can create a computation d on ω that cannot possibly exist. Before we start, let h : X ′ → X be the function that maps every gate in X ′ to its ‘unprimed’ version in X: for example, h(ul ) = h(u′l ) = ul . Based on this mapping, we can find the the appropriate di as follows: • If d′i is an odd fragment (a computation on exactly one of the two graphs in χ′ ) from state q and gate ε to state r and gate ζ, we let di be the computation on (the one graph of) χ from q and h(ε). Clearly, di ends at r and h(ζ). In particular, d1 starts at h(u) = u and ends at a gate in h(Y ) = Y . • If d′i is an even fragment (a computation on exactly one of the four paths in ψ ′ ) from state q and gate ε to state r and gate ζ, we let di be the computation on (one of the two copies of ρ in) ψ from q and h(ε). Since ρ is indistinguishable from each of π and σ, we know di ends at r and h(ζ). Note here the critical use of the inability of the mole to detect the big difference in the lengths of π, ρ, and σ. Reviewing the list d1 , d2 , . . . , dν , we see that: d1 starts at h(u) = u; for every 1 ≤ i < ν, fragment di ends at the state and gate where di+1 starts; fragment dν ends on h(u′ ) = u or h(v ′ ) = v; and every even fragment di is crossing (on the path of ψ that it visits) iff d′i is crossing (on the path of ψ ′ that it visits). Therefore, by concatenating all these new fragments, we can build a computation d on χ ◦ ψ = ω that starts at u, ends at u or v, and contains an odd number of crossing even fragments. But is this possible? If d ends at u, then it never moves beyond τl (if it did, it would traverse the l-dilemma and get ‘blocked’ away from u). In particular, d1 never reaches a gate in Y . But (by a previous remark) this is where it is supposed to end. Contradiction. If d ends in v, then it is a crossing computation on ω. As ω equals each of the compositions χ ◦ ψ, χi1 ◦ ψi1 , and χi2 ◦ ψi2 , we know d can be fragmented in three different ways. Clearly, every even fragment with respect to either χi1 ◦ ψi1 or χi2 ◦ ψi2 is also an even fragment with respect to χ ◦ ψ, and vice versa; and is crossing or not (on the copy of ρ that it visits) irrespective of which composition we look at it through. So, letting ξ, ξ1 , ξ2 be the numbers of crossing even fragments with respect to the three compositions, we know ξ = ξ1 + ξ2 and (as established above) ξ is odd. Yet, by the selection of i1 and i2 , the parities of ξ1 , ξ2 are respectively bi1 ,d , bi2 ,d and hence equal (as bi1 = bi2 ), so that ξ should be even. Contradiction. So, in both cases we reach a contradiction, as desired.

84

2. 2D VERSUS 2N

3. Restricted Bidirectionality: Sweeping Automata In this section we explore the approach that we described in Section 3.3 of the Introduction. After we formally define what it means for a 2nfa to be sweeping, we will prove that every sweeping 2nfa solving Bn needs 2Ω(n) states—here, Bn is the complement of liveness. 3.1. Preliminaries. Our basic notation and definitions are as presented in Section 2 of Chapter 1. Some extra notions and facts, of special interest to this section, are presented below. 3.1-I. Sets, functions, and relations. As usual, for any set U , we write U , |U |, P(U ), and U 2 for the complement, the size, the powerset, and the set of pairs of U . The next simple lemma plays a central role in our proof. 22. Lemma. Let I be a set of indices totally ordered by |u| + |v|, and thus (u′ , v ′ ) > (u, v). Otherwise, u′ = u and v ′ = v, and thus (u′ , v ′ ) = (u, v). In both cases, (u′ , v ′ ) 6< (u, v), as needed. 3.1-II. Strings over Σn . Recall the alphabet Σn over which liveness is defined. In this section, it will be convenient to have a concise way of describing how the edges of a string over Σn connect the vertices of its outer columns. So, given any z ∈ Σn∗ , we say that z has connectivity ξ ⊆ [n]2 if the following holds: (a, b) ∈ ξ iff z contains a path from the a-th node of its leftmost column to the b-th node of its rightmost column. For example, the connectivity of the string on page 21 is {(3, 1), (3, 4)}; the connectivity of the empty string ǫ is the identity relation {(a, a) | a ∈ [n]}; and the connectivity of any single symbol is the symbol itself. The set of all strings

86

2. 2D VERSUS 2N

of connectivity ξ is written as Bn,ξ . In this terminology, the dead strings are exactly those with connectivity ∅; in other words, Bn = Bn,∅ . Every other connectivity implies a string which is live. 3.2. Sweeping automata. One way to define sweeping 2nfas is to start with our standard definition for 2nfas (cf. Chapter 1, Section 2) and simply impose the restriction that the transition function is such that the direction of the input head never changes strictly inside the input, for all inputs and all branches of the corresponding nondeterministic computations. Note that, with a definition of this kind, it becomes meaningful to ask whether a given 2nfa is sweeping or not. Our approach will be different. We will give an entirely new definition, with the restriction about the direction of motion built-in. The best way to explain what this means to give the definition right away. So, here it is. As usual, we start with the deterministic version and leave the straightforward generalization to nondeterminism for later. 3.2-I. The deterministic case. By a sweeping deterministic finite automaton (sdfa) over the states of a set Q and the symbols of an alphabet Σ we mean any triple M = (s, δ, f ), where δ is the transition function, partially mapping Q × Σe to Q, and s, f ∈ Q are the start and the final states. An input w ∈ Σ ∗ is presented to M surrounded by the end-markers, as ⊢w⊣. The computation starts at s and on the symbol to the right of ⊢, heading rightward. The next state is always derived from δ and the current state and symbol. The next position is always the adjacent one in the direction of motion, except when the current symbol is ⊢ or when the current symbol is ⊣ and the next state is not f , in which two cases the next position is the adjacent one in the opposite direction. Note that the computation can either loop, or hang, or move past ⊣ into f . In this last case we say that M accepts w. We stress that the values of the transition function do not contain any direction information. In contrast, this information is derived implicitly from the assumption that the automaton is sweeping. This greatly simplifies the setting and helps us stay closer to the combinatorial essence of the sweeping automata, avoiding the distraction caused by irrelevant inherited features. Moreover, this definitional shift does not invalidate our conclusions in this section. Specifically, it is not hard to verify that the size of a sweeping 2nfa under the new definition is linearly related to the size of a smallest equivalent sweeping 2nfa under the standard definition, and vice versa. Hence, any exponential lower bound under either definition implies an exponential lower bound under the other one—of course, if we cared about the exact trade-off, the choice of definition would matter.

3. SWEEPING AUTOMATA

87

The simplified definition allows for a simplified notion of computation, as well. In particular, for any z ∈ Σ ∗ and p ∈ Q, the left computation of M from p on z is the unique sequence lcompM,p (z) = (qt )1≤t≤m where q1 = p; every next state is qt+1 = δ(qt , zt ), provided that t ≤ |z| and the value of δ is defined; and m is the first t for which this last provision fails. If m = |z| + 1, then the computation exits into qm ; otherwise, 1 ≤ m ≤ |z| and the computation hangs at qm . The right computation of M from p on z is defined symmetrically, as the sequence rcompM,p (z) = qt )1≤t≤m with qt+1 = δ(qt , z|z|+1−t ). 3.2-II. The nondeterministic case. If M is allowed more than one next move at each step, we say that it is nondeterministic (snfa). Formally, this means that δ totally maps Q × Σe to the powerset of Q and implies that, on input any w ∈ Σ ∗ , M exhibits a set of computations on ⊢w⊣. If at least one of them moves past ⊣ into f , then M accepts w. Similarly, lcompM,p (z) and rcompM,p (z) are now sets of computations. We also introduce a notion to describe how the states of M connect via left and right computations on some string. For left computations on some z ∈ Σ ∗ , we encode these connections into a binary relation lviewM (z) ⊆ Q2 , which we refer to as the left behavior of M on z, defined as: (p, q) ∈ lviewM (z) ⇐⇒ ∃c ∈ lcompM,p (z) (c exits into q). Then, for any u ⊆ Q, the set lviewM (z)(u) of states reachable from within u via left computations on z is the left view of u on z. The right behavior rviewM (z) of M on z and the right view rviewM (z)(u) of u on z are defined symmetrically. We note that, on strings of length 1, the automaton has the same behavior in both directions: 25. Lemma. If |z| = 1, then lviewM (z) and rviewM (z) coincide: lviewM (z) = rviewM (z) = {(p, q) | δ(p, z) ∋ q}. We also note that, if extending a string z does not cause a view to include new states, then this remains true on all identical further extensions: 26. Lemma. The following implications are true, for all t ≥ 1: a. lviewM (z)(u) ⊇ lviewM (z z˜)(u) ⇒ lviewM (z)(u) ⊇ lviewM (z z˜t )(u), b. rviewM (z)(u) ⊇ rviewM (˜ z z)(u) ⇒ rviewM (z)(u) ⊇ rviewM (˜ z t z)(u). Proof. For part (a), suppose lviewM (z)(u) contains lviewM (z z˜)(u). To show that it also contains lviewM (z z˜t )(u), we use induction on t. The case t = 1 is the assumption itself. For t ≥ 1, we calculate: lviewM (z z˜t+1 )(u) = lviewM (˜ z ) lviewM (z z˜t )(u) ⊆ lviewM (˜ z ) lviewM (z)(u) = lviewM (z z˜)(u) ⊆ lviewM (z)(u).

88

2. 2D VERSUS 2N

The 1st step holds because lviewM (z z˜t+1 ) = lviewM (z z˜t ) ◦ lviewM (˜ z ). In the 2nd step, we use the inductive hypothesis and the monotonicity of lviewM (˜ z )(·). The 3rd step holds because lviewM (z z˜) = lviewM (z) ◦ lviewM (˜ z ). Finally, the last step uses the original assumption. For the implication of part (b), a symmetric argument applies. 3.3. Proof outline. We are now ready to present an outline of our proof. As explained in the introduction, to show that small snfas are not closed under complement (sn 6= cosn), it is enough to prove the following. 27. Theorem. Every snfa that recognizes Bn,∅ has 2Ω(n) states. The rest of Section 3 is a proof is this fact. We fix n and an snfa M = (s, δ, f ) over a set of k states Q that solves Bn,∅ , and we show that k = 2Ω(n) . The proof is based on Lemma 22. We build two sequences (Xι )ι∈I and (Yι )ι∈I that are related as in the lemma. The indices are all pairs of non-empty subsets of [n], the universe is all sets of 1 or 2 steps of M :4 I := {(α, β) | ∅ = 6 α, β ⊆ [n]} E := {e′ , e} | e′ , e ∈ Q2 , and the total order < is the restriction on I of some nice order on P([n])2 . If we indeed construct these sequences, then the lemma says |I| ≤ |E|, or 2 (2n − 1)2 ≤ k 2 + k2 ,

which implies k = 2Ω(n) . For the remainder, we fix I and E as here. Note that, from now on, some subscripts in our notation are redundant, and will be dropped: e.g., Bn,∅ and lviewM (z)(u) will be referred to simply as B∅ and lview(z)(u). Before moving on, let us also quickly prove a fact that will be useful later: In order to accept a dead string but reject a live one, M must produce on the dead string a single-state view that “escapes” the corresponding view on the live string. 28. Lemma. Suppose z ′ is live and z is dead. Then at least one of the following two claims is true: • lview(z ′ )(p) + lview(z)(p) for some p ∈ Q. • rview(z ′ )(p) + rview(z)(p) for some p ∈ Q.

Proof. Towards a contradiction, suppose that neither claim is true, namely lview(z ′ )(p) ⊇ lview(z)(p) and rview(z ′ )(p) ⊇ rview(z)(p), for every state p. Pick any accepting computation c of M on z and break it into its traversals c1 , . . . , cm , in the natural way: for all j < m, • cj starts at some pj next to ⊢ and ends at some qj on ⊣, if j is odd; • cj starts at some pj next to ⊣ and ends at some qj on ⊢, if j is even; 4A step of M is any e ∈ Q2 . Also, note that {e′ , e} is a singleton when e′ = e.

3. SWEEPING AUTOMATA

89

p1 = s and pj+1 is in δ(qj , ⊣) or δ(qj , ⊢), depending on whether j is odd or even, respectively; m is even and the last fragment is not really a traversal, but simply cm = (f ). Then, for each j < m, we know that • qj is in lview(z)(pj ) and thus also in lview(z ′ )(pj ), if j is odd, • qj is in rview(z)(pj ) and thus also in rview(z ′ )(pj ), if j is even. Hence, in both cases, some computation c′j of M on z ′ starts and ends identically to cj . If we also set c′m := (f ) and concatenate the computations c′1 , . . . , c′m , we end up with a computation c′ of M on z ′ which is also accepting. So, M accepts the live string z ′ , a contradiction. 3.4. Hard inputs and the two sequences. In this section, we will construct a set of inputs that collectively force M to use exponentially many states. Similarly to what we did for moles, here we will again need to start with strings that are long and rich enough to strain the ability of M to process their information, and then use those strings as building blocks for constructing the hard inputs. Once again, we call these strings generic, and we base their construction on the same general idea of [55]. 3.4-I. Generic strings. Consider any y ∈ Σ ∗ and the set of views produced via left computations on it (i.e., the range of lview(y)(·)): lviews(y) := {lview(y)(u) | u ⊆ Q}, How does this set change when we extend y into a longer string yz? To answer this question, it is useful to consider the function lmap(y, z) that for every left view produced on y returns its left view on z—namely, lmap(y, z) is the restriction of lview(z)(·) to lviews(y). It is easy to verify that all values of lmap(y, z) are inside lviews(yz). Indeed, consider any u in the range of lmap(y, z). Then some u′ in the domain of lmap(y, z) is such that lmap(y, z)(u′ ) = u. Since this domain is lviews(y), some u′′ ⊆ Q is such that lview(y)(u′′ ) = u′ . Then, u = lmap(y, z)(u′ ) = lmap(y, z) lview(y)(u′′ ) = lview(z) lview(y)(u′′ ) = lview(yz)(u′′ ), so that u is indeed in lviews(yz). (Note that in the last equality we used the fact that lview(yz) = lview(y) ◦ lview(z).) Moreover, the values of lmap(y, z) cover lviews(yz). Indeed, consider any u ∈ lviews(yz). Then there exists u′′ ⊆ Q such that lview(yz)(u′′ ) = u. Letting u′ := lview(y)(u′′ ), we see that u′ ∈ lviews(y). Therefore, u′ is in the domain of lmap(y, z). Moreover, lmap(y, z)(u′ ) = lview(z)(u′ ) = lview(z) lview(y)(u′′ ) = lview(yz)(u′′ ) = u,

so that u is indeed in the range of lmap(y, z). (Again, in the last equality we used the fact that lview(yz) = lview(y) ◦ lview(z).)

90

2. 2D VERSUS 2N

Overall, lmap(y, z) is a surjection from lviews(y) to lviews(yz), which immediately implies that |lviews(y)| ≥ |lviews(yz)|. The next fact encodes this conclusion, along with the obvious remark that lmap(y, z) is monotone. It also shows the symmetric facts, for left extensions and right views. The set rviews(y) consists of all views produced on y via right computations, and rmap(z, y) is the restriction of rview(z)(·) on rviews(y). 29. Lemma. For any two strings y and z: lmap(y, z) is a monotone surjection of lviews(y) onto lviews(yz), so |lviews(y)| ≥ |lviews(yz)|; similarly, in the other direction, rmap(z, y) is a monotone surjection of rviews(y) onto rviews(zy), so |rviews(y)| ≥ |rviews(zy)|. Now suppose y belongs to an infinitely right-extensible property P ⊆ Σ ∗ . What happens to the size of lviews(y) if we keep extending y into yz, yzz ′, . . . inside P ? Although there are infinitely many extensions, the size of the set can decrease only finitely many times. So, at some point it must stop changing. When this happens, we have arrived at a very useful tool. We define it as follows—compare with Definition 9 and Lemma 10. 30. Definition. Let P ⊆ Σ ∗ . A string y is l-generic over P if y ∈ P and: for all extensions yz ∈ P , |lviews(y)| = |lviews(yz)|. An r-generic string over P is defined similarly, on left-extensions and rviews(·). A string that is simultaneously l-generic and r-generic over P is called generic. 31. Lemma. Let P ⊆ Σ ∗ . If P is non-empty and infinitely extensible to the right (resp., left), then l-generic strings over P (resp., r-generic strings over P ) exist. In addition, if yl is l-generic and yr is r-generic, then every string yl xyr ∈ P is generic. Intuitively, from the perspective of M , a generic string is among the richest inputs that have property P , in the sense that it exhibits a greatest subset of the “features” that M is “prepared to pay attention to”. This makes generic strings useful in building hard inputs, as described in the Lemma 33 below and in Section 3.4-II. 32. Lemma. For any two strings y and z, lviews(yz) ⊆ lviews(z). Similarly, in the other direction, rviews(zy) ⊆ rviews(z). Proof. By Lemma 29, lviews(yz) is the range of lmap(y, z), which is a restriction of lview(z)(·). So, the first containment follows. The argument in the other direction is similar. 33. Lemma. Let y be generic over P ⊆ Σ ∗ . If yxy ∈ P , then • lmap(y, xy) is an automorphism on lviews(y), and • rmap(yx, y) is an automorphism on rviews(y).

3. SWEEPING AUTOMATA

91

Proof. Suppose yxy ∈ P . Then |lviews(y)| = |lviews(yxy)| (since y is generic) and lviews(yxy) ⊆ lviews(y) (by Lemma 32). Therefore, we know that lviews(y) = lviews(yxy). By this and Lemma 29, we conclude lmap(y, xy) surjects lviews(y) onto itself, which is possible only if it is injective. Since lmap(y, xy) is also monotone, Lemma 23 implies it is an automorphism. A similar argument applies for rmap(yx, y). 3.4-II. Constructing the hard inputs. Fix ι = (α, β) ∈ I and let Pι := Bα×β be the property of connecting exactly every leftmost node in α to every rightmost node in β. Easily, Pι is non-empty and infinitely extensible in both directions. Therefore, an l-generic string yl and an r-generic string yr exist (Lemma 31). Then, for η := [n]2 the complete symbol, we easily see that yl ηyr ∈ Pι , too. Hence, this string is generic over Pι (Lemma 31). We define yι := yl ηyr . We also define the symbol xι := β × α. 34. Lemma. For the sequences (yι )ι∈I , (xι )ι∈I and for all ι′ , ι ∈ I: ι′ < ι =⇒ yι xι′ yι ∈ Pι

ι′ = ι =⇒ yι xι′ yι ∈ B∅ .

and

Proof. Fix ι′ = (α′ , β ′ ) and ι = (α, β) and let z := yι xι′ yι . Note that the connectivities of yι and xι′ are respectively ξ := α× β and ξ ′ := β ′ × α′ . yι

β ′ xι′ α′

yι

yι

α

β ′ xι′ α′

yι

α β b∗

β

a∗

If ι′ < ι (on the left), then α′ + α or β ′ + β (since < is nice). Suppose β + β (if α′ + α, use a similar argument) and fix any b∗ ∈ β \ β ′ and any a∗ ∈ α. For any a, b ∈ [n], consider the a-th leftmost and b-th rightmost nodes of z. If a 6∈ α or b 6∈ β, then the two nodes do not connect in z, since neither can “see through” yι . If a ∈ α and b ∈ β, then (a, b∗ ) ∈ ξ and (b∗ , a∗ ) ∈ ξ ′ and (a∗ , b) ∈ ξ, so the two nodes connect via a path of the form a b ∗ → a∗ b. Overall, z ∈ Pι . ′ If ι = ι (on the right), then ξ ′ = β × α. Suppose z 6∈ B∅ . Then some path in z connects the leftmost to the rightmost column. Suppose it is of the form a b ∗ → a∗ b. Then b∗ ∈ β and (b∗ , a∗ ) ∈ ξ ′ and a∗ ∈ α. ∗ ∗ That is, (b , a ) is both inside and outside β × α, a contradiction. ′

3.4-III. Constructing the two sequences. Suppose ι′ < ι. Since the extension yι xι′ yι of yι preserves Pι (Lemma 34), each of lmap(yι , xι′ yι ) and rmap(yι xι′ , yι ) is an automorphism (Lemma 33). Put another way, the interaction between the steps of M on xι′ and its two behaviors on yι is such that these two mappings are automorphisms. Put formally, both • the restriction of Eι′ ◦ lview(yι ) (·) on lviews(yι ) and • the restriction of Eι′ ◦ rview(yι ) (·) on rviews(yι )

92

2. 2D VERSUS 2N

are automorphisms, where Eι′ := {(p, q) | δ(p, xι′ ) ∋ q} = lview(xι′ ) = rview(xι′ ) (cf. Lemma 25). What if ι′ = ι? What is then the status of the mappings lmap(yι , xι yι ) and rmap(yι xι , yι )? We can show that, since yι xι yι is dead (Lemma 34), we cannot have both of them to be automorphisms.5 However, something stronger is also true: we can even convince ourselves that one of the functions is not an automorphism by pointing at only 1 or 2 of the steps of M on xι . The next figure shows three examples of this. In each, we sketch the left behavior of M on yι and all single-state views. yι

xι

yι

yι

xι

yι

yι

xι e′

e′

e

e u

(i)

v

′

u

u

(ii)

yι

e v

′

v

′

u

u

(iii)

v′ v

Example i shows only 1 of the steps of M on xι , say e = (p, q) — many more may be included in Eι . Is lmap(yι , xι yι ) an automorphism? Normally, we would need to know the entire Eι to answer this question. Yet, in this case e is enough to answer no. To see why, note that the view v of q on yι has height 2, while one of the views that contain p is u, of height 1. Irrespective of the rest of Eι , lmap(yι , xι yι ) will map u to a view that contains v and thus has height 2 or more. So, it does not respect heights, which implies it is not an automorphism. Example ii shows 2 of the steps in Eι , say e′ = (p′ , q ′ ) and e = (p, q). Is lmap(yι , xι yι ) an automorphism? Observe that neither step alone can force a negative answer: the view v ′ of q ′ on yι has height 1, as does the lowest view u′ containing p′ ; similarly for e, u, v, and height 2. So, individually each of e′ and e may very well participate in sets of steps that induce automorphisms. Yet, they cannot belong to the same such set. To see why, suppose they do. Since u′ ⊆ u, the image of u would be v ′ ∪ v or a superset. Since v ′ * v, the height of that image would be greater than the height of v, and thus greater than the height of u, violating the respect to heights. Example iii also shows 2 of the steps in Eι , say e′ = (p′ , q ′ ) and e = (p, q), neither of which can disqualify lmap(yι , xι yι ) from being an automorphism. Yet, together they can. To see why, suppose both steps participate in the same automorphism. Then the image of u′ must be exactly v ′ : otherwise, it would be some strict superset of v ′ , of height 2 or 5If they were, they would be bijections (because each of lviews(y ) and rviews(y ) ι ι has a maximum). Hence, M would not be able to distinguish between the live yι and the dead yι (xι yι )t , for t any exponent that turns both bijections into identities. (Note that this is true even for the n-state snfa that solves liveness. Therefore, this observation alone can give rise to no interesting lower bound for k.)

3. SWEEPING AUTOMATA

93

more, disrespecting the height of u′ . On the other hand, u must map to a set that contains v, and thus also v ′ ⊆ v. Hence, v ′ must be the exact image of some u∗ ⊆ u. But then both u∗ and u′ map to v ′ , when u∗ 6= u′ (since u′ * u), a contradiction to the map being injective. In short, each step in Eι severely restricts the form of lmap(yι , xι yι ) and rmap(yι xι , yι ). And, either individually or in pairs, some steps can be so restrictive that they cannot be part of any set of steps that induces an automorphism in both directions. This motivates the following. 35. Definition. A set of steps E ⊆ Q2 is compatible with yι if there exˆ ⊆ Q2 such that the following are both automorphisms: ists a set E ⊆ E ˆ ◦ lview(yι ) (·) on lviews(yι ), and • the restriction of E ˆ ◦ rview(yι ) (·) on rviews(yι ). • the restriction of E

E.g., {e} in Example i and {e′ , e} in Examples ii,iii are incompatible with yι . We are now ready to define the sequences promised in Section 3.3. For each ι ∈ I, we let Xι consist of all sets of 1 or 2 steps of M on xι , and Yι consist of all sets of 1 or 2 steps of M that are incompatible with yι : Xι := E ∈ E | E ⊆ Eι , Yι := E ∈ E | E is incompatible with yι .

We need, of course, to show that the sequences relate as in Lemma 22. The case ι′ < ι is easy. Every E ∈ Xι′ can be extended to the set of all ˆ := Eι′ ), which does induce automorphisms steps of M on xι′ (namely to E (by Lemmata 33 and 34), so Xι′ ∩ Yι = ∅. The case ι′ = ι is harder. We analyze it in the next section. 3.5. The main argument. Suppose ι′ = ι. We will exhibit a singleton or two-set E ⊆ Eι that is incompatible with yι . First, some preparation. 3.5-I. The witness. Consider the strings yι (xι yι )t = (yι xι )t yι , for all t ≥ 1. Since yι xι yι is dead, the same is true of all these strings. Since yι is live, Lemma 28 says that for all t ≥ 1: • lview(yι )(p) + lview yι (xι yι )t (p) for some p ∈ Q, or • rview(yι )(p) + rview (yι xι )t yι (p) for some p ∈ Q. Namely, in order to accept the extensions yι (xι yι )t = (yι xι )t yι but reject the original yι , M must exhibit on each of them a single-state view that ‘escapes’ its counterpart on the original. In a sense, among all 2k singlestate views on each extension, the escaping one is a ‘witness’ for the fact that the extension is accepted, and Lemma 28 says that every extension has a witness. Of course, this allows for the possibility that different extensions may have different witnesses. However, we can actually find the same witness for all extensions: 36. Lemma. At least one of the following is true: • lview(yι )(p) + lview yι (xι yι )t (p) for some p ∈ Q and all t ≥ 1. • rview(yι )(p) + rview (yι xι )t yι (p) for some p ∈ Q and all t ≥ 1.

94

2. 2D VERSUS 2N

Proof. Suppose neither is true. Then each of the 2k single-state views has an extension on which it fails to escape from its counterpart on yι . That is, for every p there exists a tp,l ≥ 1 such that lview(yι )(p) ⊇ lview yι (xι yι )tp,l (p)

as well as a tp,r ≥ 1 such that

rview(yι )(p) ⊇ rview (yι xι )tp,r yι (p). ∗

∗

Consider then the extension z := yι (xι yι )t = (yι xι )t yι , where Q Q t∗ := p∈Q tp,l · p∈Q tp,r .

Clearly, every p has a t ≥ 1 such that z = yι ((xι yι )tp,l )t , and thus Lemma 26 implies that lview(yι )(p) ⊇ lview(z)(p). By the same argument, we also conclude that rview(yι )(p) ⊇ rview(z)(p). Overall, all single-state views on z fall within their counterparts on yι , contradicting Lemma 28. We fix p to be a witness as in Lemma 36. We assume p is of the first type, involving left views (otherwise, a symmetric argument applies). Moreover, among all witnesses of this type, we select p so as to minimize the height of lview(yι )(p) in lviews(yι ). For convenience, in the rest of the argument we let v0 := lview(yι )(p) and V := lviews(yι ), h := hV . Note that, by the selection of p, no p˜ with lview(yι )(˜ p) ( v0 can be a witness of the first type. Hence, every such p˜ has a t˜ ≥ 1 satisfying ˜ lview(yι )(˜ p) ⊇ lview yι (xι yι )t (˜ p).

We fix t∗ to be the product of all such t˜. Then the following lemma holds. 37. Lemma. If p˜ is such that lview(yι )(˜ p) ( v0 , then for all λ ≥ 1: ∗

lview(yι )(˜ p) ⊇ lview(yι (xι yι )λt )(˜ p). Proof. Fix any p˜ as in the statement and consider the t˜ for which lview(yι )(˜ p) ⊇ lview yι (xι yι )t˜ (˜ p). For any λ ≥ 1, we know λt∗ is a ˜ multiple of t, and therefore Lemma 26 applies. 3.5-II. Escape computations. For all t ≥ 1, collect into a set Ct all computations c ∈ lcompp (yι (xι yι )t ) that exit into some state q 6∈ v0 . We will be calling these computations ‘the escape computations for p on the t-th extension’. We also define C := ∪t≥1 Ct . Let us see how an escape computation looks like. (See Figure 21a on page 96.) Pick any c ∈ C, say on the t-th extension, exiting into q. Let e 1 , e 2 , . . . , et be the steps of c on xι , where ej = (pj , qj ) ∈ Eι . These are the critical steps along c. Let vj := lview(yι )(qj ) be the view of the right end-point

3. SWEEPING AUTOMATA

95

of ej . Along with v0 , these views form the list v0 , v1 , . . . , vt of the major views along c. Clearly, each of them contains the left end-point of the next critical step: vj−1 ∋ pj (similarly, vt ∋ q). So, for each ej there exist views u ∈ V that contain its left end-point and are contained in the preceding major view: vj−1 ⊇ u ∋ pj (similarly, vt ⊇ u ∋ q). Among them, let uj−1 be of minimum height in V (select ut similarly). Then the list u0 , . . . , ut−1 , ut are the minor views along c. This concludes our description. 3.5-III. The incompatible set. We are now ready to find the incompatible set E that we are looking for. We will find its one or two steps among the critical steps of escape computations. We distinguish two cases. Case 1: Some c ∈ C contains some critical step e such that the singleton {e} is incompatible with yι . Then we can select E := {e}, and we are done. Case 2: For all c ∈ C and all critical steps e in c, the singleton {e} is compatible with yι . In this case, we will find an incompatible two-set. Steepness. First of all, every c ∈ C (say with t, ej , vj , uj as above) has every major view at least as high as the next minor one (h(vj ) ≥ h(uj ), since vj ⊇ uj ) and every minor view at least as high as the next major one (h(uj ) ≥ h(vj+1 ), or {ej+1 } would be incompatible, as in Example i). Hence, every c ∈ C has views of monotonically decreasing height: h(v0 ) ≥ h(u0 ) ≥ h(v1 ) ≥ h(u1 ) ≥ · · · ≥ h(vt ) ≥ h(ut ). To capture the “rate” of this decrease, we record the list of minor view heights Hc := h(uj ) 0≤j≤t , and order each Ct lexicographically: c′ ≤ c

iff

Hc′ ≤lex Hc .

With respect to this total order, “smaller” computation means “steeper”. Long and steepest computation. We now fix t to be a multiple of t∗ which is at least |V |, and we select c to be one of the steepest computations in Ct . We let q, ej , vj , uj be as usual. Since t ≥ |V |, the list u0 , . . . , ut contains repetitions. Let j ′ < j be the indices for the earliest one. Then uj ′ = uj , so h(uj ′ ) = h(uj ), and thus all views in between have the same height: h(uj ′ ) = h(vj ′ +1 ) = · · · = h(vj ) = h(uj ). As a result, each major view equals the next minor one: vj ′ +1 = uj ′ +1 ,

...,

vj = uj .

The rest of the argument depends on whether the earliest repetition occurs at the very beginning or not. So, we distinguish two cases.

p

p

p˜

p

yι

yι

u0

u0

0

yι

v0

v0

v0

xι

e1

xι

e1

xι

yι

v1

v1

u1

yι v1

′ vl−1

yι

u1

1

e2

xι

e′l

e2

xι

e2

xι

yι

yι

vl′

yι

u2

2

u2

v2

v2

v2

xι

e3

xι

e3

xι

yι

yι

yι

u3

3

u3

v0

v3

e4

xι

q′

xι

e4

xι

yι

yι

yι

u4

4

u2

v4

xι

xι

e5

xι

yι

yι

yι

u5

5

v5

q

q

q

Figure 21. (a) An escape computation c ∈ C5 , exiting into q. (b) An example of Case 2A, for j = 3 and l = 2; in dashes, the new computation c′ ∈ Cj . (c) An example of Case 2B, for j ′ = 2 and j = 4; in dashes, the hypothetical case uj ′ −1 ⊇ uj−1 and c′ .

(c)

(b)

(a)

96 2. 2D VERSUS 2N

3. SWEEPING AUTOMATA

97

Case 2A: j ′ = 0. Then h(u0 ) = h(v1 ) = · · · = h(vj ) = h(uj ), and therefore v1 = u1 , . . . , vj = uj . In fact, we also have h(v0 ) = h(u0 ), and thus v0 = u0 . To see why, suppose h(v0 ) 6= h(u0 ). Then v0 ) u0 . Since u0 ∈ V , some state p˜ has lview(yι )(˜ p) = u0 (Figure 21a), and therefore Lemma 37 applies to it (since u0 ( v0 ). In particular, lview(yι )(˜ p) ⊇ lview yι (xι yι )t (˜ p)

(since t is a multiple of t∗ ). On the other hand, u0 contains the left endpoint of e1 , so the part of c after e1 shows that q ∈ lview yι (xι yι )t (˜ p), and thus q ∈ lview(yι )(˜ p) = u0 . Since u0 ⊆ v0 , this means that c is not an escape computation, a contradiction. So, if the earliest repetition occurs at the very beginning, we know h(v0 ) = h(u0 ) = · · · = h(vj ) = h(uj )

and

v0 = u0 , . . . , vj = uj

(see Figure 21b). Now, by the selection of p, its view on the j-th extension escapes v0 . Pick any c′ ∈ Cj , with exit state q ′ ∈ / v0 , critical steps e′1 , . . . , e′j , ′ ′ ′ and major views v0 , . . . , vj . Then v0 = v0 (since both c′ and c start at p) and q ′ ∈ vj′ \ vj (since vj = uj = u0 = v0 and q ′ ∈ / v0 ). So, the respective major views start with inclusion v0′ ⊆ v0 but end with non-inclusion vj′ * vj . ′ So there is 1 ≤ l ≤ j so that vl−1 ⊆ vl−1 but vl′ * vl . We are now ready to prove that {e′l , el } is incompatible with yι . The argument is as in Example ii. Suppose the two steps participate in a set ′ inducing an automorphism ζ. Since vl−1 ⊆ vl−1 , both e′l and el have their left end-points in vl−1 . Hence, ζ(vl−1 ) ⊇ vl′ ∪ vl . Since vl′ * vl , the height of ζ(vl−1 ) is greater than that of vl . But h(vl−1 ) = h(vl ). Therefore h ζ(vl−1 ) > h(vl−1 ), a contradiction.

Case 2B: j ′ 6= 0. Then we can talk of the minor views uj ′ −1 and uj−1 that precede the first repetition. Of course, uj ′ −1 6= uj−1 . In fact, uj ′ −1 + uj−1 . To see why, suppose uj ′ −1 ⊇ uj−1 (Figure 21c). Then uj ′ −1 ) uj−1 (since uj ′ −1 6= uj−1 ) and thus h(uj ′ −1 ) > h(uj−1 ). Moreover, ej has its left end-point in vj ′ −1 (since vj ′ −1 ⊇ uj ′ −1 ⊇ uj−1 ) while its right end-point has view uj ′ (since vj = uj = uj ′ ). Hence, by replacing ej ′ with ej , we get a new computation c′ that is also in Ct . In addition, Hc′ differs from Hc only in that h(uj ′ −1 ) is replaced by h(uj−1 ). But then c′ is strictly steeper than c, a contradiction. We are now ready to prove that {ej ′ , ej } is incompatible with yι . The argument is as in Example iii. Suppose the two steps participate in a set inducing an automorphism ζ. Because of ej , ζ(uj−1 ) ⊇ uj ; but h(uj−1 ) = h(uj ) and ζ respects heights, so in fact ζ(uj−1 ) = uj . Because of ej ′ , ζ(uj ′ −1 ) ⊇ uj ′ = uj ; so there exists u∗ ⊆ uj ′ −1 such that ζ(u∗ ) = uj . Overall, u∗ 6= uj−1 (since exactly one is in uj ′ −1 ) and ζ(u∗ ) = ζ(uj−1 ). Hence ζ is not injective, a contradiction. This concludes the analysis of the case ι′ = ι and thus the overall proof.

98

2. 2D VERSUS 2N

3.6. 2DFAs versus SNFAs. As already mentioned, an easy modification of the proof of Theorem 27 allows us to also establish the following. 38. Theorem. The trade-off from 2dfas to snfas is exponential. In other words, there exists a problem that can be solved by small 2dfas but cannot be solved by small snfas. This problem is simply an appropriate restriction of liveness and the small 2dfa solving it is actually single-pass. To describe this restriction, let us use Σn′ to denote the subset of Σn containing only the 2n ‘parallel’ symbols of the form {(a, a) | a ∈ α} for α ⊆ [n]. For example, the leftmost symbol in Figure 14a (page 65) is in Σ5′ , for α = {2, 3, 4, 5} ⊆ [5]. Let us also recall the complete symbol η = [n]2 from Section 3.4-II. The restriction of liveness that we have in mind is the problem Bn′ where all inputs are promised to follow the pattern: ∗ Σn′ ηΣn′ Σn Σn′ ηΣn′ . In other words, according to this promise, every input z starts and ends with a parallel symbol and the rest of it consists of one or more copies of the complete symbol separated by 3-symbol snippets of the form

where the outer symbols have to parallel and the middle one can be anything. For example, here is an input that obeys this promise:

Notice that every such string is live iff its first and last symbols are nonempty and its snippets are all live. Intuitively, the copies of the complete symbol ‘reset’ liveness every four symbols. This last observation immediately suggests a small 2dfa algorithm for solving liveness under this promise: just check that the first and last symbols are non-empty and that every snippet is live. More carefully: We read the first symbol. If it is empty, we hang. Otherwise, we start scanning the input from left to right. Every time that we read a copy of η, we use a depth-first search to check whether the string of the next 3 symbols is live. If it is not, we hang. If it is, we move to the next copy of η. If there is only 1 next symbol, we check whether it is empty or not. If it is, we hang. Otherwise, we accept.

Easily, this can be implemented on a zdfa with only O(n2 ) states.6 6In fact, replacing the depth-first search on the 3-symbol snippets with a cleverer search, we can reduce the size of this zdfa to only O(n2 / log n) states—which, by the

4. CONCLUSION

99

On the other hand, even under this promise, every snfa solving liveness still needs 2Ω(n) states. This is true simply because the promise does not invalidate our argument for the general case, as the hard inputs constructed in Section 3.4-II can all be drawn so as to obey the promise. More specifically, we can replace property Pι with the property Pι′ ⊆ Pι which contains only the strings of Pι that obey the promise. Easily, Pι′ is still non-empty and infinitely extensible in both directions. So, we can again find an l-generic string yl′ and an r-generic string yr′ over Pι′ and construct yι′ := yl′ ηyr′ , which is clearly in Pι′ and is thus generic over Pι′ . Then, it is trivial to verify that the sequence (yι′ )ι∈I is related to (xι )ι∈I exactly as (yι )ι∈I is in Lemma 34. The rest of the proof remains the same. 4. Conclusion In the first part of this chapter, we focused on a natural class of restricted but still fully bidirectional 2nfa algorithms for liveness, which includes the small 1nfa solvers. We asked whether small 2dfas of that kind can succeed and proved that they cannot, no matter how large they are. It is certainly good to know that graph exploration alone can never be a sufficient strategy. However, as already mentioned in the Introduction, in the context of the full conjecture the emphasis above stresses an alarming mismatch: a complexity question received a computability answer. This suggests that the reasons why deterministic moles fail against liveness are only loosely related to the reasons why small 2dfas fail —if they really do. In order for this approach to ultimately be of any use against the full conjecture, we need restricted versions of fully bidirectional 2dfas that are both weak enough to succumb to our arguments and strong enough to keep us in complexity: large 2dfas of this kind should be able to solve liveness. In the second part of the chapter, we proved that snfas must be exponentially large to solve the complement of liveness, and hence small snfas are not closed under complement. With an easy modification, our proof also showed that zdfas can be exponentially more succinct than snfas. An interesting next question concerns the exact value of our lower bound. The smallest known snfa for Bn,∅ is the obvious 2n -state 1dfa. Is this really the best snfa algorithm? If so, then nondeterminism and sweeping bidirectionality together are completely useless in this context.

A preliminary version of the contents of Section 2 can be found in [27], whereas Section 3 contains the material of [30]. way, is asymptotically optimal for this restriction of liveness. However, the algorithm is too complicated to describe here and is not necessary for Theorem 38, anyway.

CHAPTER 3

Non-Recursive Trade-Offs In Chapters 1 and 2 we compared the relative succinctness of several pairs of types of machines. In each case, the two types had the same computational power, in the sense that they could solve the same class of problems—the regular languages. Moreover, the associated trade-off was easily seen to be bounded from above by some recursive function. In contrast, the comparisons that we will consider in this chapter are more general. We will discuss conversions between types of machines of different computational power and, most often, the associated trade-offs will be growing faster than any computable function. One of the earliest studies of the relative succinctness of types of machines of different power was conducted by Stearns [56], as part of a proof that we can algorithmically check whether the language of a deterministic pushdown automaton (1dpa) is regular or not. Stearns showed that, although not every 1dpa can be converted into an equivalent 1dfa, whenever such equivalent automata exist, the smallest among them are at most triple-exponentially larger than the 1dpa itself. This naturally lead to the corresponding question for one-way nondeterministic pushdown automata (1npas): whenever a 1npa has equivalent 1dfas, what is an upper bound for the size of the smallest among them? The answer was qualitatively new, by Meyer and Fischer [35], who showed that every such bound grows (as a function of the size of the 1npa) faster than any computable function. Hence, among the cases where it is possible to turn a 1npa into a 1dfa, the trade-off in the description size is in general non-recursive.1 Several refinements of this result followed [59, 50]. 1As already noted in the introduction, this name can be misleading. Our intention here is to characterize the trade-offs that admit no recursive upper bounds. Clearly, every such trade-off is non-recursive. However, it is conceivable that there exist non-recursive trade-offs that admit recursive upper bounds. (It is easy to present natural functions of this kind. However, we do not know of one that is also the trade-off of some natural conversion.) So, strictly speaking, the class of non-recursive trade-offs is a subclass of the class of trade-offs that do not admit recursive upper bounds and, if the two classes are in fact equal, then an argument is needed to support this. With this clarification, we will move on using the popular choice. Note that, under this choice, a “recursive trade-off” is one that admits recursive upper bounds, and the “(non-)recursiveness of a trade-off” refers to the (non-)existence of a recursive upper bound for it. 101

102

3. NON-RECURSIVE TRADE-OFFS

In an important development, Hartmanis [16] later explained that the recursiveness of the trade-off from a type of machines a to a not-as-powerful type of machines b typically implies the recognizability (semi-decidability) of the corresponding inadequacy problem: “given a machine of type a, check that it has no equivalent machines of type b (i.e., that the machines of type b are inadequate to describe the language of the given machine)”. This greatly simplified the proofs of [35, 59, 50], while it nicely revealed the connections of the entire discussion to G¨odel’s theorem that the addition of an extra axiom to a formal system typically results in non-recursively shorter proofs for some of its theorems [17]. Today several refinements of the above results are known and nonrecursive trade-offs have emerged in many other comparisons between different types of machines. Comprehensive surveys can be found in [15, 32]. 1. Two-Way Multi-Pointer Machines In a remark in [19], Hartmanis and Baker showed that a non-recursive trade-off can occur even when an optimal algorithm replaces a near-optimal one.2 For example, converting an n2+ǫ -space deterministic Turing machine (dtm) into one that uses only n2 -space involves a non-recursive blowup in the size of description. In the pattern of [16], they derived this observation from the unrecognizability of the inadequacy problem from near-optimal to optimal machines (from n2+ǫ -space to n2 -space dtms), which in turn was shown to be a consequence of the fact that the near-optimal complexity class is strictly larger than the optimal one (some n2+ǫ -space dtms have no n2 -space equivalent). In this chapter we will refine that argument. We will prove a general theorem that directly shows the non-recursiveness of the trade-off in many conversions between machines of different power. In loose terms, our theorem states the following: If two types of machines a and b are such that 1. some machine of type a has no equivalent machine of type b, and 2. every unary two-way deterministic finite automaton with access to a linearly-bounded counter can be simulated by some machine of type a, then the trade-off from type a to type b is non-recursive. For example, in order to arrive at the previous remark on space, we can argue that, since n2 = o(n2+ǫ ), we know there exist n2+ǫ -space dtms with no n2 -space equivalent and therefore condition 1 is indeed true; that condition 2 is also true follows from the easy observation that any Ω(lg n) amount of space suffices for the simulation of a linearly-bounded counter. 2The reader is referred to [19, 17, 18] for a quite interesting discussion of the implications that this might have to our search for optimal algorithms.

1. TWO-WAY MULTI-POINTER MACHINES

103

The most characteristic applications of our theorem concern the successive levels of hierarchies of two-way multipointer automata, where by ‘pointer’ we mean any of the following accessories (in order of nondecreasing power): a linearly-bounded counter; a blind read-only head, namely a head that cannot distinguish between different input symbols (but can distinguish between input symbols and end-markers); an ordinary read-only head; a sensing read-only head, namely one that can sense which of the other heads are at the same cell as itself; or a pebble. For example, we can establish the non-recursiveness of the following trade-offs (in each case, the reference indicates where condition 1 of the theorem has been established; for condition 2, it is always easy to see that it is also satisfied): • from k + 1 to k counters, on linearly-bounded two-way deterministic counter automata (unary or not) and for all k [42], • from k + 1 to k heads, on two-way multi-head finite automata (deterministic or not, unary or not) and for all k [40, 41, 42], • from k + 1 to k heads, on two-way multi-head pushdown automata (deterministic or nondeterministic) and for all k [23]. Sometimes, we can only be as refined as the hierarchy is known to be: • from k + 2 to k registers, on linearly-bounded register machines (deterministic or nondeterministic) and for all k [42], • from k + 2 to k counters, on linearly-bounded two-way nondeterministic counter automata (unary or not) and for all k [42]. Similarly, the trade-off is non-recursive: • from 3 to 2 heads, on a simple two-way deterministic finite automaton [9] (a multi-head automaton is simple if every input head after the first one is blind). It remains non-recursive even when we start from a 2-head two-way deterministic finite automaton, or from a 1-head two-way deterministic pushdown automaton [9]. Finally, we can also conclude the non-recursiveness of the trade-off: • from k + 1 to k work-tape symbols, on Turing machines (deterministic or not) that use at most lg n work-tape cells on every input of length n, and for all k ≥ 2 [52]. It remains non-recursive even when we start from a Turing machine with a unary input alphabet, but then only for all sufficiently large k [52]. Several other conversions between machines of distinct computational power can be treated in a similar way. Returning to the statement of the theorem above, we warn that it is, in fact, incomplete. Additional conditions have to be met, concerning a and b, their descriptions, and how ‘size’ is measured. Still, in most interesting cases these conditions are trivially satisfied (in the above examples they are), so that listing them in this introduction would be a distraction. The complete list is contained in the formal statement of the theorem in Section 3.

104

3. NON-RECURSIVE TRADE-OFFS

The next section describes the formal framework of this study in more detail. Section 3 states and proves the theorem, except for an important lemma, which is proved in Section 5, after some preparation in Section 4. We warn that the discussion in this chapter is going to be much more abstract than in Chapters 1 and 2, so as to ensure that the conclusions cover as many conversions as possible—including cases where a or b denote types of language descriptors other than machines (e.g., regular expressions, grammars). A more concrete discussion can be found in [26], where the theorem is proved specifically for the conversion from k + 1 to k heads on two-way multi-head finite automata. 2. Preliminaries We write N for the set of positive integers, and lga n for ⌊loga n⌋. As usual, given any problem Π = (Πyes , Πno ) over some alphabet Σ and any dtm M , we say that M recognizes Π if M accepts all w ∈ Πyes and rejects (possibly by looping) all w ∈ Πno . If some dtm recognizes Π, we say Π is (Turing-) recognizable. If Π ′ is also a problem over Σ, we write Π ≤ Π ′ and say that Π reduces to Π ′ iff there is a dtm that, on input w ∈ Πyes ∪ Πno , eventually halts with an output w′ such that ′ w ∈ Πyes =⇒ w′ ∈ Πyes

and

′ w ∈ Πno =⇒ w′ ∈ Πno .

Clearly, if some unrecognizable problem Π reduces to a problem Π ′ , then Π ′ must also be unrecognizable. In the case where Π is a language (namely, Πyes + Πno = Σ ∗ ) and Πyes contains exactly all sufficiently long strings, for some interpretation 0 ≤ l ≤ ∞ of ‘sufficiently long’ Πyes = {w ∈ Σ ∗ | |w| ≥ l}, we say Π obeys a threshold —clearly then, Πyes is empty iff this threshold is infinite. A machine that solves Π is also said to obey the same threshold. 2.1. Descriptional systems. A descriptional system over the alphabets Γ and Σ is any set D ⊆ Γ ∗ of names (or descriptors), along with two total functions (·)D and |·|D , mapping every name d ∈ D to its language (d)D ⊆ Σ ∗ and its size |d|D ∈ N, respectively. For example, suppose that we fix a binary encoding of all 1dfas over, say, the input alphabet {a, b, c}. This induces the descriptional system over {0, 1} and {a, b, c} that contains all encoding strings as names and maps each of them to the language accepted by the corresponding 1dfa (as its language) and to the number of states in that 1dfa (as its size). Alternatively, the size of every name could just be its length (i.e., the number of binary symbols in it). A system D is decidable if the membership problem for its names is decidable. That is, if there is a dtm UD that halts always and is such that: for all d ∈ D and w ∈ Σ ∗ :

UD (d, w)accepts ⇐⇒ w ∈ (d)D .

2. PRELIMINARIES

105

Thus, the system of the previous example is clearly decidable, whereas a system containing binary encodings of dtms would be undecidable. In order to be able to compare two descriptional systems D and E in terms of their relative succinctness, we require that they are comparable, in the sense that [i] they are defined over the same alphabets, and that [ii] their (·) and |·| mappings agree on all common names,3 for all z ∈ D ∩ E:

(z)D = (z)E

and

|z|D = |z|E ,

so that subscripts can be dropped: for all z ∈ D ∪ E, (z) and |z| are unambiguous. For such systems, the comparison of E against D involves two natural notions: i. For a name e ∈ E, there may or may not exist a name in D that maps to the same language. In the latter case, we say that D is inadequate for describing the language of e and, accordingly, we call the associated computational problem, “given an e ∈ E, check that no d ∈ D maps to (e)”, the inadequacy problem from E to D. Formally, this is the promise problem I = (Iyes , Ino ), with: Iyes := {e ∈ E | (d) 6= (e) for all d ∈ D}, Ino := {e ∈ E | (d) = (e) for some d ∈ D}. Notice that e is promised to be in E, so that solving I does not require checking membership in E (which might be hard, even impossible). ii. When a name e ∈ E does have equivalent names in D (i.e., names mapping to (e)), we naturally ask how larger than e the smallest of these D-equivalents are. As usual, we answer this question with a function f : N → N that upper bounds this increase in size in terms of the size of e. Namely, f is such that for all s ∈ N and all e ∈ E of size s: if D contains names that are equivalent to e, then at least one of them is of size at most f (s). We say that f is an upper bound for the trade-off (for the conversion) from E to D.4 When a computable such upper bound exists, we say that the trade-off from E to D is recursive.5 As first noted in [16], discussions i and ii above are not unrelated: the unrecognizability of the inadequacy problem typically implies the nonrecursiveness of the trade-off, as explained in the following lemma. 3The agreement for |·| is not necessary but we do require it, for simplicity. 4Note that this defines directly the notion of an upper bound for the trade-off from

E to D. A more natural approach would be to first define the notion of the trade-off from E to D (in the sense of Chapters 1 and 2), and only then say what an upper bound for it is. However, that would be redundant, as our goal in this chapter is to show the non-recursiveness of the upper bounds. 5See the discussion in Footnote 1 (page 101) on what “recursive trade-off” means.

106

3. NON-RECURSIVE TRADE-OFFS

1. Lemma (Hartmanis). Let D and E be two comparable descriptional systems over alphabets Γ and Σ, satisfying the following conditions: H1 . both D and E are decidable, H2 . for every e ∈ E, we can effectively compute its size |e|, and H3 . there exists a halting dtm that, on input any s ∈ N, outputs a list of names Z ⊆ Γ ∗ such that i. the non-D names can be recognized in Z: the problem (Z ∩ D, Z ∩ D) is recognizable. ii. the languages of the D-names in Z cover all and only those languages over Σ that are supported by names in D of size ≤ s: {(z) | z ∈ Z ∩ D} = {(d) | d ∈ D & |d| ≤ s}. If the trade-off from E to D is recursive, then the inadequacy problem from E to D is recognizable. Before the proof, let us remark how mild conditions H1 –H3 are. For most interesting cases, the first two of them are trivially true and H3 is satisfied via the dtm that simply lists all names in D that have size ≤ s (so that the problem of H3 i is trivially decidable and the sets of H3 ii trivially identical). Having H3 as complicated simply covers some special cases (e.g., comparing general to unambiguous context-free grammars [17, Example 2]). Proof. Suppose D, E are as in the statement and f is a computable upper bound for the trade-off from E to D. To check that a given e ∈ E has no D-equivalents, we first compute s := f (|e|) (by H2 and since f is computable). We then run the dtm guaranteed by H3 on s, to produce a (finite, since the dtm is halting) list of names Z := {z1 , z2 , . . . , zk }. At that moment, we know (by the selection of f and H3 ii) that we should accept iff every D-name in Z maps to a language different from (e). Equivalently, we should accept iff: for every z ∈ Z, either z is not a D-name or z is a D-name and (z), (e) differ at one or more w ∈ Σ ∗ . In order to check this, we start simulating, in two parallel threads: i. the recognizer given by H3 i on each of z1 , z2 , . . . , zk in parallel, and ii. for all w ∈ Σ ∗ : the machines UE and UD (guaranteed by H1 ) respectively on (e, w) and on each of (z1 , w), (z2 , w), . . . , (zk , w). Whenever a z ∈ Z is accepted in thread i, we cross it off the list. Whenever a z ∈ Z is found to disagree with e on some w in thread ii, it is crossed off the list, as well. Finally, if the list ever gets empty, we accept. Clearly, every string in Z that is not a D-name, will eventually be crossed off, in thread i. Similarly, each D-name that is inequivalent to e will also be eventually removed, in thread ii. Moreover, neither thread can delete a D-name that is equivalent to e. Hence, the list will eventually get empty iff e had no D-equivalent in the original list Z, which is true iff e has no D-equivalent at all.

2. PRELIMINARIES

107

2.2. Multi-counter automata. Our main theorem will need to make use of the natural notion of a unary two-way deterministic finite automaton that has additional access to a number of counters. Such models are of course known and well studied, but mainly for non-unary alphabets—see, for example, the two-way multi-counter machines of [11, 42]. Since we will only be interested in the unary case, it is possible to simplify the model in helpful ways. Most notably, we can avoid the notion of input tape, and assume instead that the input is the upper bound for one of the counters.6 The simplified definition follows. A deterministic automaton with k counters (dcak ) consists of a finite state control and k counters, each of which can store a nonnegative integer. One of the counters is distinguished as primary, the rest being referred to as secondary. The input to the automaton is a nonnegative upper bound n for the primary counter. The machine starts at a designated start state with all its counters set to 0. At each step, based on its current state, the automaton decides which counter it should act upon and whether it should decrease it or increase it. Then the action is attempted. An attempt to decrease fails iff the counter already contains 0; an attempt to increase fails iff the counter is the primary one and it already contains n; an attempt to increase a secondary counter never fails. A failed attempt leaves the counter contents intact; a successful attempt updates the counter contents accordingly. Based on its current state and on whether the attempt succeeded or not, the automaton selects a new state and moves to it. The input is accepted if the machine ever enters a designated final state. The language of the machine is exactly the set of inputs that it accepts. If, for all n, the behavior of the automaton is such that no secondary counter ever grows larger than n, we say that the automaton is (linearly) bounded. We will be interested in a special version of the emptiness problem for multi-counter automata. One way to introduce this problem is to start with the emptiness problem for dtms (“given a description of a dtm, check that the language of the machine is empty”), which is well known to be unrecognizable [21], and to consider certain ways of ‘simplifying’ it: • What happens if, instead of a full-fledged dtm, the machine we are given is ‘simpler’ ? Say, a multi-counter automaton? Or just a dca2 ? Clearly, checking emptiness becomes ‘simpler’, too. Does it also become recognizable? • What if the given dca2 is also promised to be bounded ? And terminating, too? And to also obey a threshold ? As the promise gets stronger, checking emptiness again becomes ‘simpler’. But does it become recognizable? So, the problem that we want to define is the following: “given a description of a dca2 that is promised to be bounded and terminating and to obey a 6Note a difference from [26], where the upper bound is applied to all counters.

108

3. NON-RECURSIVE TRADE-OFFS

threshold, check that its language is empty.” Formally, E = (Eyes , Eno ) with Eyes := {z ∈ hdca∗2 i | (z) = ∅} Eno := {z ∈ hdca∗2 i | (z) 6= ∅}. Here, we use hdca∗2 i to denote the set of descriptions (under a fixed encoding) of all terminating, bounded dca2 s that obey a threshold, whereas (z) stands for the language of the machine described by z. Interestingly, although not surprisingly, even for such a weak automaton and under such a strong promise, emptiness remains unrecognizable:7 2. Lemma. E is unrecognizable. We use this fact in the next section, but defer proving it until Section 5. In between, Section 4 discusses the capabilities of multi-counter automata. 3. The Main Theorem We are now ready to state and prove the main theorem. 3. Theorem. Let D, E be two comparable descriptional systems that satisfy conditions H1 –H3 of Lemma 1. If they also satisfy the following: C1 . there exists a name e0 ∈ E that has no equivalent in D, C2 . given a description z of a terminating, bounded dca2 that obeys a threshold, we can effectively construct a name ez ∈ E such that (9)

(ez ) = (e0 ) ∪ {w ∈ Σ ∗ | |w| ∈ (z)},

C3 . every co-finite language has a name in D that maps to it, then the trade-off from E to D is non-recursive. Before proving the theorem, note how mild conditions C1 –C3 really are. Since every co-finite language is regular, C3 is trivially satisfied whenever the names in D describe machines that have some kind of finite control. The second condition essentially says that the machines described by E have enough resources to simulate a bounded dca2 . Because then, given z, we can always construct the description ez of the following E-machine: on input w ∈ Σ ∗ : first simulate on |w| the dca2 described by z; if this accepts, then halt and accept; otherwise, simulate on w the machine described by e0 and accept, reject, or loop accordingly. which obviously satisfies (9) (note the importance of the promise that z describes a dca2 that never loops). Given how weak bounded dca2 s are, most two-way machines with non-regular capabilities will easily meet C2 . 7Clearly E ∈ Π , and the proof of Lemma 2 will show that E is Π -complete. Recall 1 1 that, under no promise and after non-trivially modifying the definition of dca2 s, E is the emptiness problem for 2-register machines, which is well-known to be Π1 -complete [38].

4. PROGRAMMING COUNTERS

109

The important condition is C1 , which requires that the machines described by D are not as powerful as those described by E; in other words, a separation is needed between the complexity classes that correspond to the two systems. Proof. We essentially repeat Hartmanis’ argument from [17, Example 4] (see also [31, Theorem 7]). Suppose D, E are as in the statement of the theorem. Since H1 –H3 are satisfied, Lemma 1 implies that we only need to prove that the inadequacy problem I from E to D is unrecognizable. By Lemma 2, we just need to reduce E to it: E ≤ I. Given a z ∈ hdca′2 i, we simply construct the name ez ∈ E guaranteed by conditions C1 and C2 , so that (ez ) = (e0 ) ∪ {w ∈ Σ ∗ | |w| ∈ (z)}. If z ∈ Eyes , then the language of z is empty, so that (ez ) = (e0 ) and ez has no D-equivalent (because e0 does not); hence ez ∈ Iyes . On the other hand, if z ∈ Eno , then the language of z contains all sufficiently large w ∈ Σ ∗ , so that (ez ) is co-finite and has D-equivalents (by C3 ); hence ez ∈ Ino . This concludes the proof. As a side remark, we note that the proof has shown a slightly stronger fact: problem I remains unrecognizable even under the promise that the given e ∈ E either has no D-equivalent or its language is co-finite. In addition, the promise that the given dca′2 obeys a threshold can be slightly relaxed: we only need to know that its language is either empty or co-finite. 4. Programming Counters In order to present the capabilities of multi-counter automata, we introduce some ‘program’ notation. First, the two atomic operations, the attempt to decrease a counter X and the attempt to increase it, are denoted respectively by f

X ←− X − 1

and

f

X ←− X + 1,

where, in each case, flag f is set to true iff the attempt succeeds. Then, the compound operation of setting X to 0, denoted by X ←− 0, can be described by (10)

f

repeat X ←− X − 1 until ¬f .

If a second counter Y is present, we can transfer the contents of Y into X: we set X to 0, then repeatedly decrease Y and increase X until Y is 0. We denote this by f

(X, Y ) ←− (Y, 0),

110

3. NON-RECURSIVE TRADE-OFFS

and describe it by a line similar to (10). Note that, if X is the primary counter and Y > n, then one of the attempts to increase X will fail; in that case, we restore the original value of Y returning X to 0, and set flag f to false. So, X’s original contents are always lost, but this never happens to the original contents of Y . Changing how fast X increases as Y decreases, we can multiply/divide Y into X by any constant a ∈ N. We denote these operations by f f,r (X, Y ) ←− (aY, 0) and (X, Y ) ←− Ya , 0 ,

where, in the second operation, we also find the remainder and return it in r. As before, if X is the primary counter and aY > n (respectively, ⌊Y /a⌋ > n) then one of the attempts to increase X will fail; we then restore the original value of Y returning X to 0, and set flag f to false. At a higher level, we can try to multiply Y by a constant a (into Y ) using X as an auxiliary counter and making sure Y changes only if the operation succeeds: f

(X, Y ) ←− (aY, 0);

t

if f then (Y, X) ←− (X, 0).

Note the use of t in the place of a flag, indicating that the action is guaranteed to be successful. Division (with remainder) can be performed in a similar manner. We denote the two operations by f,X f,r,X (11) Y ←− aY and Y ←− Ya . Now, if X is primary, we can set Y to the largest power of a that fits in n: f

(12)

X ←− 0; X ←− X + 1; if f then g,X

t

{Y ←− 0; Y ←− Y + 1; repeat Y ←− aY until ¬g}, an operation that fails iff n = 0. We denote it by f,X

Y ←− alga n where, as already mentioned, lga n := ⌊loga n⌋. If a third counter Z is present, we can modify (12) to also count (in Z) the number of iterations performed. This gives us a way to calculate lga n: f,X,Y

Z ←− lga n, an operation that fails iff n = 0. In another variation, we can modify the multiplication in (11) so that the success of the operation depends on the contents of X (as opposed to its upper bound n): f,X,Z

Y ←− aY, meaning that, using Z as auxiliary and without affecting X: if aY ≤ X, then Y is set to aY ; otherwise, Y is unaffected. Specifically, to implement this, we first set Z to 0. Then, we repeatedly decrease Y , increase Z, and

5. PROOF OF THE MAIN LEMMA

111

decrease X by a. If X becomes 0 before Y , then aY > X and the operation should fail: we restore the original values of Y and X by repeatedly decreasing Z, increasing Y , and increasing X by a, until Z becomes 0. Otherwise, aY ≤ X and the operation will succeed: we copy the correct value to Y and restore the value of X by repeatedly decreasing Z and increasing each of Y , X by a, until Z becomes 0. Note that if originally Y, Z ≤ X, then at no point during the operation does any of the counters assume a value greater than the original value of X. Using this last operation, we can program the following variant of (12): f

t

X ←− X − 1; if f then {X ←− X + 1; g,X,Z

t

Y ←− 0; Y ←− Y + 1; repeat Y ←− aY until ¬g} which implements the attempt to set Y to the largest power of a that is at most X, using Z as auxiliary and leaving X unaffected (and failing iff X is 0). We denote this operation by f,Z

Y ←− alga X . It is important to note that, by the remark at the end of the previous paragraph, if originally Y, Z ≤ X, then during this operation no counter ever assumes a value greater than the original value of X. Hopefully, the reader is convinced of the quite significant capabilities of dcak s that have 2 or more counters. We will be using these capabilities in the next section. 5. Proof of the Main Lemma We now prove that E is unrecognizable. We do this by presenting a reduction from the complement of the halting problem: HALTING ≤ E, where HALTING := {z ∈ {0, 1}∗ | z encodes a dtm that loops on z} is well-known to be unrecognizable [21]. That is, we present an algorithm that, given a description z of a dtm M , returns a description z ′ of a terminating, bounded, threshold-obeying dca2 M ′ such that (13)

M loops on z =⇒ (z ′ ) = ∅ and M halts on z =⇒ (z ′ ) 6= ∅.

In describing this algorithm, we will be calling a machine (dtm or dcak ) good, if it is terminating, bounded (for dcak s), obeys a threshold, and its language satisfies (13) when it replaces (z ′ ). Thus, e.g., M ′ will be good. On its way to z ′ , the algorithm will construct descriptions of two other machines: a description zA of a dtm A, and a description zB of a dca3 B. In the sequence M , A, B, M ′ each machine after M will be defined

112

3. NON-RECURSIVE TRADE-OFFS

in terms of the previous one and will be good. Our constructions use the ideas of [61] and [38] (also found in [39, 21]). 5.1. The first machine. A is a dtm with one tape, infinite in both ˙ 1}, ˙ while the input alphabet is directions; the tape alphabet is {⊔, 0, 1, 0, {0}. On input 0n , A starts with tape contents · · · ⊔ ⊔ ⊔ 000 · · · 00} ⊔ ⊔ ⊔ · · · | {z n times

and its input head on the ⊔ next to the leftmost 0 (or on any ⊔, if n = 0). It then computes as follows: 1. For all w ∈ {0, 1}n , from 0n up to 1n : — if w encodes a halting computation history of M on z, accept. 2. Reject. The check inside the loop presupposes some fixed reasonable encoding of sequences of configurations of M into binary strings, with the additional property that if w encodes a computation history, then every string of the form w0∗ encodes the same computation history. Note that, using the extra dotted symbols, A can easily perform this check without ever writing a non-blank symbol on a ⊔, or a ⊔ on a nonblank symbol; and without ever visiting any ⊔ that lies beyond the two that originally delimit the input. As a consequence, throughout its computation on 0n , A keeps exactly n non-blank symbols on its tape, occupying the same n cells as the symbols of the input. Also note that, by the selection of the encoding scheme for M ’s computation histories, if A accepts an input 0n , it necessarily accepts all longer inputs, as well. 5.2. The second machine. B is a dca3 that, on input n ≥ 30, simulates the behavior of A on input 0lg5 lg30 n ; on input n < 30, B just rejects. Note the strange length lg5 lg30 n. This is chosen as a function of n that is (i) computable by a dca3 and (ii) increasing, but also (iii) small enough. Goodness of B bases on (ii), whereas (iii) facilitates the simulation performed by M ′ in the next section. To explain B’s behavior, let J, L, R be its three counters. J is primary and helps performing operations on L and R, while L and R together encode tape configurations of A. To see the encoding, consider the following example of a configuration: ··· ···

l4 ⊔

l3 ⊔

l2 ×

l1 ×

l0 ×

h × ↑

r0 ×

r1 ×

r2 ×

r3 ×

r4 ⊔

r5 ⊔

··· ···

where × stands for any non-blank symbol, and ↑ shows the head position. ˙ 0, ˙ 1 and 0 to the numbers 0, 1, 2, 3 and 4, respecMapping the symbols ⊔, 1, tively (in fact, any mapping that maps symbol ⊔ to code 0 and symbol 0 to code 4 will do), we get each tape cell map to a digit of the 5-ary numbering

5. PROOF OF THE MAIN LEMMA

113

system. Then, the head position splits the tape into three portions, which define the integers ∞ ∞ X X l= li · 5 i and h and r= ri · 5i , r=0

i=0

where the two sums are finite, exactly because ⊔ maps to 0. Of these three values, l and r are kept respectively in L and R, while h is kept in a register H in B’s finite memory. More specifically, on input n, B starts with a two-part initialization. First, it computes lg30 n into J, leaving 0s in L and R (this is only if n ≥ 1; if n = 0, then B rejects): f,J,L

R ←− lg30 n; t if ¬f then reject else {(J, R) ←− (R, 0); L ←− 0}. Then, it computes into R the value 5m − 1, where m := lg5 lg30 n, leaving 0s in L and H (this is only if J ≥ 1, that is if n ≥ 30; otherwise, n < 30 and B rejects): f,L

R ←− 5lg5 J ; t if ¬f then reject else {R ←− R − 1; L ←− 0; H ←− 0}. This completes the initialization, with L = H = 0 and R = 5m − 1. Hence, the 5-ary representations of the three counters are: L=0

and

H =0

and

R = |4 4 4{z· · · 4}, m times

which means that they correctly represent A’s starting tape configuration on input 0m (recall that symbol 0 maps to code 4): ··· ···

l1 ⊔

l0 ⊔

h ⊔ ↑

r0 0

r1 0

r2 0

··· ···

rm−1 0

rm ⊔

rm+1 ⊔

··· ···

At this point, B is ready to start a faithful step-by-step simulation of A. The automaton remembers in its finite memory the current state of A as well as the code of the currently read symbol (in H). If s is the code of the new symbol to be written on the tape, B computes t,J t,r0 ,J t L ←− 5L; repeat s times: L ←− L + 1; R ←− R5 ; H ←− r0 to simulate writing s and moving to the right; similarly, it computes t,J t,l0 ,J t R ←− 5R; repeat s times: R ←− R + 1; L ←− L5 ; H ←− l0

to simulate writing s and moving to the left. It is important to note the range of the values assumed by the counters. By the design of its main operation, the second part of the initialization phase never assigns to a counter a value greater than the original value of J, which is lg30 n. Then, in the simulation phase, the behavior of A (the

114

3. NON-RECURSIVE TRADE-OFFS

tape starts with m 0s and always contains exactly m non-blank symbols) and the selection of the symbol codes (0 gets the largest code) are such that the initial value 5m − 1 of R upper bounds all possible values that may appear in B’s counters. One consequence of this is that all operations in the previous paragraph are guaranteed to be successful (hence the t reminder). Another consequence is that, since 5m − 1 < lg30 n, the entire computation of B after the first part of its initialization phase keeps all values of all counters at or below lg30 n. This will prove crucial in the next section. 5.3. The final machine. M ′ is a dca2 that simulates the behavior of B. If U , V are its two counters, then U is primary and helps performing operations on V , while V encodes the contents of the counters of B: whenever J, L, R, contain j, l, r respectively, V contains 2j 3l 5r . The automaton starts by computing into V the product 30t = 2t 3t 5t , where t := lg30 n (this is only if n ≥ 1; if n = 0, then M ′ rejects, exactly as B would do): f,U

V ←− 30lg30 n ; if ¬f then reject. It then removes all 3s and 5s of this product, so that V becomes 2lg30 n 30 50 . Specifically, in order to remove all 3s, M ′ divides V by 3 repeatedly: t,r,U V ←− V3 ,

until a non-zero remainder r is returned, which implies there were no 3s in V before the last division. Then the correction t,U

t

V ←− 3V ; repeat r times: V ←− V + 1 undoes the damage caused by the last division. After this, M ′ performs a similar computation to remove from V all 5s. At this point, the value 2lg30 n 30 50 correctly encodes the values of the counters of B right after the first part of its initialization phase and M ′ is ready to start a faithful step-by-step simulation of B. The current state of B is stored in the finite memory of M ′ . Whenever B tries to decrease J, f

J ←− J − 1, M ′ divides V by 2. If this division returns no remainder, then it has simulated a successful decrement; otherwise, the simulated attempt has failed, and M ′ restores the initial value of V : t,r,U

V ←−

V 2 ; if r = 0 then f ←− true else t,U

t

{f ←− false; V ←− 2V ; repeat r times: V ←− V + 1}.

The attempts of B to decrease L or R are implemented similarly, but with 3 or 5 in the place of 2.

6. CONCLUSION

115

Attempts of B to increase its counters are of course simulated by appropriate multiplications of V . The only subtlety involves failure during increment attempts. To be faithful, the simulation must ensure that an attempt of B the corresponding attempt of M ′ iff to increase a counter fails to multiply V fails. How is this condition satisfied? If it is, this does not happen in some obvious way. In B the upper bound for J is always the same (the input of B), whereas in M ′ the upper bound for its representation is the base 2 logarithm of a value that depends (on the input of M ′ and) on the values of the other two counters. Similarly, in B counter L is unrestricted, whereas in M ′ its representation is bounded by a value that depends on the other two counters—and the same is true for R. The crucial observation (from the previous section) is that, since we are after the first part of B’s initialization phase, no counter of B ever assumes a value greater than t = lg30 n. This immediately implies that, after the initialization phase of M ′ , no counter of M ′ ever assumes a value greater than 2t 3t 5t . Now, since t < n and 2t 3t 5t ≤ n, we conclude that both • all increment attempts of B are successful, and • all corresponding multiplication attempts of M ′ are successful, as well. Hence, the equivalence above is satisfied vacuously. Put another way, when M ′ multiplies V to simulate a counter increment in B, it knows in advance that this increment does not fail and therefore that the multiplication will not fail, either. Overall, B’s atomic operation t

J ←− J + 1

is simulated by

t,U

V ←− 2V,

and similarly for L and R. As a final remark, we note the following immediate by-product of our last argument: Since V clearly never exceeds n during the initialization phase of M ′ and it also never exceeds n during the simulation of B, it follows that M ′ is bounded. This concludes the definitions of all three machines in our reduction. It should be clear that M ′ is good and that a description z ′ of it can be effectively computed out of z. 6. Conclusion Using old ideas [61, 38], we showed the unrecognizability of the emptiness problem for dca2 s that are promised to be bounded, always terminate, and obey a threshold. We then combined this with the idea of [19] to show that, if machines a have the resources to simulate dca2 s of the particular kind and can also solve problems that machines b cannot, then typically the trade-off from a to b is non-recursive. Applying the theorem, we derived such trade-offs in many conversions.

116

3. NON-RECURSIVE TRADE-OFFS

We do not know if the emptiness problem of Section 2.2 remains unrecognizable even when the underlying machine is a 2-register automaton [38] (that is, a dca2 that starts with n in its primary counter and where increments of that counter never fail). If it is, then our main theorem can be made slightly stronger.

A preliminary and more concrete version of the contents of this chapter can be found in [26]. An improved but more abstract version appeared in [28].

End Note I would like to thank my research advisor, Michael Sipser, for suggesting to me the 2d vs. 2n problem and for being a constant source of encouragement and ideas as I was working on it during the past five years. I learned a lot from him about computation and (more generally, and perhaps more importantly) about how to think and how to explain. I enjoyed the kindness and humanity of his personality and I particularly admire his ability to just take a few silent seconds and then return with the most valuable advice for whatever problem needs to be solved. I would also like to thank Albert Meyer, from whom I also learned a lot over the past years as a teaching assistant for his class on discrete mathematics. I really enjoyed the honesty of his character and I cannot but admire his eagerness and ability to talk and think efficiently through almost any kind of problem. This is probably a good place to also thank the people of MPLA in Greece, where my graduate studies actually began. When I moved to Athens in 1997, it was not certain that the universities was indeed going to be the place where I would be burning my energy. If it turned out this way, it is mainly due to the quality of the people and the academic program at MPLA. I am particularly grateful to Yiannis Moschovakis for encouraging me to continue my studies abroad. On my way out of Greece, I also had the honor to meet General Leftheris and Mrs. Roula Kanellakis. Like so many other students, I am proud to have been a Paris Kanellakis Fellow and I have often drawn inspiration from Paris’s academic conduct and achievements. Finally, many thanks are due to my uncle Demos Fokas. When I first came to Cambridge, he had already been around for a while and he helped me a lot with the transition. Most importantly, having been through graduate school himself, Demos knew very well what this process is all about and in many occasions used his experience to provide me with valuable advice and inspiration. He is now in the ‘southern provinces’ already—and this is exactly where I am also heading for.

117

Bibliography [1] Bruce H. Barnes. A two-way automaton with fewer states than any equivalent oneway automaton. IEEE Transactions on Computers, C-20(4):474–475, 1971. [2] Piotr Berman. A note on sweeping automata. In Proceedings of the International Colloquium on Automata, Languages, and Programming, pages 91–97, 1980. [3] Piotr Berman and Andrzej Lingas. On complexity of regular languages in terms of finite automata. Report 304, Institute of Computer Science, Polish Academy of Sciences, Warsaw, 1977. [4] Jean-Camille Birget. Two-way automata and length-preserving homomorphisms. Report 109, Department of Computer Science, University of Nebraska, 1990. [5] Jean-Camille Birget. Positional simulation of two-way automata: proof of a conjecture of R. Kannan and generalizations. Journal of Computer and System Sciences, 45:154–179, 1992. [6] Jean-Camille Birget. State-complexity of finite-state devices, state compressibility and incompressibility. Mathematical Systems Theory, 26:237–269, 1993. [7] Marek Chrobak. Finite automata and unary languages. Theoretical Computer Science, 47:149–158, 1986. [8] David Damanik. Finite automata with restricted two-way motion. Master’s thesis, J. W. Goethe-Universit¨ at Frankfurt, 1996. In german. ˇ s and Zvi Galil. Fooling a two-way automaton or one pushdown store [9] Pavol Duriˇ is better than one counter for two-way machines. Theoretical Computer Science, 21:39–53, 1982. [10] Roger B. Eggleton and Richard K. Guy. Catalan strikes again! How likely is a function to be convex? Mathematics Magazine, 61(4):211–219, 1988. [11] Michael J. Fischer, Albert R. Meyer, and Arnold L. Rosenberg. Counter machines and counter languages. Mathematical Systems Theory, 3:265–283, 1968. [12] Martin Gardner. Catalan numbers: an integer sequence that materializes in unexpected places. Scientific American, 234(6):120–125, June 1976. [13] Viliam Geffert, Carlo Mereghetti, and Giovanni Pighizzini. Converting two-way nondeterministic unary automata into simpler automata. Theoretical Computer Science, 295:189–203, 2003. [14] Viliam Geffert, Carlo Mereghetti, and Giovanni Pighizzini. Complementing two-way finite automata. In Proceedings of the International Conference on Developments in Language Theory, pages 260–271, 2005. [15] Jonathan Goldstine, Martin Kappes, Chandra M. R. Kintala, Hing Leung, Andreas Malcher, and Detlef Wotschke. Descriptional complexity of machines with limited resources. Journal of Universal Computer Science, 8(2):193–234, 2002. [16] Juris Hartmanis. On the succinctness of different representations of languages. SIAM Journal of Computing, 9(1):114–120, 1980. [17] Juris Hartmanis. On G¨ odel speed-up and succinctness of language representations. Theoretical Computer Science, 26:335–342, 1983. 119

120

BIBLIOGRAPHY

[18] Juris Hartmanis. On the importance of being Π2 -hard. Bulletin of the EATCS, 37:117–127, 1989. [19] Juris Hartmanis and Theodore P. Baker. Relative succinctness of representations of languages and separation of complexity classes. In Proceedings of the International Symposium on Mathematical Foundations of Computer Science, pages 70–88, 1979. [20] John E. Hopcroft and Jeffrey D. Ullman. Formal languages and their relation to automata. Addison-Wesley, Reading, MA, 1969. [21] John E. Hopcroft and Jeffrey D. Ullman. Introduction to automata theory, languages, and computation. Addison-Wesley, Reading, MA, 1979. [22] Juraj Hromkoviˇ c and Georg Schnitger. Nondeterminism versus determinism for twoway finite automata: generalizations of Sipser’s separation. In Proceedings of the International Colloquium on Automata, Languages, and Programming, pages 439– 451, 2003. [23] Oscar H. Ibarra. On two-way multihead automata. Journal of Computer and System Sciences, 7:28–36, 1973. [24] Neil Immerman. Nondeterministic space is closed under complementation. SIAM Journal of Computing, 17(5):935–938, 1988. [25] Ravi Kannan. Alternation and the power of nondeterminism. In Proceedings of the Symposium on the Theory of Computing, pages 344–346, 1983. [26] Christos Kapoutsis. From k + 1 to k heads the descriptive trade-off is non-recursive. In Proceedings of the Workshop on Descriptional Complexity of Formal Systems, pages 213–224, 2004. [27] Christos Kapoutsis. Deterministic moles cannot solve liveness. In Proceedings of the Workshop on Descriptional Complexity of Formal Systems, pages 194–205, 2005. [28] Christos Kapoutsis. Non-recursive trade-offs for two-way machines. International Journal of Foundations of Computer Science, 16:943–956, 2005. [29] Christos Kapoutsis. Removing bidirectionality from nondeterministic finite automata. In Proceedings of the International Symposium on Mathematical Foundations of Computer Science, pages 544–555, 2005. [30] Christos Kapoutsis. Small sweeping 2NFAs are not closed under complement. In Proceedings of the International Colloquium on Automata, Languages, and Programming, pages 144–156, 2006. [31] Martin Kutrib. On the descriptional power of heads, counters, and pebbles. In Proceedings of the Workshop on Descriptional Complexity of Formal Systems, pages 138–149, 2003. [32] Martin Kutrib. The phenomenon of non-recursive trade-offs. In Proceedings of the Workshop on Descriptional Complexity of Formal Systems, pages 83–97, 2004. [33] Hing Leung. Separating exponentially ambiguous finite automata from polynomially ambiguous finite automata. SIAM Journal of Computing, 27(4):1073–1082, 1998. [34] Hing Leung. Tight lower bounds on the size of sweeping automata. Journal of Computer and System Sciences, 63(3):384–393, 2001. [35] Albert R. Meyer and Michael J. Fischer. Economy of description by automata, grammars, and formal systems. In Proceedings of the Symposium on Switching and Automata Theory, pages 188–191, 1971. [36] Silvio Micali. Two-way deterministic finite automata are exponentially more succinct than sweeping automata. Information Processing Letters, 12(2):103–105, 1981. [37] Pascal Michel. An NP-complete language accepted in linear time by a one-tape Turing machine. Theoretical Computer Science, 85(1):205–212, 1991. [38] Marvin L. Minsky. Recursive unsolvability of Post’s problem of “tag” and other topics in theory of Turing machines. Annals of Mathematics, 74(3):437–455, 1961. [39] Marvin L. Minsky. Computation: finite and infinite machines. Prentice-Hall, Englewood Cliffs, NJ, 1967.

BIBLIOGRAPHY

121

[40] Burkhard Monien. Transformational methods and their application to complexity problems. Acta Informatica, 6:95–108, 1976. [41] Burkhard Monien. Corrigenda: Transformational methods and their application to complexity problems. Acta Informatica, 8:383–384, 1977. [42] Burkhard Monien. Two-way multihead automata over a one-letter alphabet. RAIRO Informatique Th´ eorique/Theoretical Informatics, 14(1):67–82, 1980. [43] Frank R. Moore. On the bounds for state-set size in the proofs of equivalence between deterministic, nondeterministic, and two-way finite automata. IEEE Transactions on Computers, 20(10):1211–1214, 1971. [44] G. Ott. On multipath automata I. Research report 69, SRRC, 1964. [45] Michael O. Rabin. Two-way finite automata. In Proceedings of the Summer Institute of Symbolic Logic, pages 366–369, Cornell, 1957. [46] Michael O. Rabin and Dana Scott. Remarks on finite automata. In Proceedings of the Summer Institute of Symbolic Logic, pages 106–112, Cornell, 1957. [47] Michael O. Rabin and Dana Scott. Finite automata and their decision problems. IBM Journal of Research and Development, 3:114–125, 1959. [48] William J. Sakoda and Michael Sipser. Nondeterminism and the size of two-way finite automata. In Proceedings of the Symposium on the Theory of Computing, pages 275–286, 1978. [49] Walter J. Savitch. Relationships between nondeterministic and deterministic tape complexities. Journal of Computer and System Sciences, 4:177–192, 1970. [50] Erik M. Schmidt and Thomas G. Szymanski. Succinctness of descriptions of unambiguous context-free languages. SIAM Journal of Computing, 6(3):547–553, 1977. [51] Joel I. Seiferas. Manuscript communicated to Michael Sipser. October 1973. [52] Joel I. Seiferas. Relating refined space complexity classes. Journal of Computer and System Sciences, 14(1):100–129, 1977. [53] John C. Shepherdson. The reduction of two-way automata to one-way automata. IBM Journal of Research and Development, 3:198–200, 1959. [54] Michael Sipser. Halting space-bounded computations. Theoretical Computer Science, 10:335–338, 1980. [55] Michael Sipser. Lower bounds on the size of sweeping automata. Journal of Computer and System Sciences, 21(2):195–202, 1980. [56] Richard E. Stearns. A regularity test for pushdown machines. Information and Control, 11:323–340, 1967. [57] Ivan H. Sudborough. On tape-bounded complexity classes and multihead finite automata. Journal of Computer and System Sciences, 10(1):62–76, 1975. [58] R´ obert Szelepcs´ enyi. The method of forced enumeration for nondeterministic automata. Acta Informatica, 26(3):279–284, 1988. [59] Leslie G. Valiant. A note on the succinctness of descriptions of deterministic languages. Information and Control, 32:139–145, 1976. [60] Moshe Y. Vardi. A note on the reduction of two-way automata to one-way automata. Information Processing Letters, 30:261–264, 1989. [61] Hao Wang. A variant of Turing’s theory of computing machines. Journal of the ACM, 4(1):63–92, 1957.