Visualizing Why Nondeterministic Finite-State Automata Reject

Oliwia Kempinski Seton Hall UniversitySouth OrangeNJUSA07079 [email protected] and Marco T. Morazán Seton Hall UniversitySouth OrangeNJUSA [email protected]

Abstract.

Students find their first course in Formal Languages and Automata Theory challenging. In addition to the development of formal arguments, most students struggle to understand nondeterministic computation models. In part, the struggle stems from the course exposing them for the first time to nondeterminism. Often, students find it difficult to understand why a nondeterministic machine accepts or rejects a word. Furthermore, they may feel uncomfortable with there being multiple computations on the same input and with a machine not consuming all of its input. This article describes a visualization tool developed to help students understand nondeterministic behavior. The tool is integrated into, FSM, a domain-specific language for the Automata Theory classroom. The strategy is based on the automatic generation of computation graphs given a machine and an input word. Unlike previous visualization tools, the computation graphs generated reflect the structure of the given machine’s transition relation and not the structure of the computation tree.

1. Introduction

In every Formal Languages and Automata Theory course, students are asked to design nondeterministic finite-state automata (NDFA). More often than not, students find this very challenging for several reasons. The first is that they are inexperienced with nondeterministic design. Second, in all likelihood, students lack the benefit of a programming language in which to implement their machines and receive immediate feedback. Such feedback is essential for students to debug a machine much like they debug a program (Landa and Martinez-Trevino, 2019; Marwan et al., 2020; Venables and Haywood, 2003). Third, most machine visualization tools are ill-equipped to help students understand why a word is rejected (or accepted) by a nondeterministic automaton. In addition, students need help understanding how a machine may perform multiple computations on the same input and how, unlike deterministic machines, there can be computations that do not consume all of the input word.

In the age of domain-specific languages (Bettini, 2016; Kleppe, 2008; Felleisen et al., 2018; Walter et al., 2009), it is an opportunity to provide students with a programming language that makes it relatively easy to program NDFAs. To this end, FSM (Functional State Machines) has been developed. FSM is a domain-specific language for the Automata Theory and Formal Languages classroom embedded in Racket. Its types include NDFA. Using FSM, students may test and debug their machines before attempting a proof or submitting a machine for grading. Furthermore, students can implement the algorithms they develop as part of their constructive proofs. An important feature of FSM is that programmers are not burdened with implementing nondeterminism. Instead, they may use nondeterminism just like they use any feature in their favorite programming language. Students design, implement, validate, and prove algorithms correct assuming nondeterminism is a built-in language feature.

One of the challenges with implementing such a language is providing support to understand and debug a nondeterministic machine addressing the problems outlined above. To understand the difficulties, it is important to recall when a nondeterministic machine accepts or rejects a given word. A nondeterministic machine accepts a given word if there is computation that takes the machine from its starting configuration to an accepting configuration. For an NDFA, an accepting configuration is one in which there is no more input to consume and the machine’s state is a final/accepting state. A word is rejected if there is no computation that takes the machine from its starting configuration to an accepting configuration. In other words, all possible computations must fail to end in an accepting configuration. If any possible computation ends in an accepting configuration then the machine accepts the word.

Providing support to demonstrate why a nondeterministic machine accepts a word, for example, is relatively easy. The programmer is provided with a trace of the configurations that take the machine from the starting state to a final state: C ${}_{\textrm{0}}$ $\vdash$ C ${}_{\textrm{1}}$ $\vdash$ $\ldots$ C ${}_{\textrm{n}}$ . In such a sequence, C ${}_{\textrm{i}}$ $\vdash$ C ${}_{\textrm{j}}$ represents the application of a single rule, C ${}_{\textrm{0}}$ is the starting configuration, and C ${}_{\textrm{n}}$ is an accepting configuration. It is more challenging to provide support when a word is rejected, because for a nondeterministic machine we must show that all possible computations reject. The number of possible computations easily and quickly becomes unwieldy. Thus, providing a trace of all possible computations is of little use other than for the simplest nondeterministic machines. For this reason, until recently, FSM only informed the programmer that a word is rejected. This left students wondering why the machine they designed rejected. To address this shortcoming, FSM now provides the generation of computation graphs for NDFAs. A computation graph is a finite visualization to help explain nondeterministic behavior.

The article is organized as follows. Section 2 discusses and contrasts related work on visualization techniques for nondeterministic NDFAs. Section 3 presents a brief introduction to FSM. Section 4 presents and discusses computation graphs in FSM. Section 5 discusses how NDFA computation graphs are generated. Finally, Section 6 presents concluding remarks and directions for future work.

2. Related Work

In most Formal Languages and Automata Theory textbooks, computations performed by a nondeterministic machine are visualized as a computation tree (e.g., (Gurari, 1989; Hopcroft et al., 2006; Lewis and Papadimitriou, 1997; Linz, 2011; Martin, 2003; Rich, 2019; Sipser, 2013)). In such trees, a node represents a machine configuration and an edge represents a transition. An NDFA configuration is a state and the unconsumed input. Nondeterministic decisions are represented by a node having multiple children reached reading the same element or having a child with the same unconsumed input (i.e., a transition without consuming input). Computation trees provide a trace of every possible configuration a machine may reach as all possible computations progress. A configuration, therefore, may appear multiple times in a computation tree. As a consequence, there are two potential hazards for students. The first is that the tree may become unwieldy large making it difficult to understand. The second is that students are easily tempted to look for differences between two identical configurations simply because they occur in different parts of the tree.

Refer to caption — Figure 1. An NDFA Trace in JFlap.

Classical automated tools to visualize why an NDFA rejects a word rely on exhaustively tracing all possible configurations a machine may reach when processing a given word. That is, they trace every path in a computation tree. JFLAP (Rodger, 2006; Rodger et al., 2006) and OpenFLAP (Mohammed, 2020), for example, provide a trace of every possible computation. Figure 1 displays an NDFA trace after 6 steps in JFLAP. Although a great deal of information is conveyed, there are two salient features that may present obstacles for learning. The first is that it is difficult to determine how a machine reached any of the configurations. The second is that the number of configurations that must be displayed can grow unwieldy and, therefore, become difficult to mentally organize. Another visualization tool is jFAST (White and Way, 2006). Similar to JFLAP, jFAST also has users manually render graph-based NDFA representations. Users can execute a machine by providing an input word. Unlike JFLAP, jFAST does not provide an option to see the series of transitions made during machine execution nor a trace of said transitions.

Modern approaches to automatically visualize NDFA computations present users with a computation graph. Like a computation tree, a computation graph has nodes that represent configurations. In a computation graph, however, a configuration is represented by a single node. Thus, different computations may have shared nodes. An undeniable benefit of the approach is that the size of the visualization is tamed. Gidayu, for example, uses computation graphs to visualize NDFA computations (Cogumbreiro and Blike, 2022). In a Gidayu NDFA computation graph, each node is rendered only using the configuration’s state. The unconsumed input is suppressed given that it may be reconstructed by tracing back to the starting configuration’s node. Thus, a machine state may appear repeatedly when it is part of different machine configurations. For instance, consider the sample NDFA displayed in Figure 2(a) (Cogumbreiro and Blike, 2022, p. 111) used to describe Gidayu. The computation graph generated for bb is displayed in Figure 2(b) (Cogumbreiro and Blike, 2022, p. 112). Observe that it has three nodes labeled q ${}_{\texttt{3}}$ . These nodes represent the configurations (from left to right): (q ${}_{\texttt{3}}$ bb), (q ${}_{\texttt{3}}$ b), and (q ${}_{\texttt{3}}$ $\epsilon$ ). Gidayu’s computation graphs are useful for students trying to understand nondeterminism, because it is possible to determine how a configuration is reached and to determine if no accepting configuration is reached. A troublesome characteristic for some students is that a state may appear repeatedly in a computation graph. This is troublesome because students naturally associate a node with a state and not with a configuration. Another troublesome characteristic for some students is that a computation graph with nodes representing configurations does not derive its structure from the machine’s state transition diagram. For instance, the computation graph in Figure 2(b) is not contained in the state transition diagram displayed in Figure 2(a). For many students, therefore, it is not immediately obvious how a computation graph is related to the machine’s state transition diagram. Finally, there is no explicit indication that some computations do not consume the input word. For example, some students wonder what happens when the machine reaches the node representing (q ${}_{\texttt{3}}$ bb) in Figure 2(b) (i.e., the leftmost occurrence of q ${}_{\texttt{3}}$ ). To the readers of this article, it is clear that the machine halts and rejects on that computation. When students are first exposed to nondeterministic behavior, however, it is helpful to have a computation graph that explicitly indicates that computations may end in a node without consuming all the input.

Like Gidayu, FSM uses computation graphs, albeit different, to visualize NDFA computations. In contrast to the approaches outlined above, the FSM visualization reflects the structure of the machine’s state transition diagram. That is, FSM computation graphs are rendered based on the states traversed, not the configurations reached, by any potential computation. In this manner, the size of the visualization is bounded by the number of states and the size of the transition relation. Other than the addition of a dead state, to illustrate that some computations do not consume the entire input word, a computation graph is always a subgraph of the machine’s state diagram. Thus, students can easily see the connection between the machine’s state diagram and the visualization provided by FSM. Unlike the related work discussed above, the goal is not to trace every possible computation based on the configurations traversed. Instead, students are provided with a graphic that summarizes the behavior of all possible computations. To make it easy to determine if a word is accepted or rejected, state coloring is used to highlight all the states reached at the end of every computation when the input word is consumed. If none of the colored nodes reached is a final state then the machine rejects. If at least one of the colored nodes reached is a final state then the machine accepts. Finally, to highlight that some computations do not consume all the input, a dashed edge is added to a fresh dead state from a node in which a computation halts without completely consuming the input word. In the dead state, the rest of the input is consumed. Therefore, in an FSM computation graph the entire input word is consumed by every computation and students’ concerns about not consuming the entire input are eased. Nonetheless, as their understanding matures, students are able to understand, that dashed edges are not part of the machine and highlight states in which computations end without consuming the entire input word.

3. A Brief Introduction to FSM

3.1. Core Definitions

In FSM, a state is an uppercase Roman alphabet symbol and an input alphabet element is a lowercase Roman alphabet symbol or a number. Based on these, an input alphabet, $\Sigma$ , is represented as a list of alphabet symbols. A word is either EMP, denoting the empty word, or a nonempty list of alphabet symbols. Words are given as input to a state machine to decide if the given word is in the machine’s language.

The machine constructors of interest for this article are those for deterministic and nondeterministic finite-state automata:

     make-dfa:  K  $\Sigma$  s F  $\delta$  ['no-dead]  $\rightarrow$  dfa
     make-ndfa: K  $\Sigma$  s F  $\delta$              $\rightarrow$  ndfa

Here, K is a list of states, F is a list of final states in K, s is the starting state in K, and

\delta

is a transition relation. A transition relation is represented as a list of transition rules. This relation must be a function for a DFA. A DFA transition rule is a triple, (K

\Sigma

K), containing a source state, the element to read, and a destination state. An NDFA transition rule is a triple, (K

\{\Sigma\cup\{EMP\}\}

K), containing a source state, the element to read (possibly none), and a destination state. The optional 'no-dead argument for make-dfa indicates to the constructor that the relation given is a total function. Omitting this argument indicates that the transition function is incomplete and the constructor adds a fresh dead state along with transitions to this dead state for any missing transitions.

The observers are:

     (sm-states M) (sm-sigma M) (sm-start M) (sm-finals M) (sm-rules M)
     (sm-type M)
     (sm-apply M w)
     (sm-showtransitons M w)

The first 5 observers extract a component from the given state machine. The given state machine’s type is returned by (sm-type M). To apply a given machine to the given word (sm-apply M w) is used. It returns 'accept or 'reject. A trace of the configurations traversed by applying a given machine to a given word is obtained using (sm-showtransitons M w). A trace is only returned, however, if the machine is a DFA or if the word is accepted by an NDFA. For the latter, a single trace (of potentially many traces) is returned, thus, illustrating why a nondeterministic machine accepts a word. No trace is provided when an NDFA rejects a word.

3.2. Visualization

FSM provides machine rendering and machine execution visualization. The current visualization primitives are:

     (sm-graph M)     (sm-visualize M [(s p)^∗])

The first returns an image of the given machine’s transition diagram. The second launches the FSM visualization tool. The optional lists of length 2 contain a state of the given machine and an invariant predicate for the state. Machine execution may always be visualized if the machine is a DFA. Similarly to sm-showtransitons, NDFA machine execution may only be visualized if the given word is in the machine’s language. For further details on machine execution visualization in FSM, the reader is referred to a previous publication (Morazán et al., 2020).

3.3. An Illustrative Example

To illustrate programming in FSM, consider the following example:

     ;; L(M) = ab^∗
     (define ab* (make-dfa '(S F)
                           '(a b)
                           'S
                           '(F)
                           '((S a F) (F b F))))

     ;; Tests
     (check-equal? (sm-apply ab* '()) 'reject)
     (check-equal? (sm-apply ab* '(b)) 'reject)
     (check-equal? (sm-apply ab* '(a)) 'accept)
     (check-equal? (sm-apply ab* '(a b b)) 'accept)

The language of this machine is the set of all words that start with an a and end with an arbitrary number of bs. The constructor adds a dead state and the omitted transitions to the dead state to create a transition function. The tests validate ab*. Machine traces are:

     > (sm-showtransitions ab* '(a b))
     '(((a b) S) ((b) F) (() F) accept)
     > (sm-showtransitions ab* '(b a a))
     '(((b a a) S) ((a a) ds) ((a) ds) (() ds) reject)

The trace is a list of machine configurations ending with the result. Each configuration contains the unconsumed input and the machine’s state. The transition diagram obtained using (sm-graph ab*) is: [Uncaptioned image]

The starting state is denoted in a green circle. Final states are denoted in double black circles.

Machine execution may be visualized in two views: control view and transition diagram view. Figure 3(a) displays ab*’s state after one step using the transition diagram view. The edge representing the last transition used is highlighted in blue. Figure 3(b) displays ab*’s same state using the control view. The last transition used is depicted with a dashed line from the previous state to the center and a solid arrow from the center to the current state. Visualization of NDFA execution is done in the same manner.

4. Computation Graphs in FSM

A primary goal for computation graphs is to visually demonstrate why a nondeterministic machine rejects a given word. They are equally useful in visually explaining why a nondeterministic machine accepts a given word. Given that, more often than not, students find computation graphs based on configurations confusing, a different representation is used in FSM. In an FSM computation graph, nodes represent states and edges represent transitions used by any possible computation that entirely consumes the given word. To guarantee that every possible computation entirely consumes the given word, if necessary, transitions to a fresh dead state are added. In an FSM computation graph, there are no repeated states. If a computation ends in a final state then we have visual proof that the word is accepted. If no computation ends in a final state then we have visual proof that the word is rejected.

There are two immediate concerns that arise with this approach. The first is that there must exist a way to distinguish states that are only traversed on all computations from states reached at the end of any computation. In an FSM computation graph, this is achieved by using color. States in which any computation ends are highlighted in crimson and all other nodes are black. Observe that a state highlighted in crimson may also be an intermediate state is some computations.

The second is how to represent a computation that does not entirely consume the given word. A nondeterministic machine, unlike a deterministic machine, may reach a non-accepting configuration in which no transition rules apply. By definition, the machine halts and rejects. This makes computation graphs difficult to read, because a state may be highlighted in crimson both if the input is or is not completely consumed. To eliminate this potential pitfall, when no transitions apply, a transition consuming the next word element to a fresh non-accepting dead state is added. In the dead state, the remaining input is consumed. In this manner, all computations end in a state in which all the input is consumed. A student may easily conclude that the input word is not in the machine’s language because it is completely consumed and rejected by all computations.

5. NDFA Computation Graphs

A nice characteristic of DFAs and NDFAs is that they decide a language. That is, they always accept or reject the given word. This means that a computation graph can always be constructed because every possible computation is guaranteed to end. This holds in FSM even when there are loops only involving $\epsilon$ -transitions, because FSM’s implementation of nondeterminism does not explore the same configuration more than once.

5.1. Design Idea

To build an NDFA computation graph from a given machine and a given word, a breadth-first traversal of the computation tree is performed. A configuration is represented as a structure, ndfa-stuci, that is defined as follows:

     ;; An ndfa configuration is a structure, (ndfa-stuci state word),
     ;; containing a state and the unconsumed input.
     (struct ndfa-stuci [state ui] #:transparent)

The search takes as input a (listof ndfa-stuci)¹¹1This denotes a type: a list that contains zero or more ndfa-stucis. for the configurations to explore and an accumulator for the explored configurations. The accumulator is used to prevent processing the same configuration more than once. For each unexplored ndfa-stuci, new computation graph edges and new ndfa-stucis are created by using all applicable transitions. If the list of new ndfa-stucis is empty then the search stops and the new list of computational graph edges is returned. Otherwise, the new list of computational graph edges is appended to the result of recursively processing the new ndfa-stucis and a new accumulator that is obtained by appending the given ndfa-stucis and the accumulator. In essence, at every step the next level of the computation tree is generated for exploration.

There are two varieties of edges: regular edges, ndfa-edges, that have a destination state that does not need to be highlighted (i.e., not the last transition used in a computation) and special edges, ndfa-spedges, that have a destination state that needs to be highlighted (i.e., a transition used in a computation that leaves the unconsumed input empty). They are defined as follows:

     ;; An ndfa-Edge is structure that is either:
     ;; 1. (ndfa-edge state symb state)
     ;; 2. (ndfa-spedge state symb state)

For a given ndfa-stuci, the new edges generated may be of either type. A conditional expression is used, outlined as follows, to generate the new edges:

Unconsumed input is empty:: New ndfa-spedges are generated from applicable empty transitions. Only ndfa-spedges are created because for any state reached the word is entirely consumed.
Length of unconsumed input is 1:: This means that the given ndfa-stuci represents a configuration that has a single input element left to process. A list of new edges is created by appending the list of ndfa-spedges obtained from applicable transition rules that consume the last input element and the list of ndfa-edges obtained from applicable empty transition rules.
Length of unconsumed input $>$ 1:: This means that the given ndfa-stuci represents a configuration that has more than a single input element left to process. A list of new ndfa-edges is created using all applicable rules. No new ndfa-spedges are generated because none of the applicable rules empty the unconsumed input.

On an empty list of new edges, a list containing an ndfa-spedge to the fresh dead state consuming the word’s next element is returned. On a new list of edges that only represent empty transitions, an ndfa-spedge to the fresh dead state consuming the word’s next element is added to the list of new edges. It is safe to make these ndfa-spedges because the rest of the input for any such computation is consumed in the fresh dead state. Otherwise, the list of new edges is returned.

The returned list of computation graph edges is an overestimation. It may contain repetitions, transitions represented as both an ndfa-edge and a ndfa-spedge, and redundant edges if the word is accepted by the given machine. Therefore, the returned list of edges is filtered to remove duplicates and to remove any ndfa-edges that are also ndfa-spedges. The ndfa-edges, not the ndfa-spedges, are removed because the spedges are needed to correctly highlight states reached when the word is entirely consumed. Finally, if the given word is accepted by the given machine then the computation graph ought to only contain edges that take the machine from the starting state to an accepting state consuming the given word. This is achieved using sm-showtransitions. From the result returned by sm-showtransitions, the set of edges traversed is extracted. Any computation-graph edge not in this set is discarded.

Finally, to generate the computation graph, ndfa-edges are rendered as solid lines given that they correspond to edges in the machine’s state transition diagram. The rendering of ndfa-spedges requires examining the destination state. If the destination state is a machine state then the edge is rendered as a solid black arrow. If the destination state is the fresh dead state then the edge is rendered as a dashed black arrow. In both cases, the destination state is highlighted in crimson.

5.2. Illustrative Example

To illustrate the result of implementing the algorithm outlined, consider the following NDFA designed by a student:

     ;; L = ((a b a)  $\cup$  (a (b b a)^∗ b))^∗
     (define M (make-ndfa '(S A B C D E F G)
                          '(a b)
                          'S
                          '(S)
                          `((S a A) (S a B) (A b C)
                            (B b D) (B b F) (C a E)
                            (D ,EMP S) (E ,EMP S) (F b G) (G a B))))
     ;; Tests
     (check-equal? (sm-apply M '(b b)) 'reject)
     (check-equal? (sm-apply M '(a b a a b b)) 'reject)
     (check-equal? (sm-apply M '()) 'accept)
     (check-equal? (sm-apply M '(a b a a b)) 'accept)
     (check-equal? (sm-apply M '(a b a a b a)) 'accept)

The state transition diagram is displayed in Figure 4. The language is the concatenation of an arbitrary number of '(a b a) and '((a (b b a))^∗ b))).

The computation graph obtained from applying M to word='(a b b a b b) is displayed in Figure 5. The only states highlighted in crimson are G and ds, which are not final states. Thus, no computations end in an accepting state and, therefore, it is straight-forward to see that the word is rejected by the machine. Also notable is that there are both computations that do not consume and computations that consume all the input. The existence of computations that do not consume all the input is extrapolated from the transitions into ds from S, C, and D. These transitions indicate that these states are reached, but the machine is unable to consume the rest of the input. The existence of computations that consume all the input are extrapolated from machine states, like G, that are highlighted in crimson. For such a state, the state is reached by a computation after consuming the given word but the computation rejects given that G is not a final state. Thus, we have a clear visualization that informs us that the word is rejected, that not all computations completely consume the word, and that some computations completely consume the word.

Finally, it is enlightening to illustrate the computation graph generated when the given word is accepted. Figure 6 displays the FSM computation graph obtained from applying M to the word '(a b a a b a). It is easily observed that the word is accepted given that a single final state is highlighted in crimson. Furthermore, only the edges representing the transitions used in a single accepting computation are displayed. This suffices to conclude that the machine accepts the word. There is no need to explore nor display other computations, if any, that also accept the word.

5.3. Implementation Sketch

The function to generate computation-graph edges may be elegantly implemented using higher-order functions. This function composes the functions to traverse the computation tree, to remove duplicates, to remove redundant ndfa-edges, and to remove redundant edges when the machine accepts the word as follows:

     ;; fsa word  $\rightarrow$  (listof ndfa-Edge)
     ;; Purpose: Return computation-graph edges for given fsa and word.
     (define (make-cg-edges M word)
       ((compose remove-redundant-edges-on-accept
                 remove-duplicate-Edges
                 remove-duplicates
                 computation-tree->cg-edges)
        (list (ndfa-stuci (sm-start M) word)) '()))

The arguments are the initial values for the two accumulators needed to traverse the computation tree: the only configuration to explore is the initial configuration and there are no configurations that have been explored.

     ;; (listof ndfa-stuci) (listof ndfa-stuci)  $\rightarrow$  (listof ndfa-Edge)
     ;; Purpose: Return the list of ndfa-edges and ndfa-spedges obtained by
     ;;          traversing the computation tree using BFS
     ;; Accumulator Invariants:
     ;;  to-visit = list of unexplored ndfa-stucis
     ;;  visited  = list of explored ndfa-stucis
     (define (computation-tree->cg-edges to-visit visited)
       (let* [(new-edges
                (remove-duplicates
                  (append-map ( $\lambda$  (s) (add-to-edges s)) to-visit)))
              (new-stucis
                (add-to-stucis new-edges to-visit visited))]
         (if (empty? new-stucis)
             new-edges
             (append new-edges
                     (computation-tree->cg-edges new-stucis
                                                 (append to-visit visited))))))

Figure 7. The function to traverse the computation tree.

The function that performs the computation tree’s breadth-first search is displayed in Figure 7. The function creates new edges from the ndfa-stucis that must be visited. From these edges, the next ndfa-stucis needed are computed without recreating any in to-visit nor visited. If there are no new ndfa-stucis to explore then the new list of edges is returned. Otherwise, the new list of edges is appended to the result of recursively exploring the new ndfa-stucis and adding the given ndfa-stucis to visited.

;; ndfa-stuci  $\rightarrow$  (listof ndfa-Edge)
;; Purpose: Compute ndfa-Edges for given ndfa-stuci
(define (add-to-edges stuci)
  (if (empty? (ndfa-stuci-ui stuci))
      (map ( $\lambda$  (r) (ndfa-spedge (ndfa-rule-fromst r)
                               (ndfa-rule-read r)
                               (ndfa-rule-tost r)))
           (filter ( $\lambda$  (r) (and (eq? EMP (ndfa-rule-read r))
                               (eq? (ndfa-stuci-state stuci)
                                    (ndfa-rule-fromst r))))
                   (sm-rules M)))
      (let* [(new-rules
               (if (= 1 (length (ndfa-stuci-ui stuci)))
                   (append (map ( $\lambda$  (r) (ndfa-spedge (ndfa-rule-fromst r)
                                                    (ndfa-rule-read r)
                                                    (ndfa-rule-tost r)))
                                (filter nonempty-rules (sm-rules M)))
                           (map ( $\lambda$  (r) (ndfa-edge (ndfa-rule-fromst r)
                                                  (ndfa-rule-read r)
                                                  (ndfa-rule-tost r)))
                                (filter empty-rules (sm-rules M))))
                   (map ( $\lambda$  (r) (ndfa-edge (ndfa-rule-fromst r)
                                          (ndfa-rule-read r)
                                          (ndfa-rule-tost r)))
                        (filter empty-or-nonempty-rules (sm-rules M)))))]
              (cond [(empty? new-rules)
                     (list (ndfa-spedge (ndfa-stuci-state stuci)
                                        (first (ndfa-stuci-ui stuci))
                                        my-ds))]
                    [(andmap empty-trans? (map ndfa-Edge-read new-rules))
                     (cons (ndfa-spedge (ndfa-stuci-state stuci)
                                        (first (ndfa-stuci-ui stuci))
                                        my-ds)
                           new-rules)]
                    [else new-rules]))))

Figure 8. Function to compute new edges from given ndfa-stuci.

The function to compute new edges from an unexplored ndfa-stuci is displayed in Figure 8. If the given ndfa-stuci’s unconsumed input is empty, ndfa-spedges are created for all applicable empty transition rules. If the unconsumed input is not empty, for all applicable rules, new computation graph edges are created. If the length of the unconsumed input is 1 then ndfa-spedges and ndfa-edges are created using, respectively, the non-empty and the empty transition rules. Otherwise, only ndfa-edges are created using all applicable rules. As outlined in Section 5.1, if the new list of edges is empty then a single ndfa-spedge to the fresh dead state is generated. If the new list of edges is generated from only empty transitions then an ndfa-spedge to the fresh dead state is added to the new rules. Otherwise, the new rules are returned.


;; (listof ndfa-Edge) (listof ndfa-stuci) (listof ndfa-stuci)  $\rightarrow$  (listof ndfa-stuci)
;; Purpose: Compute new ndfa-stucis from the given ndfa-stucis to explore
;; Accumulator Invariant: visited = list of generated ndfa-stucis
(define (add-to-stucis Edges to-visit visited)
  ;; ndfa-stuci  $\rightarrow$  (listof ndfa-stuci)
  ;; Purpose: Compute new ndfa-stucis from the given ndfa-stuci
  (define (add-to-stucis-helper stuci)
    (if (empty? (ndfa-stuci-ui stuci))
        (map ( $\lambda$  (e) (ndfa-stuci (ndfa-Edge-tost e) (ndfa-stuci-ui stuci)))
             (filter ( $\lambda$  (e) (and (eq? (ndfa-Edge-read e) EMP)
                                 (eq? (ndfa-Edge-fromst e)
                                      (ndfa-stuci-state stuci))))
                     Edges))
        (map ( $\lambda$  (e) (ndfa-stuci (ndfa-Edge-tost e)
                                (if (eq? (ndfa-Edge-read e)
                                         (first (ndfa-stuci-ui stuci)))
                                    (rest (ndfa-stuci-ui stuci))
                                    (ndfa-stuci-ui stuci))))
             (filter ( $\lambda$  (e) (and (or (eq? (ndfa-Edge-read e)
                                          (first (ndfa-stuci-ui stuci)))
                                     (eq? (ndfa-Edge-read e) EMP))
                                 (eq? (ndfa-Edge-fromst e)
                                      (ndfa-stuci-state stuci))))
                     Edges))))
  (if (empty? to-visit)
      '()
      (let* [(new-stucis (add-to-stucis-helper (first to-visit)))]
        (if (or (empty? new-stucis)
                (andmap ( $\lambda$  (s) (member s visited)) new-stucis))
            (add-to-stucis Edges (rest to-visit) visited)
            (append new-stucis
                    (add-to-stucis Edges
                                   (rest to-visit)
                                   (append new-stucis visited)))))))

Figure 9. The function to compute new ndfa-stucis.

Finally, the function to create new ndfa-stucis from the list of ndfa-Edges is presented in Figure 9. If the list of ndfa-stucis to visit is empty, the function halts and returns an empty list of ndfa-stucis. Otherwise, the given ndfa-stucis are traversed to compute the new ndfa-stucis at the next level of the computation tree. If there are no new ndfa-stucis or all are already visited then the rest of the ndfa-stucis are processed recursively. Otherwise, the new ndfa-stucis are appended to the result of recursively processing the rest of the given ndfa-stucis and adding the given ndfa-stucis to the accumulator for generated ndfa-stucis. The list of new ndfa-stucis for a given ndfa-stuci is generated by add-to-stucis-helper in Figure 9. New ndfa-stucis are generated using the applicable edges’ destination states. If the given ndfa-stuci’s unconsumed input’s first element matches the element consumed by an edge then the unconsumed input is the rest of the given ndfa-stuci’s unconsumed input. Otherwise, the unconsumed input is unchanged (i.e., the edge represents an empty transition).

6. Concluding Remarks

This article presents a novel graphical approach to help students understand why nondeterministic finite-state automata reject a given word. The approach is integrated into FSM–a domain-specific language that allows programmers to easily build and execute such machines. Unlike previous approaches, that focus on tracing machine configurations, the work presented here builds on the transition diagram of a given machine. The nodes in an FSM computation graph represent states traversed by any computation and the edges represent the transitions used in any computation. Computation graphs are purposely built with a fresh dead state, if needed, so that all computations may consume all their input. States reached that satisfy this criteria, in any possible computation, are highlighted in crimson to indicate where at least one computation ends. The result is the generation of computation graphs that summarize what happens on all computations. They allow students to easily determine if any crimson states are final states. If none are, then they can see that the machine rejects because no computation ends in an accepting configuration. In addition, students easily associate a computation graph with a machine’s state transition diagram and can easily determine transitions used and states traversed during any computation. Finally, computations for accepted words only contain states and transitions traversed to reach an accepting configuration.

Future work includes the generation of computation graphs for pushdown automata and for Turing machines that decide and semidecide a language. Generation of such graphs requires care, because the length of a computation may be infinite. This means that in some cases it may only be possible to approximate a computation graph.

References

(1)
Bettini (2016) L. Bettini. 2016. Implementing Domain-Specific Languages with Xtext and Xtend. Packt Publishing.
Cogumbreiro and Blike (2022) Tiago Cogumbreiro and Gregory Blike. 2022. Gidayu: Visualizing Automaton and Their Computations. In Proceedings of the 27th ACM Conference on on Innovation and Technology in Computer Science Education Vol. 1 (Dublin, Ireland) (ITiCSE ’22). Association for Computing Machinery, New York, NY, USA, 110–116. https://doi.org/10.1145/3502718.3524742
Felleisen et al. (2018) Matthias Felleisen, Robert Bruce Findler, Matthew Flatt, Shriram Krishnamurthi, Eli Barsilay, Jay McCarthy, and Sam Tobin-Hochstadt. 2018. A Programmable Programming Language. Commun. ACM 61, 13 (March 2018), 62–71. https://doi.org/10.1145/3127223
Gurari (1989) Eitan M. Gurari. 1989. An Introduction to the Theory of Computation. Computer Science Press.
Hopcroft et al. (2006) John E. Hopcroft, Rajeev Motwani, and Jeffrey D. Ullman. 2006. Introduction to Automata Theory, Languages, and Computation (3rd Edition). Addison-Wesley Longman Publishing Co., Inc., USA.
Kleppe (2008) A. Kleppe. 2008. Software Language Engineering: Creating Domain-Specific Languages Using Metamodels. Pearson Education.
Landa and Martinez-Trevino (2019) Raquel Landa and Yolanda Martinez-Trevino. 2019. Relevance of Immediate Feedback in an Introduction to Programming Course. In 2019 ASEE Annual Conference & Exposition. ASEE Conferences, Tampa, Florida. https://doi.org/10.18260/1-2--33235
Lewis and Papadimitriou (1997) Harry R. Lewis and Christos H. Papadimitriou. 1997. Elements of the Theory of Computation (2nd ed.). Prentice Hall PTR, Upper Saddle River, NJ, USA. https://doi.org/10.1145/300307.1040360
Linz (2011) Peter Linz. 2011. An Introduction to Formal Languages and Automata (5th ed.). Jones and Bartlett Publishers, Inc., USA.
Martin (2003) John C. Martin. 2003. Introduction to Languages and the Theory of Computation (3 ed.). McGraw-Hill, Inc., New York, NY, USA.
Marwan et al. (2020) Samiha Marwan, Ge Gao, Susan Fisk, Thomas W. Price, and Tiffany Barnes. 2020. Adaptive Immediate Feedback Can Improve Novice Programming Engagement and Intention to Persist in Computer Science. In Proceedings of the 2020 ACM Conference on International Computing Education Research (ICER ’20). Association for Computing Machinery, New York, NY, USA, 194–203. https://doi.org/10.1145/3372782.3406264
Mohammed (2020) Mostafa Kamel Osman Mohammed. 2020. Teaching Formal Languages through Visualizations, Simulators, Auto-graded Exercises, and Programmed Instruction. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education, SIGCSE 2020, Portland, OR, USA, March 11-14, 2020, Jian Zhang, Mark Sherriff, Sarah Heckman, Pamela A. Cutter, and Alvaro E. Monge (Eds.). ACM, 1429. https://doi.org/10.1145/3328778.3372711
Morazán et al. (2020) Marco T. Morazán, Joshua M. Schappel, and Sachin Mahashabde. 2020. Visual Designing and Debugging of Deterministic Finite-State Machines in FSM. Electronic Proceedings in Theoretical Computer Science 321 (aug 2020), 55–77. https://doi.org/10.4204/eptcs.321.4
Rich (2019) Elaine Rich. 2019. Automata, Computability and Complexity: Theory and Applications. Pearson Prentice Hall.
Rodger (2006) Susan H. Rodger. 2006. JFLAP: An Interactive Formal Languages and Automata Package. Jones and Bartlett Publishers, Inc., USA.
Rodger et al. (2006) Susan H. Rodger, Bart Bressler, Thomas Finley, and Stephen Reading. 2006. Turning automata theory into a hands-on course. In Proceedings of the 37th SIGCSE Technical Symposium on Computer Science Education, SIGCSE 2006, Houston, Texas, USA, March 3-5, 2006, Doug Baldwin, Paul T. Tymann, Susan M. Haller, and Ingrid Russell (Eds.). ACM, 379–383. https://doi.org/10.1145/1121341.1121459
Sipser (2013) Michael Sipser. 2013. Introduction to the Theory of Computation (3rd ed.). Cengage Learning.
Venables and Haywood (2003) Anne Venables and Liz Haywood. 2003. Programming Students NEED Instant Feedback!. In Proceedings of the Fifth Australasian Conference on Computing Education - Volume 20 (Adelaide, Australia) (ACE ’03). Australian Computer Society, Inc., AUS, 267–272.
Walter et al. (2009) Tobias Walter, Fernando Silva Parreiras, and Steffen Staab. 2009. OntoDSL: An Ontology-Based Framework for Domain-Specific Languages. In Model Driven Engineering Languages and Systems, Andy Schürr and Bran Selic (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 408–422.
White and Way (2006) Timothy M. White and Thomas P. Way. 2006. jFAST: A Java Finite Automata Simulator. SIGCSE Bull. 38, 1 (March 2006), 384–388. https://doi.org/10.1145/1124706.1121460