We have shown that any regular expression can be converted into a finite automata. Now we will prove the opposite, that any finite automata can be converted into a regular expression.

First we will need a new type of finite automata, the Generalized Nondeterministic Finite Automata.

A GNFA is an NFA where each transition can have arbitrary regular expressions as labels rather than just elements of $\Sigma$.

GNFAs can consume multiple input symbols on each transition.

We also require GNFAs follow these rules:

- The start state has a transition going to every other state, but no transitions into it.
- There is a single accept state, and every state transitions to it, but it doesn't transition to any state.
- Except for the start and accept state, each other state has transitions to every other state (including itself).

These rules will make the conversion simpler.

To convert a DFA into a GNFA we follow these steps:

- Add a new start state with an $\epsilon$-transition to the old start state.
- Add a new accept state with $\epsilon$-transitions from each old accept state.
- If any transitions have multiple symbols, merge them with a union.
- If there are transitions missing, create transitions with $\emptyset$ as the label.

How would we convert the following DFA into a GNFA?

Now we need to convert the GNFA into a Regular Expression.

This is done by removing the states of the GNFA one at a time until only the start and accept states remain. The label on the transition from the start state to the accept state will be our regular expression:

We can do this by choosing any state, ripping it out, then repairing the transitions so that the machine functions the same as before.

Formally: we can define a GNFA as:

- $Q$ is the set of states.
- $\Sigma$ is the alphabet.
- $\delta: (Q - \{q_{accept}\}) \times (Q - \{q_{start}\}) \rightarrow R$ is the transition function.
- $q_{start}$ is the start state.
- $q_{accept}$ is the accept state.

- Select a state $q_{rip} \in (Q - \{q_{accept}, q_{start}\})$.
- $Q^\prime = Q - \{q_{rip}\}$.
- For any $q_i \in (Q^\prime - \{q_{accept}\})$ and any $q_j \in (Q^\prime - \{q_{start}\})$, let

$\delta^\prime(q_i, q_j) = (R_1)(R_2)^\ast(R_3) \cup (R_4)$, where

$R_1 = \delta(q_i, q_{rip})$,

$R_2 = \delta(q_{rip}, q_{rip})$,

$R_3 = \delta(q_{rip}, q_j)$,

$R_4 = \delta(q_i, q_j)$

Below is an illustration:

How could we convert this DFA into a regular expression?

How could we take this more complex DFA all the way to a regular expression?

We now have three ways to talk about regular languages:

- DFAs
- NFAs
- Regular Expressions

All three are equivalent.

Next we will discuss languages that are not regular and thus have no finite automata recognizing them, or regular expression producing them.

Copyright © 2018 Ian Finlayson | Licensed under a Creative Commons Attribution 4.0 International License.