Implementing a non-deterministic finite state machine (NFA)
I am trying to develop a simulation that does a non-deterministic state machine in Java. The first command line argument is a text file that defines the machine. The second argument is the input string. If it accepts a string, it prints "accept" to standard output, followed by a list of accept states in which it can end. If it rejects, it issues "reject" followed by a list of all possible end states.
For example, the text:
state 1 start
state 2
state 7 accept
transition 1 0 1
transition 1 1 1
transition 1 0 2
transition 2 0 2
transition 2 0 1
transition 2 1 1
transition 2 0 7
transition 7 0 7
transition 7 1 7
which looks like this:
with an input string of 10 outputs
reject 1 2
and with an input string 000 will output
accept 7
I need to know what data structures to use. I've been thinking about using hashmaps and sets, but I'm not sure.
source to share
I think you should convert your automaton to deterministic and then do the obvious.
There are three general ways to implement a state machine in software:
- Imagine the transition function as a table (2D array) and after each character read, find the next state in it.
- Use nested statements
switch
for the current state and the input symbol to explicitly define the next state in your code. - Use a state pattern: think of each state as a class with a method that returns the next state given the input character.
Since you will need to create your automaton at runtime, the last two options don't really apply, so you have to go with a lookup table. If your alphabet is small, you can write down all the transitions. If the alphabet is huge this can waste a lot of space and you might consider compressing the spreadsheet, which used to be an active research field.
For Downvoters:You cannot write a deterministic program that "executes" a non-deterministic automaton. However, by a fairly fundamental theorem of theoretical computer science, you can turn any non-deterministic finite state machine into a deterministic one by a fully automated procedure, that is, a deterministic algorithm that can be easily implemented in software. Every computer science student was required to complete this procedure at least once using a pencil and paper. If you do this, you will quickly see how to track the state of the states of the original automaton, which includes the states of the deterministic. Then just simulate the latter, see what state it ends in, and print the states of the original non-deterministic automaton that make it up.
source to share
NFA means you can have a set of states at a time. Thus, to represent the current state of the NFA, use Set.
Set interface ensures that there are no duplicate states. This is important because NFA has more than one condition at a time. If you are using any other dataset this gurrentee is not. In the case of NFA, the probability that a duplicate state in each transition will be exponential. But the set of states is always finite. The Set interface ensures that your current set is filled with duplicates.
For space and performance, you can use an Enumset as an Enumset using bit vectors to store the state internally.
Algorithm:
initialize to start state
Process line from right to left, starting at index 0;
for a symbol when updated using a state transition;
If a final state occurs for any of this transition, which means the string is accepted.
source to share