This is a simple regex engine written in Scala that I made following this article.
(borrowed and modified from Matt Might)
<regex> ::= <term> '|' <regex>
::= <term>
<term> ::= <factor>+
<factor> ::= <factor> ('*'| '?'|'+')?
::= <base> ('*'| '?'|'+')?
<base> ::= <char>
::= '\' <char>
::= '(' <regex> ')'
The RegexParser
class is used to parse a string that represents a regex. Create a new RegexParser
instance and use its parse()
method in order to create a Machine instance that represents that
regex.
The MachineRunner
object has a method called testInput
that takes a Machine
and some inputString
and returns a tuple of a boolean followed by a sequence of ProgressionWithChar
instances. The boolean is true
if inputString
matches the regex described by the Machine
and false otherwise. The sequences of ProgressionWithChars
are a record of the sets of states that Machine was evaluating (still considered valid) as it consumed inputString
.
The Machine
companion object also has a helper function, called toDOTFileFormat
that takes a Machine
, and returns a string representation
of that machine using the DOT graph description language.
This graph has nodes as states, unlabeled edges representing transitions that aren't contingent on any character, and labeled edges
representing transitions that are contingent on the character that the edge is labeled with.
Example of a graph generated from toDOTFileFormat
's output from the regex ((ab|b?)*w)
, after a single b
was consumed. The valid states are highlighted in red:
-test cases?
-work on visualization of state machine