Simple compile time DFA based regular expression for C++20.
Single header file - compile_time_regexp.h, without dependencies.
Tested compilers, with flag -std=c++20
:
- clang 15+
- gcc 13+
// Compile-time DFA build.
constexpr auto dfa = ctre::Compile<"(a|b)*ab">();
// Runtime matching.
auto b = dfa.Match("ababab");
// Runtime DFA build and matching.
auto b1 = ctre::Match("(a|b)*ab", "ababab");
// Compile-time DFA build and matching.
// Compile faster than: ctre::Compile<pattern>().Match(s);
constexpr auto b2 = ctre::Match("(a|b)*ab", "ababab");
// Compile-time DFA build and matching.
auto b3 = ctre::Match<"(a|b)*ab", "ababab">();
ab
matchesab
.a|b
matchesa
orb
.a*
matches 0 or morea
.a+
matches 1 or morea
.a?
matches 0 or 1a
.[0-9a-z]
matches characters in some ranges.(a|b)*
parentheses to force priority
For instance, (a|b)*ab
matches ababab
.
make install-deps
make build
make tests
-
Compiling with clang causes error:
constexpr evaluation hit maximum step limit; possible infinite loop?
Try to increase value of option
-fconstexpr-steps
, e.g.clang++ ./example/example.cc -std=c++20 -O3 -fconstexpr-steps=1000000000 -I.
.
Three main steps:
NfaParser
- Parse a given pattern string to NFA via dijkstra two stack algorithm.DfaBuilder
- Build a DFA from given NFA via subset-construction.DfaMinifier
- Minify a DFA to an equivalent DFA that has a minimum number of states via hopcroft alogrithm.
Chinese blog, 中文博客 - 实现一个简单的正则表达式引擎
BSD.