The Lounge is rated Safe For Work. If you're about to post something inappropriate for a shared office environment, then don't post it. No ads, no abuse, and no programming questions. Trolling, (political, climate, religious or whatever) will result in your account being removed.
I have no idea what this does, or why dummy rules would have to be made in the first place. I'm not even clear on what a rule is, although I have a vague idea.
This code has no documentation. At best, it's mostly a reimplementation of lex in C#, so I kind of know what inputs it accepts from the lex man pages. It has flex extensions too though but not all of them. Who knows what it supports? I'm not even sure the original author does and the primary engine hasn't been updated in 6 years.
Even if i get this working how I want it will probably always be C# only unless i want to debug slang enough to get it to work with it or retool all of the code generation to use the codedom by hand. Ugh.
And that's not even the worst bit.
I have half a mind to leave the parser in place, preparse my desired document format, and then write out a document to this parser spec format in memory and then feed it to it that way, but what a nasty mess!
That's why I wrote a parser and lexer from scratch. Sure, code generation helps, but it just shifts the problem to one of getting the grammar right and fixing it for unusual cases. Ever looked at the C++ quasi-grammar sprinkled throughout cppreference.com[^]? I'd hardly know where to begin with such shite. Of course, it's not the fault of that site, which I consult frequently. It's probably inevitable when a language continually evolves while being reluctant to deprecate anything.
I mean, my previous version of Rolex used my own hand rolled parser, but this version is using the Gplex engine and I want to keep the regex syntax the same as Gplex - that and it's near impossible to build up the regex trees for gplex on my own - the trees are so convoluted that i'm basically stuck using the parser they gave me.
I found that the regex parsing part - that subset, is handrolled recursive descent so that helps at least, but it's still ugly.
But I'm just looking for an efficient DFA lexer engine that handles unicode. GPLEX does it but the output is ugly and multi-file, and the input doc is fugly and looks like a lex spec, so I'm gutting gplex and changing the input and outputs but keeping the engine, if that makes sense.
Right now I've got the output where i like it but I'm working on changing the input spec.