The Lounge is rated Safe For Work. If you're about to post something inappropriate for a shared office environment, then don't post it. No ads, no abuse, and no programming questions. Trolling, (political, climate, religious or whatever) will result in your account being removed.
I think I can parse anything with Parsley. I'm pretty confident in it at this point, despite its immaturity. It still is cranky because I'm not finished with it, but now all of the foundation is there.
It can parse C#. That's a tall order. With a little work (mostly lexer support for significant vs insignificant whitespace) I could parse Python. That would be easier given that lexer support because it's not so ambiguous.
I also never had to break from an LL(1) parser algorithm except in my hand-rolled routines.
Parsley will eventually be LL(k), i think. Or maybe LL(*). It depends on how big the code that gets generated winds up being.
It *did* generate a 1.3MB source file, but unlike the 800k source file i got from ANTLR for parsing C#6 (which didn't work) you can follow it in a debugger pretty easily, and Parsley actually parses C#, or at least the subset that is useful to me, though I'm confident by now i could pretty easily parse the rest of it too. I just don't have anywhere in the CodeDOM to put what i'd parse! CodeDOM knows nothing of things like anonymous methods or lambda expressions or LINQ
This is just so cool. I've been wanting to do this for a long time. (Parse C# with one of my parser generators that is)
I never thought it would wind up being a recursive descent parser generator that did it. I wrote it that way for a lark, but it worked out really well.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
Yes, error checking isn't usually added until it starts to "work", so it can cause upheaval. But it's so much easier to debug if the parser--and also what is basically an interpreter, for me--can pinpoint where they failed, with a message that makes sense. And keep going, so that you don't have to uncover one problem at a time.
end is the position of the delimiter where the expression ends. It's found by doing a lookahead for the closing punctuation, which is one of ;},)]:= depending on the context. If something goes wrong while parsing the expression, a log containing the expression is generated and the parse resumes after end. I guess that makes it LL(*) if I remember correctly.