|
Normally I'd have to add a lot to these statements, but since you're generating code, I'll just say this:
Use whatever works for you. If VS can't handle the code your generator produces, then use another generator option, or a different generator. Discussing the pros and cons of the available methods is meaningless if your code doesn't compile.
That said, I'm not an expert on parsers, but I wonder whether it is the usual approach to implement one as a state machine. Wouldn't it be straightforward to just implement each rule of the grammar as a function that converts a string of tokens (recursively) into a syntax tree? IIRC, there's typically no need to maintain a global state, and therefore no need to construct a state engine. A state would only be needed in case of a context-dependend grammar/language. I may have missed it, but is that what you have to deal with - a context depended language?
GOTOs are a bit like wire coat hangers: they tend to breed in the darkness, such that where there once were few, eventually there are many, and the program's architecture collapses beneath them. (Fran Poretto)
|
|
|
|
|
For the lexer, a state machine is the typical approach.
The state machine technique for the parser is to implement LL(k) similar to how ANTLR implements LL(*)
The reason for this is, creating an LL(k) parser with your technique requires far too much code.
It's both more efficient and more compact to use a state machine rather than a giant parse table.
There's a research paper on doing it this way somewhere but i just googled for it and I can't dredge it up again.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
That's not what I expected, but as I said, I'm not an expert. Thanks for the feedback.
GOTOs are a bit like wire coat hangers: they tend to breed in the darkness, such that where there once were few, eventually there are many, and the program's architecture collapses beneath them. (Fran Poretto)
|
|
|
|
|
I'm no expert on parsers, but can't you use some parser generator, such as Yacc - Wikipedia[^] ? Just an idea...
GOTOs are a bit like wire coat hangers: they tend to breed in the darkness, such that where there once were few, eventually there are many, and the program's architecture collapses beneath them. (Fran Poretto)
|
|
|
|
|
I could, but i don't see why I'd write a parser generator, throw it away, and then use Yacc, which doesn't even use the same parsing algorithm as the generator i just wrote.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Time after can change colour (4)
|
|
|
|
|
TINT
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
ya
|
|
|
|
|
There's some obvious cross-platform rich UI skills here currently being wasted on trivial pursuits [^].
What do you think about this approach:Quote: Dear Russell,
Here at CodeProject, so many of us are impressed by your work in Material Design multimedia; our fourteen million++++++ member community can offer you a chance to develop your talents with support ... and friendly challenge. Here you can write your good name and ideas in the sky of technical eternity (good to the last byte).
You can enjoy the bonhomie of the Lounge Parser Discussion Forum, and I guarantee you the mix of esoteric conversations, personal revelations, daily-life minutiae, and heart-warming stories of geeks who love gadgets too much, will not only amuse you, but, stimulate deep reflections on the meaning of it all.
I want to assure you that: while I am just a way-past-use-by-date pedantic oddball, there's so many people here to bond with, learn from, teach, ventilate with, enjoy, celebrate, etc., who can write intelligible English uncluttered by archaic vocabulary. So, please don't let my flourishes and foofaraw put you off.
I hope you will not be offended if I dare to say your talents deserve a wider audience than mere photographers.
best wishes, Bill
«One day it will have to be officially admitted that what we have christened reality is an even greater illusion than the world of dreams.» Salvador Dali
|
|
|
|
|
But but but... I'm a photographer!
|
|
|
|
|
PIEBALDconsult wrote: But but but... I'm a photographer! You are so much more than that to me
«One day it will have to be officially admitted that what we have christened reality is an even greater illusion than the world of dreams.» Salvador Dali
|
|
|
|
|
Isn't that kinda like asking someone who organises his wardrobe well to become a carpenter?
I wanna be a eunuchs developer! Pass me a bread knife!
|
|
|
|
|
After using them and building several of them, I still prefer building my parsers by hand.
The only reason I don't is error handling and maintenance, although maintenance with generated parsers is a double edged sword. For starters, it generates a lot more code than you would if you hand rolled a parser, typically. Second, with a hand rolled parser you can parse directly into your final tree. With a generated parser you must first parse into a parse tree, before converting *that* to your final tree, and so maintenance becomes an issue again because there's that second phase which is usually hand-written. Although to be fair, the data you're working with in the second phase has already been validated and error checked for syntax and structure.
I provide macros to help with the second phase, but it doesn't help that much. It would be nice if i could make a grammar description that could completely shape a tree, and create arbitrary objects off of it, to mimic the creation-into-the-final-tree that hand rolled parsers can do.
I don't like that the generated parser will take my codedomgokit source from 500KB to about 1.5-2MB
But I also don't like how sketchy my hand rolled parser is.
Is the extra 1+MB worth it? I guess that's probably 400-500k compiled based on previous experience. Currently CodeDOM Go Kit is about 250K compiled.
I don't know. It's something to think about. I really want a Better Way(TM) - something that handles that second phase and rolls it into the first.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
honey the codewitch writes: I still prefer building my parsers by hand. I even built lexical code by hand, so no arguments from this 🦕.
With a generated parser you must first parse into a parse tree, before converting that to your final tree... 🤮
I provide macros to help with the second phase... Naughty! I'd deprecate macros with prejudice. No, not even deprecate: they'd just vanish overnight. I've only seen one instance where they were used elegantly: to generate assembly opcodes of all things. Maybe I've led a sheltered life. But I'll confess that I never "got" templates until I read a comment describing them as "macros on steroids".
But I also don't like how sketchy my hand rolled parser is. Would you like to borrow my rubber 🦆, who's a refactoring taskmaster? He promises to be gentle.
Quote: Is the extra 1+MB worth it? I'd bet this is a problem with all non-trivial generated code.
|
|
|
|
|
I'm glad you read this post. I was thinking of you when I wrote it. I enjoy our conversations, as we have similar passions, it seems.
My "Macros" are actually type checked. After the code gets parsed and dumped into an AST then I go back and resolve macros and I do it in a type safe manner. They resolve to function calls. It isn't a simple preprocessor, so doesn't have a lot of those traditional drawbacks.
Sure, templates are "macros on steroids" but more importantly perhaps, they're *type checked* "macros".
I love templates in C++. Once you learn Generic Programming(TM) you never go back unless you have to.
By the way, you might want to consider using a lexer like flex or lex for your parser.
The reason is speed, not just ease.
If you learn how to use it (the learning curve isn't big, lex is stupid simple and mostly you write it in C/C++ code) - it will do things like shortest string discovery. It uses these optimizations to skip examining characters in the input stream, speeding up the parse. This is near impossible in a hand built lexer, because you need a lot of tabular data to do it properly.
It also makes your parser less buggy than hand lexing, in my experience. Evaluating tokens is a hell of a lot less error prone than evaluating individual characters, although I'm only assuming you're doing scannerless/lexerless parsing.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
If you could recommend a link for lex or flex, I'll take a look, though it'd probably be hard to retrofit it at this point. The shortest string stuff sounds nice. Even worse is having to back up and re-parse an entire identifier. I ran into this today while writing code to support a label that's the target of a goto . Not that they're used much, but goto is on the list of keywords that I never got around to supporting.
My lexer code is very string oriented, so there aren't distinct lexer and parser phases, which is what I assume you mean by "scannerless/lexerless". It's probably a trade-off between things like the shortest string optimization that you mentioned, and the ability to override rules (e.g. to allow punctuation in a name so that a template instance can be referred to by its verbatim name rather than some mangled monstrosity).
|
|
|
|
|
The man pages are actually VERY good. But also Using <CODE>flex</CODE> - An Overview of @code{flex}, with Examples[^]
That's a very basic tutorial on using flex.
Download Flex[^]
Now, here's something for some reason nobody really talks about.
In practice, forget those samples. They give you basics but nothing real world. Real world though, is even easier.
Have a set of constants for all your symbols. If you intend to parse >= you need a GREATERTHANOREQUAL/GTE constant. Give it an int value. Do it for all your terminals. It doesn't matter what the values are. You're going to use these to distinguish your tokens.
In your scanner:
%option unicode, codepage:raw, out: outputfile.c
%{
%}
%%
<<EOF>> { return EOS; }
"int" {return INT_TYPE; }
"uint" { return UINT_TYPE; }
(_|[[:IsLetter:]])(_|[[:IsLetterOrDigit:]])* { return IDENT; }
[\r\n\t\v\f ]
[\n]|[^\n] { return ERROR; }
basically you just return a constant a value for every token you capture. You can do more - whatever code you need in those brackets is fine. I use them for building my skip lists inside those little brace breakouts above where we return consts. Like
{ yyskip(LINE_COMMENT); }
since i didn't return a value, flex simply "hides" it. It only gives you something back when you return. otherwise it just keeps going. yyskip() is a function I made that would have been included in the "extra code" part above. It adds the current value to my skiplist.
Oh to get access to these values you use int symid = yylex();
and the text returned is in yytext
It's all pretty simple. It's a bit of work defining your tokens and consts up front but hey, it's a lot less work than parsing them yourself.
And it will make this FAST.
I actually generate these files using Parsley
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
One thing I forgot to mention to you - regarding skipping comments.
I learned the hard way, if you're going to skip, you need to manually clear the skip list periodically.
I was trying to clear it after every read, and it doesn't work, because trailing items get lost if you aren't checking *everywhere*
I learned the hard way as the way I exposed these things doesn't leave me room to do this. (Part my fault, and part limitations of the lexer generator i'm using - skipped lists weren't really in the design )
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
I used to build databases by hand, and use PostScript and mark-up languages (the vast majority of which were far better suited to the job than XML) to produce documents.
... But then better tools came along, which allowed me to spend my time and energy on doing the work, rather than on setting up the infrastructure for it.
I wanna be a eunuchs developer! Pass me a bread knife!
|
|
|
|
|
There's one main reason I don't switch back to linux and that is Visual Studio. MonoDevelop is good, but not great. Linux C++ IDE offerings are bunk on linux even if gcc itself is good. KDevelop had potential way back when but oh well.
Has anyone ever tried to get Visual Studio (not VSCode!) running in linux in something other than a virtual machine?
I don't want all that hassle of running windows at all if I don't have to.
I just got my computer back from microsoft. I had to wait for it to install some unplanned updates that rebooted the machine 3 times and took almost 10 minutes. And MS says windows boot times are fast.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
The last VS version was better than garbage on wine was 2008... with a bronze... 2005 got gold...
Maybe here you could find something... .NET tools & editors for Windows, Linux and macOS[^] (I'm using VS Code)
"The only place where Success comes before Work is in the dictionary." Vidal Sassoon, 1928 - 2012
|
|
|
|
|
honey the codewitch wrote: Has anyone ever tried to get Visual Studio (not VSCode!) running in linux in something other than a virtual machine? Little chance for that; too much components in there that will fail the install.
honey the codewitch wrote: I had to wait for it to install some unplanned updates that rebooted the machine 3 times and took almost 10 minutes. And MS says windows boot times are fast. ..an update is not a normal boot. I love my Kubuntu-machine, but to be fair, I also love the VS on Windows, and it certainly doesn't take 10 minutes for a normal boot.
In Linux you'd be loosing more time after each update, since not every packages' dependence will update nicely, going through weird errors and copy/pasting a lot of commands in the console window.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
Eddy Vluggen wrote: In Linux you'd be loosing more time after each update, since not every packages' dependence will update nicely, going through weird errors and copy/pasting a lot of commands in the console window.
You know, for some reason linux and I just get along. I don't tend to have a lot dependency problems, and I understand the various package tools pretty well.
It tends to be less cranky for me than windows overall. Usually if it doesn't work it's because i did something wrong. Windows won't work if I look at it sideways.
Not that I'm down on windows. I love windows, but I get sick of windows' $*(#@& sometimes and I want a good old ubuntu machine or something.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
honey the codewitch wrote: Usually if it doesn't work it's because i did something wrong. That's not what most people report, as you already know.
honey the codewitch wrote: I love windows, but I get sick of windows' $*(#@& sometimes and I want a good old ubuntu machine or something. And you can, but Linux is not stable enough for VS.
I'd be surprised if VS came past the installation of .NET; it's not a question of whether it runs, it won't even install.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
Wouldn't cost you anything to try it with a trial of codeweavers. I choose to run it in a W10 VM in Mint. Have not had any stability problems with many versions of Linux over the past 4 or 5 years. I have been using it for servers and my main system for some 10 years or more. Used to be problematic on my old Thinkpad but ran OK if I didn't suspend it. That was some 5 years ago. Run a MacBook now, with Fusion. VM's run fine.
I don't hate Windows... I don't hate anything. I am just a contrarian.
Lou
If you can keep your head while those about you are losing theirs, perhaps you don't understand the situation.
|
|
|
|
|