|
I'd have to adapt your strategy because my code is totally hand-rolled. This function would probably have to assemble comments and save them in an adjunct vector of strings. It's invoked from many places and I'm loath to spend time speculatively constructing and passing around strings.
size_t Lexer::NextPos(size_t pos) const
{
while(pos < size_)
{
auto c = source_->at(pos);
switch(c)
{
case SPACE:
case CRLF:
case TAB:
++pos;
break;
case '/':
if(++pos >= size_) return string::npos;
switch(source_->at(pos))
{
case '/':
pos = source_->find(CRLF, pos);
if(pos == string::npos) return pos;
++pos;
break;
case '*':
if(++pos >= size_) return string::npos;
pos = source_->find(COMMENT_END_STR, pos);
if(pos == string::npos) return string::npos;
pos += 2;
break;
default:
return --pos;
}
break;
case BACKSLASH:
if(++pos >= size_) return string::npos;
if(source_->at(pos) != CRLF) return pos - 1;
++pos;
break;
default:
return pos;
}
}
return string::npos;
}
modified 4-Jan-20 10:04am.
|
|
|
|
|
Well, yours do work better than mine since you accounted for line continuation and i forgot about it
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
A relatively recent addition.
|
|
|
|
|
I still have to figure out how to match multiline tokens with gplex/lex/flex.
Have you ever used those tools? Had any luck with them?
I'm using gplex now, and it works, but it's a bit clunky sometimes. I'd like to write my own scanner generator that supports unicode but i need to understand NFA regex first, and I only understand DFA regex.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
No, I've never used those tools. Usually I start something that grows organically, at which point it's too late to use them. But even if they might help, I'd rather write the code from scratch than search for a tool, read its manual, get it running, and maybe discover, after investing lots of time on it, that it has show-stopping deficiencies. Mostly it's because I hate spending time configuring stuff.
|
|
|
|
|
those scanner generators are really easy to use once you know how. the issue i have building my own tokenizer/scanner by hand is they get awfully complicated for real world languages as you've probably found. I prefer to use regex to define my lexemes/terminal tokens as it makes it easier - less code i have to write and debug. regex is like nothing to me. I'd almost rather write my own scanner generator and then use that than write my own scanner by hand.
I have an unrelated question for you.
I have two options with respect to parsing C#: I can parse in two passes, parsing only as far as types the first time, just to get type information completely so i can parse the rest accurately (i need type information to disambiguate the parse, just like you do to parse C only worse)
My other option - and what i've done with the hand rolled parser, is simply punt *correcting* the AST after the fact, rather than touching the parse tree. basically *after* I've finished building my AST out of my parse tree, then i go back and correct the AST with type information.
I have this nasty method called "Patch" which visits my entire tree, with type info, looking for bits in the tree it needs to patch. Currently it's slow, but i have some ideas to speed it up.
The first way might be more efficient, but it also might be a dead end. I've never tried it?
Any thoughts? What would you do?
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
This is after a rather large dinner with wine, so I hope it helps.
I don't understand the difference between the parse tree and the AST. I only have one tree, but maybe that's because I'm only doing C++ whereas you're converting one language to another.
Once I have a subtree for something that is "executable" (e.g. an enum , typedef , data declaration or definition, function declaration or definition), I invoke a virtual EnterBlock function on its root, which is like invoking an interpreter. Each node invokes EnterBlock on its descendants, so it proceeds depth first. A QualName (a possibly qualified name) or DataSpec (a QualName tagged with pointers, references, and/or const ) implements this by resolving its name based on the current scope. I hope this is what you mean by "getting the type information". It also causes stuff to be pushed onto the operand (types) and operator stacks.
Could this wait until all of the code is parsed? I don't see why not, and I don't see how it would be more or less efficient. In fact, name resolution can occur later if there are errors during the parsing or interpretation. If you run the >check tool on the code, one of the things it does (to clean up #include lists) is to ask each file for all of the things that it uses. Any nodes that have names but that weren't "interpreted" because an error caused them to be skipped will then try to resolve their names.
|
|
|
|
|
Greg Utas wrote: Could this wait until all of the code is parsed? I don't see why not
That's what I've been doing. I think i just need to implement a more efficient visit on the tree i'm using.
Maybe i'll make it so it can visit only marked nodes.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
return _Compare(x.Key.Key, y.Key.Key);
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
It's sunny here too. Wall to wall sunshine and blue skies. Nary a cloud.
-15 right now, but it's a dry cold and the sun feels nice on your numb face.
|
|
|
|
|
Quote: -15 right now I am glad to be a "Florida Man". It was cold this morning. 55 F. That is: PLUS 55!
|
|
|
|
|
I thought all the "Florida Mans" were here!
As well as everyone from every other state and walk of life for the holiday craziness in Summit County Colorado.
|
|
|
|
|
Quote: I thought all the "Florida Mans" were here!
Are you telling me there is more than one? That I am not unique? Rats! Double Rats!
|
|
|
|
|
Ditto.....
A human being should be able to change a diaper, plan an invasion, butcher a hog, navigate a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders, cooperate, act alone, solve equations, analyze a new problem, pitch manure, program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialization is for insects! - Lazarus Long
|
|
|
|
|
Living in x, it's grey and cold here. Quite the contrast from 74 and "clear with periodic clouds" in y.Key.Key
But hey, I moved to the East Coast from San Diego years ago to experience actual seasons and weather.
|
|
|
|
|
|
|
Who writes the subtitles on the Code Project Daily News?
They really crack me up.
Example:
Scientists say they've found a way to solve the 'oldest open question in astrophysics' (3 body problem)
OK, now do four
|
|
|
|
|
Kent Sharkey[^] does.
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
but does Kent Sharkey really exist? Same with Sean, does he really exist?
Part of me thinks these are aspects of Bob's split personality.
It's much easier to enjoy the favor of both friend and foe, and not give a damn who's who. -- Lon Milo DuQuette
|
|
|
|
|
I do not.
Chris wrote me back in grad school. If I weren't still running on that PDP-11, I'd thank him.
Sean on the other hand, is great. I think he's on the other rack though.
TTFN - Kent
|
|
|
|
|
If you ever reach MVP / MVE on CP, you'll know that Sean does indeed exist. And has the most deplorable taste in "clothes".
Mankinis are not a good look, and when he is standing there on your doorstep to deliver eth Certificate dressed only in a red mankini and a thin layer of oil, you do begin to wish he perhaps didn't. He is the reason why "Casual Friday" was banned in Toronto after just one week.
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
It's much easier to enjoy the favor of both friend and foe, and not give a damn who's who. -- Lon Milo DuQuette
|
|
|
|
|
And the place for those news is: The Insider News[^] just in case you want to thank him personally.
M.D.V.
If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about?
Help me to understand what I'm saying, and I'll explain it better to you
Rating helpful answers is nice, but saying thanks can be even nicer.
|
|
|
|
|
bleahy48 wrote: They really crack me up. I think many people here are subscribed to the newsletter for exactly that reason
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|