|
So many lonely people, in an overpopulated world.
Your own choice. It doesn't take much to reach out.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
Quote: It doesn't take much to reach out.
Now why would I want to do that?
|
|
|
|
|
|
Frequently when I generate code, I require a LOT of state. This, I've found, is the nature of code generation. You just have to keep track of sometimes over a dozen things at once.
I also tend to find that it lends itself to procedural style coding, rather than OO coding - not the generated code mind you, but the process of generating it.
The problem is, this results in an anti-pattern wherein I'm constantly passing at least 6 parameters per method.
Now, I could keep state in a struct and pass that around but the trouble is, those 6 (or more) different parameters - what they are - varies wildly depending on what I'm calling.
Again, the problem is the amount and variation of state I must work with at any given time.
Creating a bunch of types (classes or structs) just to hold it increases maintenance.
I don't have much of a problem reading the code. It's just the pattern crops up and I don't like it.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Have you tried the obvious?
Have one "state" class that aggregates the entire state for the current parsing tree. Each method gets a pointer to this state, and extracts (or modifies) the necessary stuff from (in) it.
Admittedly, this is one level up from putting everything in global variables ( ), but if your state is used globally, it should be available globally.
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.
-- 6079 Smith W.
|
|
|
|
|
unfortunately a lot of it is not used globally, but manufactured inside loop bodies which then delegate to methods.
I could store whole arrays in there, but i've already processed that information, and I'd need to just process it again, so that's code in two places.
I do actually use the pattern you talk about elsewhere in my project, to good effect. I just don't think it will be effective here. That's not to knock your suggestion in general. Just you haven't seen this code, so you're flying blind as it were in this case.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
It's hard to answer this without looking at your code, and I haven't used C# and the other things that you write about.
I also don't know what you mean by procedural vs OO. OO still has procedures, but ideally they're small, with many being private or protected. If you're saying that your code looks more like a C free-for-all than C++, my guess is that you haven't yet found a division of responsibilities that yields good encapsulation. And it might just be that there isn't one.
I also don't know what the "input" to your code generator is. When I parse C++ to do static analysis, I also "execute" it using operand and operator stacks, which can emit a sort of stack machine pseudo-code to verify that the code was properly understood. My guess is that it wouldn't be difficult, although lots of work, to turn this into (inefficient) machine code. But this isn't at the same level of abstraction as generating class definitions, like you are.
You say that you can read your code easily, so it doesn't much matter if you think you'd still be able to pick it up again, with modest effort, in a year. If this is the first time you're doing code generation, be patient. Soon you'll probably have an epiphany about how it should be structured, presenting you with the painful choice between leaving it alone and doing a big refactoring!
|
|
|
|
|
Greg Utas wrote: my guess is that you haven't yet found a division of responsibilities that yields good encapsulation. And it might just be that there isn't one.
Yes to this. I'd divide the labor more, but again, the variation and amount of state involved in the generation process makes that so cumbersome as to be more trouble than it solves.
Greg Utas wrote: When I parse C++ to do static analysis, I also "execute" it using operand and operator stacks, which can emit a sort of stack machine pseudo-code to verify that the code was properly understood.
Slang does something very similar. It has to go and turn a parse tree into an abstract syntax tree of code using type resolution. That requires some level of evaluation. For example, I have a routine called GetTypeOfExpression(expr) that lets you retrieve the type of any expression you get, including method return values, and such.
Because of metadata I don't have to evaluate as much as you do. I just have to find types. All of that is nicely encapsulated in CodeDomResolver and CodeDomBinder. Still, it makes Slang take awhile to process.
But I use those ASTs when I go to generate code. I modify them, a pick parts out of them (for reasons - maintainability elsewhere- this allows me to "template" parts of the code)
And I also use things like parse tables.
I've been generating code for decades. It's kind of my thing. The code i generate has come from more and more complex sources.
It's why I'm generating a backtracking parser with syntax directed actions and semantic constraints in a language independent manner right now.
It's very challenging. For both me and my CPU!
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Another way I confirm that the C++ code was properly understood is to regenerate it. The comments have been stripped out, and class members aren't in their original order, but the functions are there (with the type for each auto variable, to make sure it got that right). The code is regenerated by one or more Display methods on each object in the parse tree, which turned out to be much easier than I had anticipated. Just in case this helps...
|
|
|
|
|
I'm doing the exact same thing in slang to the same effect, with reordered members and everything. I even resolve var (the C# equiv of auto) to their appropriate types.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Adding, right now my show stopping problem is my inability to factor a grammar for C#.
I'm running into problems with ambiguities between a cast and a parenthesized expression, but no matter how i apply semantic constraints i can't seem to resolve the conflicts. Plus it says my grammar is directly left recursive and i can't figure out why it would be.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Too theoretical! I just wrote the parser without using any tools. C++ probably doesn't even have a proper grammar!
But I might get theoretical too, if I was trying to support multiple languages.
Parentheses were also one of my problems:
bool Parser::HandleParentheses(ExprPtr& expr)
{
auto back = expr->Back();
if((back != nullptr) && (back->Type() == Cxx::QualName))
{
TokenPtr call;
if(GetArgList(call))
{
expr->AddItem(call);
return true;
}
}
if(GetCast(expr)) return true;
return GetPrecedence(expr);
}
|
|
|
|
|
I already do that in my hand written parser
CodeExpression expr = null;
var pc2 = pc.GetLookAhead();
pc2.EnsureStarted();
try
{
expr = _ParseCast(pc2);
}
catch { }
if(null!=expr)
{
return _ParseCast(pc);
} else
{
try
{
if (!pc.Advance())
_Error("Unterminated cast or subexpression", pc.Current);
expr=_ParseExpression(pc);
_SkipComments(pc);
if(ST.rparen!=pc.SymbolId)
_Error("Invalid cast or subexpression", pc.Current);
pc.Advance();
return expr;
}
catch(Exception eex)
{
throw eex;
}
}
Ironically that TODO: is already handled by my generated parser (it counts tokens and dumps lookahead) but the above works.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Have you tried using local function?
Just an idea!
|
|
|
|
|
There's an idea. I have not. I haven't really messed with them yet, though I wonder if I should use the newest language features or not. I like to stay a generation or two back with my codebase so it can compile for more people.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Mm... I like to mess up with new feature as soon they come out... so that I get comfy with them!
And, on the particular case of local function I like them because I don't like to have a method that is used only by another method.. over time and refactoring its origin and purpose might get obscured and funny thing might start to happen... like dead code, or people reusing what should have been a private method, that get changed and broke everything...
|
|
|
|
|
Yeah, I can see that. For now I just distinguish by putting private in front of my one caller private methods. I leave them blank for other private methods.
The exception is with slang. I explicitly qualify members of types in C# code i feed it with access modifiers because VB demands it.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
One state one class, especially in case code is generated
It does not solve my Problem, but it answers my question
modified 19-Jan-21 21:04pm.
|
|
|
|
|
One word: XML
Or, ya know, globals.
|
|
|
|
|
Don't know the code-base or exactly what you're doing but why not make data objects? Yes more classes is more maintenance, but it also expands the vocabulary of your code and if they're just data holders there's not much to maintain. If the data is commonly used together then it makes sense.
|
|
|
|
|
The data isn't really commonly used together. There's a lot of state, but what state I need at any given time varies wildly, and some very expensive/elaborate state is sometimes used purely to satisfy a single operation, and then tossed (like follows sets)
It's the nature of generating this parser code
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
You should try creating larger and larger tuples to pass back and forth. Be sure never to name any of the components of the tuples, so that your code becomes more and more unreadable:
(int, string, double, double, DateTime, bool) Generate(int X, (int, string, bool) Parameters)
{
....
}
...and so on. This way, you will be sure to develop schizophrenia even more quickly.
|
|
|
|
|
LOL, I do use exactly one tuple in my codebase but it has named arguments
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Hi there. Take a minute, breathe and rethink.
There may be nothing at all wrong with what you are doing.
Procedural style can be the correct solution for a given situation. I once had a new programmer come to me and tell me he wanted to rewrite a program piece because it wasn't OO. Cool, go for it. While some of the resultant code was really quite good, a lot of it was un-needed and difficult to read.
Depending on the application, you sometimes start out thinking that a class is going to be the best thing on earth. Later, you find that it never gets reused, will be a lot shorter and clearer in procedural style, so why do it?
Passing 6 or more parameters may be just fine, but the question is have you overridden, expanded, extended, whatever the method beyond its scope and would multiple methods be more concise? It may be a place where the global vars are indeed correct. If properties are only there for 'set and get', you need to reconsider their existence.
In a certain sense, structures really are only a form of global. Their use is generally just clarity: tree.species, tree.color, etc. when they are not being acted upon to where a class is more appropriate. Passing it to a method(s) is probably a class function.
It boils down to architecture. What is the scope and scalability (i.e. future) of what you are writing? In a desktop business app with little or no interaction other than a database (still need all the db handlers obviously), 'down and dirty procedural code' may be just dandy in many cases. Of course, it may not work, as well. But we really can't tell that without an architect/analyst view of the project and entire codebase.
edit: Then there's waterfall vs agile vs continuous evolution. Sometimes you know everything you need up front. Small part of my world. My normal world is "the more you give them, the more they want" - a continuously evolving large codebase. Continual change means refactoring and reviewing that you are still writing effectively instead of creating a nice bowl of spaghetti. Anyone can make spaghetti but nicely layered lasagna is more work!
modified 28-Dec-19 10:05am.
|
|
|
|
|
I understand what you mean. This is my favorite reply, as you were very thorough.
This despite the fact that I've evaluated a lot of what you wrote and it's covered ground for me- I used to be a software architect by trade after I was strictly a developer. Still I appreciate the time and effort, and so I'm sure do people visiting the thread. Thank you for this.
I have run into some maintenance issues, but not insurmountable. I don't like how the code will look to me in a month, and so some refactoring is in order, but I haven't figured out which way to go with it.
A lot of times when I need ideas, I tend to go my own way, having received inspiration from others nonetheless. Asking helps "unblock" me, so don't worry about not being able to address my code without seeing it. I appreciate the input.
I think this is still best procedural, but maybe some of the arguments I pass could be encapsulated as structs or even classes. The other option is to go full bore and make a complete and sensible model out of the whole thing but this requires a lot of code I'll never use, despite being one of my go to techniques. For example I developed a document object model for my regular expressions even though I didn't absolutely need it. It was useful in some cases though. I don't believe that to be true of this particular bit of the code generation process.
Either that, or i haven't landed on the Right Way(TM) to do this and that's why I'm running into anti-patterns. Maybe there's a concise model I just haven't figured out.
I'm great at modeling though. It's one reason I became an architect - the ability to think in several levels of abstraction and model complicated systems even in my head. However, I'd be a fool to think I was the smartest person in any given room, so it's always possible someone more clever than me came up with A Better Way(TM)
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|