|
I'd rather have an empty check wasting time (and we're talking nanoseconds here, hardly ever a big deal) than suddenly have AlwaysDoThis only execute if the if statement is true, which is an actual bug that users are going to see and feel.
Of course, also commenting out the if statement is a simple solution, but maybe only after that bug has gone to production.
Now, if you have automated tests with sufficient coverage and testing processes to find these bugs before they're released, the damage is nothing more than wasted time and effort.
Experience learns that tests aren't always sufficient and that code like that does find its way into production.
In fact, according to this blog[^] a single line if statement caused a bug in Apple's TLS code!
So, I'd rather always use braces and minimize the risk of bugs in production, which is far more serious than one or two extra lines of code
|
|
|
|
|
here's why I don like braces everywhere
}
}
}
}
}
}
++i;
}
}
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Your code shouldn't be so nested in the first place.
I get it, it happens.
How I solve such code is as follows:
void ProcessThing(Thing thingy)
{
try
{
ProcessFoos(thingy.Foos);
}
catch (Exception ex)
{
}
}
void ProcessFoos(List<Foo> foos)
{
foreach (var foo in foos)
{
ProcessBars(foo.Bars);
}
}
void ProcessBars(List<Bar> bars)
{
foreach (var bar in bars)
{
ProcessStuff(bar.Stuff);
}
}
You get the idea, same for if statements and switches.
It's not even that bad, it's quite readable, it's maintainable (you'll always know where the bars are processed) and it saves you from nesting so many levels in one code block.
|
|
|
|
|
That doesn't work when you're "lalring" - there is simply too much state to pass between the methods, and a bunch of methods with lots of ref parameters doesn't do anyone any favors.
Factoring is great when it works, but it's not the solution to everything.
Here's what I'm up against. This is the problem, broken down for mortals:
LALR(1) Parsing[^]
I based all of my code off of this. Scary, I know. But I verified that it works using Gold Parser's tools.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
I can't read and understand all of that, show me some code instead!
Oh... But it'll be C++, right?
|
|
|
|
|
Here's a private method that computes the LR Item sets (one of the first steps at the link). In C#
it may as well be in C++. All LRItem is is basically a "tuple" (string Left, IList<string> Right, int RightIndex) except it implements value semantics (they can be compared)
static void _FillLRClosureInPlace(CfgDocument cfg,IProgress<CfgLalr1Progress> progress, ICollection<LRItem> result)
{
var done = false;
while (!done)
{
done = true;
var l = result.ToArray();
for (var i = 0; i < l.Length; ++i)
{
if(null!=progress)
progress.Report(new CfgLalr1Progress(CfgLalr1Status.ComputingClosure, i));
var item = l[i];
var next = item.RightIndex < item.Rule.Right.Count ? item.Rule.Right[item.RightIndex] : null;
if (item.RightIndex < item.Rule.Right.Count)
{
if (cfg.IsNonTerminal(next))
{
for (int jc = cfg.Rules.Count, j = 0; j < jc; ++j)
{
var r = cfg.Rules[j];
if (r.Left == next)
{
var lri = new LRItem(r, 0);
if (!result.Contains(lri))
{
done = false;
result.Add(lri);
}
}
}
}
}
}
}
}
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
modified 16-Aug-19 17:24pm.
|
|
|
|
|
I have a few questions:
1. Why use ++i rather than i++? i++ is far more common so I had to stop and think about what you were doing (it seems in this context ++i and i++ are the same).
2. Do you favor a for loop over foreach because of performance?
3. Why do you take an ICollection as parameter and immediately call ToArray? Most of the time, the implementation is an IList as well, and in that case you wouldn't need ToArray because it also supports indexing.
So without knowing anything about the code I've rewritten it a bit.
I couldn't test it, but I think it does the same as the original code, but with less nesting.
I've also got rid of the single line if-statement without braces by using progress?.Report .
And I've introduced a new single line if-statement by getting done = true; out of the if.
If this were my code I'd probably go even further by putting the remaining code inside the for loop in it's own method too.
Anyway, it's all a matter of style, I know plenty of people who'd prefer a single method, and maybe these extra method calls slow down your code to much, but here's how I'd handle it anyway
The comments in the code are specifically just for you
static void _FillLRClosureInPlace(CfgDocument cfg, IProgress<CfgLalr1Progress> progress, ICollection<LRItem> result)
{
var done = false;
while (!done)
{
done = true;
var l = result.ToArray();
for (var i = 0; i < l.Length; i++)
{
progress?.Report(new CfgLalr1Progress(CfgLalr1Status.ComputingClosure, i));
var item = l[i];
var next = item.RightIndex < item.Rule.Right.Count ? item.Rule.Right[item.RightIndex] : null;
if (item.RightIndex < item.Rule.Right.Count && cfg.IsNonTerminal(next))
{
done = HandleRules(cfg, next, result);
}
}
}
}
private static bool HandleRules(CfgDocument cfg, object next, ICollection<LRItem> result)
{
bool done = true;
for (int j = 0; j < cfg.Rules.Count; j++)
{
var r = cfg.Rules[j];
done = HandleRule(r, next, result);
}
return done;
}
private static bool HandleRule(Rule r, object next, ICollection<LRItem> result)
{
bool done = true;
if (r.Left == next)
{
var lri = new LRItem(r, 0);
done = result.Contains(lri);
if (!done)
{
result.Add(lri);
}
}
return done;
}
|
|
|
|
|
1. They are not the same. One is prefix and the other is postfix for a couple of reasons. But it *would* be the same here, sort of. Except in C++, i++ creates a copy of I, whereas ++i is just the simple asm "inc ax" basically. i++ isn't quite as efficient.
I code in C++, so this is habit. In fact I want to call this ++C You'll find others with C++ backgrounds sometimes use ++i too.
2. Yes I really like foreach, and I use it a lot. But not if I don't have to. Frankly, if it didn't create an object on the heap every foreach, I'd probably use it almost everywhere.
3. Because look carefully. In that loop I'm modifying the passed in collection. I work only on a copy. You cannot modify a collection you are currently enumerating. Hence I must make a copy, enumerate that and work on result. Also note that's in a loop. I create this copy each time around, because I must.
And a note on your code.
for(var i = 0;i<list.Count;++i); is somewhat less efficient than
for(int ic=list.Count,i=0;i<ic;++i); because you're calling Count on every iteration when you don't have to.
alse HandleRules is unnecessary. Literally all you factored was the loop itself.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
modified 17-Aug-19 9:26am.
|
|
|
|
|
honey the codewitch wrote: Because look carefully. In that loop I'm modifying the passed in collection. The passed in collection isn't the Rules, right?
You're using the Count of rules, but you're adding to result.
In any case, I'd either use Count directly, like I did, or create the variable out of the for statement for readability.
I didn't know i++ created a copy where ++i doesn't (I know pre- and postfix, and that they're more efficient than i += 1, but not their inner workings).
That said, I always use i += 1 because the i++ is something for C(++)'s
honey the codewitch wrote: Yes I really like foreach, and I use it a lot. But not if I don't have to. I use foreach always, except when I can't (because I need that index)
C++ vs. C# I guess.
honey the codewitch wrote: alse HandleRules is unnecessary Except to get out of that nasty seven times nesting
It also reads a bit better, in my opinion, as it's clear that we're handling the rules in the for loop (which isn't directly clear in the original).
One level of nesting could easily be eliminated by taking two if's and combining them with &&.
honey the codewitch wrote: Literally all you factored was the loop itself. Well, what did you expect, that was the code I was given.
To me, it's more readable with less nesting.
I wish I could do something about the while loop, but that would require a lot more in-depth knowledge on what you're doing.
Personally, I (almost) never need while loops, so I'm pretty sure you could somehow do without.
You're dealing with difficult logic in a difficult language.
You were complaining about the many nestings that you couldn't change because of the many states you keep, and for that I've given you one solution in this particular method.
Whether you like the solution and if it meets your (performance) requirements is up to you
|
|
|
|
|
What I meant by HandleRules is unnecessary is it would make more sense to combine handlerules and handlerule into one method.
I need the while loop because I don't know how many times in advance I will be iterating.
The break condition can't be known in advance. I have to keep iterating until there are no more unique LRItem lists to add, and you don't know that point until you hit it.
This is exactly what a while loop was invented for. If you haven't needed one you haven't been iterating over things with no fixed/knowable ending condition. There's a math for this. "While" is a necessary iteration construct, short of using goto instead. "for/foreach" *isn't* - they can be done by a "while", if that makes sense.
As far as copying the array, I'm not copying an array of rules there. I'm copying the item sets.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
I'm really not sure what you mean.
You don't have a break condition, and in any case even with HandleRule and HandleRules you're looping as often as when you combine them...
The done variable is always true and doesn't get set until the most inner loop, so it's safe to declare that in those methods and set it to true by default, then set it and return it to the caller.
It is possible that done is overwritten if the loop isn't at an end yet, but that's possible in the original code too.
Usually, my logic applies to all items in a list so I rarely need a while.
My logic is usually business logic though, and very few business have while logic.
while (form.Color == Colors.Yellow)...
No, foreach (Form form in yellowForms)
Maybe because my apps don't need those few extra milliseconds so I get to filter my yellow forms beforehand
|
|
|
|
|
done gets set to false in the innermost if it ends up adding a unique itemset to the result.
If it makes it to the end and hasn't added a unique itemset to the result the loop will break.
the done variable is the break condition - or the exit condition if you prefer.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Yeah, I get that.
But as far as I can tell, it hasn't changed in my code, even with those two methods
|
|
|
|
|
nah, it needs to be changed. You're overwriting done either way, and that won't work. I can fix it, but basically, you still have to keep the done var in the outer code
and then do
if(!HandleRules(cfg,next,result))
done=false;
but yeah basically otherwise your code works
Still if I was going to do that, I'd probably just do a lambda against a foreach over rules instead of making a whole new routine.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Sounds...inefficient. In an "abusive" sort of way.
|
|
|
|
|
Oh it is. This is what math and computer science doctorate holders with a masochistic streak do to unwind.
It's part of a LALR(1) parser table generation algorithm.
The whole thing is abusive. I feel like I'm gaslighting my CPU whenever I run it, because the damn thing doesn't make any sense and shouldn't work, but it does - expertly.
It is literally the most convoluted, ridiculous and monstrous thing I have ever implemented aside from pure hacks.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
And you've been writing articles about it. Are you saying you're writing these so they serve as warnings to others?
|
|
|
|
|
If you're smart, that's how you'll take them. Get too interested and you'll turn into a parsing gollum like me.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
honey the codewitch wrote: If you're smart, that's how you'll take them.
Genuine LOL.
|
|
|
|
|
ugh, i've covered so much of the Pck project and I've barely even touched the API which is Microsoft big.
It has 4 different document object models in it - with two different kinds of expression trees
It has 2 different parser generators and a lexer generator in it.
It also has transformations to other grammar formats.
And the entire process, from building the grammar to generating the code can be done using the API.
Plus there are runtime parsers.
It's ridiculous. this is huge. i need like, 5 dedicated devs on this.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
|
The DOMs *are* the facade. You should see what they cover up!
Actually the API is easy to use. It's just big.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
honey the codewitch wrote: Actually the API is easy to use. It's just big.
I haven't had a chance to really dive into the project but I was thinking of writing some adapters so this project could be used in one of my parser-oriented projects once I get it finished. So that's good to hear!
|
|
|
|
|
Hit me up if you have questions. I check back at the article periodically but i don't get notified when there are new comments there. The best place to find me is in the lounge, tbh
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
var lex = LexDocument.ReadFrom(filename);
var cfg = CfgDocument.ReadFrom(filename);
var tokenizer = lex.ToTokenizer(cfg.FillSymbols());
var parser = cfg.ToLalr1Parser(tokenizer);
that's not that bad for creating a LALR parser and tokenizer from a PCK spec at runtime
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|