|
I have a few questions:
1. Why use ++i rather than i++? i++ is far more common so I had to stop and think about what you were doing (it seems in this context ++i and i++ are the same).
2. Do you favor a for loop over foreach because of performance?
3. Why do you take an ICollection as parameter and immediately call ToArray? Most of the time, the implementation is an IList as well, and in that case you wouldn't need ToArray because it also supports indexing.
So without knowing anything about the code I've rewritten it a bit.
I couldn't test it, but I think it does the same as the original code, but with less nesting.
I've also got rid of the single line if-statement without braces by using progress?.Report .
And I've introduced a new single line if-statement by getting done = true; out of the if.
If this were my code I'd probably go even further by putting the remaining code inside the for loop in it's own method too.
Anyway, it's all a matter of style, I know plenty of people who'd prefer a single method, and maybe these extra method calls slow down your code to much, but here's how I'd handle it anyway
The comments in the code are specifically just for you
static void _FillLRClosureInPlace(CfgDocument cfg, IProgress<CfgLalr1Progress> progress, ICollection<LRItem> result)
{
var done = false;
while (!done)
{
done = true;
var l = result.ToArray();
for (var i = 0; i < l.Length; i++)
{
progress?.Report(new CfgLalr1Progress(CfgLalr1Status.ComputingClosure, i));
var item = l[i];
var next = item.RightIndex < item.Rule.Right.Count ? item.Rule.Right[item.RightIndex] : null;
if (item.RightIndex < item.Rule.Right.Count && cfg.IsNonTerminal(next))
{
done = HandleRules(cfg, next, result);
}
}
}
}
private static bool HandleRules(CfgDocument cfg, object next, ICollection<LRItem> result)
{
bool done = true;
for (int j = 0; j < cfg.Rules.Count; j++)
{
var r = cfg.Rules[j];
done = HandleRule(r, next, result);
}
return done;
}
private static bool HandleRule(Rule r, object next, ICollection<LRItem> result)
{
bool done = true;
if (r.Left == next)
{
var lri = new LRItem(r, 0);
done = result.Contains(lri);
if (!done)
{
result.Add(lri);
}
}
return done;
}
|
|
|
|
|
1. They are not the same. One is prefix and the other is postfix for a couple of reasons. But it *would* be the same here, sort of. Except in C++, i++ creates a copy of I, whereas ++i is just the simple asm "inc ax" basically. i++ isn't quite as efficient.
I code in C++, so this is habit. In fact I want to call this ++C You'll find others with C++ backgrounds sometimes use ++i too.
2. Yes I really like foreach, and I use it a lot. But not if I don't have to. Frankly, if it didn't create an object on the heap every foreach, I'd probably use it almost everywhere.
3. Because look carefully. In that loop I'm modifying the passed in collection. I work only on a copy. You cannot modify a collection you are currently enumerating. Hence I must make a copy, enumerate that and work on result. Also note that's in a loop. I create this copy each time around, because I must.
And a note on your code.
for(var i = 0;i<list.Count;++i); is somewhat less efficient than
for(int ic=list.Count,i=0;i<ic;++i); because you're calling Count on every iteration when you don't have to.
alse HandleRules is unnecessary. Literally all you factored was the loop itself.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
modified 17-Aug-19 9:26am.
|
|
|
|
|
honey the codewitch wrote: Because look carefully. In that loop I'm modifying the passed in collection. The passed in collection isn't the Rules, right?
You're using the Count of rules, but you're adding to result.
In any case, I'd either use Count directly, like I did, or create the variable out of the for statement for readability.
I didn't know i++ created a copy where ++i doesn't (I know pre- and postfix, and that they're more efficient than i += 1, but not their inner workings).
That said, I always use i += 1 because the i++ is something for C(++)'s
honey the codewitch wrote: Yes I really like foreach, and I use it a lot. But not if I don't have to. I use foreach always, except when I can't (because I need that index)
C++ vs. C# I guess.
honey the codewitch wrote: alse HandleRules is unnecessary Except to get out of that nasty seven times nesting
It also reads a bit better, in my opinion, as it's clear that we're handling the rules in the for loop (which isn't directly clear in the original).
One level of nesting could easily be eliminated by taking two if's and combining them with &&.
honey the codewitch wrote: Literally all you factored was the loop itself. Well, what did you expect, that was the code I was given.
To me, it's more readable with less nesting.
I wish I could do something about the while loop, but that would require a lot more in-depth knowledge on what you're doing.
Personally, I (almost) never need while loops, so I'm pretty sure you could somehow do without.
You're dealing with difficult logic in a difficult language.
You were complaining about the many nestings that you couldn't change because of the many states you keep, and for that I've given you one solution in this particular method.
Whether you like the solution and if it meets your (performance) requirements is up to you
|
|
|
|
|
What I meant by HandleRules is unnecessary is it would make more sense to combine handlerules and handlerule into one method.
I need the while loop because I don't know how many times in advance I will be iterating.
The break condition can't be known in advance. I have to keep iterating until there are no more unique LRItem lists to add, and you don't know that point until you hit it.
This is exactly what a while loop was invented for. If you haven't needed one you haven't been iterating over things with no fixed/knowable ending condition. There's a math for this. "While" is a necessary iteration construct, short of using goto instead. "for/foreach" *isn't* - they can be done by a "while", if that makes sense.
As far as copying the array, I'm not copying an array of rules there. I'm copying the item sets.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
I'm really not sure what you mean.
You don't have a break condition, and in any case even with HandleRule and HandleRules you're looping as often as when you combine them...
The done variable is always true and doesn't get set until the most inner loop, so it's safe to declare that in those methods and set it to true by default, then set it and return it to the caller.
It is possible that done is overwritten if the loop isn't at an end yet, but that's possible in the original code too.
Usually, my logic applies to all items in a list so I rarely need a while.
My logic is usually business logic though, and very few business have while logic.
while (form.Color == Colors.Yellow)...
No, foreach (Form form in yellowForms)
Maybe because my apps don't need those few extra milliseconds so I get to filter my yellow forms beforehand
|
|
|
|
|
done gets set to false in the innermost if it ends up adding a unique itemset to the result.
If it makes it to the end and hasn't added a unique itemset to the result the loop will break.
the done variable is the break condition - or the exit condition if you prefer.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Yeah, I get that.
But as far as I can tell, it hasn't changed in my code, even with those two methods
|
|
|
|
|
nah, it needs to be changed. You're overwriting done either way, and that won't work. I can fix it, but basically, you still have to keep the done var in the outer code
and then do
if(!HandleRules(cfg,next,result))
done=false;
but yeah basically otherwise your code works
Still if I was going to do that, I'd probably just do a lambda against a foreach over rules instead of making a whole new routine.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Sounds...inefficient. In an "abusive" sort of way.
|
|
|
|
|
Oh it is. This is what math and computer science doctorate holders with a masochistic streak do to unwind.
It's part of a LALR(1) parser table generation algorithm.
The whole thing is abusive. I feel like I'm gaslighting my CPU whenever I run it, because the damn thing doesn't make any sense and shouldn't work, but it does - expertly.
It is literally the most convoluted, ridiculous and monstrous thing I have ever implemented aside from pure hacks.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
And you've been writing articles about it. Are you saying you're writing these so they serve as warnings to others?
|
|
|
|
|
If you're smart, that's how you'll take them. Get too interested and you'll turn into a parsing gollum like me.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
honey the codewitch wrote: If you're smart, that's how you'll take them.
Genuine LOL.
|
|
|
|
|
ugh, i've covered so much of the Pck project and I've barely even touched the API which is Microsoft big.
It has 4 different document object models in it - with two different kinds of expression trees
It has 2 different parser generators and a lexer generator in it.
It also has transformations to other grammar formats.
And the entire process, from building the grammar to generating the code can be done using the API.
Plus there are runtime parsers.
It's ridiculous. this is huge. i need like, 5 dedicated devs on this.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
|
The DOMs *are* the facade. You should see what they cover up!
Actually the API is easy to use. It's just big.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
honey the codewitch wrote: Actually the API is easy to use. It's just big.
I haven't had a chance to really dive into the project but I was thinking of writing some adapters so this project could be used in one of my parser-oriented projects once I get it finished. So that's good to hear!
|
|
|
|
|
Hit me up if you have questions. I check back at the article periodically but i don't get notified when there are new comments there. The best place to find me is in the lounge, tbh
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
var lex = LexDocument.ReadFrom(filename);
var cfg = CfgDocument.ReadFrom(filename);
var tokenizer = lex.ToTokenizer(cfg.FillSymbols());
var parser = cfg.ToLalr1Parser(tokenizer);
that's not that bad for creating a LALR parser and tokenizer from a PCK spec at runtime
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
I've used a dictionary keyed by dictionaries once.
I had to take a shower afterwards.
How's that on a scale?
|
|
|
|
|
that's pretty good. Conceptually i view dictionaries as indexed groups most of the time. In fact, half the time i use dictionaries they have a collection as the TValue member, never mind the key, so they're a fancy collection with an index affixed to it that i use for grouping. In SQL a dictionary is a GROUP BY call.
So I can definitely see using one as a key.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Hmm. I use this:
Dictionary<Token,Dictionary<Token,Value>> a lot. Now that I look at it, yours is a lot more perverse.
Software Zen: delete this;
|
|
|
|
|
but where's your LALR(1) parse table?
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
This is the last post I read from you. The fact that you use words or alogrithms that I do not understand plus the fact that they seem so evident to you makes me feel real dumb.
|
|
|
|
|
You think they're evident to me LOL. I understand them about as well as you do. The trick is to practice the ability to code things you don't understand. That way you don't have to be smart.
Duh.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|