Click here to Skip to main content
14,331,975 members
Rate this:
Please Sign up or sign in to vote.
See more:
so i have these (as structured data)

test -> expr
expr -> ambig1 ambig2 ambig3
expr -> ambig1 ambig2 ambig4
expr -> ambig1 ambig2
expr2 -> ambig2 ambig3
expr2 -> ambig2

what I need is this

test -> expr

expr -> ambig1 ambig2 ambig3
expr -> ambig1 ambig2 ambig4
expr -> ambig1 ambig2

expr2 -> ambig2 ambig3
expr2 -> ambig2


where each section is in it's own container (array/dictionary whatever)

i have ideas but they're tough to implement and I'm not sure they will work

I already have a GetCommonPrefix method

I'll accept answers in any code, any language - but no LINQ

These data structures can be anything you like. Arrays, strings, whatever. The general algorithm will be the same, even if the particulars are different.

What I have tried:

I've tried several things. None worked. Pasting the code here would take too much space.
Posted
Updated 10-May-19 4:17am
v2
Comments
CHill60 10-May-19 8:48am
   
What do you mean by "structured data"? Are these strings or objects or lists or what?
There's not enough here to even get started on helping you
honey the codewitch 10-May-19 8:49am
   
you can consider them strings for the purpose of the question.

but they are arrays of System.Object - where the object is usually a string. Consider the objects to be comparable with equals

Consider the Left of a -> to be the first element, if that helps

The problem is independent of whether they are arrays or strings, the algorithm will be the same either way.
Rate this:
Please Sign up or sign in to vote.

Solution 2

Paste and run code here:
https://www.onlinegdb.com/online_python_interpreter[^]

I would not call it exactly hard.
lines = [   "test -> expr",
            "expr -> ambig1 ambig2 ambig3",
            "expr -> ambig1 ambig2 ambig4",
            "expr -> ambig1 ambig2",
            "expr2 -> ambig2 ambig3", 
            "expr2 -> ambig2"]

def get_matcher(s):
    return s[:s.find("->")+2]
    
def group_data(lines):
    groups = []
    new_group = []
    matcher = get_matcher(lines[0])
    for line in lines:
        if line.startswith(matcher):
            new_group.append(line)
        else:
            groups.append(new_group)
            new_group = [line]
            matcher = get_matcher(line)
            
    groups.append(new_group)
    return groups    
    
print(group_data(lines))
   
Comments
"mega"-in-a-fake-beard 10-May-19 10:30am
   
The above assumes that data is "ordered" like the data you provided.

If it is unordered eg:
lines = [
"test -> expr",
"expr -> ambig1 ambig2 ambig3",
"test -> ambig1 ambig2 ambig4",

Then it is better to make a dictionary [Python] or map [C++] where the keys are "matcher" and values are "lines". So instead of making a list of lists it will be a dictionary of lists
honey the codewitch 10-May-19 10:34am
   
yeah I'm using a dictionary in my code, but thanks for the heads up anyway. I'm basically adapting what you solved to see if it works. It looks too simple. I'm almost positive the solution will be recursive or at least use nested loops, but maybe I'm wrong. If you're interested, this is part of a larger problem called left factoring LL grammars

https://www.tutorialspoint.com/compiler_design/left_factoring.asp

"mega"-in-a-fake-beard 10-May-19 11:03am
   
Well i would think that IF your data is recursive (i.e. if the "ambig" items contain new lists of "L -> R" expressions) then the code might need to be recursive, or at at least the parser needs to build a recursive data structure.
honey the codewitch 10-May-19 11:39am
   
I don't think I'm explaining myself well. I mean recursive in the same way a sort() method might be. recursive comparisons, not comparisons over a recursive data structure, if that makes sense.

adding, it's possible to make those things recursive.

expr -> term expr

honey the codewitch 10-May-19 11:51am
   
Aaand I solved it. Sure enough, an inner loop was required (could have been done recursively)

It's kind of involved. Your solution didn't quite do it, but it almost worked. I am using a variation of the concept in it anyway
Rate this:
Please Sign up or sign in to vote.

Solution 1

I tend to implement this like

Create dictionary collection to store results with string as key, list of string as value
For each string
    Get prefix
    If dictionary contains prefix as a key
        Retrieve value from dictionary and add result to it
    Else
        Add new item to dictionary with prefix as key and result in value collection
Loop
   
Comments
honey the codewitch 10-May-19 9:08am
   
but how do I know what the prefix is without other items to compare it to?

like in

expr -> ambig2 ambig3

is the prefix expr->ambig2 or expr-> ambig2 ambig3

the trouble is I only know the prefix when compared to other items.
F-ES Sitecore 10-May-19 9:21am
   
In that case it might be better to simply store the data in a tree structure with each item a node. Similar to the above algorithm you'll split the item into its parts, then for each part look for a node in the tree and if it doesn't exist add it, then move down a level for each item. So you'll end up with something like

test
-- expr
expr
-- ambig1
---- ambig2
------ ambig3
------ ambig4
etc

you can store the original data as the value of each node and at the end of the process you'll have a representation of the data nested by prefix.
honey the codewitch 10-May-19 9:28am
   
you might have something there but i have to wrap my head around it. MOAR coffee. =) Thanks.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100