AWK script to find string based on pattern and search for its corresponding rows in column

Question

1.00/5 (2 votes)

See more:

Experts,
Need your support for this awk script. we have only one input file, all these column 1 and column 2 are in same file and have to do lookup for values in one file(column1 and column2) but output we need in another file
Need to grep row whose string contains 9K in column 1 and then grep its corresponding row in column 2. Suppose check for 9K1 then grep A1, check for A1 in column 1 if not, output result as shown in below expected output file
Then again check for 9K1, grep A2, look for A2 in column1, if found grep its corresponding row value B2.
Now check for B2 value in column 1, then grep for C2 instead of A2 as we have already considered previously in lookup. Now look for C2 in column and grep D2 instead of B2

This needs to be checked for all rows in column 1 that contains 9K, as there can be rows with value 9K2, 9K3, 9K4 in column 1 with corresponding data in column2

Input file :

9K1,A1
9K1,A2
9K1,A3
9K1,A4 
9K1,A5 
A2,B2
B2,A2
B2,C2
C2,B2
C2,D2
A5,B5
B5,C5
B5,A5
9K1,A6
A6,B6
B6,A6
B6,C6

Below is required and expected output which is needed(another file) :

9K1,A1
9K1,A2,B2,C2,D2
9K1,A3
9K1,A4
9K1,A5,B5,C5
9K1,A6,B6,C6

Output returned by above code: (which is not as per expectation)

9K1,A3
9K1,A4
9K1,A5
9K1,A6,B6,C6

What I have tried:

awk '
NR==FNR{
    assoc[$1]=$2
    next
}
(FNR!=1)&&($1~/9K/){
    printf "%s,%s", $1,$2
    seen[$1]; seen[$2]
    search=$2
    while((search in assoc) && !(assoc[search] in seen)){
        search=assoc[search]
        printf ",%s", search
        seen[search]
    }
    print ""
    for(var in seen){
         delete seen[var]
    }
}' input.csv input.csv > output.csv

Posted 25-Jul-19 9:12am

as7951

Updated 7-May-23 0:46am

Add a Solution

1 solution

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Member 10601191 · Answer 1 · 2023-05-07T00:46:00

/^9K/ {
        kees[$0];
        next;
}
{
        for ( key in kees )
        {
                #
                # is it part of the key ...
                #
                if ( match( key, $1) )
                {
                        if( match( kees[key],$2) == 0 )
                                kees[key]=kees[key]"," $2
                        break;
                }
                else if ( match( key, $2 ) )
                {
                        if( match( kees[key],$1 ) == 0 )
                                kees[key]=kees[key]"," $1
                        break;
                }

                #
                # not part of the key but a member
                #
                if ( match( kees[key], $1 ) )
                {
                        if( match( kees[key],$2) == 0 )
                                kees[key]=kees[key]"," $2
                }
                else if ( match( kees[key], $2 ) )
                {
                        if( match( kees[key],$1) == 0 )
                                kees[key]=kees[key]"," $1
                }

        }
}
END{
        for ( key in kees )
                print key kees[key]
}

Given your input, a very rough job,
(NB: you need to test/check ... this can be further reduced, I leave that to you.

awk -f s1.awk FS=',' input.csv
9K1,A1
9K1,A2,B2,C2,D2
9K1,A3
9K1,A4
9K1,A5,B5,C5
9K1,A6,B6,C6