Click here to Skip to main content
15,881,600 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
A is a set of strings defined by alphabet {a, b, c}. Can it be represented as composition of two other sets of strings B and C? A = BC, but A != C and A != B.

Composition means joining each string of set B with each string from set B (which creates set A). For example A = {aba, ca, abb, cb} then answer is positive, because B = {ab, c} and C = {a, b}. Empty string ε is also allowed.

Example input:
List of strings:
aba ca abb cb

Expected output:
true

All I'm asking for is there an official name of this problem so I can look up solution myself. I don't mind getting solution here, but this is extremely hard problem to solve.

What I have tried:

Brute force solution but the deeper I go the more I realize how collosal amount of possibilities I need to cover to create solution that works for all examples.
Posted
Updated 8-Mar-22 2:51am

I thought the term for this was the product, not the composition, of two sets. But I don't know what the reverse operation is called.

One approach would be to factor |C| (the cardinality of C). In your example, this yields 4x1 or 2x2, so you'd be looking for an |A| and |B| of 4 and 1, or 2 and 2.

But let's say that A contains a and ab, and B contains bc and c. AxB then produces abc twice, so one of them would be dropped. In this case, trying to factor |C| no longer works. So we're back to a more brute force approach.

However, as you use a brute force approach, you're constructing candidate sets A and B. If |A| x |B| > |C|, it must be that AxB produces some number of duplicates, so checking that this is true would probably prune a lot of the searches quickly.

EDIT: Another thing that would help is to consider each string in C individually. ca implies that c must be in A, and a must be in B. cb is similar. These are simple examples, but even a long string must be split this way. This could quickly prune the search when a candidate AxB produces a string that isn't in C.
 
Share this answer
 
v2
Here's an algorithm:

Alphabet: {a,b,c}

Strings: aba ca abb cb

The most complete solution to this is to build a dictionary M, mapping each string onto the letters, that existed in {a,b,c}, which have been occurred once or multiple times. Here, it's not necessary to set the mappings of the duplicate letters, so that the dictionary M will be a rectangular matrix 4 x 3, consisting of the following elements:

a b c

aba : 1, 1, 0
ca : 1, 0, 1
abb : 1, 1, 0
cb : 0, 1, 1


For instance, the string "abb" contains 1 - 'a' and 2 - 'b', such as that the row of matrix M is { 1, 1, 0 }, for which 1 and 1 values are corresponding to letters 'a' and 'b', respectively, and 0 corresponds to the letter 'c', which has not been found in the string "abb".

Next, take the inner product of the matrix M and its transpose, to obtain the symmetric matrix I=MM^T of shape 4 x 4.

I=MM^T (Upper triangular):

0 1 2 0
0 0 1 0
0 0 0 0
0 0 0 0

Finally, check whether at least one element, on the I matrix's diagonal, is equal to 0. If so, the composition does not exist and the result is "FALSE", or "TRUE", unless otherwise.

Here's a code snippet, written in plain C89, illustrating the algorithm, above:

C++
#include <vector>
#include <fstream>
#include <iostream>

int main()
{
    const char* filename = "tZwciVLk.txt";

    std::ifstream ifs(filename, \
        std::ifstream::ios_base::in);

    std::vector<std::string> strings;

    char* line = (char*)new char[256];

    while (ifs.getline(line, 256))
        strings.push_back(line);

    std::string chars = "abc";

    if (strings.size() == 1) {

        std::size_t i = 0; bool f = false;
        const char* string = \
            strings[strings.size() - 1].c_str();
        f = strcmp("\0", string) == 0;
        while (i < strlen(string) && !f)
            f = strchr(chars.data(), string[i++]);

        printf("Output: %s\n", f ? "True" : "False");

        return 0;
    }

    int** M = (int**)malloc(strings.size() * sizeof(int*));
    int** I = (int**)malloc(strings.size() * sizeof(int*));

    std::size_t n = 0;
    for (std::size_t i = 0; i < strings.size(); i++) {
        M[i] = (int*)malloc(chars.size() * sizeof(int));
        const char* s = strings[i].data();
        if (strcmp("\0", s) < 0) {
            for (std::size_t j = 0; j < chars.size(); j++)
                M[n][j] = strchr(s, chars[j]) ? 1 : 0;

            n++;
        }
    }

    bool f = true;
    for (std::size_t i = 0; i < n && f; i++) {
        I[i] = (int*)malloc(strings.size() * sizeof(int));
        memset((void*)I[i], 0x00, strings.size() * sizeof(int));
        for (std::size_t j = 0; j < n && f; j++) {
            for (std::size_t k = 0; k < chars.size(); k++)
                I[i][j] += M[i][k] * M[j][k];

        }

        f = I[i][i] == 0 ? 0 : 1;
    }

    for (std::size_t i = 0; i < n; i++)
         std::cout << strings[i] << " ";

    std::cout << std::endl;

    std::cout << "Output: " << (f == true ? "True" : "False") << std::endl;

    return 0;
}


Outputs:

Alphabet: {a,b,c}
Strings: aba ca abb cb
Output: True

Alphabet: {a,b,c}
Strings: aba ca abb cb d
Output: False

Alphabet: {a,b,c}
Strings: tt yyyy zz dd ppp
Output: False

Alphabet: {a,b,c}
Strings: tt yyyy zz dd ppp
Output: False

Alphabet: {a,b,c}
Strings: xp
Output: False

Alphabet: {a,b,c}
Strings: ax
Output: True

Edit:

Also, this is implementation-specific. You can also find an intersection of the alphabet and each string to do that, something like:

C#
using System.Linq;

char[] chars = {'a','b','c'};
String[] strings = { "aba", "ca", "abb", "cb", "zz" };

bool isComposition(String[] strings, char[] chars) {
    return strings.Select(s => s.Intersect(chars).Any()).ToArray().All(s => s);
}

Console.WriteLine(isComposition(strings, chars));


Good Luck! :)
 
Share this answer
 
v7
Comments
Member 15047625 8-Mar-22 9:22am    
Program crashes for set of length 1 "abc". I can provide you my code (I prefer C# myself).
Arthur V. Ratz 8-Mar-22 10:32am    
Here's a small update to the code, working for 1-string sets. Probably, this one is what you need. :)
Member 15047625 8-Mar-22 10:53am    
Still crashing for me. j at some point becomes 1 and it gets out of range because I has size 1x1 for set of size 1. So it only has available indexes 0 and 0.
Arthur V. Ratz 8-Mar-22 11:12am    
You might want also to try this. See the code fragment, from above.
Member 15047625 8-Mar-22 11:23am    
Now it works. But there is still one thing off. This set of strings: https://pastebin.com/tZwciVLk (first line is empty string of length 0). Returns false, but it should return true. I trust that this algorithm works fine so I'm unsure where is the problem.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900