Click here to Skip to main content
15,886,258 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
You are given a list of N words (strings containing only lower case letters of the English alphabet).
We consider two words to be equivalent if they contain the same letters, i.e. we can rearrange the letters of one word in order to obtain the other word.
Compute the size of the largest subset of equivalent words.You should assume the input is quite large (there are about 10^5 letters in total).

Input
The first line contains a single integer value
Each of the following N lines contains a single string, representing one of the words.

Output
The output should contain a single integer representing the size of the largest subset of equivalent words.


Sample Input
6
cats
caller
dogs
cellar
parrots
recall

Sample Output
3

Explanation
caller, cellar and recall are (the only) equivalent words.


What I have tried:

Java
import java.util.Scanner;
import java.io.*;
import java.util.Arrays;
class Main
{
    public static void main(String[] args)
    {
        Scanner sc=new Scanner(System.in);
        int n=sc.nextInt();
        String list[]=new String[n];
        for(int i=0;i<n;i++)
        {
            list[i]=sc.next();
        }
        int ans=0;
        for(int i=0;i<n;i++)
        {
            int count=0;
          for(int i2=i+1;i2<n;i2++)
          {
              if(isAnagram(list[i],list[i2]))
              count++;
            }
            if(count>ans)
            ans=count;
        }
        System.out.println(ans+1);
    }
    public static boolean isAnagram(String s1, String s2){
  if (s1.length() != s2.length()) return false;
 
  char[] c1 = s1.toCharArray();
  char[] c2 = s2.toCharArray();
 
  Arrays.sort(c1);
  Arrays.sort(c2);
 
  for(int i = 0; i < c1.length; i++) {
    if(c1[i] != c2[i]) return false;
  }
 
  return true;
}
}
Posted
Updated 30-May-17 0:32am
v4
Comments
wseng 30-May-17 5:55am    
Please define TLE.
Member 13230582 30-May-17 6:06am    
Time Limit Extended!

1 solution

The way I'd do it is in several passes:
The first pass goes through each word and adds up all the characters to form an integer value, storing these in an array (or a value with each word in a class would be better).
This forms a crude hash value where anagrams have the same value (because addition is commutative - 1 + 2 + 3 is the same as 3 + 1 + 2). This is crude, because "equal value" does not mean "equal letters" (1 + 2 + 6 is equal to 1 + 3 + 5) but it's quick, and it should discard a significant number of your words that do not have anagram matches.
The next pass sorts the words so that equal hash values are together.
The final pass looks at each set of equal hash words. If the current longest count is greater than the set count, discard it because even if they are all anagrams, it's not what we are looking for. Otherwise, sort each word (so in your example "caller", "cellar", and "recall" all become "acellr") and group them again based on identical strings. If the group count exceeds the current longest, that's the new current longest.

The reason for this is that hash creation and comparison is very quick - even if it isn't totally accurate - but sorting is an expensive operation, so you want to minimise the number of sorts you need to do.

Make sense?
 
Share this answer
 
v2
Comments
Member 13230582 30-May-17 7:06am    
I don't understand the final pass, bud
Member 13230582 30-May-17 7:07am    
i would be higly obliged if you could programmatically solve it!
OriginalGriff 30-May-17 7:13am    
No, that's not going to happen: we aren't here to do your work for you, even if it is your homework - especially if it's your homework.
What part of it doesn't make sense? You have groups of words with the same hash value. You need to check which words in that group are anagrams, but there may be two or more different word sets in there. So you sort them, so the letters in each word are all in the same order, and compare. This tells you how large the biggest sub group is.
Member 13230582 30-May-17 7:24am    
Still a TLE! One has to solve it in 1 sec! This one is taking 3 secs for the largest input which no one knows!
OriginalGriff 30-May-17 7:29am    
So start timing it and finding out where the time is being taken.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900