Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C# .NET strings
Hi, I'm trying out a sample problem on strings wherein I take a string as input and find out the similarity of each of its suffix with itself.
 
For two strings A and B, the similarity of the strings is the length of the longest prefix common to both strings. For example, the similarity of strings “abc” and “abd” is 2, while the similarity of strings “aaa” and “aaab” is 3.
 
For example, if the input string is ababaa. Then its suffixes would be ababaa, babaa, abaa, baa, aa, a. The corresponding similarities would be 6, 0, 3, 0, 1, 1. These are added and printed on the console.
 
Now in my problem, I'm taking input as follows. First line is the number of strings whose sum of similarities (with each of it's suffixes) have to be printed. In the next lines, I enter that many number of strings. I have written code for the problem and it is running fine for normal inputs. But when i give a very large string, it is giving the following runtime exception message "Could not allocate 8192 bytes". My code is as follows
 
 
using System;
using System.Collections.Generic;
using System.IO;
class Solution {
    static List<int> listOfSimilarities = new List<int>();
    static void Main(String[] args) {
        
        int t = 0;
        
        t = Convert.ToInt32(Console.ReadLine());
        
        if( t >= 1 && t <=10)
        {
            
            List<string> listOfStrings = new List<string>();
            string input = string.Empty;
            
            for(int i = 1 ; i <= t ; i++)
            {
                
                input = Console.ReadLine().ToString();
                if(input.Length < 100000)
                {
                    listOfStrings.Add(input);
                }
            }
            
            foreach(string str in listOfStrings)
            {
                List<string> listOfSuffixes = new List<string>();
                listOfSuffixes = GetSuffixes(str);
                
                List<int> listOfSimilaritiesForEachSuffix = new List<int>();
                listOfSimilaritiesForEachSuffix = GetSimilaritiesWithString(str, listOfSuffixes);
                
                int sumOfSimilarities = GetSumOfSimilarities(listOfSimilaritiesForEachSuffix);
                
                listOfSimilarities.Add(sumOfSimilarities);
                
            }
            
            foreach(int similarity in listOfSimilarities)
            {
                Console.Write(similarity + "\n");
            }
        }
    }
    
static int GetSumOfSimilarities(List<int> listOfSimilaritiesForEachSuffix)
    {
        int result = 0;
        foreach(int i in listOfSimilaritiesForEachSuffix)
        {
            result += i;
        }
        return result;
    }
    
static List<int> GetSimilaritiesWithString(string str, List<string> listOfSuffixes)
    {
        List<int> listOfSimilaritiesForEachSuffix = new List<int>();
        char[] arrayOfActualString = str.ToCharArray();
        foreach(string suffix in listOfSuffixes)
        {
            int count = 0;
            char[] suffixArray = suffix.ToCharArray();
            for(int i = 0; i < suffixArray.GetLength(0) ; i++)
            {
                if(suffixArray[i] == arrayOfActualString[i])
                {
                    count++;
                }
                else
                {
                    break;
                }
            }
            listOfSimilaritiesForEachSuffix.Add(count);
        }
        return listOfSimilaritiesForEachSuffix;
    }
    
static List<string> GetSuffixes(string str)
    {
        List<string> listOfSuffixes = new List<string>();
        for(int i = 0; i < str.Length ; i++)
        {
            listOfSuffixes.Add(str.Substring(i));
        }
        return listOfSuffixes;
    }
}
 
 
Any idea as to why this is happening?
Posted 11-May-13 9:42am
Comments
Zoltán Zörgő at 11-May-13 15:38pm
   
In which code line do you get this exception?
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

When I run your code with 99,999 byte strings (I cheated and read it from a file) I get "out of system memory" on the first run through GetSuffixes, so start by looking at that - and what do we find?
 
Input string length: 99,999 bytes. "i" is 8,847, so it doesn't get near the end of the string.
 
Look at the code: each time round, you create and store a string which is one character shorter than the previous one. So the strings you create with Substring are:
99999 chars
99998 chars
99997 chars
...
So by the time you get the error, you have directly allocated 845,652,024 characters, each of two bytes (unicode), or 1.6 GB in string space. Add to that the Object overhead for each string, and the List space you are using to store the references, and you are surprised when it starts to have problems?
 
I strongly suggest that you go back a stage or two in your algorithm, and think again about exactly how you should do this.
  Permalink  
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

Instead of storing all sub-strings (suffixes) in list object, calculate the sum of similarities for each suffix. I could not figure out your purpose of calculating the sum of similarity. You can look into KMP failure function if that helps.
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 Sergey Alexandrovich Kryukov 575
1 Kornfeld Eliyahu Peter 409
2 Maciej Los 369
3 DamithSL 196
4 OriginalGriff 188
0 OriginalGriff 6,353
1 DamithSL 4,854
2 Maciej Los 4,476
3 Kornfeld Eliyahu Peter 4,058
4 Sergey Alexandrovich Kryukov 3,917


Advertise | Privacy | Mobile
Web01 | 2.8.141220.1 | Last Updated 12 May 2013
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100