Click here to Skip to main content
15,942,449 members
Please Sign up or sign in to vote.
4.00/5 (1 vote)
See more:
I want to find the longest repeating character in a sorted string.
C#
class Program
    {
        static string input = "aaaabbcddddddddddeeffffg";
        // expected: "d"
        static void Main(string[] args)
        {
            char[] c = input.ToCharArray();
            List<string> list = new List<string>();
            string temp = String.Empty;
            for (int i = 1; i < c.Length - 1; i++)
            {
                char curr = c[i];
                char next = c[i + 1];
                temp += curr.ToString();
                if (curr != next)
                {
                    list.Add(temp);
                    temp = next.ToString();
                }
            }

            Console.WriteLine(list.OrderByDescending(n => n.Length).First()[0]);
            Console.WriteLine("Press any key...");
            Console.ReadKey();
        }
    }

My code has two flaws.
1) It missed the last char.
2) It used Linq and stored the blocks to a list, so it may not be efficient. I hope some kind of in-place algorithm.

Thank you.
Posted
Comments
Andreas Gieriet 13-Dec-14 7:02am    
It seems to me a bit of an overkill to go the "bin" approach: for each character throw it into the respective bin. Then sort the bin by amount of characters in each bin.
This approach is adequate if you have an arbitrary string. But you state that the string is already "sorted", so you only have to loop over the string and count plus decide if a new maximum is found while counting. See solution 4 for this approach.
Cheers
Andi

Arrays are zero-based; by starting with i = 1 you skip one element. And by using c.Length - 1, you also skip the last element. You also don't need to store the whole list, you can store only the longest. It's also better to use StringBuilder than string concatenation. And the temp = next.ToString(); line shouldn't be there, because it causes a char to be counted twice.
C#
char[] c = input.ToCharArray();
string longest = "";
StringBuilder sb = new StringBuilder();
for (int i = 0; i < c.Length; i++)
{
    char curr = c[i];
    char next = i != c.Length - 1 ? c[i + 1] : '\0'; // check whether the current char is the last
    sb.Append(curr);
    if (curr != next)
    {
        string temp = sb.ToString();
        if (temp.Length > longest.Length)
        {
            longest = temp; // if the new string with the new chars is longer, replace the longest string
        }
        sb.Clear(); // clear the StringBuilder
    }
}

Console.WriteLine(longest[0]);
Console.WriteLine("Press any key...");
Console.ReadKey();

The above code shows how you can do it with a StringBuilder and without LINQ.

Why StringBuilder? Take a look at this article: http://yoda.arachsys.com/csharp/stringbuilder.html[^]
Quote:
Definitely use StringBuilder when you're concatenating in a non-trivial loop - especially if you don't know for sure (at compile time) how many iterations you'll make through the loop. For example, reading a file a character at a time, building up a string as you go using the += operator is potentially performance suicide.

That is the case here; we don't know how many iterations it'll make, because that depends on the input.
 
Share this answer
 
v2
Comments
Maciej Los 12-Dec-14 13:43pm    
Hawk eye, +5!
Thomas Daniels 12-Dec-14 13:45pm    
Thank you!
[no name] 12-Dec-14 13:47pm    
Thank you. +5!
Thomas Daniels 12-Dec-14 13:51pm    
You're welcome!
[no name] 12-Dec-14 19:12pm    
5 for all this
I'd suggest to read this thread: Counting A Character[^].
I do not see any performance issues by using Linq, unless the set of strings is very, very LARGE.
It's really fast. Below query is executed in 0.001 sec.
C#
string input = "aaaabbcddddddddddeeffffg";

var qry = (from c in input
		group c by c into grp
		orderby grp.Count() descending
		select new
		{
			Character = grp.Key,
			Count= grp.Count()
		}).Take(1);

foreach(var v in qry)
{
	Console.WriteLine("'{0}' is counted {1} time(s)",v.Character,v.Count);
}

Result:
'd' is counted 10 time(s)
 
Share this answer
 
v2
Comments
[no name] 12-Dec-14 19:15pm    
My 5. I see I have a hard way in front of me to stay up to date with this linq things :)
Maciej Los 13-Dec-14 4:02am    
Thank you, Bruno ;)
Andreas Gieriet 13-Dec-14 6:57am    
My 5 too! Your solution and solution 5 are similar with some minor differences.
Cheers
Andi
Maciej Los 13-Dec-14 7:02am    
Thank you, Andi ;)
Here's a solution that just counts the characters and keeps track of the longest sequence seen so far.
C#
class Program
{
  static string input = "aaaabbcddddddddddeeffffg";
  // expected: "d"
  static void Main(string[] args)
  {
    var longest = FindLongestRepeat(input);
    Console.WriteLine("Char {0} Count {1}", longest.Item1, longest.Item2);
    Console.WriteLine("Press any key...");
    Console.ReadKey();
  }

  static Tuple<char, int> FindLongestRepeat(string str)
  {
    if (string.IsNullOrEmpty(str))
      return null;
    char seqChar = str[0];
    int seqCount = 1;
    char longChar = seqChar;
    int longCount = seqCount;
    for (int i = 1; i < str.Length; i++)
    {
      char c = str[i];
      if (c != seqChar)
      {
        seqChar = c;
        seqCount = 1;
      }
      else
      {
        ++seqCount;
        if (c == longChar)
        {
          ++longCount;
        }
        else if (seqCount > longCount)
        {
          longCount = seqCount;
          longChar = seqChar;
        }
      }
    }
    return Tuple.Create(longChar, longCount);
  }
}

In case of a tie, it returns the first.
 
Share this answer
 
I prefer to encapsulate the status and access to it.
The idea behind this: since the string is already grouped or sorted by characters, we can traverse the string from left to right and maintain a current state (which character with how many instances at that point). If the number of character instances at that point is greater than the ones from the maintained max-state, the max-state is also updated (which character and how many instances of it).
C#
class State
{
    public char Char { get; private set; }
    public int Count { get; private set; }
    public void Update(char c, ref State max)
    { 
        if (Char != c) { Char = c; Count = 0; }
        Count++;
        if (Count > max.Count) { max.Char = Char; max.Count = Count; }
    }
};
public static void Main()
{
    State curr = new State(), max = new State();
    foreach (char c in "aaaabbcddddddddddeeffffg") curr.Update(c, ref max);
    Console.WriteLine("max = {0} ({1} x '{2}')"
                     , new string(max.Char, max.Count)
                     , max.Count
                     , max.Char
                     );
}

Results in
max = dddddddddd (10 x 'd')

Cheers
Andi
 
Share this answer
 
v4
Comments
[no name] 12-Dec-14 20:33pm    
I like to vote 5, but I miss some comments for noobs like me in the code :( trust the comments will follow so 5 ;)
Andreas Gieriet 13-Dec-14 6:52am    
Hello Bruno,
I've updated the solution with the description of what the intent of the solution is.
Cheers
Andi
[no name] 13-Dec-14 7:09am    
Thanks. I can not vote more than 5 :(
Bruno
BillWoodruff 12-Dec-14 23:16pm    
Hi Andreas, imho, what you implement by means of the Class 'State is equivalent to creating two 'KeyValuePair instances where the Key is a 'char and the Value is an 'int.

Linq can easily generate a Dictionary:

var dictionary = someString.Distinct().ToDictionary(value => value, key => 0);

Of course, that dictionary will have a KeyValuePair for each unique char in the string, with the value of each KeyValuePair set to #0.

I mention this not to, in any way, criticize your solution.
Andreas Gieriet 13-Dec-14 6:45am    
Hello Bill,
thanks for your feedback!
I was contemplating on the grouping approach too before I wrote this solution.
If the problem was to have an arbitrary string and need to get the number of each character and from these the one with the most instances, then of course I would have done the bin or grouping approach.
Since here we have a string that is already "grouped" (i.e. the OP calls it "sorted"), we simply have to iterate over the string and there is no need to store all characters again. I'm a bit reluctant to store the data in a different structure again without cause. Maybe this is from my background of also programming in small memory systems where you get sooner or later sensitive to redundant storage structures ;-)
So, this solution is just one of many possible ones: it is a cursor that traverses and updates its character count state and decides if it needs updating the max state.
Cheers
Andi
PS: I find criticism brings us all ahead. So, please tell what's nice/interesting/.../not so good/wrong/... :-)
Given the fact that the string is an ordered sequence, this gets very easy[1] to do with Linq:
C#
string test = @"aaaabbcddddddddddeeffffg";

var longestGroup = test.GroupBy(ch => ch).OrderByDescending(grp => grp.Count()).First();
char longestChar = longestGroup.Key;
int longestContinuousCharCount = longestGroup.Count();
[1]"Easy" referring to the length of the code; what's going on under-the-hood with Linq is as complex as any of the other solutions posted here.
 
Share this answer
 
Comments
Maciej Los 13-Dec-14 4:05am    
Another linq solution ;)
+5!
Andreas Gieriet 13-Dec-14 6:54am    
My 5 too! It's interesting to see the various approaches to solve also "trivial" problems. I like that! :-)
Cheers
Andi

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900