Click here to Skip to main content
14,930,079 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
I am having a textfile.
I want to display all the repeated lines in the text document and

store those repeated lines in a new text file.

How to do this using C#
Posted
Updated 5-May-21 14:12pm
Comments
syed shanu 27-Mar-14 5:09am
   
Chk this link http://stackoverflow.com/questions/1245500/c-sharp-remove-duplicate-lines-from-text-file

you can see sample code to remove duplicate lines.Insted of remove you can add your logic to get the duplicate lines.
KUMAR619 27-Mar-14 5:11am
   
Please help me to get repeated lines
Give me some code
syed shanu 27-Mar-14 5:13am
   
you can see in that website .they read line by line and compare each line with before line and if before line and presentline same that sample is to delete insted of delete you can apply your logic.Try with that sample code and if you not able to solve it paste your code here i will work on that.
Member 15185609 5-May-21 20:13pm
   
Hello, thank you very much for this code, it has helped me a lot, but I have a problem, once I have already removed the duplicates from a text file, how do I store the clean ones and put them in another file or in a variable. I hope some answer thanks.

1 solution

Easy way is to use Linq:
C#
string[] lines = File.ReadAllLines(path);
lines = lines.GroupBy(x => x).Where(g => g.Count() > 1).Select(g => g.Key).ToArray();
File.WriteAllLines(newPath, lines):




OP:
string[] lines = File.ReadAllLines(path);
lines = lines.GroupBy(x => "Books").Where(g => g.Count() > 1).Select(g => g.Key).ToArray();
File.WriteAllLines(newP
ath, lines):

Was that right

No, it's not right!
Instead of making changes at random (which could take us both all day!) lets look at it logically:

C#
lines.GroupBy(x => x).Where(g => g.Count() > 1).Select(g => g.Key).ToArray();

lines a Collection (lists and arrays are both examples of collections)
So we can use a Linq Method to "collect them together" so identical lines are "grouped":
C#
lines.GroupBy(x => x)
The x => x is a lambda which uses the line whole line itself as the data to group by.
When can then select only those lines where the number of lines in the group (ie the number of lines that are identical) is greater than one - so just the duplicates:
C#
lines.GroupBy(x => x).Where(g => g.Count() > 1)

Then we select the whole text from the group:
C#
lines.GroupBy(x => x).Where(g => g.Count() > 1).Select(g => g.Key);
(The Key to the Group is the value you grouped by)
And convert it into an array:
C#
lines.GroupBy(x => x).Where(g => g.Count() > 1).Select(g => g.Key).ToArray();
So we can put it back into the lines array.

Now, if you had a string like:
C#
string s = "hello, this line is about books with hard back covers.";

How would you find out if it contained your search word?
   
v2
Comments
KUMAR619 27-Mar-14 5:38am
   
But it not added in new text file.

What to do sir
OriginalGriff 27-Mar-14 5:47am
   
Um...it is if you set a value in the newPath variable...Does the file get created?
KUMAR619 27-Mar-14 5:56am
   
Thanks it worked for me

Now I want to search repeated line by particular string
Example

sampleString="Books";

If any line contains this string I need to write it in a new file
OriginalGriff 27-Mar-14 5:59am
   
So what have you tried?
(You actually have 95% of the code you need already - so it's a pretty simple change if you understand what I gave you!)
KUMAR619 27-Mar-14 6:04am
   
string[] lines = File.ReadAllLines(path);
lines = lines.GroupBy(x => "Books").Where(g => g.Count() > 1).Select(g => g.Key).ToArray();
File.WriteAllLines(newPath, lines):

Was that right
KUMAR619 27-Mar-14 6:20am
   
How to do this sir
OriginalGriff 27-Mar-14 6:44am
   
Answer updated.
KUMAR619 27-Mar-14 6:50am
   
if (filelines[tempCurrentLine].Contains("Books"))
{
lstLines.Add(filelines[tempCurrentLine);

Was that right
}
OriginalGriff 27-Mar-14 7:03am
   
Basically, yes - you use String.Contains.
So why didn't you put that in your version? :laugh:

So we want to use string.Contains - but where...
Can you tell? I'll give you a clue: which one of the Linq methods we used here works on a boolean true/false value?
KUMAR619 27-Mar-14 7:23am
   
Yes I've completed the task
1. I first found the lines with the string using string.contains Method

2. Then I write a text file separately

3. Then I used your Linq query to find repeated lines.

4. Then I write a new file for the repeated lines.

That gave me my output.
But is there any other simple way to do this
OriginalGriff 27-Mar-14 7:41am
   
Yes.
Look at my original code, and change:
Where(g => g.Count() > 1)
To
Where(g => g.Count() > 1 && g.Key.Contains(sampleString))
KUMAR619 27-Mar-14 8:03am
   
Thanks Sir
OriginalGriff 27-Mar-14 8:11am
   
You're welcome!
Member 10285969 1-Mar-21 13:39pm
   
Great! what i have been searching for

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900