Click here to Skip to main content
15,880,469 members
Articles / Desktop Programming / WPF
Tip/Trick

Find Duplicate Files and Delete Them

Rate me:
Please Sign up or sign in to vote.
3.91/5 (21 votes)
1 Sep 2012CPOL2 min read 88.8K   5K   39   29
Application for seaching similar kinds of files in different folders.

Introduction 

Actually I have created this application to search for similar songs on my computer. Using it I found multiple songs in different folders, but this application can be used for searching any kind similar files. Here we will also see how we can search for the similar files from the specified folder. As per the suggestions i have also updated the tool to search from file length as well as through file hash value. if you do the file length there are very rare chances of finding similar file even if they are different. but in case of file hash, there is no chances of grouping different fine and consider it as single one. File matching through hash will take more time then comparison through file length.

Using the Code

We have used C# and WPF for the application development. My background is with Windows forms so my WPF code may not be optimized. Suggestions will be welcome.

First of all we need to find all the files from the specified folder. To fulfill this requirement I have created one recursive function for getting all the files from specified folder. The below method will give you all the files (in form of List<FileAttrib>) containing in the Given Directory (we need to pass DirectoryInfo object)

C#
/// <summary>
/// Method to get all the files from given folder and subfolders.
/// </summary>
/// <param name="dinfo">DirectoryInfo object for the directory.</param>
/// <returns>Gives list of FileAttrib object with all files</returns>
private List<FileAttrib> GetFiles(DirectoryInfo dinfo)
{
    List<FileAttrib> files = new List<FileAttrib>();

    if (this.searchExtension == "*")
    {
        files.AddRange(dinfo.GetFiles().Select(s => this.ConvertFileInfo(s)));
    }
    else
    {
        files.AddRange(dinfo.GetFiles().Where(g => g.Extension.ToLower() == 
           string.Format(".{0}", 
           this.searchExtension.ToLower())).Select(s => this.ConvertFileInfo(s)));
    }

    foreach (var directory in dinfo.GetDirectories())
    {
        files.AddRange(this.GetFiles(directory));
    }
    
    return files;
}

We have created one class to store FileInfo object with required information. FileAttrib is the class used to store information. Here is the FileAttrib class code.

C#
public class FileAttrib
{
    public string fileName { get; set; }
    public string filePath { get; set; }
    public string fileImpression { get; set; }     
    public double fileLength { get; set; }
} 

We have stored only FileName, FilePath and FileImpression (it defines your file identity.) We have created one function that Converts our FileInfo object into FileAttrib type. Below is the code for conversion.

C#
/// <summary>
/// Convertor for converting FileInfo object into FileAttrib
/// </summary>
/// <param name="finfo">Original FileInfo object</param>
/// <returns>Generated FileAttrib object</returns>
private FileAttrib ConvertFileInfo(FileInfo finfo)
{
    return new FileAttrib
    {
        fileName = finfo.Name,
        filePath = finfo.FullName,
        fileImpression = isHashSearch ? this.FileToMD5Hash(finfo.FullName) : null,
        fileLength = finfo.Length
    };
}

You can see one FileToMD5Hash function to generate FileImpression information from the given file location. Below is the code for generating file impression from the file location.

C#
/// <summary>
/// Function to get file impression in form of string from a file location.
/// </summary>
/// <param name="_fileName">File Path to get file impression.</param>
/// <returns>Byte Array</returns>
private string FileToMD5Hash(string _fileName)
{
    using (var stream = new BufferedStream(File.OpenRead(_fileName), 1200000))
    {
        SHA256Managed sha = new SHA256Managed();
        byte[] checksum = sha.ComputeHash(stream);
        return BitConverter.ToString(checksum).Replace("-", string.Empty);
    }
}

Here is a LINQ query for grouping the result. This may also help you in other projects.

C#
e.Result = this.GetFiles(dinfo)
                            .GroupBy(i => i.fileLength)
                            .Where(g => g.Count() > 1)
                            .SelectMany(list => list)
                            .ToList<FileAttrib>(); 

All other code is very simple to understand. Here you can find the code of the C#. I have not explained the WPF parts (as I am not expert in it). All your suggestions/complaints/appreciations are welcome. Hope this application will be helpful to you.

Step 1: Select folder to search similar files.

Image 1

Step 2: Select folder for searching similar files.

Image 2

Step 3: Choose search from either FileLength or from FileHash(more time consuming)

Image 3

Step 4: Click on the Start button to search similar file in a group.

Image 4

Step 5: Delete selected file using context menu.

Image 5

In future i will keep updating this Utility. Keep posting your suggestiong/Query. you can also follow this project on codeplex @ FindDuplicateFile_CodePlex

History  

31 Aug 2012 - Added Multiple File delete functionality

19 Jun 2012 - Added Search option through file length

25 May 2012 - Initial Release

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
India India
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
Questionsir Pin
bryanjaysena@yahoo.com4-Jan-17 21:10
bryanjaysena@yahoo.com4-Jan-17 21:10 
Questionhello sir Pin
Member 1265517928-Oct-16 16:45
Member 1265517928-Oct-16 16:45 
AnswerRe: hello sir Pin
AmitGajjar31-Oct-16 5:48
professionalAmitGajjar31-Oct-16 5:48 
QuestionSearch by content Pin
Member 1173577224-Aug-16 23:53
Member 1173577224-Aug-16 23:53 
AnswerRe: Search by content Pin
AmitGajjar2-Sep-16 11:41
professionalAmitGajjar2-Sep-16 11:41 
Questionnice code - my suggestion Pin
Member 1179891214-Feb-16 12:17
Member 1179891214-Feb-16 12:17 
AnswerRe: nice code - my suggestion Pin
AmitGajjar15-Feb-16 3:55
professionalAmitGajjar15-Feb-16 3:55 
GeneralRe: nice code - my suggestion Pin
Member 1179891216-Feb-16 14:20
Member 1179891216-Feb-16 14:20 
AnswerDelete Duplicate Files Pin
bardy130-Sep-14 0:33
bardy130-Sep-14 0:33 
QuestionDuplicate files tool Pin
Member 1021093418-Aug-13 23:03
Member 1021093418-Aug-13 23:03 
Questionsaymahayen Pin
Member 960575916-Nov-12 9:47
Member 960575916-Nov-12 9:47 
General[My vote of 1] My 1 for not keeping the code up-to-date Pin
Andreas Gieriet20-Jun-12 13:19
professionalAndreas Gieriet20-Jun-12 13:19 
GeneralRe: [My vote of 1] My 1 for not keeping the code up-to-date Pin
AmitGajjar20-Jun-12 18:09
professionalAmitGajjar20-Jun-12 18:09 
on daily basis i keep update my code. even twice in a day. so changing article daily is not good practice. people will see the similar article on front page everyday.

so i do update attachment once in a week.

hope you don't mind for getting code from codeplex. you can find link from one of the comment.

thanks
-amit
GeneralRe: [My vote of 1] My 1 for not keeping the code up-to-date Pin
Andreas Gieriet20-Jun-12 20:36
professionalAndreas Gieriet20-Jun-12 20:36 
GeneralRe: [My vote of 1] My 1 for not keeping the code up-to-date Pin
AmitGajjar20-Jun-12 20:38
professionalAmitGajjar20-Jun-12 20:38 
Bug[My vote of 1] Exception Pin
Omar Gameel Salem19-Jun-12 22:40
professionalOmar Gameel Salem19-Jun-12 22:40 
GeneralRe: [My vote of 1] Exception Pin
AmitGajjar19-Jun-12 22:44
professionalAmitGajjar19-Jun-12 22:44 
GeneralRe: [My vote of 1] Exception Pin
AmitGajjar19-Jun-12 22:47
professionalAmitGajjar19-Jun-12 22:47 
GeneralMy vote of 1 Pin
Lorenzo Gatti18-Jun-12 4:54
Lorenzo Gatti18-Jun-12 4:54 
GeneralRe: My vote of 1 Pin
AmitGajjar18-Jun-12 18:17
professionalAmitGajjar18-Jun-12 18:17 
GeneralRe: My vote of 1 Pin
AmitGajjar19-Jun-12 3:01
professionalAmitGajjar19-Jun-12 3:01 
QuestionVery Useful Application Pin
Jean Paul V.A13-Jun-12 8:47
Jean Paul V.A13-Jun-12 8:47 
AnswerRe: Very Useful Application Pin
AmitGajjar14-Jun-12 18:19
professionalAmitGajjar14-Jun-12 18:19 
Bugunhandled exception Pin
morningstar171025-May-12 11:38
morningstar171025-May-12 11:38 
GeneralRe: unhandled exception Pin
AmitGajjar27-May-12 18:19
professionalAmitGajjar27-May-12 18:19 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.