Click here to Skip to main content
15,886,067 members
Articles / Desktop Programming / WPF

Duplicate songs detector via audio fingerprinting

Rate me:
Please Sign up or sign in to vote.
4.96/5 (337 votes)
23 Jun 2020MIT44 min read 1.3M   20.4K   533  
Explains sound fingerprinting algorithm, with a practical example of detecting duplicate files on the user's local drive.
The aim of this article is to show an efficient algorithm of signal processing which will allow one to have a competent system of sound fingerprinting and signal recognition. I'll try to come with some explanations of the article's algorithm, and also speak about how it can be implemented using the C# programming language. Additionally, I'll try to cover topics of digital signal processing that are used in the algorithm, thus you'll be able to get a clearer image of the entire system. And as a proof of concept, I'll show you how to develop a simple WPF MVVM application.
// Sound Fingerprinting framework
// https://code.google.com/p/soundfingerprinting/
// Code license: GNU General Public License v2
// ciumac.sergiu@gmail.com
using System;

namespace SoundfingerprintingLib.NeuralHashing.MMI
{
    /// <summary>
    ///   Class that allows one to calculate minimal mutual information between 2 samples
    /// </summary>
    public static class MutualInformation
    {
        /// <summary>
        ///   Compute minimal mutual information between 2 samples
        /// </summary>
        /// <param name = "samples1">Sample 1</param>
        /// <param name = "samples2">Sample 2</param>
        /// <returns>Value of minimal mutual information</returns>
        public static double Compute(float[] samples1, float[] samples2)
        {
            if (samples1.Length != samples2.Length)
            {
                throw new ArgumentException("The length of arrays should be equal");
            }
            int length = samples1.Length;

            float f00 = 0;
            float f01 = 0;
            float f10 = 0;
            float f11 = 0;
            for (int k = 0; k < length; k++)
            {
                if (samples1[k] < 0.1 && samples2[k] < 0.1)
                    f00++;
                else if (samples1[k] > 0.9 && samples2[k] > 0.9)
                    f11++;
                else if (samples1[k] < 0.1 && samples2[k] > 0.9)
                    f01++;
                else
                    f10++;
            }

            if (f00 == 0.0)
                f00++;
            if (f10 == 0.0)
                f10++;
            if (f01 == 0.0)
                f01++;
            if (f11 == 0.0)
                f11++;

            float pX0Y0 = f00/length;
            float pX0Y1 = f01/length;
            float pX1Y0 = f10/length;
            float pX1Y1 = f11/length;
            float pX0 = pX0Y0 + pX0Y1;
            float pX1 = pX1Y0 + pX1Y1;
            float pY0 = pX0Y0 + pX1Y0;
            float pY1 = pX0Y1 + pX1Y1;

            double mutualInformation = (float) (pX0Y0*Math.Log(pX0Y0/(pX0*pY0)) +
                                                pX0Y1*Math.Log(pX0Y1/(pX0*pY1)) +
                                                pX1Y0*Math.Log(pX1Y0/(pX1*pY0)) +
                                                pX1Y1*Math.Log(pX1Y1/(pX1*pY1)));
            return mutualInformation;
        }
    }
}

By viewing downloads associated with this article you agree to the Terms of Service and the article's licence.

If a file you wish to view isn't highlighted, and is a text file (not binary), please let us know and we'll add colourisation support for it.

License

This article, along with any associated source code and files, is licensed under The MIT License


Written By
Software Developer
Moldova (Republic of) Moldova (Republic of)
Interested in computer science, math, research, and everything that relates to innovation. Fan of agnostic programming, don't mind developing under any platform/framework if it explores interesting topics. In search of a better programming paradigm.

Comments and Discussions