Click here to Skip to main content
13,005,920 members (98,768 online)
Rate this:
Please Sign up or sign in to vote.
See more:
I want to remove the singer's voice in audio or video files, remain the background voice of the audio.

How can I implement this functionality?
Could someone give me some advice?

Thanks a lot.
Posted 21-Feb-11 21:47pm
afgkidy 22-Feb-11 21:57pm
yes, I have an audio file with a singer over a backing track and I want to isolate either the singers voice or the backing track.
afgkidy 23-Feb-11 0:36am
accroding to answer 1&2:
1, we know that most of mixed records can not be separate(singers voice and backing track), but several can be separate.
my question is, which format files can be separate?
I searched on google and other site. find that CD was records nicely, it can be done plssible.
what about those format files, such as .mp3, .wma, .aac, .rm, .rmvb etc...
Dylan Morley 23-Feb-11 6:33am
File format doesn't matter. Whether it's wav, mp3, wma or *anything else*, if the file has been mixed down you can't simply grab a certain part out of it, you're left with isolating audio within a frequency range. That's all you can do!

Why not download something like Reason ( and have a look at some of the demo songs. This will give you an idea about audio mixing, individual tracks and a final 'mix down'

afgkidy 23-Feb-11 21:06pm
now, my question is, if I have a audio file, maybe it is wav, mp3, wma or other format file. whether it has been mixed down that we can not grab a certain part out of it or it has been mixed nicely that we can separate it. how do we know that?
Rate this: bad
Please Sign up or sign in to vote.

Solution 1

I don't think you can do it — in general case with somewhat acceptable quality.
I would be quite thankful is anyone proves me wrong, but will be extremely surprised it is possible.

Just one notes: masters of scat singing can mimic an instrument amazingly well; even a human can hardly tell the difference. Even if such voice can be recognized, where should it go: to the minus phonogram or filtered out?

I'm updating my explanation based on the comment by Emilio. He is absolutely right.
The correctness of this task is the same as correctness of the inversion of the direction of time from thermodynamic stand point: previous state does exist in principle, but it cannot be recovered based on the information of the later state if entropy was increased (it can only increase with time or stay the same).

Consider the audio was nicely recorded in several separate channels, so the human voice channels are separate. This is how good recording is done in a good studio. Now, for a consumer product, all channels are mixed together in one or just two stereo channels. This is a really big increase of entropy of information. The solution is to restore the records of each recorded channel which; let's say, these records do physically exit at that time, so the solution exists and can be validated using that past records. But exact solution of this task based on already-mixed records only would be a work against entropy, which is impossible theoretically.

Emilio Garavaglia 22-Feb-11 9:01am
"I would be quite thankful is anyone proves me wrong"
That's impossible.
"Unmixing" mixed things reduces the entropy. It cannot be done without messing up something else.
SAKryukov 22-Feb-11 13:42pm
You're absolutely right (I would vote 5+++++ if I could :-).
I added explanation of what you say for those unfamiliar with entropy concept :-)
afgkidy 23-Feb-11 0:09am
SAKryukov 23-Feb-11 0:12am
You're welcome. Sorry it's not so encouraging :-)
Rate this: bad
Please Sign up or sign in to vote.

Solution 2

If I understand correctly, you have an audio file with a singer over a backing track & you want to isolate either the singers voice or the backing track?

You can't really do this to a high quality....even with professional applications such as Cubase or Logic, it's extremely difficult.

All you can do if you've got a stereo audio mix is isolate certain frequencies and either boost or reduce them.

For example, you would need to determine the singers Vocal range[^] which should be between 80Hz to 1100Hz

You could then set an extremely narrow filter over the frequency range that would cut out all the other audio.

While you get a result like this, it's not very good. Some of the original harmonics before the filter was applied always remain, and the filter itself will affect the audio you've isolated.

Basically, you want to get the eggs back after you've baked the cake!

Linky, found this[^]

That's all they're doing there, setting a couple of band pass filters to narrow the frequency range
SAKryukov 22-Feb-11 13:47pm
You're right, my 5.
I did mean that some limited effect of voice removing is possible by mentioning "quality".
I've added a note about theoretical impossibility of exact solution, driven by a comment by Emilio (appreciate it very much).
Yusuf 22-Feb-11 14:03pm
Excellent! +5
afgkidy 23-Feb-11 0:09am

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Top Experts
Last 24hrsThis month

Advertise | Privacy | Mobile
Web02 | 2.8.170628.1 | Last Updated 22 Feb 2011
Copyright © CodeProject, 1999-2017
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100