Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C++ Video audio
I want to remove the singer's voice in audio or video files, remain the background voice of the audio.
 
How can I implement this functionality?
Could someone give me some advice?
 
Thanks a lot.
Posted 21-Feb-11 22:47pm
afgkidy263
Comments
afgkidy at 22-Feb-11 21:57pm
   
yes, I have an audio file with a singer over a backing track and I want to isolate either the singers voice or the backing track.
afgkidy at 23-Feb-11 0:36am
   
accroding to answer 1&2:
1, we know that most of mixed records can not be separate(singers voice and backing track), but several can be separate.
my question is, which format files can be separate?
I searched on google and other site. find that CD was records nicely, it can be done plssible.
what about those format files, such as .mp3, .wma, .aac, .rm, .rmvb etc...
Dylan Morley at 23-Feb-11 6:33am
   
File format doesn't matter. Whether it's wav, mp3, wma or *anything else*, if the file has been mixed down you can't simply grab a certain part out of it, you're left with isolating audio within a frequency range. That's all you can do!
 
Why not download something like Reason (http://www.propellerheads.se/products/reason/) and have a look at some of the demo songs. This will give you an idea about audio mixing, individual tracks and a final 'mix down'
 
afgkidy at 23-Feb-11 21:06pm
   
Thanks.
now, my question is, if I have a audio file, maybe it is wav, mp3, wma or other format file. whether it has been mixed down that we can not grab a certain part out of it or it has been mixed nicely that we can separate it. how do we know that?
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

I don't think you can do it — in general case with somewhat acceptable quality.
I would be quite thankful is anyone proves me wrong, but will be extremely surprised it is possible.
 
Just one notes: masters of scat singing can mimic an instrument amazingly well; even a human can hardly tell the difference. Even if such voice can be recognized, where should it go: to the minus phonogram or filtered out?
 
[EDIT]
I'm updating my explanation based on the comment by Emilio. He is absolutely right.
The correctness of this task is the same as correctness of the inversion of the direction of time from thermodynamic stand point: previous state does exist in principle, but it cannot be recovered based on the information of the later state if entropy was increased (it can only increase with time or stay the same).
 
Consider the audio was nicely recorded in several separate channels, so the human voice channels are separate. This is how good recording is done in a good studio. Now, for a consumer product, all channels are mixed together in one or just two stereo channels. This is a really big increase of entropy of information. The solution is to restore the records of each recorded channel which; let's say, these records do physically exit at that time, so the solution exists and can be validated using that past records. But exact solution of this task based on already-mixed records only would be a work against entropy, which is impossible theoretically.
 
—SA
  Permalink  
v4
Comments
Emilio Garavaglia at 22-Feb-11 9:01am
   
"I would be quite thankful is anyone proves me wrong"
That's impossible.
"Unmixing" mixed things reduces the entropy. It cannot be done without messing up something else.
SAKryukov at 22-Feb-11 13:42pm
   
You're absolutely right (I would vote 5+++++ if I could :-).
I added explanation of what you say for those unfamiliar with entropy concept :-)
--SA
afgkidy at 23-Feb-11 0:09am
   
thanks.
SAKryukov at 23-Feb-11 0:12am
   
You're welcome. Sorry it's not so encouraging :-)
--SA
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

If I understand correctly, you have an audio file with a singer over a backing track & you want to isolate either the singers voice or the backing track?
 
You can't really do this to a high quality....even with professional applications such as Cubase or Logic, it's extremely difficult.
 
All you can do if you've got a stereo audio mix is isolate certain frequencies and either boost or reduce them.
 
For example, you would need to determine the singers Vocal range[^] which should be between 80Hz to 1100Hz
 
You could then set an extremely narrow filter over the frequency range that would cut out all the other audio.
 
While you get a result like this, it's not very good. Some of the original harmonics before the filter was applied always remain, and the filter itself will affect the audio you've isolated.
 
Basically, you want to get the eggs back after you've baked the cake!
 

Linky, found this
 
http://www.musicmorpher.com/free-tutorials/voice-extractor.htm[^]
 
That's all they're doing there, setting a couple of band pass filters to narrow the frequency range
  Permalink  
v2
Comments
SAKryukov at 22-Feb-11 13:47pm
   
You're right, my 5.
I did mean that some limited effect of voice removing is possible by mentioning "quality".
I've added a note about theoretical impossibility of exact solution, driven by a comment by Emilio (appreciate it very much).
--SA
Yusuf at 22-Feb-11 14:03pm
   
Excellent! +5
afgkidy at 23-Feb-11 0:09am
   
thanks.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 OriginalGriff 480
1 Maciej Los 290
2 Richard MacCutchan 225
3 BillWoodruff 185
4 Suraj Sahoo | Coding Passion 155
0 OriginalGriff 8,764
1 Sergey Alexandrovich Kryukov 7,437
2 DamithSL 5,639
3 Maciej Los 5,279
4 Manas Bhardwaj 4,986


Advertise | Privacy | Mobile
Web02 | 2.8.1411028.1 | Last Updated 22 Feb 2011
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100