The Lounge is rated Safe For Work. If you're about to post something inappropriate for a shared office environment, then don't post it. No ads, no abuse, and no programming questions. Trolling, (political, climate, religious or whatever) will result in your account being removed.
I may be totally wrong, but it seems that there should be a fairly good piece of software that can "listen" to a WAV or MP3 of a song, and transcribe (polyphonically) the various parts into staves of notes for that instrument.
Anyone know of such an application? Anyone aware of one being developed using machine learning?
You want software, that can hear a song, and transcribe each instrument, being able to recognise two different guitars, a guitar from a bass etc, in its right key, a key which is determined by the average of the notes played, and is not even always evident, into notes that depend on that key?
Easy in a three piece band perhaps, in an orchestra?
I think it's probably only working with videos so they can match the music to the motion of the instrument to some extent. There are also different tonal qualities, of course. But I wouldn't want to try differentiating between the first violin and the second.
And how do you determine the key? There are only 12 notes. Working out the key is very difficult.
Not really, as long as it's a "standard" key without accidentals. And, score notation-wise, there isn't difference between C Maj and A Min, etc.
We won't sit down.
We won't shut up.
We won't go quietly away.
I'm no musician but have a basic grasp. By your logic, it must therefore also be "impossible" for anyone to listen to a piece of music and transcribe it. I do know there are several tools (I've used some) that will take normal written music and re-write it in a different key. (The interface to do this was pretty complicated though, requiring the use of two keys pressed simultaneously; "shift" and either "up" or "down" as I recall) By implication, even if any automated tool gets the initial key "wrong" it would be a simple matter to adjust it into a more normally accepted key.
No, because (some) people (after many years in music) are very good at recognising scales, mood, and hence the key.
take normal written music and re-write it in a different key
This is simpler, much simpler. The notes and key are given. Trying to determine they key is difficult.
For example, the key of G is a happy key.
The key of E is moody and blue. A is simple, but fun.
Deep Purple used G a lot.
BLues uses E.
ACDC used A a lot.
How does a machine recognise mood? Emotion?
And then the issue of taking apart the sound and determining which instruments are playing which notes. Very hard to do for a human ear who knows music, and knows the make up of the band already.
For example it takes a good ear to recognise two guitars are playing a solo. It is often noticeable because the combination of notes is not physically possible to play. But you have to be a guitarist to recognise that.
A computer program, well, it can only do what it is told. It can not interpret for itself and decide.
Working out the key is a trivially simple task that the human can do, either as an input to the program or later when editing its output. In fact, since the computer already knows the *pitches* of every note, a good heuristic would be to go through all keys and see which produces the least amount of accidentals in the resulting score.
The really hard bit, as you pointed out first, is isolating the sounds of different instruments.
As you say humans can detect the key, if they are musicians, because of the mood, or feel of a song. How do you teach a computer to understand mood?
Indeed, this is difficult for computers, but it's easy for us and therefore not the problem we need them to solve -- accurate transcription (of heavily polyphonic music) is very hard for us and that's where we need help from the machine.
You would need to count the notes of each frequency of the whole piece, assume most are not accidentals, then compare the highest counts to a table of notes per key. If there is a question, you can look at the first and last notes of the piece to decide.
In a musical score you don't really show the key (the music could be in a mode instead) - you just apply the convenient number of sharps or flats after the clef sign. You'd just need a table of flats and sharps in the order they appear on the circle of 5ths.
The program can notice that there are way more b-flats and e-flats than naturals, and put them by the clef. You'd want the table so you wouldn't end up with non-western key signatures with just an a-flat and a c-sharp. Modulation would be more difficult, but noticing that now there are all b and e-naturals, with c and f-sharps would be an indication.
With instruments that throw out lots of overtones it would be difficult to determine which notes are actually being played. A Mixture Stop[^] on a pipe organ or an old Hammond drawbar would be pretty difficult to deal with.
Last Visit: 24-Oct-20 8:39 Last Update: 24-Oct-20 8:39