Using FiddlerCore to Capture Streaming Audio

ProgrammerTim

Rate me:

5.00/5 (3 votes)

24 Sep 2013GPL311 min read

33.1K

1.4K

Introduction

I tend to reinvent the wheel out of boredom, and decided to write a streaming music capture program a few years back. It watched the caption on a browser window running a stream from last.fm and recorded everything coming out of the speakers. When the caption changed, it took the recorded chunk, parsed the previous caption to determine the song title, saved it off as a .wav file, and then started recording for the next song. It would then spawn a new thread and run Lame.exe from the command line to convert the .wav to an .mp3. It worked great other than the occasional email beep embedded in a song when I would forget to turn off my system sounds or something.

Then I was working on another project and using Fiddler to debug some web traffic and noticed Pandora sending m4a files across. I did some more digging and found identifying XML data, and PandoraCapture was born. This article is a discussion of the use of FiddlerCore to capture the audio and XML streams, along with the UltraId3Lib library to add metadata to mp3 files.

Using the code

The source zip contains all of my code. It started as a single windows form and has been refactored from there, so it's not pretty. An Options form provides a GUI wrapper to the Settings class, which loads and saves all settings to the registry. The defaults are set up for my particular work style, not what they should probably be "in the wild". For instance, the default is to not set the app up as the system proxy. The reason for this is that outlook at our company often fails when fiddler takes over. So instead I have a firefox profile specifically for internet radio that is hardcoded to use the PandoraCapture proxy and everything else bypasses it. But for the general populous, setting the system proxy is probably the easiest way to get it set up.

On to the guts.

The backbone of this app is FiddlerCore. The usage is fairly simple, as it is a .net object-oriented library and there's no p-invokes or anything like that to deal with. The following block initializes the proxy server:

public void ResetProxy() {
   Fiddler.FiddlerApplication.AfterSessionComplete -= FiddlerApplication_AfterSessionComplete;
   Fiddler.FiddlerApplication.BeforeRequest -= FiddlerApplication_BeforeRequest;
   Fiddler.FiddlerApplication.Shutdown();
   var settings = Fiddler.FiddlerCoreStartupFlags.Default;
   if (!Settings.SetSystemProxy) {
      settings &= ~Fiddler.FiddlerCoreStartupFlags.RegisterAsSystemProxy;
   }
   Fiddler.FiddlerApplication.AfterSessionComplete += FiddlerApplication_AfterSessionComplete;
   Fiddler.FiddlerApplication.BeforeRequest += FiddlerApplication_BeforeRequest;
   Fiddler.FiddlerApplication.Startup(Settings.ProxyPort, settings);
}

Because I also use this routine when Settings.ProxyPort or Settings.UseSystemProxy is changed, I first shut down the proxy. This doesn't affect anything if the proxy hasn't started yet. I then tell Fiddler whether or not it will be the system proxy, set up the event handlers, and then start up the proxy.

The FiddlerApplication_AfterSessionComplete event handler is the only one that matters. I use FiddlerApplication_BeforeRequest simply to display a status showing URLs being requested.

Inside FiddlerApplication_AfterSessionComplete, I check the host name against a regex to determine if it is a Pandora or last.fm session. Since these are handled differently, they branch to separate routines. I pass the fiddler session to the routines so that they have all of the information they need.

if (Regex.IsMatch(session.hostname, @"(^|\.)((pandora\.com)|(p-cdn\.com))$")) {
   handlePandoraSession(session);
} else if (Regex.IsMatch(session.hostname, @"(^|\.)last\.fm$")) {
   handleLastFmSession(session);
}

Inside handlePandoraSession, I determine whether the page being requested is an XML file. If so, I use an XPath query to determine whether it is the XML I actually want; namely, the one that holds the metadata and links for the audio files. I break the XML fragments out into a dictionary keyed off of the URL for the audio.

if (session.oResponse.MIMEType == "text/xml") {
   var doc = new XmlDocument();
   doc.LoadXml(session.GetResponseBodyAsString());
   if (doc.SelectSingleNode("/methodResponse/params/param/value/array/data/value/struct/member/name[text()=\"audioURL\"]") != null) {
      foreach(XmlElement node in doc.SelectNodes("/methodResponse/params/param/value/array/data/value/struct/member/name[text()=\"audioURL\"]")) {
         var audioUrl = node.SelectSingleNode("../value/text()").Value;
         var fragment = (XmlElement)node.SelectSingleNode("../..");
         pandoraXmlFragments.Add(audioUrl, fragment);
      }
   }
} else if (session.oResponse.MIMEType == "audio/mp4") {

Now, it gets a little weird. When I get an audio file from Pandora, it doesn't actually match the URLs I have stored. The URL has a token on it, and that token changes slightly between what was received in the XML and what is sent by the client app. So instead of being able to actually do something like pandoraXmlFragments[url], I chop 75 characters off of the end of each URL, then compare the remainder to that same section of the incoming URL. The URLs are different lengths, so I can't hardcode that length or anything.

foreach (string key in pandoraXmlFragments.Keys) {
   string test = session.fullUrl;
   if (key.Substring(0, key.Length - 75) == test.Substring(0, key.Length - 75)) {

In the last.fm handler, the exact URL is used, but that URL is then redirected to a different one. So I watch for 302 statuses and move the fragment from the original url to the new one in the dictionary so that I have the proper key when the audio comes down.

if (session.responseCode == 302) {
   if (lastFmXmlFragments.ContainsKey(session.fullUrl)) {
      string station = lastFmXmlFragments[session.fullUrl].Key;
      XmlElement fragment = lastFmXmlFragments[session.fullUrl].Value;
      lastFmXmlFragments.Remove(session.fullUrl);
      lastFmXmlFragments.Add(session.oResponse.headers["Location"], new KeyValuePair<string, XmlElement>(station, fragment));
   }
}

Once I have determined the proper entry, I use XPath again to parse each of the major pieces of metadata from it so that I can build a filename for the audio file. I then decode the response (because it may be gzip-ed or something) and save the file with FiddlerCore.

var fragment = pandoraXmlFragments[key];
var title = fragment.SelectSingleNode("member/name[text()=\"songTitle\"]/../value/text()").Value;
var album = fragment.SelectSingleNode("member/name[text()=\"albumTitle\"]/../value/text()").Value;
var artist = fragment.SelectSingleNode("member/name[text()=\"artistSummary\"]/../value/text()").Value;
var filePrefix = string.Format("{0} - {1} - {2}", title, album, artist);

session.utilDecodeResponse();
var audioFile = Path.Combine(folder,string.Format("{0}.m4a", filePrefix));
var xmlFile = Path.Combine(folder, string.Format("{0}.xml", filePrefix));
session.SaveResponseBody(audioFile);

var xmlFragmentData = new StringBuilder();
using (var x = XmlWriter.Create(xmlFragmentData)) {
   fragment.WriteTo(x);
   x.Close();
}
File.WriteAllText(xmlFile, xmlFragmentData.ToString());

ProcessPandoraFiles(audioFile, xmlFile);

pandoraXmlFragments.Remove(session.fullUrl);

When I save the file, I save the XML fragment alongside it. Pandora sends m4a files instead of mp3, and m4a can't have tags (or at least not the same tags as mp3). So I spawn a new thread that runs FFmpeg from the command line and converts the m4a file to an mp3. One caveat here is that FFmpeg, by default, uses ID3v2.4, whereas the ID3UltraLib only currently supports ID3v2.3. So when I call FFmpeg, I have to pass -id3v2_version 3 on the command line to make sure that they are compatible.

Once I have a valid mp3, either from Pandora via FFmpeg, or directly from last.fm, I start setting the metatags. UltraId3Lib is another .net, object-oriented library, so there aren't any p-invokes and I don't have to know the format of an mp3 or anything. I simply load up the file, set all of the tags I have, and write it back out.

var id3 = new HundredMilesSoftware.UltraID3Lib.UltraID3();
id3.Read(audioFile);
id3.Album = fragment.SelectSingleNode("member/name[text()=\"albumTitle\"]/../value/text()").Value;
id3.Artist = fragment.SelectSingleNode("member/name[text()=\"artistSummary\"]/../value/text()").Value;
id3.Title = fragment.SelectSingleNode("member/name[text()=\"songTitle\"]/../value/text()").Value;
foreach (XmlNode genre in fragment.SelectNodes("member/name[text()=\"genre\"]/../value/array/data/value/text()")) {
   id3.ID3v2Tag.Frames.Add(new HundredMilesSoftware.UltraID3Lib.ID3v23GenreFrame(genre.Value));
}
var node = fragment.SelectSingleNode("member/name[text()=\"composerName\"]/../value/text()");
if (node != null && node.Value != null && node.Value.Length > 0) {
   var composers = new HundredMilesSoftware.UltraID3Lib.ID3v23ComposersFrame();
   composers.Composers.Add(node.Value);
   id3.ID3v2Tag.Frames.Add(composers);
}
node = fragment.SelectSingleNode("member/name[text()=\"amazonUrl\"]/../value/text()");
if (node != null && node.Value != null && node.Value.Length > 0) {
   id3.ID3v2Tag.Frames.Add(new HundredMilesSoftware.UltraID3Lib.ID3v23CommentsFrame(node.Value, "Amazon URL"));
}
id3.ID3v2Tag.Frames.Add(new HundredMilesSoftware.UltraID3Lib.ID3v23CommentsFrame(xml, "Pandora XML Fragment"));
id3.Write();

last.fm and Pandora have different sets of data, so not all of the tags can be set in both. For instance, I don't have a way to get the genre from last.fm, so I use the radio station name, simply so I can sort later.

I write out all of the tags other than the album artwork first. I want to make sure that the write doesn't fail just because of something screwed in the artwork, and I don't want to do any complicated exception handling. So I do a write before loading the artwork, then do a second write for the album art. To get it, I use an HttpWebRequest to download it from the URL specified in the XML file. I basically trap and ignore any exceptions that occur at that stage.

var artPath = "member/name[text()=\"artRadio\"]/../value/text()";
node = fragment.SelectSingleNode(artPath);
if (node != null && node.Value != null && node.Value.Length > 0) {
   var web = System.Net.HttpWebRequest.Create(node.Value);
   try {
      using (var response = web.GetResponse()) {
         using (var stream = response.GetResponseStream()) {
            using (var bmp = new Bitmap(stream)) {
               var pic = new HundredMilesSoftware.UltraID3Lib.ID3v23PictureFrame(bmp, HundredMilesSoftware.UltraID3Lib.PictureTypes.CoverFront, "Album art", HundredMilesSoftware.UltraID3Lib.TextEncodingTypes.Unicode);
               pic.MIMEType = response.ContentType;
               id3.ID3v2Tag.Frames.Add(pic);
               id3.Write();
            }
            stream.Close();
         }
         response.Close();
      }
   } catch (Exception ex) {
   }
}

Points of Interest

Obviously, last.fm and Pandora have different XML formats, so each one uses different XPath queries. I like last.fm better, I think, because they are using a standard playlist namespace. However, this causes me problems with my XPath queries because I have to specify the namespace every time. So before I load the last.fm XML, I rip off the namespace attributes.

// without ripping off the namespace
doc.LoadXml(xml);
var ns = new XmlNamespaceManager(doc.NameTable);
ns.AddNamespace("xspf", "http://xspf.org/ns/0/");
if (doc.SelectSingleNode("/lfm/xspf:playlist/xspf:trackList/xspf:track/xspf:location", ns) != null) {

// with ripping it off
doc.LoadXml(xml.Replace(" xmlns=\"http://xspf.org/ns/0/\"", ""));
if (doc.SelectSingleNode("/lfm/playlist/trackList/track/location") != null) {

So if I don't rip it off, all of my xpath queries get longer because I have to add the namespace prefix, and I have to always pass the XmlNamespaceManager around. Now, it may turn out at some point in the future that this bites me in the butt if last.fm ever adds some other type of tracklist or something in there, but I doubt that will happen.

Another thing you may notice in the code is that I don't handle any of the ID3UltraLib exceptions. Some exceptions it will actually throw, and others it merely returns; basically, they are warnings. I don't care about them so I don't worry about them. Another thing you'll notice is that I spell out the full namespace for UltraID3Lib in most places. I tend to do that with any new library I use. It helps me start memorizing the structure, and helps me bypass the help file; I often find methods or objects on a library that I wouldn't have found otherwise as they pop up through intellisense.

This app is also a good(?) intro to slightly complicated XPath queries, especially in the Pandora XML files. Pandora doesn't use tag names to differentiate the parts, but instead uses a member element with a name and value child elements. You have to look at the text of the name element to determine what the text of the value element applies to. So the XPath query member/name[text()=\"albumTitle\"]/../value/text() says to pull the text associated with the value child element of a member tag whose name element contains the text "albumTitle".

A final weirdness with last.fm is that sometimes the mp3 files I receive already have tags. I'm not doing any check for this, so I have a few last.fm mp3s that have a proper genre and then have the station name added to the list.

This app is very specifically coded for last.fm and Pandora audio files. However, if you wanted to find out the APIs needed for any of the music search sites, you could capture any incoming audio file and then query the music services to gather tags. Also, you could capture the HTML page that is loading the audio and may be able to pull other data from there. For instance, if I parsed the last.fm html file, I could take the tags associated with that song and add those as genres.

I've also found that this works on various video sites. Youtube, as a bad example, sends flv files across. Now, YouTube is a bad example because they send each chunk as a separate file. Of course, it's possible all of the video sites work the same way. I tested with another site and received a single complete flv file, but it was also a 1 minute video, so it may have just been shorter than the chunk size. But I imagine the chunks simply need concatenating, although I haven't tested it. After that, you would simply have to parse the HTML or look for a similar XML to tell you what the file was.

Future Steps

I would like this to be able to catch any other audio and run it through a music tagging service. I would also like to get a .net library of FFmpeg so that I don't have to shell out to the command line. And of course, I'd like to make some money off of this, so if anyone wants to donate, let me know. :)

The main thing that really needs to be done, though, is to determine the actual licensing requirements of the various pieces. The binary zip contains UltraId3Lib and FiddlerCore, but not FFmpeg. And the code is written to work without FFmpeg; last.fm doesn't need it, and the Pandora handler will simply leave the XML fragment in a file beside the m4a file. But I don't speak legalese well enough to know whether any of the libraries can actually be placed in the same zip or not. So if anyone can clarify that, it would be much appreciated.

Update

I have a new version posted to SourceForge. This version has a plugin architecture for the session handlers and for UI option pages. The last.Fm and Pandora handlers are both refactored into plugins (albeit system plugins within the exe) and I have written two new plugins. One is for Fora.Tv, but it doesn't work. Fora.Tv sends f4f files, which are a flash fragmented video, and I haven't gotten to the point of recombining them (the plugin tries to concatenate them together, but that didn't work). The other is for Vimeo. Vimeo sends m4v files, so it seems to work fine. The one caveat is that Vimeo use HTTPS for some of its traffic. On my machine, the FiddlerCore proxy is handling the HTTPS traffic just fine, but I already had the full version of Fiddler installed and had accepted Fiddler's own SSL certificate. I don't know what the experience will be for new users, but I don't have a machine without Fiddler to test on. So I'm relying on the community to let me know if there are problems. I may not be able to fix them, but maybe I can at least document them.

3rd-Party Links

FiddlerCore
UltraID3Lib
Zeranoe FFmpeg builds

History

9/11/2013 - Uploaded. Also available on SourceForge at https://sourceforge.net/projects/pandoracapture/
9/24/2013 - Uploaded new version to SourceForge, including plugin architecture and a Vimeo plugin.

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)

Written By

ProgrammerTim

Systems Engineer

United States

I'm a senior developer at an independent broker/dealer in Irving, TX. Mostly I program back-end systems involving data automation and processing.

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.