NOTE: If you spent the last decade 100miles below the surface of the
earth studying the growth of algae under an antarctic glacier, then you
will be surprised to learn that we can now listen to radio over the
internet. Just like tuning-in to particular frequencies for particular
stations in the olden days; now we can "tune-in" to specific radio
stations over the internet. Without further ado, lets jump-in into the
world of "Internet Radio".
Most apps claiming to support Internet Radio, in fact support a industry standard - Shoutcast
. It was a protocol devised by nullsoft in the 90's and first implemented in their popular player Winamp
reading if you haven't heard of Winamp!). Inspite of being a
proprietary protocol with not much documentation to go with, Shoutcast
has become the de-facto industry standard for streaming audio. This is
mainly due to its simplicity and similarity with the existing hyper-text
transfer protocol (dear old http! wink wink). Icecast
is a similar open-source implementation compatible with Shoutcast.
Initial handshake between Shoutcast client-server
|High-level overview of Internet-radio over Shoutcast. |
[STEP1] Station Listing
The client app connects to a station listing/aggregator on the internet and obtains a list of stations alongwith their details like genres, language, now-playing, bitrate among other things.
[STEP2] Station Lookup
The user can then select one of the stations as desired. Then the client obtains the ip-address(& port) of the server running that particular station from the station-listing/aggregator. Networking enthusiasts will notice that this step is exactly like a DNS lookup i.e. the client obtains the network address for a particular station name; the station-listing/aggregator acting like a DNS-server for Radio stations. Also note that sometimes the station-listing will provide only a domain-name and then an additional actual DNS lookup is needed to obtain the ip-address of the streaming server. Popular station-listing/aggregator sites like Xiph
provide huge web-friendly lists of live radio stations.
[STEP3|4] Station Connection
1. The client attempts to connect to the server using the ip-address(and port) obtained during station lookup.
2. The server responds with "ICY 200 OK" (a custom 200 OK success code
3. ...and the stream header...
4. ..and finally the server starts sending encoded audio in a continuous stream of packets(which the client app can decode and playback) until the client
disconnects(stops ACK-ing and signals a disconnect).
Download the entire WireShark capture of packets exchanged by the shoutcast client and server during initial station connection.
The above steps are similar to what a browser does when it connects to a website (and hence in-browser streaming audio playback of shoutcast streams IS possible).
Shoutcast has subtle differences
over http during the station connection
step above. Shoutcast supports "Icy-MetaData
" - an additonal field in the request header. When set, its a request to the shoutcast server to embed metadata about the stream at periodic intervals(once every "icy-metaint
" bytes) in the encoded audio stream itself. The value of "icy-metaint
" is decided by the shoutcast server configuration and is sent to the client as part of the initial reply.
|Shoutcast stream format when ICY:MetaData is set to 1|
This poses a slight complication during playback. If the received audio stream is directly queued for playback, then the embedded metadata appears as periodic glitches. Following is one such sample recording.
This audio clip was retrieved from a radio stream whose icy:metaint =
32768; i.e. the metadata is embedded in the audio stream once every
32KBytes. Stream bit-rate is 4KBps. So during playback a glitch is
present once every 32KB/4KB = 8seconds (0:08s, 0:16s, 0:24s, 0:32s,...).
To view/analyse the stream data in a hex editor, download the actual clip and check out the following offsets :
[0:08s] 0x0815A - 0x0817A count N = 2, meta = 1+ (16 x 2) = 33(0x21h)bytes
[0:16s] 0x1017B - 0x1017B count N = 0, meta = 1byte
[0:24s] 0x1817C - 0x1817C count N = 0, meta = 1byte
[0:32s] 0x2017D - 0x2017D count N = 0, meta = 1byte
|Embedded metadata from 0x0815A to 0x0817a.|
Note the first byte is 02 i.e. metadata is 2x16=32(0x20h)bytes following it.
Also note that the first 345(0x159h)bytes of the actual clip are the reply header of the
stream(plain-text in ASCII) sent by the shoutcast server. Technically
these are NOT part of the audio stream as well.
NOTE: If you simply want to obtain the audio stream (no embedded metadata) then set the "Icy-MetaData" field in the request header to 0 or simply do NOT pass it as part of the initial request header.
Finally here is a small bit of code that implements all that we have learnt so far - a simple shoutcast client in a few lines of C, that connects to any shoutcast server and logs the audio stream data to stdout. It uses the curl library to initiate connection requests to the shoutcast server.
Stripping off the comments and the clean-up code following line:50, it comes down to 13 lines of C code. Pretttty neat eh?...
$> sudo apt-get install libcurl4-gnutls-dev
$> gcc simple.c -o simple -l libcurl
$> ./simple <shoutcast-server-ip-addr:port> > <test-file>
After running the above commands, the <test-file> <test>will contain the audio stream of that particular internet radio station. The <test-file> can be played back in any player that supports decoding the stream format(AAC, MP3, OGG etc. depending on the radio station) Make sure to comment out line 38 in simple.c to have a glitch-free(no embedded metadata) audio stream.
This concludes part 1 of the series on how internet radio works. In part2 we will analyse the challenges and issues faced during de-packetising, parsing and queuing the audio stream buffers for local playback. Stay tuned for updates.