In the first article in this series, I showed you the basics of constructing a signed request to search the Netflix catalog for titles that matched a given search term. In this second article I will demonstrate how to parse the search results, and how to request and parse the details for a specific title.
This article will introduce you to:
- Accessing the Netflix Catalog Resource using Signed requests
- Understanding and parsing Netflix data documents
Title Search Results Structure
To describe the Netflix catalog I will start by analyzing the results of at title search. In the previous article, I showed you how to perform a title search and the result of that request was an XML document. The top-level document structure looks like this, as viewed in XML Notepad:
Right away it is obvious that several discrete variables and couple of objects will be needed to store the results of parsing this information. To make a long story short by illustration, the following figure shows examples of each of the various object types contained in the search results expanded to show their contents. These will be translated into local data storage structures by a parser, which will in turn be used for containing and displaying information in our client application.
The following data structures correspond to the objects in the preceding illustration:
||Contains basic information about a data resource, with additional information available through a hyperlink.
||Contains all the information about a title.
||Contains a short and long form of a title's name.
||Contains hyperlinks to three versions of the title's cover image.
||Contains information about a title's MPAA rating or genre.
Rather than describe the mundane implementation of these data storage objects here, I will refer you to the contents of the NetflixParser.cs code module.
Search Results Parser
The parser is a rather simple implementation that uses the Microsoft XmlDocument class to do a lot of the work. The parser actually does double duty, parsing both the search results and the title details results. I'll describe the search results parsing first and describe the title details parsing when I cover the title information retrieval processing.
The search response parser takes its input directly from the HTTP response stream and loads it into an
XmlDocument object. As shown in the following code, the parser then:
- Skips past the preliminary information, which you will find documented in the Netflix API documentation if it isn't already obvious, looking for the first <catalog_title> node.
- Processes each <catalog_title> node and inserts the resulting
CatalogTitle object into a list that is returned to the caller upon completion.
private List<CatalogTitle> _titleList = new List<CatalogTitle>();
public bool ParseSearchResults(Stream str)
XmlDocument xDoc = new XmlDocument();
int rank = 0;
XmlNode xNode = xDoc.DocumentElement;
if (xNode.Name != "catalog_titles" || !xNode.HasChildNodes)
foreach (XmlNode subNode in xNode)
if (subNode.Name == "catalog_title" && subNode.HasChildNodes)
CatalogTitle title = ParseSingleCatalogTitle(subNode);
if (title != null)
title.rank = rank++;
Catalog Title Parser
The catalog title parser method,
ParseSingleCatalogTitle, walks the <catalog_title> node extracting data into a
CatalogTitle object. The extraction is quite simple; a switch statement identifies the data from an element or node and populates the
CatalogTitle object. Note that there is a special case for
Link objects that handles "title expansion", which I'll be describing in the next section.
Title Details Retrieval
At this point in this application development, there are three possible ways of retrieving details about a title. But before I tell you about that, I want to explain the options for detail retrieval.
The Link objects returned in the catalog search represent items that "link" to additional details and they contain three attributes:
- href - A hyperlink that can be used as a request URL for directly obtaining the details associated with the
- rel - A relative URL for the class of detail for the
- title - A type name for the
The purpose and usage of these attributes are described in the following sections.
Title Expansion Requests
A "title expansion" request is a request that has an additional query string parameter named "expand" specified. The value of the parameter is one or more of the
Link object title attributes. For example, a search request might produce a response that contains the following
rel="http://schemas.netflix.com/catalog/titles/synopsis" title="synopsis" />
rel="http://schemas.netflix.com/catalog/people.cast" title="cast" />
rel="http://schemas.netflix.com/catalog/people.directors" title="directors" />
To request "expansion" of these links in a subsequent request, you can include the following additional query string parameter in your request. Note that the values are a) comma delimited, and b) correspond to the "title" attributes of the <link> elements in the preceding example.
You will see this implemented in the code for the
lvResults_DoubleClick handler as:
The result of including this title expansion specification in the request is that the links in the response now include the details to which the links refer.
rel="http://schemas.netflix.com/catalog/person" title="Christina Ricci"></link>
rel="http://schemas.netflix.com/catalog/person" title="Bill Pullman"></link>
rel="http://schemas.netflix.com/catalog/person" title="Cathy Moriarty"></link>
rel="http://schemas.netflix.com/catalog/person" title="Eric Idle"></link>
rel="http://schemas.netflix.com/catalog/person" title="Malachi Pearson"></link>
rel="http://schemas.netflix.com/catalog/person" title="Ben Stein"></link>
rel="http://schemas.netflix.com/catalog/person" title="Don Novello"></link>
rel="http://schemas.netflix.com/catalog/person" title="Joe Nipote"></link>
rel="http://schemas.netflix.com/catalog/person" title="Joe Alaskey"></link>
rel="http://schemas.netflix.com/catalog/person" title="Brad Garrett"></link>
rel="http://schemas.netflix.com/catalog/person" title="Brad Silberling"></link>
There are several features to note in the preceding example:
synopsis detail is returned in a ![CDATA] segment because the returned data contains embedded hyperlinks for references to cast members, directors, etc. Fortunately, the
XmlDocument class handles unformatting this for us when it loads the XML document. However, you should be aware that the synopsis is not plain text, and you may wish to strip the embedded HTML before using it, as I do in the example code for this article.
- Note that both the
directors links contain
people objects. When parsing the expanded data you need to be aware of the context in which an object is returned, as in this case where it represents either cast members or directors.
Link Details Retrieval
The other option for detail retrieval is now quite simple to understand now that I have described how title expansion works. As noted earlier, the
Link object contains a fully qualified URL in the
href attribute that can be used in a separate request to obtain the same detail information that was returned by the title expansion in a single request.
Detail Retrieval Options
Now that I've explained title expansion and link detail retrieval, I'll return to the point I was making on the three potential options for retrieving the title details, which option I chose for the example code, and why.
- We could have specified title expansion in the original search request to obtain the details of interest for every title returned.
- We can use the information in the
Link objects returned in the catalog search to fill in the details for each of the parts of the particular title of interest.
- We can submit a new request for information about the one title we're interested in, specifying title expansion for the details we need.
The pros and cons of each of these methods are summarized in the following table:
|Search with title expansion
||Retrieves detailed title information in a single request.
||Returns additional unneeded data if only one title is of interest.
|Link detail retrieval
||Retrieves only the additional information needed.
||Requires caching the catalog search results in order to obtain the links for the title of interest.
|Title request with title expansion
||Does not require caching the search results other than the title retrieval information.
||Returns information that was already returned by the search request.
The example application consists of two activities:
- The main application form is used to submit a title search request and to show the results of the search.
- When a title is clicked on the search results on the main form, a separate dialog window is displayed that shows the details for the title.
The Netflix service requests are handled a little more elegantly than in the previous article in this series. The features of the
NetflixRequest class are:
- The class is derived from the
OAuth.OAuthBase class, so the additional step of instantiating an
OAuthBase class is not required. Because the public
OAuthBase class members are inherited, they are also publicly available from the
- There are two general purpose methods in the
NetflixRequest class, one for Non-Authenticated requests and one for Signed requests. See the previous article in this series for further information about these two types of service requests.
- The service request results are streamed into an
XmlDocument, which is the input parameter format for the parser.
- You will notice there is no exception handling in the
NetflixRequest class. Exceptions are intentionally unhandled so that the application (or the caller) can handle them appropriately.
- The HTTP request is still synchronous, meaning that the service request does not return until the request has been received and loaded into the
XmlDocument for return.
The parser class is designed to receive an
XmlDocument object that is loaded with the results of a service request. There are two public methods
ParseTitleInfo, which are called by the application to parse the date returned from a catalog search or a title information search, respectively. Because both of these are dealing with
catalog_title XML objects, they both funnel into a common set of private functions starting with
ParseCatalogTitle walks through the
catalog_title object and extracts the information for each node, and in the cases where the node is an object, constructs an equivalent data object for the node contents. A special case is the
Link object which may contain "expanded" title information. If this is the case, the title parser hands off to the
ParseExpansion function. Note that the
ParseExpansion function is only partially implemented in this example with just a few of the link expansions for the purpose of demonstrating the technique. I didn't want the code to be overly complicated and confusing, so I leave the full title expansion parsing implementation up to you.
For this example I chose to use separate requests for the catalog search and the title details (remember the three detail retrieval options I was just talking about?). For the catalog search I chose to not use any title expansion, but rather request the minimum amount of information for the results. However, you can add a title expansion request parameter to see how title expansion works in a search request.
Title expansion is specified on the title details request in the
lvResults_DoubleClick handler. To accommodate the title expansion results, the basic parser for the catalog search results was also extended to accommodate the additional data contained by expanded
Link objects. As previously mentioned, not all of the title expansions are supported in the parser for this code example, but rather just the few link types that have been discussed in this article, for the purpose of demonstration.
Running the Example Application
The main form for the example application is based on the code from the previous article in this series as an application that searches for Netflix titles. It requires three inputs: your consumer key, your consumer secret, and the term for which to search. Optionally, you can specify the maximum number of results to return (up to the Netflix-imposed limit of 100), or choose zero to return the default limit of 25 items.
The results are returned in a
ListView, showing a few key elements of each catalog item that was returned: the relevance (Rank), Netflix's title identity (ID), the title name, and the year the title was released. Double-clicking on a title launches the TitleInfo dialog which then requests the title information, this time asking for expanded information for the synopsis and cast. Extended details for the title are then displayed on the dialog form.
Note: Any resemblance between my example color scheme and that used by Netflix is purely coincidental.
Now I have to admit that brute force approach to parsing I took in this article may not be very elegant, but it was sufficient to illustrate how the Netflix catalog data is structured. There are many other ways the XML results can be processed, of course, so I will leave it up to you as an exercise now that I've explained the fundamentals.
This article has described and demonstrated how to submit search and title detail requests using the Netflix API, and how to parse and use the results. It also explains some of the concepts, and the options, for performing these tasks. In the next article in this series, I'll show you how to access a Netflix subscriber's account information using Protected Requests.
- September 3, 2009 - Original submission
- September 8, 2009 - Updated source code