Click here to Skip to main content
6,595,444 members and growing! (20,468 online)
Email Password   helpLost your password?
Languages » XML » XSLT     Intermediate

Sorting Book Titles in XSLT

By Emma Burrows

An example of using XSL to achieve a custom sort order - in this case, sorting titles by the first word that isn't an article.
XML, XSLT, Windows, Visual Studio, Dev
Posted:28 Jan 2006
Views:16,962
Bookmarked:12 times
Announcements
Loading...
 
Search    
Advanced Search
Add to IE Search
printPrint   add Share
      Discuss Discuss   Broken Article?Report  
11 votes for this article.
Popularity: 4.19 Rating: 4.02 out of 5
1 vote, 9.1%
1

2
1 vote, 9.1%
3
4 votes, 36.4%
4
5 votes, 45.5%
5

Introduction

One of my sites allows people to post their short stories, and I'd always been bothered by the fact that when the titles were sorted alphabetically, I would have a long list under T, where all the titles beginning with "The" would congregate.

Traditional libraries use a sort order called the grammatic order, where titles are categorized based on the first significant word. In theory, this word could be any one in the title, but in practice nowadays, it'll usually be the first word that isn't an article (i.e. A/An or The). So instead of this plain alphabetical list:

  • A Tale of Two Cities
  • The Bostonians
  • The Importance of Being Earnest
  • War and Peace

The titles should be sorted as follows:

  • The Bostonians
  • The Importance of Being Earnest
  • A Tale of Two Cities
  • War and Peace

(In fact, to reduce confusion, the titles could be listed as, for example, "Tale of Two Cities, A", but I'll just concentrate on sorting in this particular article.)

When I was looking into methods to achieve this on my web site, I soon discovered that the XSL <xsl:sort> element was the solution to my problem, as it can be used to create a completely custom sort order for a set of XML data. Similar solutions to this one can be found elsewhere on the web (here for example), but they don't often include explanations of how the methods work. The rest of this article explains a simple way to build up a select statement to do this type of title sorting.

Customising xsl:sort

In the examples below, I will assume that we are working with an XML source that has the following structure (a complete XML file and the corresponding stylesheet are included in the source zip file):

<Stories>
  <Story>
    <Title>War and Peace</Title>
  </Story>
  <Story>
    <Title>The Bostonians</Title>
  </Story>
...
</Stories>

In order to sort these elements by the complete title, you would simply use code like this:

<xsl:for-each select="Story">
  <xsl:sort select="Title"/>
    <xsl:value-of select="Title"/>
</xsl:for-each>

When an xsl:sort element is present, the XSLT processor determines the sort order by evaluating the result of its select statement for each element to be sorted. In this case, as it iterates through the Story elements, the processor will check the value of Title and work out where that particular element should go in the sorted list. The result will be a straightforward alphabetical list similar to the first list in the Introduction.

However, we actually want some of the titles to be sorted according to the second word, not the first, and in those cases, we need the processor to evaluate the Title string starting with the first space. The substring-after function is ideal for this purpose:

<xsl:sort select="substring-after(Title, ' ')"/>

The result of substring-after(Title, ' ') will be everything after the first space in the Title element. As it iterates through the elements, the XSLT processor will be sorting this set:

  • and Peace
  • Bostonians
  • Importance of Being Earnest
  • Tale of Two Cities

Unfortunately, while this will indeed put The Bostonians under "B", it will also put War and Peace under "A". What's more, titles which do not contain any spaces end up unsorted at the beginning of the list, because the function doesn't return anything at all in that case, and the processor therefore doesn't include them in the sorted list.

We need to be more specific about which titles need to be sorted by their second word. Fortunately, the substring function can help with this. The result of the following function will be everything after the first space, but only if the Title element starts with "The " (we need to include the space after "The" so that titles like "Thesaurus" aren't included as well):

substring(substring-after(Title, ' '), 0 div starts-with(Title, 'The '))

The second parameter of the substring function normally takes a number determining where the substring should start, but has the added benefit that if an invalid number is given, the function returns nothing at all. Here it is combined with a starts-with function, which returns a boolean true if the string starts with "The " and false if not. true and false evaluate to 1 and 0 when converted to numbers.

In this case, dividing 0 by the boolean value returned by the starts-with function toggles the value between 0 (0 div 1) and NaN (0 div 0 - not a number). If the value is 0, the function returns the title minus its leading word, so it will be sorted by the second word as described above. If the value is NaN, however, the function returns nothing, so the title isn't sorted and appears in the same position as in the XML document.

At this point, we can sort the titles beginning with "The " correctly, so we now need to sort the other titles as well. This is done using a similar substring function:

substring(Title, 0 div not(starts-with(Title, 'The ')))

In this case, the first parameter is simply Title, since we do want these titles to be sorted by the whole value of the element. The second parameter relies on the same evaluation as above to produce 0 or NaN values depending on whether the title doesn't start with "The ".

We now have functions to sort ordinary titles alphabetically and titles starting with The by their second word. The next step is to put them together. Since the two substring functions are mutually exclusive, we can use the concat function to stick them together:

<xsl:sort select="concat(substring(substring-after(Title, ' '), 
       0 div starts-with(Title, 'The ')),
       substring(Title, 0 div not(starts-with(Title, 'The '))))"/>

The result is a list in which the titles are sorted correctly. This functionality can be extended further by simply adding extra criteria to the substring functions as in the following example:

<xsl:sort select="concat(substring(substring-after(Title, ' '), 
  0 div boolean(starts-with(Title, 'A ') or starts-with(Title, 'An ') 
  or starts-with(Title, 'The '))),
  substring(Title, 0 div not(starts-with(Title, 'A ') 
  or starts-with(Title, 'An ') or starts-with(Title, 'The '))))"/>

Points of interest

While it can be cumbersome and rather verbose at times, the XSL language makes up for it by being incredibly powerful. While designing my website, I was able to achieve things which would have been extremely difficult with just straight ASP and ADO. I hope this example has been a useful introduction to custom sorting.

History

  • January, 2006 - First version.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Emma Burrows


Member
Emma's first steps in programming took place at primary school over twenty years ago, thanks to a TI-99/4A and the LOGO language. Following a Master's degree in English Studies (obtained, strangely enough, with a paper on the birth of the microcomputer), Emma started her career in IT.

Over the last ten years, she has worked as a localiser, technical writer, editor, web designer, systems administrator, team leader and support engineer for companies ranging from Microsoft to the more modest British software company Equisys.

Emma is currently expecting another baby after taking a break to have a first baby. Some day, perhaps when both babies have gone to university, she hopes to improve her C# and web development skills and, of course, write more articles for Code Project.
Occupation: Web Developer
Location: United Kingdom United Kingdom

Other popular XML articles:

Article Top
You must Sign In to use this message board.
FAQ FAQ 
 
Noise Tolerance  Layout  Per page   
 Msgs 1 to 4 of 4 (Total in Forum: 4) (Refresh)FirstPrevNext
QuestionReturn all XML content after Sorting [modified] PinmemberMarcelRG8:28 24 Aug '09  
GeneralThe big small help Pinmembersreejith ss nair23:40 30 Jan '06  
GeneralI like it PinmemberMike Ellison7:12 30 Jan '06  
GeneralRe: I like it PinmemberEmma Burrows2:30 2 Feb '06  

General General    News News    Question Question    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

PermaLink | Privacy | Terms of Use
Last Updated: 28 Jan 2006
Editor: Rinish Biju
Copyright 2006 by Emma Burrows
Everything else Copyright © CodeProject, 1999-2009
Web18 | Advertise on the Code Project