It is in our nature to use tools, and it is in our nature to make them. And one of our most powerful traits is in our ability to take tools made by others, and customize them to our own needs and preferences. Sadly, this is often an area where software tools, potentially some of the most flexible at our disposal, fail us. Consider the applications and utilities installed on your computer: how many of them nearly fit your mental model of how a tool could be used for certain tasks, but fall short in one area or another? If only you could make a few small modifications, how much easier and faster could your day-to-day work be accomplished?
Because of this, software that enables customization of other software, meta-tools if you will, have existed for nearly as long as software itself. From the lowly macro assembler, to scripting frameworks, to complex patching tools, to the hardware-software combinations that allow today's game consoles to be used in ways their designers never intended, we have always found ways of fulfilling our drive to bend the work of others to our own ends.
Web applications, themselves a novel take on technology originally created for much simpler tasks, are ripe for customization. The underlying design, a client-server system with well-documented communications protocols and data formats, is far more open to enhancement than nearly any other platform in common use. Even the much-lamented existence of "browser wars" speaks of opportunities that just don't exist in other areas - even the ability on *nix to choose from among different window managers pales in comparison.
One of the most recent developments to this end has been the creation of tools that allow bits of code, "user scripts", to be inserted into a web page when it loads, thereby allowing users to enhance, automate, and extend websites and web apps in ways the authors of those sites were unable or unwilling to consider. One of the most popular of these meta-tools is the GreaseMonkey extension for the Firefox browser. Since its release, thousands of scripts have been written to do everything from making poorly-designed sites more accessible for those with disablities, to the so-called "mashup scripts" that combine information or functionality from two or more sites into a single interface.
In this article, I will present a number of simple enhancements to forums on The Code Project website, and attempt to give some insight into the power - and the limitations - of tools such as GreaseMonkey. As you may know, The Code Project is a website dedicated to articles on code or programming-related topics, primarily for the Windows platform. One of its best attributes are the forums it provides, both those attached to each article and those dedicated to general topics such as MFC and .NET. Unlike many such forums, Code Project's integrate well into the rest of the site, and provide an especially good venue for discussion and debate on the topics presented in articles. Although they can be viewed in several different ways, the scripts I present here are intended for use in the most common format: the DHTML "Message View", where discussion threads and replies appear as subject lines that expand when clicked to display the full message.
Tools: This Is GreaseMonkey
We'll start out by gathering some useful tools. I'm not going to go into detail on how any of these are used, but consider spending some time on your own playing with them.
GreaseMonkey is the most important, so go download and install it first:
http://greasemonkey.mozdev.org/ (Of course, you should download the version for Firefox 1.5. If you don't have Firefox 1.5, then download and install it.)
Finally, download and install the FireBug extension. Logging, error display, DOM inspection and modification... this tool has too many features to properly summarize. Just trust me, if you're doing any sort of web development, you want it:
DHTML Tweaks: Expand All
(note: I can't upload .js files to CP, so I'm hosting these myself.)
Although the default view (messages collapsed) is generally preferable, it is occasionally useful to make the text of all messages visible at once (for instance: in order to use the browser's Find feature to search through message texts). This script adds a button next to the (Refresh) link on each forum; when clicked, it expands all displayed messages:
var refreshLink = document.evaluate("//a[text()='Refresh']", document, null,
if ( refreshLink )
var expandBtn = document.createElement("BUTTON");
expandBtn.className = "FormButton";
expandBtn.textContent = "Expand All";
var forumTable = document.getElementById("ForumTable");
if ( !forumTable )
var messages = document.evaluate("./tbody/tr/td/table/tbody/tr",
forumTable, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
if ( messages )
for (i=0; i<messages.snapshotLength; ++i)
messages.snapshotItem(i).style.display = "";
Things to note:
- The comment block: This provides information to Greasemonkey, including patterns for determining the URLs that the script should be applied to, a namespace for avoiding naming collisions, a description, and the name itself. This is intended to be human-readable as well, so it makes sense to give the user some good introductory information.
document.evaluate(): Use XPath on a DOM Document. Mozilla provides this for both XML and HTML documents, so it makes sense to take advantage of it - manually traversing the DOM can be considerably slower, not to mention more tedious to write. Keep in mind though, it is easy to write inefficient XPath - in this script, i start out by searching through the entire document for a link with the text "Refresh"; i could easily provide an explicit path to the link i'm after, but doing it this way makes the script better able to handle changes to the document structure. This resiliency is important, and only happens once, so i consider it acceptable. By contrast, in the event handler for the Expand button i use a much more specific XPath to collect the collapsed elements. While this will likely break if the forum structure changes, it is quite possible the method for expanding posts would also need to change in that instance, so little resiliency would be gained by using a slower, less-specific XPath.
While developing these and other scripts, I often found it helpful to create dummy threads to test various things. Once I'd finished testing, deleting these posts would be rather tedious - the [delete] link on each post navigates first to a confirmation page, where pressing a button finally deletes the post and reloads the original forum. Creating a faster means of deleting posts was essential.
This script uses
GM_xmlhttpRequest() to communicate with the server behind the scenes: first loading the confirmation page to retrieve the password, then making the actual delete request. This is similar to using the
XMLHttpRequest objects available in most modern browsers, but with the added advantage of being able to make requests across domains. I don't use that particular feature in this script, so this could actually be written to use
XMLHttpRequest instead; I use
GM_xmlhttpRequest() because it is convenient.
Points of interest:
- Event handling: I don't add onclick handlers to the delete links themselves, but to the forum table - I determine whether or not to act within the click handler. I wouldn't recommend this in most situations, but it comes in handy now and then, so I thought it was worth demonstrating.
- checkError(): If I screwed up somewhere (allowed deleting posts not owned by the user, retrieved an incorrect password, etc.) then CP would return a page detailing the error (this might also occur if the protocol for deleting posts changes at some point). This function does a quick and dirty parsing of the response text, looking for such errors. Other potential failure scenarios include server errors and network timeouts - these will be caught by examining the response status code.
- HTML parsing: While a simple RegExp would be enough to extract the password from the first response, such parsing would quickly become unwieldy for large, complex forms. Fortunately, Mozilla has an excellent HTML parser, and with a couple of simple hacks its power is ours:
var tmpDiv = document.createElement("DIV");
tmpDiv.innerHTML = responseDetails.responseText;
This is all it takes to parse the response text as HTML. Although the HTML we're parsing potentially contains many elements which aren't at all valid children of a DIV element, Mozilla's HTML parser was built to handle the often-atrocious pages of the World-Wide Web - it keeps its sanity. Note also that
var pass = document.evaluate(".//input[@name='ArticlePassword']/@value",
tmpDiv, null, XPathResult.STRING_TYPE, null).stringValue;
Here we use XPath to extract the desired information from our newly-parsed DOM. Note that the XPath is evaluated relative to
tmpDiv, so it works even though
tmpDiv isn't actually in the primary DOM tree.
The power of DOM and XSL: Printer-friendly forums
By now, you've probably realized that Mozilla's powerful DOM implementation makes many enhancements a good deal easier than they would be on other platforms. But wait! There's more! If you've ever used XSL templates to transform XML, you know how they can reduce the drudgery of such tasks. Well, Mozilla's
XSLTProcessor works just as well on HTML DOMs as on XML DOMs. That makes this script pretty boring - it adds a "Printer friendly" link to each forum, which when clicked replaces the page with the forum contents, formatted suitable for printing. The fun comes in with the two XSL templates used by the script to actually accomplish the reformatting.
select="substring-before(substring-after(//a[text()='Refresh']/@href, 'forumid='), '&')"/>
<!-- lounge -->
<!-- article -->
<!-- forum -->
<!-- survey / news -->
<xsl:when test="$title1"><xsl:value-of select="$title1"/></xsl:when>
<xsl:when test="$title2"><xsl:value-of select="$title2"/></xsl:when>
<xsl:when test="$title3"><xsl:value-of select="$title3"/></xsl:when>
<xsl:when test="$title4"><xsl:value-of select="$title4"/></xsl:when>
This handles the ugly task of extracting information from the various forums. Lots of hairy XPath and special-cases (shown: code used to determine forum title). The output is an XML representation of the forum, which is fed into
This has the much more pleasant job of transforming the XML representation into a printer-friendly HTML representation. Naturally, it's much smaller and easier to read. The one mildly interesting bit (shown) involves keeping rowdy signature HTML from appearing outside the containing element.
Using libraries: Syntax Highlighting
Greasemonkey scripts are run in a slightly separate environment from those embedded in the actual web page being modified. This is partially for security reasons (since GM scripts can do things like add menus and make cross-site requests, it would be dangerous to make it too easy for malicious pages to hook into it), and partially for practical reasons (it would be unfortunate if pages started breaking because of namespace collisions or other unforseen script interference). Regardless, it is almost always in the best interest of the user to keep it this way. However, there's no reason why, on sites where we have a good understanding of the scripting already in place, we can't use GM to insert additional unprivileged scripts into the page itself.
var forumLUT =
1649 : "c#",
12076 : "c#",
1650 : "c#",
1646 : "vb",
1725 : "sql",
3421 : "xml",
var langLUT =
"c#" : "shBrushCSharp.js.txt",
"vb" : "shBrushVb.js.txt",
"sql" : "shBrushSql.js.txt",
"xml" : "shBrushXml.js.txt"
My script has explicit mappings for several programming forums, so that the default coloring will be correct for those. For the rest, adding a
class="language" attribute to the <pre> tag surrounding the code will trigger the "brush" (language-specific code) for that language to be loaded and applied.
Now, everyone likes easy-to-read code, but stop and think for a minute about how powerful this technique is. I'm loading these scripts from CP to avoid adding to the load on Alex's server, but for sites that don't allow uploads i could just as easily host these scripts on my own server. There are a growing number of excellent 3rd-party libraries that can be used in this way.
The ever-increasing capability of tools such as Greasemonkey provides vast opportunity for bringing new and useful changes to sites like CodeProject. I hope this article has sparked your interest in writing site-specific enhancements, and I look forward to seeing what those of you inclined to experiment are able to accomplish.
For a more in-depth tutorial, please read Mark Pilgrim's Dive Into Greasemonkey:
For an alternate method of scripting websites, see Chickenfoot: