|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Announcements
Services
Chapters
Feature Zones
|
IntroductionThis article describes a client-side workaround using JScript for the missing functionality in Microsoft Internet Explorer to add language-dependant quotation marks around, in particular <q>, but also <blockquote> HTML elements. MotivationIt happens frequently in the daily life of a webmaster that you need to cite an external source, that you need to write spoken dialogue, or any such related activity. HTML 4.01 and because of it, XHTML 1.0 defines the elements <q>, <blockquote>, and <cite> to accomplish said tasks. HTML 4.01 defines <q> and <blockquote> as follows.
So, according to the specification the user agent, in our case Microsoft Internet Explorer 6, should automatically add quotation marks before and after the text, even language dependant, but alas, life isn't always as rosy as the specifications picture it to be, because Microsoft Internet Explorer 6 doesn't support this behaviour, at all. Not only does it ignore the language-dependance, but it also ignores adding quotation marks. There are, of course, ways to remedy this shortcoming, but they all require some work, and they may not always be equally useful, and some may even cause inconsistencies with other user agents, e.g. Mozilla and Opera. The list below sums up a few different approaches.
Locating the Hunting GroundsBefore we can start building the script we need to be aware of a few things from the specification:
As supporting content generation with Cascading Style Sheets would require writing a script which will accurately parse CSS and apply formatting and content generation to the document elements, we will settle with a simpler solution: being able to specify whether to add quotation marks to blockquotes in the script. This should pose no greater problem for the webmaster, but you lose a bit of flexibility. Investigating the Document StructureWith what we've summarised so far, we should be able to figure out what the script should do in relation to <q> and <blockquote> elements, but we also need to get to the elements somewhere, somehow. If we, for a moment, presume that our webpage is well-formed XML then we might have a structure much like this:
The more programming inclined of us will invariably recognize this as a tree, and what better way is there to traverse a tree than to use recursive functions? In particular, I will be using a preorder traversal of the tree. The diagram above doesn't actually entirely depict the internal document tree
that Microsoft Internet Explorer generates from the page source, as it also has
text-nodes for text in elements (as far as I can deem it is only for block
elements that text-nodes are generated, e.g. they should never be generated for
<q>, <a>, etc.). These text-nodes are characterized by having their
As we can see from the diagram we could be in the situation where an image is the first element of a blockquote. Incidentally an image element cannot contain HTML, so we need to take an alternate course of action in this case: inserting an extra text-node before the image. Fortunately this can be achieved easily using methods on the blockquote element. Likewise if the image element is the last child of the blockquote element. Languages/Sprache/SprogThe next big deal to cover before we go overboard and code happily through
the night, is the tiny little phrase in the specification: What language-dependance is there to this? Quotation marks are just "..." and '...', are they not? It would be much too simple if all languages used the same quotation marks — life just doesn't work like that! It is, of course, easier for those of us who speak more than one language to notice this difference in behaviour between languages. For instance in Denmark text is quoted like this: It is also possible to have quotations inside quotations. In general this means using a single-sign version of the outer quotation (except in English). To simplify matters I have chosen just to alternate the quotation mark as quotes are nested, and not to support any of the alternate quotation styles for the various languages. For instance Danish and Norwegean both have two commonly used alternatives than the one presented in the table below.
The table above has been constructed from the following references: English/American, Norwegean, German/French, Swedish. Only the Norwegean reference is an official reference, most language councils do not publish the language's grammar and usage online (at least not what I was able to locate). The Danish quotation marks have been taken by the official publication by the Danish Language Council. If you want to make corrections, give references to further languages, etc., feel free to contact me. Harvesting the FruitsNow that we have come all this way, from reading the specification to
linguistic analysis we are finally able to construct the script. There are a few
things that we would like to keep optional, and thus we support configuring the
script by placing a few global variables at the top of the script, this
includes: whether to use Apart from the configurability the script isn't much more than a few
functions: get_quotes
q_fixAs I have only had the time to test the script with Microsoft Internet Explorer 6 the function will limit the script to work with this. It should be fairly straightforward to extend it to other versions if they support the full range of methods and properties as well. Following, it queries whether the <html> element has the parse_elementThis is probably the most interesting part of the script as this is the thing that resolves all elements, place all quotation marks, and well... you get the picture. The first part examines the language of the passed element. If it is
different from the language of the parent the new language will be used
( The second part examines whether the current element is one of the elements
listed in the <blockquote> on the other hand is a great deal trickier as it is a
block element and as such can contain a lot of elements, including elements that
cannot contain HTML/text themselves, e.g. <img>. The problematics with
placing the first quotation marks are mirrored in placing the last quotation
mark within a <blockquote> element, so I will settle with explaining the
first: If the element has no children then its Lastly the quotation level will be increased if the element was in
That is all there is to it, really. Integrating the ScriptIntegrating the script into your own pages is fairly painless, all it takes
is an extra line added to your <head> section and calling
<html>
<head>
<title>My Page Title</title>
<script type="text/javascript" src="q_fix.js"></script>
</head>
<body onload="q_fix();">
...
</body>
</html>
That should be doable even for the most JScript-phobic webmasters out there (I hope). Customizing the scriptIf you do not wish to reset the quotation nesting if you change language
somewhere down through the document, then find the variable
If you are only using HTML 4.01 and thus don't want to support the
If you do not wish to have quotation marks added to <blockquote> then
find the line Lastly, if you write your pages in a different language than English and
don't want to place manual Adding LanguagesIf the need arises you can manually add language definitions to the script,
or change existing ones. If you navigate to the case 'en': quotes[0] = '\u2018'; quotes[1] = '\u201c';
quotes[2] = '\u2019'; quotes[3] = '\u201d';
break;
First off you will want to copy this to a new block and change
The The Future PursuitsThere are, of course, always things to improve, always things to add, always things to do, and never really enough time to do it in — ah, the joys of having a job. I rarely work with JScript so I can only presume what the efficiency of the script will be, but as far as I can reckon it should only touch any element once, so it should be fairly efficient (we do need to touch every element down the tree to see whether the language changes). This might be extremely inefficient if you only have few quotes on a page, then it might be more efficient just finding the quotation elements and walking up the document tree to determine the language. The next step would be to automatically support for alternate quotation marks for various languages, and also to expand the list of quotation marks for languages. The current amount of languages is still fairly limited, but with a bit of luck it can increase steadily. If you want to contribute knowledge of quotation marks for some language, please include a book and/or web reference so that I can validate your claims. Of course, the big pursuit would be to write a custom CSS parser in JScript that will override the computations by IE so that we can support the content generation capabilities of CSS2, in particular the :before and :after pseudo elements. This is, however, a large endeavour to take on and not one that I am prepared to spend a lot of time on. Notes and Acknowledgements
Thanks to Sean Kent for reviewing the article prior to submission. ReferencesSpecificationsLanguage-related pagesNot all of the references above are formal, and some even contain errors, but in general they are informative and have been, in some form or another, useful. Development-related pagesHistory
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||