Click here to Skip to main content
Click here to Skip to main content

Tagged as

XHTML, from an HTML starting point

, 29 Jan 2002
Rate this:
Please Sign up or sign in to vote.
A starting point for people familiar with HTML who want to start using XHTML
<!-- Download Links --> <!-- Add the rest of your HTML here -->

Summary

This article will help people familiar with HTML to start producing XHTML 1.0 compliant documents. This articles approach is very simple, as is XHTML. However it is this very simplicity, of XHTML, which baffles some web developers and designers.

This article is not an indepth look into XML or why XHTML has replaced HTML.

Requirements

A basic understanding of HTML and CSS is recommended for this article. No JavaScript, ASP or XML knowledge is required. Though XML knowledge will help in understanding the XHTML approach and reasons for it.

Why change to XHTML?

In this article I won't be going into the whys of using XHTML or the benefits involved. That will be a topic for a later article. However if you want some good reasons to use XHTML then check these links out:

Won't XHTML break my sites in visitors browsers?

No, put simply. XHTML is very backwards compatible and a page coded using XHTML 1.0 Transitional will work in all browsers that support HTML 4.01. The W3C have done a very good job of moving web documents closer to XML but without breaking compatibility or sending more web developers over the proverbial cliff.

HTML vs. XHTML Examples

So you want to get started in either creating new XHTML compliant documents or converting your current HTML 4.01 documents into XHTML 1.0 documents. Lets start with some actual HTML vs. XHTML, and then move onto the differences in point form.

An HTML document

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
    <head>
        <title>HTML to XHTML Example: HTML page</title>
        <link rel="Stylesheet" href="htmltohxhtml.css" type="text/css" media="screen">
        <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
    </head>
    <body>
        <p>This is the HTML page. It works and is encoded just like any HTML page you    
         have previously done. View <a href="htmltoxhtml2.htm">the XHTML version</a> of 
         this page to view the difference between HTML and XHTML.</p>
        <p>You will be glad to know that no changes need to be made to any of your CSS files.</p>
        <hr>
        <h1>Standards</h1>
        <p>Standards are important for, and this is only one reason, the simple fact that with a 
         standardised web you will only have to code your site once and it will work on all 
         browsers, on all platforms and on all devices.</p>
        <p>Following are some useful web standards links.</p>
        <h2>Useful Links</h2>
        <table cellpadding="0" cellspacing="0">
            <tr class="tblheader">
                <td>Name</td>
                <td>Link</td>
            </tr>
            <tr>
                <td class="tbldata">Web Standards Project, WASP</td>
                <td class="tbldata"><a href="http://www.webstandards.org">webstandards.org</a></td>
            </tr>
            <tr>
                <td class="tbldata">The W3C</td>
                <td class="tbldata"><a href="http://www.w3c.org">w3c.org</a></td>
            </tr>
            <tr>
                <td class="tbldata">XHTML, HTML Validator</td>
                <td class="tbldata"><a 
                 href="http://www.nypl.org/styleguide/">nypl.org/styleguide/</a></td>
            </tr>
            <tr>
                <td class="tbldata">New York Public Library Style Guide</td>
                <td class="tbldata"><a 
                 href="http://validator.w3.org/">validator.w3.org/</a></td>
            </tr>
            <tr>
                <td class="tbldata">Standards Evangelist, Paul Watson</td>
                <td class="tbldata"><a 
                 href="mailto:paulmwatson@email.com">paulmwatson@email.com</a></td>
            </tr>
        </table>
        <hr>
        <p>
            <a href="http://validator.w3.org/check/referer"><img border="0" 
             src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!" 
             height="31" width="88"></a>
        </p>
    </body>
</html>
This is a well formed and valid HTML 4.01 Transitional document. You can validate it against the W3C HTML Validator Service.

An XHTML document

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
    <head>
        <title>HTML to XHTML Example: XHTML page</title>
        <link rel="Stylesheet" href="htmltohxhtml.css" type="text/css" media="screen" />
        <meta http-equiv="Content-Type" content="text/html;charset=UTF-8" />
    </head>
    <body>
        <p>This is the XHTML page. As you can see the result between the two pages 
         is identical, even though one is in HTML 4.01 and the other is in XHTML 1.0. View 
         <a href="htmltoxhtml.htm">the HTML version</a> of this page to view the difference 
         between HTML and XHTML.</p>
        <hr />
        <h1>Standards</h1>
        <p>Standards are important for, and this is only one reason, the simple fact that 
         with a standardised web you will only have to code your site once and it will work 
         on all browsers, on all platforms and on all devices.</p>
        <h2>Useful Links</h2>
        <p>Following are some useful web standards links.</p>
        <table cellpadding="0" cellspacing="0">
            <tr class="tblheader">
                <td>Name</td>
                <td>Link</td>
            </tr>
            <tr>
                <td class="tbldata">Web Standards Project, WASP</td>
                <td class="tbldata"><a 
                  href="http://www.webstandards.org">webstandards.org</a></td>
            </tr>
            <tr>
                <td class="tbldata">The W3C</td>
                <td class="tbldata"><a href="http://www.w3c.org">w3c.org</a></td>
            </tr>
            <tr>
                <td class="tbldata">XHTML, HTML Validator</td>
                <td class="tbldata"><a 
                  href="http://www.nypl.org/styleguide/">nypl.org/styleguide/</a></td>
            </tr>
            <tr>
                <td class="tbldata">New York Public Library Style Guide</td>
                <td class="tbldata"><a href="http://validator.w3.org/">validator.w3.org/</a></td>
            </tr>
            <tr>
                <td class="tbldata">Standards Evangelist, Paul Watson</td>
                <td class="tbldata"><a 
                  href="mailto:paulmwatson@email.com">paulmwatson@email.com</a></td>
            </tr>
        </table>
        <hr />
        <p>
            <a href="http://validator.w3.org/check/referer"><img border="0" 
             src="http://www.w3.org/Icons/valid-xhtml10" alt="Valid XHTML 1.0!" 
             height="31" width="88" /></a>
        </p>
    </body>
</html>    
    
This is a well formed and valid XHTML 1.0 Transitional document. You can validate it against the W3C HTML Validator Service.

The Differences

Frankly the difference between HTML 4.01 and XHTML 1.0 is almost laughable. Don't think your are missing something important just because it is so easy, you aren't, because it really is very easy. I will list the differences and then explain each one in detail:

  • DOCTYPE reference has changed
  • xmlns reference in the HTML tag
  • All tags in lowercase
  • Valid structure
  • Attribute quotes are mandatory
  • "Empty" tags must be closed now
That is it, nothing very earth shattering at all. Lets get into the details.

DOCTYPE

Naturally from HTML 3 to HTML 4.01 your DOCTYPE changed. Similarly from HTML 4.01 to XHTML 1.0 your DOCTYPE must change.

What is a DOCTYPE? It is a declaration at the top of your document. A DOCTYPE, simply put, is a declaration of what standard or specification the web browser should use to interpret the web document. You are telling the web browser that what follows conforms with a certain specification, e.g. XHTML or HTML 4.01. The web browser can then take advantage of this knowledge. It is becoming very important for you to use a DOCTYPE declaration and in fact it is mandatory for XHTML 1.0. If you don't put it in then XHTML 1.0 compliant browsers will not render your page at all.

If you are writing ASP pages then put the DOCTYPE just under the <%@ Language=VBScript %> declaration. Essentially the clients web browser must see the DOCTYPE on the first line of the web document.

An HTML 4.01 DOCTYPE looks like this: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

So for your XHTML documents simply put <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd"> at the top of your page.

xmlns

The xmlns, or XML NameSpace, declaration simply tells the browser, once again, to use the XHTML specification located at W3C. This declaration is carried over from the XML specification and has no correlation in HTML 4.01. People familiar with VML will recognise this usage.

You should locate xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" in the HTML tag, like so:
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">

All tags in lowercase

Since XHTML is a valid XML specification it is case sensitive. This means that <STRONG> is not the same thing as <strong>.

What this all means to you is that from henceforth you should put all tags and attributes in lowercase, not a mix or just uppercase.

*On this topic: As with the English language there are exceptions to every rule. In this case ensure that your DOCTYPE declaration has DOCTYPE in uppercase. If you don't, then it is not valid and the browser or validator won't pick the declaration up. I found this out the hard way Smile | :)

Valid structure

A lot of web developers create invalidly structured HTML, I know I used to. For instance this snippet:

<p><b>This is invalid</p></b>
is not valid because the paragraph tag is closed inside the strong tag, while the strong tag is opened inside the paragraph tag. However HTML 4.01 lets you off without even a warning.

XHTML 1.0 however will crack down on this and your web document will not be valid. To be valid you should maintain a valid structure, like so:

<p><b>This is invalid</b></p>

Mandatory attribute quotes

Attribute quotes are the quotes around the value of an attribute. For instance the src attribute of an image must have its value surrounded by quotes, like so: src="images/bob.gif"

Culprits like Microsoft Visual Interdev do not put quotes around attribute values and web browsers allow this (though Netscape can sometimes get confused, as it is wont to do.) XHTML compliant browsers will not render your document if you do not use quotes. Single quotes btw do not count.

So for XHTML 1.0 never do <p style=font-weight: bold>Where are your quotes?</p> but rather do <p style="font-weight: bold">Ahhh, there they are!</p>

Close "empty" tags

An empty tag is a tag such as <img> or <br>. Essentially it is a tag without a closing tag.

Because XHTML is a specification of XML all tags must be closed. Either by <p>closed</p> or by <p />.

So for XHTML all you need to do is make sure you put a / before the closing bracket of any empty tags.

It must be noted that you should also put a space inbetween the / and the rest of the tag's attributes, like so <img src="images/bob.gif" width="50" height="50" alt="Bob, cavorting" />. The reason for this is that Netscape will definitley fall over if you put the / in without a space.

Wrapping Up

Yes, I am dead serious. That is all there is to it.

Remeber to use a DOCTYPE, put in your xmlns, use lowercase for attributes and tags, always use valid structure, put attribute values in quotes and always close empty tags. Once you do that, you are well ahead of the curve and preparing your web documents for the promises of XML.

Please note that this article is based on the Transitional XHTML spec and no the Strict spec. The reason for this choice is that the Strict spec is nowhere near as backwards compatible as the Transitional spec.

So XHTML is really simple and really only involves a bit more dedication and concentration from web developers. If you want another article on the why of XHTML please write to me and I will do it.

I learnt XHTML through zeldman.com and the incredibly to-the-point New York Public Library Style Guide.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Share

About the Author

Paul Watson
Web Developer TSSG
Ireland Ireland
Paul is an internet developer living in Dublin, Ireland though home is still South Africa.
 
He believes in self-taught programming skills, standards based thinking and in the power of the common man.
 
Oh, and he loves photography. Make sure you don't get caught in the corner of a party when he has that photographic gleam in his eye. And if you were wondering about that bed-head photograph, wonder no longer...
Follow on   Twitter

Comments and Discussions

 
GeneralMy vote of 5 PinmemberBobbyWD30-Jul-11 5:11 
GeneralWhy xhtml i NOT so great! PinmemberMember 383927431-Jan-08 11:42 
GeneralRe: Why xhtml i NOT so great! PinsitebuilderPaul Watson31-Jan-08 11:55 
GeneralRe: Why xhtml i NOT so great! PinmemberMember 383927431-Jan-08 12:13 
GeneralRe: Why xhtml i NOT so great! PinsitebuilderPaul Watson31-Jan-08 12:21 
GeneralRe: Why xhtml i NOT so great! PinmemberMember 383927431-Jan-08 12:55 
GeneralRe: Why xhtml i NOT so great! PinsitebuilderPaul Watson31-Jan-08 13:02 
GeneralGreat points PinmemberJamie Nordmeyer29-Dec-07 9:26 
GeneralLet's try then... PinmemberK(arl)27-Jan-05 10:24 
GeneralWoohoo! PinmemberRohit Sinha16-Nov-02 10:09 
GeneralRe: Woohoo! PinsitebuilderPaul Watson17-Nov-02 5:51 
GeneralRe: Woohoo! PinmemberRohit Sinha17-Nov-02 7:58 
GeneralRe: Woohoo! PinsitebuilderPaul Watson17-Nov-02 8:18 
Rohit Sinha wrote:
On an aside, what is the level of support in different browsers for XHTML? Or even HTML?
 
Well that is just the thing with XHTML. It does not add new tags or change how things work. All it does really is make HTML an application of XML by ensuring XHTML is well formed (virtually all of my tips in the article can be applied to doing XML.)
 
So XHTML works in any browser that supports HTML 3 and up, because those browsers treat the XHTML as just funny HTML. They ignore the DOCTYPE mainly and just "run" with the XML nature of it.
 

As for HTML support that is a huge subject and something that nobody has ever covered thoroughly. Here is a good enough browser chart though.
 

 
Rohit Sinha wrote:
Plus a run down on all the caveats, surprises and gotchas coupled with your own experiences would help everyone a lot.
 
Laugh | :laugh: Laugh | :laugh:
 
I could write a book on that.
 
Plus backwards-compatibility is no longer the focus. The focus is on forwards-compabitility (sites that don't break in NEWER browsers) and that is being covered in many new books, including some by Zeldman.
 

 
Really I recommend reading Zeldman.com everyday. It is the best source of practical web-development info.
 

Rohit Sinha wrote:
And I had thought the small disclaimer after that remark had put things right.
 
LOL, well I was just covering my ass Smile | :)
 
Paul Watson
Bluegrass
Cape Town, South Africa

Colin Davies wrote:
...can you imagine a John Simmons stalker !
GeneralRe: Woohoo! PinmemberRohit Sinha17-Nov-02 8:34 
GeneralRe: Woohoo! PinsitebuilderPaul Watson17-Nov-02 8:47 
GeneralRe: Woohoo! PinmemberRohit Sinha17-Nov-02 9:38 
GeneralI'm voting 5, and here's why.... PinmemberBarry Lapthorn16-Nov-02 5:54 
GeneralRe: I'm voting 5, and here's why.... PinsitebuilderPaul Watson17-Nov-02 5:44 
GeneralRe: I'm voting 5, and here's why.... PinmemberBarry Lapthorn17-Nov-02 5:50 
QuestionHow about javascript etc.? PinmemberAnonymous6-Mar-02 7:51 
AnswerRe: How about javascript etc.? PinmemberPaul Watson13-Mar-02 7:00 
AnswerRe: How about javascript etc.? PinmemberBarry Lapthorn16-Nov-02 5:51 
GeneralRe: How about javascript etc.? PinmemberGraham N11-Apr-06 8:26 
GeneralXHTML :-) PinmemberNish [BusterBoy]1-Feb-02 5:32 
GeneralRe: XHTML :-) PinmemberPaul Watson4-Feb-02 2:51 
QuestionWhat about Tidy? PinmemberLaurent Kempé30-Jan-02 20:04 
AnswerRe: What about Tidy? PinmemberPaul Watson30-Jan-02 23:47 
GeneralRe: What about Tidy? PinmemberLaurent Kempé30-Jan-02 23:59 
GeneralThanks PinmemberJon Sagara30-Jan-02 11:15 
GeneralRe: Thanks PinmemberPaul Watson31-Jan-02 0:44 
GeneralGreat article Paul! PinmemberAndrew Peace30-Jan-02 10:55 
GeneralRe: Great article Paul! PinmemberCLaW30-Jan-02 12:26 
GeneralRe: Great article Paul! PinmemberPaul Watson30-Jan-02 23:38 
GeneralRe: Great article Paul! PinmemberAndrew Peace31-Jan-02 13:31 
GeneralRe: Great article Paul! PineditorPaul Watson11-Jun-02 21:32 
GeneralRe: Great article Paul! PineditorAndrew Peace12-Jun-02 8:31 
GeneralXHTML PinmemberCLaW30-Jan-02 10:10 
GeneralRe: XHTML PinmemberPaul Watson31-Jan-02 5:05 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.1411023.1 | Last Updated 30 Jan 2002
Article Copyright 2002 by Paul Watson
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid