The World Wide Web (WWW) or simply the Web has come of age. For the many mortals on Earth, especially the digital natives, one of the commonest activities everyday is Web Surfing.
As a daily ritual, we launch a web browser, type into the address bar some text that looks like http://www.codeproject.com, and wait while some web page is being loaded onto the browser window. A typical web page consists of texts, images, and linkages to other web pages. Apart from differences in contents, web pages of different websites also differ in terms of looks and feels in their quest to establish their own identities on the Web.
If you have ever wondered how these web pages that come to your screen are created and rendered in such myriad ways, then you have come to the right place. Let's welcome the two core technologies for building the many web pages that you and I have seen on any browsers today - HTML and CSS.
In a nut shell, HTML provides the scaffolding while CSS the facial therapy to web pages. As the saying goes: "A picture is worth a thousand words". This is best illustrated in Figure 1 by comparing the appearances of a web page before and after the application of CSS.
|Before CSS|| After CSS|
Figure 1: Before and After CSS|
Eager to know more, read on...
Once Upon a Time
Birth of the HTML
The clock rolls back to 1989 when a young talented software engineer at CERN, by the name of Tim Berners-Lee invented the World Wide Web. The following year, he created three technologies that laid the foundation of today’s Web:
- HTML: HyperText Markup Language. The de facto standard for structuring, publishing, and linking documents on the Web.
- URI: Uniform Resource Identifier. A unique “address” assigned to each resource on the Web to facilitate accessibility.
- HTTP: Hypertext Transfer Protocol. A communication protocol used by the Web to define how web pages and messages are formatted and transmitted.
As the name suggests, HTML marks up each text on a web page by enclosing it with certain predefined tags, e.g. <title>, <head>, and <p>. Any browser that picks up these tags would know how to display the enclosed texts correctly.
In tandem with the rapid grow of the Web, new HTML tags like <image> and <table> were added to improve users' web experience. The <table> tag was initially introduced for the purpose of data tabulation but was later used to format the layout of web pages. However, this way of mixing presentation with structure was later found to be disastrous.
State of Anarchy
Attracted by the popularity of the Web, many different browsers emerged. One by one, Mosiac, Netscape, followed by Microsoft made their forays into the browser market, each brought along their own proprietary stylistic tags in their bids to increase market share and to meet the demands of web developers. HTML had started to drift away from its original root as pure structure provider.
The browsers war that ensued in the mid 1990s brought chaos and confusion to the Web, much to the frustration of the users. Pages that used proprietary tags displayed differently or did not display at all in rivalry browsers were common complains. The state of anarchy had given rise to browser compatibility issues.
In the late 1990s, it has finally dawn on the World Wide Web Consortium (W3C) that it had to do something to rein in the situation. They had decided to cleanse HTML back to its original role of structure provider, while introducing a new technology to take on the presentation role of web pages. A wise move that led to the introduction of CSS.
Dawn of the CSS
The full name of CSS is Cascading Style Sheets. It is the presentation language of the Web. It adds styles to web pages by assigning values of fonts, colors, or layout to the respective HTML tags. However, CSS is not just for HTML, it can also be used with any XML-based markup language.
This separation of concern brings about many benefits. For example, it becomes possible to cascade stylistic description down to different pages of a website from a single CSS page, this is a far cry from having to hard code the same information on each and every page of your site. In other words, use of CSS has helped to ease website maintenance tremendously.
In addition, it is also possible to apply different CSS's to the same document for presentation in different environments such as large screens, small screens, or printers, much to the delight of the users.
HTML5 and CSS3
HTML5 is the latest standard for HTML, replacing the previous HTML 4.01.
HTML5 was born out of cooperation between the W3C and the Web Hypertext Application Technology Working Group (WHATWG). HTML5 was created with the following objectives in mind:
- Reduced dependency on plugins (like Flash)
- Replaced scripting with markups
- Independent of devices and platforms
Reading the HTML Design Principles helps to understand better why HTML5 is like what it is today.
As for the CSS, its latest standard is CSS3 which is completely backwards-compatible with earlier versions. The CSS3 specification is still under development by W3C and the latest version is Cascading Style Sheets (CSS) Snapshot 2010.
To learn the essence of HTML in its authentic form, I strongly recommend a text editor like Notepad for PC, TextEdit for Mac, or any open source text editor such as Notepad++. At this stage, stay away from any professional HTML editors that promise WYSIWYG (What You See Is What You Get) but effectively deprive you of any real learning.
Go ahead and launch your text editor now.
Step 1 - Type the following text in Figure 2 "diligently" and "faithfully" into the text editor. I have deliberately chosen screenshot over text for displaying the code snippet to discourage any act of copy and paste. I shall defer any explanation on the code to the next section.
|Figure 2: Writing HTML Code|
Step 2 - Create a new folder say "mysite". Inside this folder, save your newly created HTML document using either .htm or .html file extension. In fact, I would advise creating this folder and saving your file as soon as you open it for the first time. (So there should be a Step 0). I have chosen the file name as "hello.html".
Step 3 - Double click on your HTML file and view your first web page on a browser (Figure 3). Congratulation! You have just successfully created a web page in HTML.
|Figure 3: Viewing on Browser|
Step 4 - Place your text editor and browser side-by-side and cross examine them. On one side you see your HTML source code while on the other side you see how the browser interpreted and rendered it. You see that the contents enclosed in the respective tags have shown up on the browser but not the tags. Wait a minute, something is not quite right. Why are the extra white spaces and indentations that you have "faithfully" inserted not shown up? Where is the "Hello HTML" in the title tag? Can you find it? Read on...
As you would have already noticed, contents in HTML are enclosed in pairs of tags such as <title></title>, <h1></h1>, and <p></p>. Let get started to familiarize ourselves with some basic HTML tags.
The very first line of any HTML document must starts with this tag <!DOCTYPE>. It tells the browser what version of HTML the page is written in so that the browser can render it correctly. In that sense, <!DOCTYPE> is not an HTML tag.
The <!DOCTYPE> for HTML 4.01 looks like this
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
The <!DOCTYPE html> in our example declares the document type as HTML5, the latest standard for HTML. It is much more concise and readable than its predecessor.
More information on <!DOCTYPE> can be found at W3C.
The <html> tag signals the start of an HTML document and must be closed by </html> tag as the last line in the document.
The area between the opening <head> tag and the closing </head> tag serves as the container for other tags like <title>, <script>, <style> and <meta>.
The area between the opening <title> tag and closing </title> tag is the place for putting the title of your HTML document. The content of the <title> tag will appear in the browser toolbar. You would have noticed the title of our example "Hello HTML" on your browser toolbar. It will also appear as bookmark name when you bookmark the page.
The area between the opening <body> tag and the closing </body> tag serves as the main container for the visible web contents that you see on the browser window as well as other HTML tags like <h1>, <p>, <image>, and <table>.
<h1>, <h2>, ...<h6>
There are a total of 6 heading tags, ranging from <h1>, <h2>, to <h6>. We have used 4 of them, i.e. <h1>, <h2>, <h4>, and <h6>, in our example. They must be accompanied with their corresponding closing tags. Browsers automatically add an extra spacing respectively above and below each heading. Their usages are self-explanatory.
We use <p> tag to divide and organize the web content into paragraphs. Every <p> tag must come with the closing </p> tag. We have 3 paragraphs in our example. Like the heading tags, browsers automatically add an extra spacing respectively above and below each paragraph. I have deliberately inserted extra white spaces in the paragraphs but they did not show up on the browser. You would have noticed that indentations and extra spacing between tags did not show up either. The message is clear: "Extra white space is ignored".
Points to Note
It is time to take stock of today's learning by noting these salient points:
- Extra white space is ignored
- HTML tags are wrapped in angle brackets like <html>
- HTML tags normally come in pairs with few exceptions
- Every closing tag has a slash added before its tag name like </html>
- HTML tags are not case sensitive: <H1> means the same as <h1>. W3C recommends lowercase, so do I.
- It is not uncommon to make typographical errors in code. One effective way to alleviate the pain of spotting errors is through proper indentation of code as shown in my example.
- Last but not least, I have observed that one of the commonest mistakes is forgetting to add closing tags. My solution is this: write the closing tag as soon as you have written the opening tag, after which you can take your time to insert the content that goes in between the tags.
We shall pause for a breather.