For the novice web developer, discovering the code behind the web pages you see in the browser can be a frightening thing. This article, and the series, will provide an easily-understandable breakdown of the mysteries of HTML5 and CSS3, along with a structured web site project, built in small increments throughout the chapters, that will demonstrate the functionality of HTML and CSS.
From SGML to HTML5
Standard Generalized Markup Language (SGML) exists to provide a means of structuring information for institutions such as government and industry in such a fashion that the information can be parsed by computer applications many years after its creation. Due to its extreme generalization it is sometimes called a meta-language.
HTML was conceived in 1989 by Tim Berners-Lee, a computer scientist at CERN. Berners-Lee initially wanted HTML to be a product of SGML, but it instead ended up being created independently, inspired by SGML's strengths. Most notably, HTML borrowed the concept of the element.
HTML went through several revisions between 1993 and today. At first it was merely a proposal, but the mere idea of a standardized method of transferring and viewing documents generated a significant amount of interest. As the net grew, so did the sophistication of its page designers, and what followed were improvements upon the HTML standards. Conceivably, that improvement will continue into the foreseeable future.
HTML5 is the 5th version of the Hypertext Markup Language standards, released by the W3C group. as a candidate recommendation (CR) in December of 2012.
The Rise of CSS
As web page design became more complex and sophisticated, web designers demanded additional capabilities. In response, browser vendors began to create proprietary elements–tags that would make text blink or set the font type, for example. Unfortunately this led to incompatibilities between different browsers. Designers either had to create pages for one browser with bells and whistles, or they marked down the pages to suit all browsers, sans the glitz.
The web community came to the realization that something new was warranted: the structure needed to be separated from the styling. Several proposals were submitted, and in 1996 the W3C released its CSS Level 1 Recommendation. CSS Level 2 was released in 1998, and the CSS3 specifications began immediately after Level 2 was released. It is still a work in progress as of the publication of this article.
Today's web designers rely heavily on CSS.
HTML provides the structure in the document, or web page. By that, we mean to say that the browser, or user-agent, builds a page to display according to the semantics defined in the HTML document.
The semantics are defined as elements, and these elements consist of an opening tag, content, and a closing tag. According to the Mozilla Developer Network, there are 109 valid HTML5 elements, and almost every contemporary browser supports all of them. The full W3C standard on the other hand defines 116 separate elements, some which are not implemented anywhere and are subject to removal.
The elements are named in such a way as to provide semantic meaning to the code. When elements are arranged properly in the HTML document, the browser displays a legible and (hopefully) visually-appealing web page. Likewise, when the web designer examines the HTML code, the elements give the document meaning.
For example, HTML defines the navigation items on a page through the <nav> element. Summaries of other pages can be placed in <article> elements. Things that show up at the bottom of the page will be placed in the <footer> element. All the elements work together to give the web designer a clear picture of the structure of the document, as well as providing the browser with correct instructions.
CSS, alternatively, provides the presentation.
Well, mostly it does. CSS can provide some formatting functionality by setting widths and positioning, but its main purpose lies in what it can do to the presentation of the content.
When you want a paragraph to display a different font from everything else, you use CSS. When you want to set the colors on a page to the ochre palette, you use CSS. The new CSS3 specifications allow the web designer to paint gradients, and you can also fade text in or out when you pass the cursor over the content. Simply stated, style sheets breathe life into otherwise dreary text.
You Need an Editor
Because HTML documents are constructed from plain text, any one of many editors can be used to create your HTML pages. Here is a list of some free editors:
The installation of the listed editors is fully explained in the links. Once you have installed one (or perhaps if you choose just to use an editor already installed with your operating system)
For this series, I will use Notepad++ as an example. When Notepad++ opens, a default blank document appears.
If you need to create a new document, then click on File, and then New.
Since all of the code examples are shown in plain format, the text can be copied and pasted directly into whatever editor you choose to use. Then you can save the document by clicking on File, and then Save As. After saving the file for the first time, clicking on File and Save will save the document. Ctrl-s is a shortcut that does the same thing.
One nice thing about Notepad++ is that it will automatically open the documents you had open when you last closed the application.
The Very First Step
Create a new file in the editor and save it as basic.html. If you use Notepad++, you should see something like this:
Code can then be copied and pasted or just transcribed into this file. The method of showing the HTML file in a browser comes later.
Browsers Play Nice
The text entered above is all you need to display "Hello browser!" in a browser. HTML5 rules are extremely lax, and you can type just about anything to get something displayed.
<strong>This AWESOME text should show up as bold.</STRONG>
Except for displayed content, elements are case-insensitive. Notice above that the element opening tag name is in lower case letters, while the element name in the closing tag is in upper case letters. That is because the browser considers casing to have no affect on valid HTML syntax. XHTML documents required all tags in lower case, but W3C recommendations for HTML5 tossed that rule out.
All elements must start with an opening tag.
Element tags are wrapped with the '<' and '>' characters. The element name falls between the angle brackets. There should be no whitespace between letters in the element name, or before the element name:
Elements do not need a closing tag, but...
The element content is what falls between the opening and closing tags. Closing tags are formed by starting it with the characters '</' and ending with '>'. And, just like with opening tags, closing tags require the element name between the brackets. Consider the following:
<p>The rain in Spain falls mainly on the plain.</p> But not in the forest.
The opening tag is <p>, and the closing tag is '</p>'. The text between the <p> and </p> tags constitute the content of the p element. The text "But not in the forest" however is not within the tags and is not considered content of the p element.
Some elements are called "empty" because they contain no content. Empty elements are closed with a '/' character before the '>' character. Here is an example of an empty element:
The br element inserts a line break. There can be no content in a line break because it just provides vertical spacing. Another empty element is the img element.
Just because closing tags are not required, it does not follow that you don't need them. Several elements can have other elements nested inside them.
<p>The rain in <b>Spain</b> falls mainly on the plain.</p> Oh yeah.
Notice how the b element has its opening and closing tags between the p element's opening and closing tags. If we left out the closing tag for the b element, the display would not be what we intended.
With the tag '</b>' in place:
Without the tag '</b>' in place:
Notice that all of the text after "in" is in bold.
The same thing happens if you place the closing tag of the nesting element outside the closing tag of the parent element:
<p>The rain in <b>Spain falls mainly on the plain.</p> Oh yeah.</b>
Because the b element does not close before the p element does, the text after "in" is all in bold. Several good editors provide a highlighting mechanism that indicates paired tags and helps with keeping your code valid. For the most part, however, you should be able to easily see nesting issues before they become a problem.
Handy Tip: Since browsers ignore extra whitespace in documents, you can easily create a visual structure to your code that eliminates most nesting issues. This is discussed in part 2 of the series.
If you want valid HTML5 code, then there are just a few conditions which need to be met. The template included below in the HTML Elements section will pass validation. I recommend using it or something similar when you begin each HTML5 document.
Valid code means all necessary conditions were met and all elements are well-formed. There are validation tools online, such as Validator.nu, which give you the ability to have your document scanned for valid syntax.
Validation should be performed frequently. Creating valid HTML documents distinguishes professionals from everyone else and promotes excellent work ethics. In this author's humble opinion, nothing sets web designers apart more than the validation of their product. Validation is not necessary, but the output from validation services will point out problems that were missed when coding.
W3C provides an online HTML5 document validation service. You can click on any of the tabs at the top of the page to choose the page source. When you have chosen the document, you can press the Check button and get an instant report.
So, to demonstrate the validation of code, perform the following:
- Go to http://validator.w3.org/.
- Enter the following in the Address field: www.google.com.
- Click the Check button.
- Notice the litany of errors reported and the red bar near the top of the window. This means the Google home page is not constructed in valid HTML5 syntax, even though it runs just fine in the browser.
- Enter the following in the Address field: www.jjensen2.com/wgu/index.html.
- Click the Revalidate button.
- Notice the green bar near the top of the page. This means the page in the address field passed validation.
There is a basic set of elements that you can use to form a structure in the document. The browser, as said before, will forgive many mistakes, but the aspiring web developer should practice forming their HTML5 documents with a modicum of structure. Please note the following:
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
<title>Beginner's HTML5 Practice Page</title>
The above code can be used as a template to facilitate structure in your document. None of the tags above are necessary for a browser to display content, but these elements are needed if you want to ensure your document passes validation services.
Please also note that these elements may not need closing tags, but I included them to clarify the arrangement of the elements.
Below, you will find a brief description of the elements in the template above. I left the meta element out, because it is used for more advanced needs. Nonetheless, it is necessary for validation. An explanation of the element will come later in the series.
Although not an element, DOCTYPE is a required preamble. It tells the browser to expect HTML5 code for the rest of the document. That's it. DOCTYPE definitions for earlier versions of HTML and XHTML were very strict and a bit confusing. The HTML5 recommendations simplified things a lot.
The html element is called the 'root' element of the document. All other elements nest inside the html element.
The head element is the first element to be nested in the root element. Within the head element, several other elements may be expressed. The meta element in the template is one. External CSS files can also be defined here, but the details on that will come later.
The title element shows up inside the head element. The text between the opening and closing tags becomes the title of the page. As such, the title will display on the browser tab. Without a title element, browsers usually display the file name.
The body tag follows the head tag. The browser regards all material between the opening and closing tags of the body element to be the web page content. 99% or more of your HTML5 code will be nested in the body element.
The basic elements we will briefly look at now will all nest inside the body tag.
The p element denotes a paragraph of text. For an HTML page, the paragraph is more a structural concept than a logical one. Paragraphs automatically generate a margin around the content, which provides a visual sense of isolation on presentation. The Chrome browser, for example, places a margin of 8 pixels around the text.
<p>Someone once told me that a stitch in time saves nine. I
never really understood that saying. Perhaps, if he gave it
some context, I would not feel so idiotic.</p>
looks like this in the Chrome browser:
Paragraphs are one of several elements that can contain other elements, along with or instead of text content.
There are six heading elements, with h1 as the most important heading and h6 being the least. In a browser page, the heading level is usually directly related to the font size of the displayed content.
Heading elements also display with a default margin, the exact distance initially dependent upon the browser itself. Here is the code to show all the heading elements compared to normal text:
<p>This is normal text.</p>
This is what it looks like in the browser:
Semantically, the heading elements are meant to indicate a structural heading, and not a shorthand way to format some text at a different size or in bold. Also, some elements, like the p element, cannot nest inside a heading element.
A br element inserts a line break, which is also called a carriage return. Without using p elements, one can separate unadorned content with br elements and display paragraph-like information. For example, the following:
This is the default font style and size.
This text is continued on a new line.
looks like this in a browser:
Notice that the spacing is not the same as the p element sample above. This is because the br element only goes down one line and to the left margin of the current container, without any margins and measured by the font size.
Putting it All Together
What follows now is HTML code which can be copied and placed in the body content of your practice page. This code demonstrates all of the elements we just reviewed above. You can see how HTML elements alone have the ability to provide a semantically obvious structure in the code:
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
<title>Beginner's HTML5 Practice Page</title>
<h1>My HTML5 Practice Page</h1>
<p>SGML stands for Standard Generalized Markup Language.</p>
<p>I found out that HTML is a derivative of SGML.</p>
<p>!DOCTYPE, html, head, title, meta, body</p>
<p>The encoded URL in the browser looks normal in the file system.</p>
<h2>Just the Beginning</h2>
<p>I just realized how easy HTML can be!</p>
As well as what it looks like in the browser:
Now this web page is pathetically dull. But, as you will see in later chapters, CSS can make this drab text look very pleasing.
If you have been attempting to view the preceding samples on your own system by cutting and pasting and so forth, you may have been confused about the way you can view a file on your hard drive in the browser.
Luckily, choosing a file is extremely simple. Pressing Ctrl-o in a browser on a Windows machine will pop up a file picker dialog box. Just search for your file, click on it, and then click the Open button. The browser will then display the file you chose.
In the address bar of the browser, things might look a little funny, especially if your path or file name contains characters that don't play nice with URL standards. For example, this:
D:\Articles\Beginner's Guide to HTML\begin.html
will look like this in the address bar:
In the case of the example above, the spaces were converted to '%20'. This is actually the hexadecimal representation of a space character preceded by a percent sign. A URL with hexadecimal representations of illegal URL characters is said to be encoded. This is an important concept for you to understand as your experience with web development continues, but we don't need to discuss it for now.
Now, You are Really Developing
The repeated act of editing a file, saving it, and then viewing it–followed by editing, saving, etc.–is a normal development cycle for web pages. Very little about this process is different across all industries. Although you may use high-powered integrated applications to edit the file or an obscure browser to view it, the edit-save-view cycle remains essentially unchanged. So, if you have followed the exercise instructions, you just became an official web developer. Congratulations!
In this chapter, we examined the history of HTML and CSS, how they relate to each other, and how a small selection of HTML5 elements cause flat text to appear in different ways on a web page. I also briefly demonstrated how one editor can be used in your web design process.
Part 2 of this series will be published soon.
2014-03-27: First posting.