Click here to Skip to main content
12,887,779 members (31,737 online)
Click here to Skip to main content
Add your own
alternative version

Tagged as


4 bookmarked
Posted 22 Jul 2014

Unassuming Unicode, The Secret to Characters on the Web

, 23 Jul 2014 CPOL
Rate this:
Please Sign up or sign in to vote.
How Unicode can lead to eye-catching symbols. The article was originally published at

Recently, I got an e-mail with an interesting title:

How did they do that?

Just how did KLM insert an airplane into the subject of an e-mail? Unicode!

I needn't put a full description here, but unicode is the system that provides a unique identifier for every single character your computer is capable of displaying. Yes Chinese, Yiddish, Maldivian, Airplane symbols, the lot!

So what does this look like under the hood?

To find out, I copied the character into Notepad and saved it, ensuring I selected 'Unicode' as the encoding at the bottom of the 'Save As' dialog.

Then, I viewed the raw binary of the file in a hex editor (I just happened to pick this online one). The results were simply:

FF FE 08 27

What we're seeing here is the hexadecimal representation of the binary in the file. You can confirm this using Windows calculator in programming mode, but for simplicity this is:

FF 11111111
FE 11111110
08 00001000
27 00100111

The first two bytes are telling us that is little-endian UTF-16, these are the byte order mark (BOM). Endian (or endianness) simply tells us from which end we read the data first, which in this case means we read from right to left.

So doing this, we now have (omitting the byte order marks):

27 08

Which just so happens to the unique identifier for the airplane symbol:

But why do you care about this? You could've just copied and pasted the original symbol, right?

Well, it just so happens that HTML encoding closely follows these unicode code points. So if I wanted to use this character myself, I'd want to be absolutely certain it'll render correctly.

To do this, I'd first make sure my page is described as being encoded in unicode using the correct meta tag:

<meta charset="utf-8">

Then I can create the character using &#xnnnn; where nnnnn is the unicode code point. Therefore &#x2708; creates our airplane:

That's just one. There are 109, 383 other characters out there, go and use 'em.


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

Matthew Edmondson
Software Developer
United Kingdom United Kingdom
Selected articles are published on codeproject. For all of my content, including how to contact me please visit my blog.

You may also be interested in...

Comments and Discussions

SuggestionHmmm Pin
Petoj8722-Jul-14 20:18
memberPetoj8722-Jul-14 20:18 
GeneralRe: Hmmm Pin
Member 1029439622-Jul-14 21:33
memberMember 1029439622-Jul-14 21:33 
GeneralRe: Hmmm Pin
eddy55622-Jul-14 21:50
membereddy55622-Jul-14 21:50 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.170424.1 | Last Updated 23 Jul 2014
Article Copyright 2014 by Matthew Edmondson
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid