Unicode is not "format". Unfortunately, there is no such thing as Unicode-enabled domain names in URL which you can use in the same clear manner as ASCII-based. There are Internationalized domain names (IDN) first implemented in 1998 and approved in 2009. Please see:
Unicode URL not accessible[
^].
In my opinion, huge part Web standards is quite ugly, and I think that
IDN is the ugliest thing on the Web. The problem really strikes in the representation of the Unicode domains name, especially top-level domains, which is done via some ASCII "equivalents" (quotation makes intended). For example, " .中國" encodes as ".xn--fiqz9s", ".台湾" — as ".xn--kprw13d", and so on. How do you like it? Please see:
http://en.wikipedia.org/wiki/Internationalized_country_code_top-level_domain[
^].
I don't know what to do about it. Maybe to look at this philosophically (human stupidity is limitless). I understand that pre-Unicode legacy was very difficult to overcome, and Unicode is probably not quite perfect, and I understand specific Chinese problems, when the characters are
pictographs or
ideographs and don't have common nation-wide pronunciation, but was that really necessary to create such a mess? maybe it would be better to live with ASCII then "solve" the problem in such a messy way? Probably I'll personally try to avoid such domain naming as much as I can. :-)
—SA