When making an application world-ready, it is important to isolate localizable resources but it is also important to identify those resources that are culture-invariant. That is to say, those resources that will be identical regardless of the culture the application is running in. Let me tell you a story to illustrate the point.
During 1997 I was developing a photo-real colour rendering application for paint retailers. However, when we tested the application on french or spanish versions of Windows all the colours rendered to black. Investigation showed that the problem lay with our file data. We were storing colour reflectance data in text format files (suffice to say that these contain either 16 or 32 floating point values, e.g. "(15.362,3.297,...)"). The problem was that we were using the intrinsic VB library functions to read this data. Now, let me be clear - the problem was not with VB, rather with our initial awareness of the functions. It transpired that VB has some locale-aware functions, and we were using these. So, VB was parsing the data using the current locale. The root cause was that french and spanish use a comma as a decimal point... If you glance at our data formatting, it is immediately apparent what went wrong!
There were two potential resolutions considered:
- Change the separator from a comma to some other character that no culture uses as a decimal point, for example a tilde (~). This solution would have worked fine, because the VB libraries gracefully fell back to English decimal points when the locale-specific decimal point failed. However, we could not do that because we would have two file format implementations: English and non-English... not good.
- We wrote our own string parser. This meant significantly more dev effort and testing than should have been required but it meant that our files were valid anywhere.
I tell this story as a real-world example of the pain that can be encountered by not addressing globalization issues.
DateTime values are difficult to handle in a truly consistent, reliable manner. They have amongst the most variable representations. I don't propose to enter into a deep discussion of these variabilities, but consider the following list of representations of the current
LongDateTime of my writing this article:
en-GB = 27 September 2003 16:34
en-US = Saturday, 27 September, 2003 4:34 PM
fr-FR = samedi 27 septembre 2003 16 h 34
fr-CA = 27. septembre 2003 16:34
There are two factors to consider here:
- A datetime value represented as above does not fully express all of the information we need. Why? Because in order to express the information fully we are missing
TimeZone data. I am writing this post in England. The representation is entirely dependent upon the fact that I am in the British Summer Time (BST) time zone. If I display this data in New York, it will be inaccurate by 5 hours unless I factor in the time difference. Indeed, if the time were 01:34 then not only would the time be inaccurate but also the day part (ditto for month part and year part).
- The representation is culture-specific. In the examples above, you could deduce the culture if you wanted to. However, consider the following classic US-UK date problem: is 01/02/2003 the 1st February (UK) or the 2nd January (US)?
The way to resolve these issues is to represent the
DateTime value in a culture-invariant manner. You can do this by always storing the value in a specified culture format, for example en-US. However, this is not standards compliant and requires that third party consumers of your data need to know the culture-specificity. The answer to this is to use a standard culture-invariant format. There are a number of these, for example NATO format (an example of which, I think, is 01 JAN 2003 16:34:00) but the best one to use is the ISO 8601 standard. Discussions of this standard can be found at the W3C, University of Cambridge, and University of Wellington. In this standard format, the datetime string would be 2003-09-27T16:34:00. In addition, this standard provides a mechanism for representing the timezone. The simplest way to make the data totally culture-invariant is to factor the data to Greenwich Mean Time (GMT) which is represented by a 'Z' suffix. So the date of this post would be represented as 2003-09-27T15:34:00Z.
So, this sounds like quite a lot of work.
But no! We are using .NET, so our life is easy:
DateTime _datetime = DateTime.Now;
string _formattedDateTime = _datetime.ToUniversalTime().ToString("s",
DateTimeFormatInfo.InvariantInfo) + "Z";
Actually, the only sore point is that the
DateTime class does not seem to natively support the TimeZone requirements of ISO 8601 (unless someone can point me to how...) which is why I have appended a 'Z' in the code above.