![]() |
General Programming »
Date and Time »
General
Intermediate
Beating the Daylight Savings Time bug and getting correct file modification timesBy Jonathan GilliganWindows reports erroneous file modification times, which change according to daylight savings. This article describes why this is so and how to determine correct file modification times and avoid the DST bug. |
VC6, Windows, Dev
|
|
Advanced Search Add to IE Search |
|
|
|
||||||||||||||||
Not many Windows developers seem aware of it, but Microsoft deliberately designed Windows NT to report incorrect file creation, modification, and access times. This decision is documented in the Knowledge Base in articles Q128126 and Q158588. For most purposes, this behavior is innocuous, but as Microsoft writes in Q158588,
After the automatic correction for Daylight Savings Time, monitoring programs comparing current time/date stamps to reference data that were not written using Win32 API calls which directly obtain/adjust to Universal Coordinated Time (UTC) will erroneously report time/date changes on files. Programs affected by this issue may include version-control software, database-synchronization software, software-distribution packages, backup software ....
This behavior is responsible for a flood of questions to the various support lists for CVS, following the first Sunday in April and the last Sunday in October, with scores of people complaining that CVS now reports erroneously that their files have been modified. This is commonly known as the "red file bug" because the WinCVS shell uses red icons to indicate modified files.
Over the past two years, several people have made concerted efforts to fix this bug and determine the correct file modification times for files both on NTFS and FAT volumes. It has proved surprisingly difficult to solve this problem correctly. I believe that I have finally gotten everything right and would like to share my solution with anyone else who cares about this issue.
The batch file listed belowFor example, run the following batch file on a computer where C: is an NTFS volume and A: is a FAT-formatted floppy disk. You will need write access to C:\ and A:\. This script will change your system time and date, so be prepared to manually restore them afterwards.
REM Test_DST_Bug.bat REM File Modification Time Test Date /T Time /T Date 10/27/2001 Time 10:00 AM Echo Foo > A:\Foo.txt Time 10:30 AM Echo Foo > C:\Bar.txt dir a:\Foo.txt c:\Bar.txt Date 10/28/2001 dir a:\Foo.txt c:\Bar.txt REM Prompt the user to reset the date and time. date time
The result looks something like this (abridged to save space)
C:\>Date 10/27/2001 C:\>dir a:\Foo.txt c:\Bar.txt Directory of a:\ 10/27/01 10:00a 6 foo.txt Directory of c:\ 10/27/01 10:30a 6 Bar.txt C:\>Date 10/28/2001 C:\>dir a:\Foo.txt c:\Bar.txt Directory of a:\ 10/27/01 10:00a 6 foo.txt Directory of c:\ 10/27/01 09:30a 6 Bar.txt
On 27 October, Windows correctly reports that Bar.txt was modified half an hour after Foo.txt, but the next day, Windows has changed its mind and decided that actually, Bar.txt was modified half an hour before Foo.txt. A näive programmer might think this was a bug, but as Microsoft emphasized, this is how they want Windows to behave.
Having spent a lot of time thinking about this problem, I wanted to share the information with people who want to get to the bottom of it, but I realize that most readers here will just want to grab my solutions and use them. Thus, I am putting the instructions here for using the solution.
The library I supply contains one exported function: BOOL GetUTCFileModTime ( LPCTSTR name, time_t * utc_mod_time ). Just pass this function a filename
(it can be a fully qualified path or just a file name in the current directory). The function will return TRUE for success, FALSE for failure, and will store
the UTC file modification time in * utc_mod_time. Link with the JmgStat.lib library and you're off and running.
Oh, yes. I have wrapped the library in the namespace Jonathan_M_Gilligan_95724E90_4A88_11d5_80F3_006008C7B14D.
Someone else might have a function called GetUTCFileModTime() and I don't want to collide with it, so I have concatenated my initials with a GUID to
produce a unique, but recognizable namespace. Rather than typeing such a long string for every invocation of the function, you may want to assign a namespace alias,
namespace jmg = Jonathan_M_Gilligan_95724E90_4A88_11d5_80F3_006008C7B14D; and then you can call jmg::GetUTCFileModTime();
Example:
namespace jmg = Jonathan_M_Gilligan_95724E90_4A88_11d5_80F3_006008C7B14D; time_t mod_time_1, mod_time_2; if ( jmg::GetUTCFileModTime( _T("foo.txt"), & mod_time_1 ) && jmg::GetUTCFileModTime( _T("bar.txt"), & mod_time_2 ) ) { if (mod_time_1 > mod_time_2) { _tprintf( _T("foo is older.\n") ); } }
The origin of this file-name problem lies in the early days of MS-DOS and PC-DOS. Unix and other operating systems designed for continuous use and network communications have long tended to store times in GMT (later UTC) format so computers in different time zones can accurately determine the order of different events. However, when Microsoft adapted DOS for the IBM PC, the personal computer was not envisioned in the context of wide-area networks, where it would be important to compare the modification times of files on the PC with those on another computer in another time zone.
In the interest of efficiently using the very limited resources of the
computer, Microsoft wisely decided not to waste bits or processor cycles
worrying about time zones. To put this decision in context, recall that the
first two generations of PCs did not have battery-backed real-time clocks,
so you would generally put DATE and TIME commands
into your AUTOEXEC.BAT file to prompt you to enter the date and
time manually when the computer booted.
By the time of WinNT, wide-area networks and had become sufficiently common that Microsoft realized that the OS should measure time in some universal format that would allow different computers to compare the order (and separation) of events irrespective of their particular time zones. Although the details vary (different time structures measure time relative to different events), the net effect is that all times used internally in Win32 measure time with respect to UTC (what used to be called GMT).
Having once worked down the hall from the master atomic clock array for the United States at the National Institute of Standards and Technology in Boulder, I feel obligated to add a few words about time and systems for reporting time. Long ago, we used to refer time to GMT, or Greenwich Mean Time, which was kept by the Royal Observatory in Greenwich, England and was ultimately referred to the position of the sun as measured by the observatory. When atomic clocks became the standard for timekeeping, a new standard, called UTC emerged. UTC is a bastard acronym. In English, it stands for "Coordinated Universal Time," while in French it stands for "le temps universel coordonn�." Rather than using either CUT or TUC, the nonsense compromise acronym UTC was adopted.
To understand UTC, we must first understand the more abstract International Atomic Time (TAI, le temps atomique international), which measures the number of seconds that have elapsed since approximately 1 Jan 1958, as measured by caesium atomic clocks. The second is defined to be the amount of time required for 9 192 631 770 cycles of the caesium hyperfine frequency. However, neither the day nor the year are exact multiples of this number, so we take TAI and correct it so that it corresponds to the actual motion of the earth by adding corrections such as "leap seconds." TAI measures raw atomic time. UTC measures time coordinated to the motion of the earth (i.e., so we don't end up having midnight while the sun is shining or January in midsummer). Details of what UTC really means, together with a more detailed history of timekeeping, can be found at http://ecco.bsee.swin.edu.au/chronos/GMT-explained.html.
So what does this all have to do with file modification
times on Windows computers? Windows is stuck with some serious problems
integrating FAT and NTFS files compatibly. FAT records file modification times
with respect to the local time zone, while NTFS records file modification (as
well as creation and access times, which FAT does not record) in UTC. The first
question you may want to ask is, "How should Windows report these file
times?" Clearly it would be stupid for dir
and Windows Explorer to report FAT file times in the local time zone and
NTFS file times in UTC. If inconsistent formats were used, users would have
great difficulty determining which of two files was more recent. We
must thus choose to translate one of the two file time formats
when we report to the user. Most users are likely to want to know the file
modification time in their local time zone. This keeps things consistent with
what people learned to expect under DOS and Win16. It also is more useful to
most users, who may want to know how long ago they modified a file without
looking up the offset of their local time zone from UTC.
It is straightforward to translate UTC to local time. You look up the offset, in minutes, between the local time zone and UTC, determine whether daylight savings is in effect and add either the standard or the daylight offset to the UTC time. However, daylight time throws a subtle wrench in the works if we try to go backwards...
If you want to translate a time in your local time zone into UTC, it seems a straightforward matter of determining whether daylight time is in effect locally and then subtracting either the standard or the daylight offset from the local time to arrive at UTC. A subtle problem emerges due to the fact that the mapping from UTC to local time is not one-to-one. Specifically, when we leave daylight savings time and set our clocks back, there are two distinct hour-long intervals of UTC time that map onto the same hour-long interval of local time. Consider the concrete case of 1:30 AM on the last Sunday in October. Let's suppose the local time zone is US Central Time (-6 hours offset from UTC when daylight time is not in effect, -5 hours when it is). At 06:00 UTC on Sunday 28 October 2001, the time in the US Central zone will be 01:00 (1:00 AM) and daylight time will be in effect. At 06:30 UTC, it will be 01:30 local. At 07:00 UTC, it will be 01:00:00 local and daylight time will not be in effect. At 07:30 UTC, it will be 01:30 local. Thus, for all times 01:00 ≤ t < 02:00 local, there will be two distinct UTC times that correspond to the given local time. This degenerate mapping means that we can't be sure which UTC time corresponds to 01:30 local time. If a FAT file is marked as having been modified at 01:30 on 28 Oct. 2001, we can't determine the UTC time.
When translating local file times to UTC and vice-versa, Microsoft made a strange decision.
We would like to have the following code procduce out_time equal
to in_time
FILETIME in_time, local_time, out_time; // assign in_time, then do this... FileTimeToLocalFileTime(& in_time, & local_time); LocalFileTimeToFileTime(& local_time, & out_time);
The problem is that if the local time zone is US Central (UTC - 6 hours for standard time,
UTC - 5 hours for daylight time) then in_time = 06:30:00 Oct 28 2001
and in_time = 07:30:00 Oct 28 2001 both map onto the same local time, 01:30:00
Oct 28 2001 and we don't know which branch to choose when we execute
LocalFileTimeToFileTime().
Microsoft picked an incorrect, but unambiguously invertable algorithm: move all times up an hour when
daylight time is in effect on the local computer, irrespective of the DST state of the time
being converted. Thus, if DST is in effect on my local computer, FileTimeToLocalFileTime
converts 06:30:00 Oct 28 2001 UTC to 01:30:00 CDT and 07:30:00 Oct 28 2001 UTC to 02:30:00 CDT. If
I call the same function with the same arguments, but when DST is not in effect on my local computer,
FileTimeToLocalFileTime will convert 06:30:00 UTC to 00:30:00 CDT and 07:30:00 UTC to
01:30:00 CDT.
It may seem strange that this would affect the C library call stat, which allegedly
returns the UTC modification time of a file. If you examine the source code for Microsoft's C library,
you find that it gets the modification time thus:
// pseudocode listing WIN32_FIND_DATA find_buf; HANDLE hFile; FILETIME local_ft; time_t mod_time; // FindFirstFile returns times in UTC. // // For NTFS files, it just returns the modification time // stored on the disk. // // For FAT files, it converts the modification time from // local (which is stored on the disk) to UTC using // LocalFileTimeToFileTime() // hFile = FindFirstFile ( file_name, &find_buf ); // convert UTC mod time to local... FileTimeToLocalFileTime ( &find_buf.ftLastWriteTime, &local_ft ); // Now use a private, undocumented function to convert local time to UTC // time according to the DST settings appropriate to the time being // converted! mod_time = __secret_microsoft_converter(local_ft);
For a FAT file, the conversions work like this:
LocalFileTimeToFileTime()FileTimeToLocalFileTime(). Note that this
exactly reverses the effect of step 1, so we are left with the correct local modification time.For an NTFS file, the conversions work like this:
FileTimeToLocalFileTime(). This applies a DST correction
according to the DST setting of the computer's system time, irrespective of the DST setting at the
file modification time.This explains the problem I showed at the top of this article: The time
reported by dir for a file on an NTFS volume changes by an hour
as we move into or out of daylight savings
time, despite the fact that I haven't touched the file. FAT modification times
are stable across DST.
There are 3 possible ways I can think of where this inconsistency in reporting file times may cause problems:
GetFileTime()
instead of using the C library stat(), and convert the FILETIME
to time_t by subtracting the origin (Jan 1 1600) and dividing by 10,000,000 to
convert 100-nanosecond units to seconds.For case (2), stat()
will work and return a time_t that you can compare to the stored one. If you must use GetFileTime() do not
use LocalFileTimeToFileTime(). This function will apply the the daylight state of the
current system time, not the daylight status of the file time in the argument. Fortunately, the C library
mktime() function will correctly convert the time if you correctly set the tm_isdst
field of the tm struct.
There is a bit of a chicken-and-egg problem here. Windows does not supply a good API call to let
you determine whether DST was in effect at a given time. Fortunately for residents of the US and other countries
that use the same logic (Daylight time starts at 2:00 AM on the first Sunday of April and ends at 2:00 AM on
the last Sunday in October), in which case you can set tm_isdst to a negative number and
mktime() will automatically
determine whether daylight time applies or not. If the file was modified in
the window 1:00-2:00 am on the last Sunday in October, it's ambiguous how
mktime() computes the modification time.
People in time zones that do not follow the usual US daylight rule must brute-force the daylight time
problem by retrieving the applicable TIMEZONEINFO structure with
GetTimeZoneInformation and manually calculating whether daylight time applies.
time_t with the time
for the NTFS file determined in case (1).
General
News
Question
Answer
Joke
Rant
Admin
|
PermaLink |
Privacy |
Terms of Use
Last Updated: 28 May 2001 Editor: Chris Maunder |
Copyright 2001 by Jonathan Gilligan Everything else Copyright © CodeProject, 1999-2009 Web19 | Advertise on the Code Project |