In part 1
of this 2 part series I presented a service to perform logging
on behalf of a multi-module application. In this part I present a Log File Viewer which provides near real-time display of the log
messages processed by the logging system.
The viewer is an MFC MDI application which provides per-view filtering of the logged data. Filtering can be done on the message logging level,
the originating module, the originating machine, the originating thread, or any combination of the above.
Each view in the app corresponds to a logfile. I used the doc/view architecture but the
derived class only performs
placeholder duty. All the work is done in the
class which is a member variable of the view.
When the application starts it presents a dialog box where you select which logfile you want to view. More on this dialog box anon.
After you choose a log file (or more than one), the MFC framework creates one
CDocument and one
for each log file. The view in turn creates a
CLogFileInstance creates two threads for current log files and one thread for old log files. An aged log file is a file whose name
is of the pattern myproduct.yyyymmdd.log. A current log files name is of the pattern myproduct.log. The first thread,
LogMonitorThread monitors the log file and, when it sees a change, reads the changes and creates one or more
where each object corresponds to a new log entry. This thread is started for all log files but it terminates itself once it's done reading the log file
if the log file is an aged one. The assumption is that aged log files don't change.
CLogData instance is added to an array of
CLogData instances and a second thread,
LogUpdateThread(), is signalled.
That thread notices the new objects added to the
CLogData array and, subject to filtering, adds them to the
ClistCtrl embedded in the view.
LogUpdateThread runs for all log files, not just current log files.
The flow of control looks like this.
The Log file
This is created by the logging service. A specific format is assumed, as documented in my previous article
This thread runs once per second as specified for the timeout in the
call. The object it waits on is an event handle used to tell
the thread to exit. Most of the time that object won't be signalled and the significant work is done in the
code. The first time through
the loop the timeout is actually set to zero and the
call returns immediately. The code in the
sets the wait time to 1000 milliseconds for all subsequent passes through the loop.
When the wait timeout occurs the thread first checks if the file it's monitoring has changed. Changes are simple to detect. Either the logging service has appended
something to file or it hasn't. So the first check is, has the length of the file changed? If not, we have nothing to do so we just go back to the top of the
loop and wait another second. If the file length changed we then check if the current file length is less than the length we last knew about. (Remember that
the logging service rolls files over at local midnight). If the current length of file is less than the length we last knew about we clear the
array and tell our view that it should delete what it's currently displaying and reread the file from the beginning.
If the file length has changed and it's greater than the length we last knew about then obviously a new log entry has been added. The code goes into a loop
reading lines (terminated by an endline (but see below)) and creates new
CLogData objects which are added to the array. When it hits end of file it goes
back to the top of the loop and waits another second, after signalling the
m_eSignal event object, which causes the
LogUpdateThread() to run.
void CLogFileInstance::LogMonitorThread(LPVOID data)
CLogDataArray *me = (CLogDataArray *) data;
BOOL bStop = FALSE;
CLogData *pLog = (CLogData *) NULL;
int iWait = 0;
while (bStop == FALSE)
switch (WaitForSingleObject(me->m_eStop, iWait))
bStop = TRUE;
iWait = 1000;
if (me->m_dwFilePosition == me->m_updateFile.GetLength())
if (me->m_dwFilePosition > me->m_updateFile.GetLength())
me->m_dwFilePosition = 0;
while (me->m_bContinue && me->m_updateFile.ReadString(csData))
pLog = new CLogData(csData, me->m_logFilterData);
else if (pLog != (CLogData *) NULL)
me->m_dwFilePosition = me->m_updateFile.GetPosition();
bStop == FALSE
bStop = TRUE;
There are a couple of things of note in this thread procedure. The first is that the inner loop (the one inside the
case) is controlled
by a boolean variable,
. This is necessary so that the thread can be told to stop when it's in the midst of reading very large log files.
If the view wants to close it sets
and then waits on the thread handle. The boolean ensures the loop exits as soon
as the work for the current line of the log file is done.
The second thing to note is that I wanted the logfile format to be flexible. It's not enough to assume that an endline represents the end of this log record
and the next line is the start of a new record. I wanted to be able to log error messges that contained embedded linefeeds (ODBC messages for example) without
having to modify the messages in the logging service or the log client code. So we make a simplifying assumption. If a line commences with a digit we'll assume
it's a new record. If not this line is part of the message text for the preceding record.
Why not use FindFirstChangeNotification() or ReadDirectoryChangesW()?
I considered using both these functions to detect changes in the log files. The problem with them is that they only watch for changes in a directory or a tree of
directories. If I could call a function that watched just one file I'd use it. As it is, the NT watch functions notice a change in a directory (admittedly filtered)
and leave it up to the user to determine if the change notification is relevant. It's just simpler to open a file and monitor it's length periodically. I've run this
software for periods of 48 hours plus with no updates and Task Manager shows 0 seconds time accumulated against it. That's enough justification in my mind to opt for
the simpler method presented here.
After writing this article I found myself wondering if the reasons I'd given above were sufficient and then thinking through the consquences of attempting to use either API.
One could use those API's if one uses a single thread to update all instances of logfiles being monitored. The downside to doing this is that your thread code will have to cope with multiple log files. Well it does anyway, but spreading the load across multiple threads means that each log file gets a crack at the CPU based on thread scheduling - collapsing this into a single thread means that either you 'round robin' each log file, handling one new record per log file, or you handle each log file serially, which means that one log file will dominate at the expense of the second, which in turn dominates at the expense of the third and so on. Spreading the load across multiple threads means that you get to use the thread scheduler for free.
The alternative to a single thread is pathological. If each log file is monitored on a separate thread and you try to use the
ReadDirectoryChangesW() API's you have two choices. Do it on a per thread basis, or collapse it into one thread.
Doing it on a per thread basis buys you nothing. One might just as well do it the way I chose, monitoring the length of the log file. Or you could use another thread to monitor all folders containing log files being monitored. When that thread notices a change it signals the thread that handles that log file. Again, you have two alternatives. Maintain a mapping between a logfile and a thread so that you can signal just that thread, or signal all threads and let them sort it out.
Signalling just the one thread means that you have to maintain a mapping from thread to directory via an event object. Extra code to be debugged and maintained, and a more subtle bug. If two threads are monitoring different log files in the same directory and you're using the
ReadDirectoryChangesW() API's there's a very real possibility that only one thread will be notified of a change! The other thread might possibly wait for hours until it recieves an update notification.
Signalling all threads means that you have to use a manual reset event object to ensure that all threads will see the notification. This leads to the question of just who resets that manual reset object? The only way I can think of is to make the thread that runs the
ReadDirectoryChangesW() API's signal the manual reset event object and then wait on an array of objects, one per monitor thread and when each of those threads have completed their work they signal their object in the array that the monitor thread waits on, which means code to add and remove event objects on the monitor thread. You see where this is going...
In the end, it's just a hell of a lot simpler to monitor the file length.
The Log Update Thread
Now the Log Update Thread takes over. This thread waits on two objects. The first is the same event handle that the monitor thread waits on, the one that tells the
thread to exit. The second object is the
event object which tells this thread to check the
array for new entries. As
new entries are found they are added to the
s to the main UI thread.
void CLogFileInstance::LogUpdateThread(LPVOID data)
CLogDataArray *me = (CLogDataArray *) data;
BOOL bStop = FALSE;
handleArray = me->m_eStop;
handleArray = me->m_eSignal;
while (bStop == FALSE)
switch (WaitForMultipleObjects(2, handleArray, FALSE, INFINITE))
bStop = TRUE;
case WAIT_OBJECT_0 + 1:
while (me->m_bContinue && me->m_currentEntry < me->GetSize())
pLog = (CLogData *) me->GetAt(me->m_currentEntry++);
The main thing of note in this thread procedure is the
to the view. As pointed out by Neville Franks and Mike Dimmick in replies to a
post I made on the Visual C++ forum, it's necessary to be very careful when attempting to touch Windows objects created in a different thread. My first draft of
this thread procedure passed a pointer to the
to the thread and the thread itself directly manipulated the underlying ListView Control. It
worked most of the time but would randomly hang the application when attempting to shut it down. Much headscratching later I followed recommendations and started
. Voila! Problem solved. Thanks Neville and Mike.
When the application starts it registers a custom windows message using the
RegisterWindowMessage() API. The registered message value is stored in a
guiAdvise. This is the message used to communicate between the
LogUpdateThread() and the view.
The View Class
The List Control is created in report mode (of course). The view contains the handler for the
message which looks like this:
LRESULT CLogViewerView::OnAdvise(WPARAM wp, LPARAM lp)
CListCtrl& ctl = GetListCtrl();
pLog = (CLogText *) lp;
LVIF_IMAGE | LVIF_PARAM,
if (ctl.GetSelectedCount() == 0)
ctl.EnsureVisible(ctl.GetItemCount() - 1, FALSE);
sub-message adds an item to the List Control. Each item has an associated image and the text is set to
which means the text to be displayed isn't stored in the List Control at all. Instead, when the List Control
needs to draw the text, it calls back into our view asking for the text. Finally, we set the
for this item as the pointer to
object corresponding to this item.
LogUpdateThread() finishes noticing new records in the
CLogData array it sends the view a message,
When the view receives this message:
- If no line is selected in the List Control we scroll to the end of the display to ensure the last item added is visible. If a line is selected we assume
it's a line of interest that ought not to scroll out of sight.
We also do an automatic column resize once the update has finished.
CLogText? As alluded to earlier in this article, the code presented here doesn't assume one line in the text file corresponds to
one record. We want to allow a log message string to contain embedded newlines so that we don't have to pre-process each line before writing it to the log file.
(What would we replace an endline with anyway?).
CLogData object (which represents a log record) contains all the data necessary for a complete record (timestamp, sequence, code, module, thread
and the first line of the message text. If the message text doesn't contain an endline (the next line of the file starts with a digit) that's it. If, however,
the next line of the file doesn't start with a digit, we assume we've seen an embedded newline and this line is a continuation of the preceding record. In this case
we don't need the overhead of a complete record. Instead we create a
CLogText object which contains only the text of this line and a pointer to
the preceding complete record (a
CLogData object and add the new object to an array of
CLogText objects in the parent
CLogText is the base class for the
CLogData class. The class looks like this.
class CLogText : public CObject
friend class CLogData;
CLogText(const CLogData *pOwner, LPCTSTR szText);
virtual int Image() const;
virtual BOOL IncludeThis(CLogFilter& pFilter) const;
void RenderText(LVITEM& rItem, CLogFilterData& rFilter);
virtual void FormatForTip(
LPTSTR szBuffer, int iBuffLen,
virtual int Code() const;
virtual int Sequence() const;
virtual LPCTSTR Thread(CLogFilterData& rFilter) const;
virtual LPCTSTR Machine(CLogFilterData& rFilter) const;
virtual LPCTSTR Module(CLogFilterData& rFilter) const;
virtual LPCTSTR TimeStamp() const;
virtual LPCTSTR Text() const;
CLogData *Owner() const;
Notice there are a bunch of protected virtual functions. Those functions are duplicated in the derived class (
) and are only called
function (which is why they're protected). The code in the view that requests display data treats all objects as though they
's and lets the object itself sort it out. The base class functions either return data in the base class (the
data), or they vector through the
pointer to return data stored in the owner object. (Actually this isn't entirely true. For aesthetic
reasons the code displays blanks for all data but the message string if the object is a
. I prefer to see the timestamp etc only on the
main record and blanks for continuation records).
The view handles requests from the List Control for data through the
OnGetDispInfo function. The code is thus.
void CLogViewerView::OnGetDispInfo(NMHDR* pNMHDR, LRESULT* pResult)
LV_DISPINFO *pDispInfo = (LV_DISPINFO *) pNMHDR;
LV_ITEM *pItem = &(pDispInfo)->item;
if (pItem->mask & LVIF_TEXT)
CLogText *pLog = (CLogText *) pItem->lParam;
*pResult = 0;
Pretty simple. The function checks that the list control is asking for text and if so, passes the request on to the underlying object. A pointer to the object
was set as the
for the item when the item was created.
All of this is well and good. We now have a program that reads log files, displays them, draws pretty icons and notices almost immediately when a new record has
been appended. Heck it's even smart enough to automatically scroll when new records are added unless you have a record selected (in which case it assumes that you
want to read that record and doesn't scroll).
So you're interested in seeing error records but the logging level is set to log everything. The signal to noise ratio might be 100 information messsages to 1
error message (or much higher). This is where filtering comes in.
As each new log record is read it's parsed into it's various parts. I can't imagine wanting to filter on the timestamp or the message text (or the sequence for that
matter) but I can well imagine needing to filter on code, module, machine or thread. Filtering on the code is easy. If the code is less than 1000 it's an error. If less than 2000 but
more than 999 it's a warning, and so on.
For the other filterable criteria we need to record what we've seen. I don't want to have to guess at thread ID's. Nor do you. Likewise for modules.
object contains three
s. Each array contains each unique string corresponding to a module name,
a machine name, or a threadID. The corresponding data stored (per record) in the
objects are actually offsets into the arrays in the
objects. In other words, if we see 'ModuleName' in the Module field of the log record we search the Module array of the
object for a match. If we find a match we store the appropriate index as part of the
object; if not we
add 'ModuleName' to the Module array returning the new index. Either way, an entry in
will resolve to a string stored in the array.
In addition to providing lookup for the
CLogData objects the filter data is also used to populate three combo boxes on the mainframe dialog bar.
As the user switches between views the dialog bar is updated with filter data derived from that view and the users last selection is reselected.
Unfortunately, changing the filter criteria means that all currently displayed data must be dropped and the List Control repopulated. If there's a way to set an item
in a List Control and not have it display I haven't found it.
Thus, changing the filter criteria causes the List Control to be emptied, the
m_currentEntry variable in the
CLogFileInstance object to be
set to zero and the Log Update thread to be signalled. Aha, so that's why it's a separate thread!
The offsets to strings in the string arrays are zero based. The combo boxes are populated with the contents of the string array and, if the string array contains
more than one entry (ie, more than one module name, machine name or thread id) an extra item (Show All) is included. The code in the filter dialog bar must
account for this extra item. I used -1 as the signal to the filtering code to say 'ignore this item when filtering'.
Anatomy of a bug
As I found to my cost today (December 9th 2003) it's a right royal mistake to store the combobox current selection as the data being used in the filter. The problem was the extra phantom entry, (Show All)
. It threw off the indices by 1. Now that wouldn't have been a problem except that something tells me that a combo box containing two items
is just plain wrong. If there's only one choice it seems to me that the combobox should only contain that one choice and that it should also be automatically selected and the combobox disabled.
If you believe that then you find yourself writing code to add the (Show All) entry when it's needed, and not otherwise. But then it's necessary at some later time to know if the (Show All) entry has been added. You see where this is going...
Anyway, my code to handle the comboboxes was just plain wrong because I was trying to conflate the users selection with the index into the filter arrays and trying to handle the case where all array entries should match. Doing it this way degenerated into a game of trying to find all cases where the choice was the sentinel value (-1) and trying to do something sensible with that value when selecting an entry in the combobox. That way madness (and MFC ASSERT's in debug builds) lies.
It was much simpler in the end to maintain two variables per combobox. The first is the users selection - the value returned by
CComboBox::GetCurSel(). Given the way this application maintains filter data (the number of entries in a combobox never decreases unless the log file is cleared) that's a reasonable design choice.
The second variable is the item data associated with the chosen combobox item. That value is always the offset into the string array.
The Select Log Files Dialog Box
I showed you this earlier in the article and hinted there'd be more anon. It's now anon :) Here it is again to refresh your memory.
The dialog class (
) is derived from the
class and provides a custom dialog template which is added to the standard
common controls file open dialog. The extra controls are two buttons, 'Load all log files' and 'Load all of todays log files'. Should be simple. The first button loads
any file in the directory with a .log extension, the second button applies a filter to ensure the files are of the form 'anytext.log' rather than 'anytext.yyyymmdd.log'.
I knew that it would be necessary to have my own array of strings containing either the list of log files selected using the normal file open dialog methods (clicking on
files and holding down the control key or shift key to select more than one), or a list of log files fitting the criteria used by the custom buttons.
My first problem was closing the dialog if one of the custom buttons was clicked. The dialog didn't want to close. It seems that the dialog won't close using the OK
button if the File name: field is empty. (Thanks to Neville Franks again for telling me this). I tried the obvious things such as sending a
message to the dialog specifying
IDOK. Didn't close. Try again with
IDCANCEL. Still didn't close. Tried using
EndDialog(IDOK);. Still didn't close. This was starting to get silly so I used spy++ and discovered that the custom dialog template is created as a
child dialog of the
CFileDialog class. If you want to send messages to the dialog from your custom message handlers you need to send them to the
parent dialog. Aha!
I still had the problem of closing the dialog. You can't send
IDOK to the parent dialog and expect it to close if the File name: field is empty. But
you can't easily make that field be not empty. You can close the dialog by sending
IDCANCEL but then how do you know if the user actually selected a
file? My solution was to override the
DoModal() function and, in code internal to the class, do one of two things.
- If the user hits either of the two custom buttons populate my own string array with files that match (either aged or not aged).
- If the user selects files using the file listing combo box and then presses the 'Open' button, populate my own string array with the files using the standard
The class lies to the caller! If the user hits either of the two custom buttons the dialog is told to cancel itself after populating an array with files that match
the criteria. If the user hits the Open button and no file is selected the dialog doesn't close. If the user selects a file and hits the Open button the dialog returns
after populating the string array.
DoModal returns either
IDCANCEL to the caller depending on whether the string array is empty or not.
This is the desired result regardless of whether we're lying about the exact path we followed.
As it stands this program does everything I want. But you might want it to do more. I can imagine extending it to email notifications to someone if a particular
error code is encountered. (thanks to Paul Perkins
for the idea).
- Version 1 - 5 December 2003.
- Version 2 - 9 December 2003. Fixed a problem brought about by inadequate testing :-( where the
LogUpdateThread was exiting after the initial file load for current log files. Reworked the way the filter state is handled (separated the filter data from the combo box selection).
- Version 3 - 10 December 2003. As part of writing a mea culpa for bugs in version 2 I realised there was another hole in the filtering schema. Specifically, when a log file rolled over, the filter data wasn't cleared and thus filter criteria that were relevant to yesterdays file would appear in todays viewer if the viewer were left running over the rollover moment. Now fixed.
- Version 4 - 12 December 2003. Added the extra check for whether the thread is being terminated as suggested by c-sharp below.
- Version 5 - 29 January 2004. Expanded the explanation of why I didn't use the
- Version 5 - 25 February 2004. Included a suggested fix for the duplicate lines problem by rnoda (see below). My fault entirely, I tested this under high load conditions which masked the problem. It appeared under low load conditions. Bummer.