Click here to Skip to main content
6,292,811 members and growing! (10,365 online)
Email Password   helpLost your password?
Desktop Development » Files and Folders » Utilities     Intermediate License: The Code Project Open License (CPOL)

MTCopy: A Multi-threaded Single/Multi file copying tool

By LiYS

An article on implementation and usage of multi-threaded single/multi file copying tool
C++ (VC6, VC7, VC7.1, VC8.0), C++/CLI, C, Windows, Win32, Dev
Posted:6 Apr 2008
Updated:28 May 2008
Views:15,524
Bookmarked:20 times
Unedited contribution
Announcements
Loading...
 
Search    
Advanced Search
printPrint   Broken Article?Report       add Share
  Discuss Discuss   Recommend Article Email
10 votes for this article.
Popularity: 3.21 Rating: 3.21 out of 5

1

2
2 votes, 40.0%
3
2 votes, 40.0%
4
1 vote, 20.0%
5
SC.JPG

Introduction

Having multiple threads running in your application in right way certainly can make it run more efficient and responsive, to just name a few. MTCopy takes this advantage. It starts by spawn a number of threads to cooperatively perform the I/O operations (multiple files multiple threads scenario, MMT) and divides the large file to be copied into several sectors aligned blocks (single file multiple threads scenario, SMT) if necessary, whereby you don't have to have your program blocked until a lengthy I/O operations finished in the single thread scenario. But certainly, the more threads you have doesn't mean the more efficiency you will get, you have to take the thread context switching overheads into the consideration otherwise the program will spend much of its precious CPU time on context switching instead of on real work it should do.

Background

Writing an article and a accompanying program on Codeproject have always been a dream of mine. Just a few weeks ago implementing a file copying tool came up to my mind after I was seeing the similar tool and wondering if I could made one by myself, and certainly it will be great if I could finish it and posted here. So, here I am with this article and completed sample program after a couple of weeks! This tool is more a by-product of self-practice that I do in my spare time on thread modeling, thread synchronization and file operations concepts than a serious product level file copy tool.

Usage of sample program

The tool is console based application, so in order to run it you need to specify command line containing "[source file/folder] [destination file/folder] /t:[how many threads will be created during copying process] /b:[buffer size in KB/MB used during copying] /l:[threshold over which files in certain size will be divided into small blocks and copied multithreadedly]". Once it starts to run it will create a log file named "MTCopy.txt" under c: drive root.

So you can copy folder like : "MTCopy.exe "I:\Something" "I:\Something_" /t:2 /b:5m /l:10m" (using 2 threads for multiple file copying, for single file copying use another 2 threads only when its size is equal to or over 10MB) or simply copy a large single file multithreadedly by: "MTCopy.exe "I:\Something\something.wmv" "I:\Something_" /t:2 /b:5m /l:10m". It is recommended that the buffer size specified should be multiple of sector size, that way the program will direct allocate that amount of memory instead of adjust the buffer size such that it will be sector aligned.

Please Note: [source file/folder] and [destination file/folder] options should be double quoted otherwise they will be interpreted as several options if spaces exist in between.

How it was implemented

The program first starts by create two thread pools and a file traverse thread:

File traverse thread will recursively search for all the files inside of specified directory, it runs with MMT copying thread pool in parallel. Once it found a file it will signal the MMT thread pool to copy the file over and put itself to sleep until the MMT pool tells it to wake up and continue traversing. If folder is found it will directly create that folder.

Multiple file multithreaded copying thread pool: When the file traverse thread signals this thread pool that it has found a file, it will make a local copy of the source file path for the local thread in the critical section which is accessed only by one thread at any given time and then the real file copying operation gets executed by each local thread. Each one of the local thread gets a set of local variables of their own that do not shared with their close neighbours. This thread pool runs as long as the file traverse thread is unsignaled (thread/process objects become signaled when they terminate), that is to say this thread pool will terminate when the file reverse thread finish traversing the specific directory.

As it receives the file to be copied it will check each file for if its size equal to or over what specified with /l: command line option, if this to be the case the file will be put into a dedicated queue as a work item and a dummy file will be created by size for future processing. During this process this newly created item will be equipped with series of blocks metrics such as how many blocks this file should be divided, again this block is specified by '/b:' option, with all of these information in hand the MMT simple copies unfiltered file over. The skeleton suedo code goes something like:

for(;(WaitForSingleObject(g_hCopyThd, IGNORE)) != WAIT_OBJECT_0;) 
{
    // Get a copy of a file for the local thread 
    EnterCriticalSection(&lpCS);
    WaitForSingleObject(g_hFound, INFINITE);
    // Found one file
    _tcscpy(tSrcPath, FileName);
    // Start find process again
    SetEvent(g_hStartFind);
    LeaveCriticalSection(&lpCS);
    ...
    if (filesize > threshold)
    {
        AppendToSMT();
    }
    ...
    CopyFile(tSrcPath, tDstName, TRUE);
    ...
}

Single file multithreaded copying thread pool: This thread pool first starts by waiting on a semaphore object to be signaled, this semaphore is signaled when there is an item inside of to be copied queue otherwise it will just put itself to sleep. Each item's blocks will be handled by the local thread running in this thread pool. When finish copying the item the thread pool will suspend itself and wait for the next item to come. The code skeleton looks like:

for (int i = 0; i < m_pHeadWorkItem->csa[uiIndex].uiLength; i++ )
{
    SetFilePointer(hSrcFile, (LONG)(m_pHeadWorkItem->csa[uiIndex].uiCurrentBlock * CS->m_dwCopyBlockSize)
                                    + (CS->m_dwCopyBlockSize * i), NULL/*&lHigh*/, FILE_BEGIN);
    SetFilePointer(hDstFile, (LONG)(m_pHeadWorkItem->csa[uiIndex].uiCurrentBlock * CS->m_dwCopyBlockSize)
                                    + (CS->m_dwCopyBlockSize * i), NULL, FILE_BEGIN);
    fFileOp = ReadFile(hSrcFile, pTmp, CS->m_dwCopyBlockSize, &dwRead, NULL);
    ...
    WriteFile(hDstFile, pTmp, dwRead, &dwWritten, NULL);
    ...
    SuspendThread(m_hThd[uiIndex]);
}                    

Points of Interest

CloseHandle when thread is just created: When I saw code like CloseHandle(_beginthreadex(...)) for the first time I was confused: what's the point in closing the thread handle that you have just created? Later as I have used this code for a while I realized that this does not actually cause the child's primary thread to terminate, it simply having the system decrement the usage count for the thread from 2 to 1 (the thread born with usage count of 2), and when the thread exits this usage count will be decremented to 0 and that in turn will have the object's memory freed, thus you're saving from the trouble to writing extra code waiting for the thread to terminate in order to close its handle. But of course this is only the case when you don't need this thread kernel object any more after it is created and just want it runs to completion. This behavior applies to process kernel object too.

There's another case that CloseHandle(...) can be handy: Suppose that you have a child process and its primary thread spawns off another thread and then the primary thread terminates. At this point, the system can free the child's primary thread object from its memory ony if the parent process doesn't have an outstanding handle to this thread object. Otherwise, the system can't free the object until the parent process closes the handle. So, you can use the code like:

BOOL fSuccess = CreateProcess(..., &pi);
if (fSuccess) 
{
    // Close the thread handle as soon as it is no longer needed!
    CloseHandle(pi.hThread);
...
}

Successful wait side effects: For some kernel objects, a successful wait (when the objects signaled) on WaitForSingleObject/WaitForMultipleObjects actually alters the object state. This side effects apply to auto reset event and semaphore object. This program uses both side effects, for semaphore successful wait side the program uses it to guard against the SMT items: unleash the threads when there's item available and put threads execution into wait state when there's none.

when a item is queued the semaphore will be incremented by

// Signal the copying thread pool to let it run
ReleaseSemaphore(m_hsemNumElements, CS->m_nThdCnt, NULL);

// For queue to have element && There's next item available
for (;(WaitForSingleObject(m_hsemNumElements, INFINITE) == WAIT_OBJECT_0);)
{
    ...
}

The for statement in the above code snippet returns only when there's item available in the queue, and as it returns it will also decrement the semaphore resource count to zero, otherwise it will be in wait state sitting idle for the next item to come.

Non-block wait:

(WaitForSingleObject(m_hFindThd, IGNORE) == WAIT_OBJECT_0) // where IGNORE equals zero. 

I overlooked this parameter in MSDN just when I was looking for some function that could be used to test for the object signalness for shortest amount of time and returns. So, use it this way the object's state can be tested and returns immediately. So my retrospect is: read MSDN carefully first before asking any question.

Critical section leak (orphan critical section)

for(...)
{
    EnterCriticalSection(&CS);
    break;
    ...
    LeaveCriticalSection(&CS);
}

Watch out for this kind of problem especially when you're in a thread pool where a number threads will execute the above code. During development I once experienced this issue where I put a break statement between them, so no wonder I ended up with having the whole thread pool blocked when the broke away thread forced not to release the critical section when it entered it, it made rest of the threads can never ever entering the critical section again.

Thread Local Storage (TLS)

Initially I thought I should have used TLS in the thread pool for the variables that go with particular thread, but later I found out that you only need TLS for global or static variables, if you can minimize the use of such variables and rely much more on automatic (stack-based) variables in thread pool you can just leave TLS alone, for those variables were all local to individual thread.

Future thoughts

I've done a little research on the maximization of I/O operations after this program's basic functions were implemented. For I/O to achieve the maximized performance there are two things among others that you need to consider:

1. By passing the file system cache (when CreateFile(...) with FILE_FLAG_NO_BUFFERING):

2. Using overlapped I/O (asynchronous I/O):

This program takes the advantage of bypassing system buffering so this means the data moves directly into the application via the SCSI adapter using DMA (direct memory access) instead of to the system and then application. Overlapped I/O can have the same effect that this program have by using just single thread instead of multiple, where the Read()/Write() function will immediately return after issuing the command to the OS, so it is OS which does the I/O works for you on the background and notifies you once the operation finished. So this increases throughput by providing the IO subsystem with more work to do at any instant. But this program has not used this mechanism yet, let's leave it for the future.

3. A better user friendly UI (progress bar, options setting etc,) also counts.

History

7 April, 2008 -- Original version posted

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

LiYS


Member
TBD
Occupation: Software Developer
Location: China China

Other popular Files and Folders articles:

Article Top
You must Sign In to use this message board.
FAQ FAQ 
 
Noise Tolerance  Layout  Per page   
 Msgs 1 to 10 of 10 (Total in Forum: 10) (Refresh)FirstPrevNext
Generalwhy block at the last file???? PinmemberMotorcure22:14 26 Jun '08  
Generallearn from TeraCopy PinmemberUnruled Boy21:44 29 May '08  
GeneralThanks for the replies PinmemberLiYS16:19 8 Apr '08  
GeneralRe: Thanks for the replies PinmemberWong Shao Voon0:25 16 Apr '08  
GeneralRe: Thanks for the replies PinmemberLiYS21:39 22 Apr '08  
GeneralThanks for the demo Pinmembercompuhealer6:43 8 Apr '08  
GeneralCool, but probably not effective. PinmemberDrNecessiter5:02 8 Apr '08  
GeneralAny benefit? Pinmembergrin9:46 7 Apr '08  
GeneralRe: Any benefit? PinmemberWong Shao Voon20:59 7 Apr '08  
GeneralRe: Any benefit? PinmemberReonekot4:24 8 Apr '08  

General General    News News    Question Question    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

PermaLink | Privacy | Terms of Use
Last Updated: 28 May 2008
Editor:
Copyright 2008 by LiYS
Everything else Copyright © CodeProject, 1999-2009
Web09 | Advertise on the Code Project