|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Announcements
Chapters
Services
Feature Zones
|
Note: This is an unedited contribution. If this article is inappropriate,
needs attention or copies someone else's work without reference then please
Report This Article
IntroductionHaving multiple threads running in your application in right way certainly can make it run more efficient and responsive, to just name a few. MTCopy takes this advantage. It starts by spawn a number of threads to cooperatively perform the I/O operations (multiple files multiple threads scenario, MMT) and divides the large file to be copied into several sectors aligned blocks (single file multiple threads scenario, SMT) if necessary, whereby you don't have to have your program blocked until a lengthy I/O operations finished in the single thread scenario. But certainly, the more threads you have doesn't mean the more efficiency you will get, you have to take the thread context switching overheads into the consideration otherwise the program will spend much of its precious CPU time on context switching instead of on real work it should do. BackgroundWriting an article and a accompanying program on Codeproject have always been a dream of mine. Just a few weeks ago implementing a file copying tool came up to my mind after I was seeing the similar tool and wondering if I could made one by myself, and certainly it will be great if I could finish it and posted here. So, here I am with this article and completed sample program after a couple of weeks! This tool is more a by-product of self-practice that I do in my spare time on thread modeling, thread synchronization and file operations concepts than a serious product level file copy tool. Usage of sample programThe tool is console based application, so in order to run it you need to specify command line containing "[source file/folder] [destination file/folder] /t:[how many threads will be created during copying process] /b:[buffer size in KB/MB used during copying] /l:[threshold over which files in certain size will be divided into small blocks and copied multithreadedly]". Once it starts to run it will create a log file named "MTCopy.txt" under c: drive root. So you can copy folder like : " Please Note: [source file/folder] and [destination file/folder] options should be double quoted otherwise they will be interpreted as several options if spaces exist in between. How it was implementedThe program first starts by create two thread pools and a file traverse thread: File traverse thread will recursively search for all the files inside of specified directory, it runs with MMT copying thread pool in parallel. Once it found a file it will signal the MMT thread pool to copy the file over and put itself to sleep until the MMT pool tells it to wake up and continue traversing. If folder is found it will directly create that folder. Multiple file multithreaded copying thread pool: When the file traverse thread signals this thread pool that it has found a file, it will make a local copy of the source file path for the local thread in the critical section which is accessed only by one thread at any given time and then the real file copying operation gets executed by each local thread. Each one of the local thread gets a set of local variables of their own that do not shared with their close neighbours. This thread pool runs as long as the file traverse thread is unsignaled (thread/process objects become signaled when they terminate), that is to say this thread pool will terminate when the file reverse thread finish traversing the specific directory. As it receives the file to be copied it will check each file for if its size equal to or over what specified with /l: command line option, if this to be the case the file will be put into a dedicated queue as a work item and a dummy file will be created by size for future processing. During this process this newly created item will be equipped with series of blocks metrics such as how many blocks this file should be divided, again this block is specified by '/b:' option, with all of these information in hand the MMT simple copies unfiltered file over. The skeleton suedo code goes something like: for(;(WaitForSingleObject(g_hCopyThd, IGNORE)) != WAIT_OBJECT_0;) { // Get a copy of a file for the local thread EnterCriticalSection(&lpCS); WaitForSingleObject(g_hFound, INFINITE); // Found one file _tcscpy(tSrcPath, FileName); // Start find process again SetEvent(g_hStartFind); LeaveCriticalSection(&lpCS); ... if (filesize > threshold) { AppendToSMT(); } ... CopyFile(tSrcPath, tDstName, TRUE); ... } Single file multithreaded copying thread pool: This thread pool first starts by waiting on a semaphore object to be signaled, this semaphore is signaled when there is an item inside of to be copied queue otherwise it will just put itself to sleep. Each item's blocks will be handled by the local thread running in this thread pool. When finish copying the item the thread pool will suspend itself and wait for the next item to come. The code skeleton looks like: for (int i = 0; i < m_pHeadWorkItem->csa[uiIndex].uiLength; i++ ) { SetFilePointer(hSrcFile, (LONG)(m_pHeadWorkItem->csa[uiIndex].uiCurrentBlock * CS->m_dwCopyBlockSize) + (CS->m_dwCopyBlockSize * i), NULL/*&lHigh*/, FILE_BEGIN); SetFilePointer(hDstFile, (LONG)(m_pHeadWorkItem->csa[uiIndex].uiCurrentBlock * CS->m_dwCopyBlockSize) + (CS->m_dwCopyBlockSize * i), NULL, FILE_BEGIN); fFileOp = ReadFile(hSrcFile, pTmp, CS->m_dwCopyBlockSize, &dwRead, NULL); ... WriteFile(hDstFile, pTmp, dwRead, &dwWritten, NULL); ... SuspendThread(m_hThd[uiIndex]); } Points of InterestCloseHandle when thread is just created: When I saw code like There's another case that BOOL fSuccess = CreateProcess(..., &pi); if (fSuccess) { // Close the thread handle as soon as it is no longer needed! CloseHandle(pi.hThread); ... } Successful wait side effects: For some kernel objects, a successful wait (when the objects signaled) on WaitForSingleObject/WaitForMultipleObjects actually alters the object state. This side effects apply to auto reset event and semaphore object. This program uses both side effects, for semaphore successful wait side the program uses it to guard against the SMT items: unleash the threads when there's item available and put threads execution into wait state when there's none. when a item is queued the semaphore will be incremented by // Signal the copying thread pool to let it run ReleaseSemaphore(m_hsemNumElements, CS->m_nThdCnt, NULL); // For queue to have element && There's next item available for (;(WaitForSingleObject(m_hsemNumElements, INFINITE) == WAIT_OBJECT_0);) { ... } The Non-block wait: (WaitForSingleObject(m_hFindThd, IGNORE) == WAIT_OBJECT_0) // where IGNORE equals zero. I overlooked this parameter in MSDN just when I was looking for some function that could be used to test for the object signalness for shortest amount of time and returns. So, use it this way the object's state can be tested and returns immediately. So my retrospect is: read MSDN carefully first before asking any question. Critical section leak (orphan critical section) for(...)
{
EnterCriticalSection(&CS);
break;
...
LeaveCriticalSection(&CS);
}
Watch out for this kind of problem especially when you're in a thread pool where a number threads will execute the above code. During development I once experienced this issue where I put a break statement between them, so no wonder I ended up with having the whole thread pool blocked when the broke away thread forced not to release the critical section when it entered it, it made rest of the threads can never ever entering the critical section again. Thread Local Storage (TLS) Initially I thought I should have used TLS in the thread pool for the variables that go with particular thread, but later I found out that you only need TLS for global or static variables, if you can minimize the use of such variables and rely much more on automatic (stack-based) variables in thread pool you can just leave TLS alone, for those variables were all local to individual thread. Future thoughtsI've done a little research on the maximization of I/O operations after this program's basic functions were implemented. For I/O to achieve the maximized performance there are two things among others that you need to consider: 1. By passing the file system cache (when 2. Using overlapped I/O (asynchronous I/O): This program takes the advantage of bypassing system buffering so this means the data moves directly into the application via the SCSI adapter using DMA (direct memory access) instead of to the system and then application. Overlapped I/O can have the same effect that this program have by using just single thread instead of multiple, where the Read()/Write() function will immediately return after issuing the command to the OS, so it is OS which does the I/O works for you on the background and notifies you once the operation finished. So this increases throughput by providing the IO subsystem with more work to do at any instant. But this program has not used this mechanism yet, let's leave it for the future. 3. A better user friendly UI (progress bar, options setting etc,) also counts. History7 April, 2008 -- Original version posted
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||