Click here to Skip to main content
15,886,362 members
Please Sign up or sign in to vote.
4.50/5 (4 votes)
See more:
hi everyone,
In my application I need to collect the information of all the files in folder and send them to database.Since this is to be done for whole disk, this is a very time taking process. What I want to do is to create a seperate thread for each file in a folder and write each file simultaneously to db. I am not getting any idea of how to start it. If anyone could please help me in creating an array of threads for all files which will run simultaneously. Below is my code snippet where I need to do this.

C#
private static void collectFileInfo(String dir)
        {
            try
            {
                String[] files = Directory.GetFiles(dir);
                DirectoryInfo dir1 = new DirectoryInfo(dir);
                string chkFlPath = Convert.ToString(dir1.FullName);
                Int32 ParentID = GetParentID(dir1.Name.ToString(), chkFlPath);
                if (files.Length > 0)
                {
           
                    for (Int32 i = 0; i < files.Length; i++)
                    {
                        FileInfo File1 = new FileInfo(files[i]);
                        double FileLength = File1.Length;// File size
                        string strExtn = System.IO.Path.GetExtension(File1.Name.ToString());
                        String fileName = File1.Name;
                        String filePath = File1.FullName;
                        string ParentName = dir.ToString();
                        String fileSize = Convert.ToString(Convert.ToInt32(File1.Length) / 1024);
                        String fileExtension = File1.Extension;
                        DateTime fileCreated = Convert.ToDateTime(File1.CreationTime);
                        DateTime FileModified = Convert.ToDateTime(File1.LastWriteTime);
                        string md5hash = GetMD5Hash(filePath.ToString());
                        string sh1hash = GetSHA1Hash(filePath.ToString());
                        DateTime lastAccessedDate = Convert.ToDateTime(GetLastAccessedDate(filePath.ToString()));
                        Thread t = new Thread(delegate()
                            {
                                WriteToTable(ParentID, fileName, filePath, ParentName, true, dir1.FullName, fileExtension, fileSize, fileCreated, FileModified, md5hash, sh1hash, lastAccessedDate);
                            }
                       );
                        t.Start();
                    }
                }
            }
            catch (Exception ex)
            {
            }
        } 


Thanks & Regards
Anurag
Posted
Updated 27-Oct-10 0:30am
v2
Comments
@nuraGGupta@ 27-Oct-10 6:34am    
Hi Ankur, please help me, I can see that you edited something here,

Edited 3 mins ago
Ankur\m/12.4K




that means you read the post. Can you provide any solution....
@nuraGGupta@ 27-Oct-10 6:59am    
Thanks for giving me 4 points on my question, whoever did it :-)
Ankur\m/ 27-Oct-10 9:02am    
Yeah, I did a minor edit and I voted you 4. But I am sorry I can't help you much with this.


Ahh.. well, okay, let me read the question again. :)

For such large scale operations i used to use SQLite in an order that i was doing something like SQLiteDB.Query('update table');
and after all those update table i did Commit();
And it ended up way faster.
I cannot know what is the API used inside WriteToTable but if your DB can do it i suggest you build a sql script with all those updates and shove it inside the db.

As for threading i use:
C#
private readonly object lockObject = new object();
volatile int jobCount;
private bool Stopped;

private void btnGo_Click(object sender, EventArgs e)
{
    btnGo.Enabled = false;
    ThreadPool.QueueUserWorkItem(new WaitCallback(Start));
}

private void Start(object obj)
{
    jobCount = 0;
    Stopped = false;
    for (int i = 0; i < Environment.ProcessorCount; i++)
    {//this for could be changed to fit your needs ofc
        ThreadPool.QueueUserWorkItem(new WaitCallback(execute),null);//you could pass a FileInfo here for example instead of null
        jobCount++;
    }
    lock (lockObject)
    {
        while (jobCount > 0)
        {
            Monitor.Wait(lockObject);
        }
    }
    ThreadStart finisher = new ThreadStart(Finish);
    finisher.Invoke();
}

private void execute(object obj)
{
    //FileInfo file = obj as FileInfo;
    while (!Stopped)
    {
        Thread.Yield();
        //execute code here
    }
    Decreace();
}

private void Decreace()
{
    lock (lockObject)
    {
        jobCount--;
        Monitor.Pulse(lockObject);
    }
}

private void Finish()
{
    Thread.Yield();
    btnGo.Enabled = true;
    //other code goes here, you can also skip the next command if you like but i prefer it to release ram after large operations even if it slows down my process alot. But in your case it would make your process seem alot faster though so its your decision.
    GC.Collect();
    Application.DoEvents();
}


Edit: forgot to declare and initialize Stopped variable :P
Edit2: Changed Code to assume not within a form (i.e finisher.Invoke())
 
Share this answer
 
v3
Comments
@nuraGGupta@ 4-Nov-10 1:07am    
Hello qyte64,
Here Invoke method is not defined, and I am not getting that how to use it.Please specify.
qyte64 4-Nov-10 6:34am    
It stands for Form1.Invoke() and since this code is inside the body of Form1 i omitted it.
Invoke basically executes a delegate from the Main Thread, i.e. a thread asks the main thread to execute a delegate.
qyte64 4-Nov-10 11:00am    
I edited the Code and replaced the Invoke() with a ThreadStart finisher delegate that does the job.
@nuraGGupta@ 11-Nov-10 5:08am    
ok, but one more thing, what is this Thread.Yield(), I am not getting this method (Yield()),when I am putting a period(.) after "Thread" keyword. Please explain.
@nuraGGupta@ 11-Nov-10 5:14am    
I would like to clearify that I am using WPF in my application, not the traditional ASP.NET Forms Application, so that may be why I may not be getting these methods in my application, Application.DoEvents(), Thread.Yield() etc. Do you have any other way round for this?
Thanks
Actually your code snippet doesn't look that bad. What is the actual problem with it?
To improve the speed you could outsource more of your logic to the thread and put more code outside the "For" loop.
By the way, it doesn't make sense to create as much threads as files. You should limit the number of threads to the number of CPU's/Cores in the system.
 
Share this answer
 
v2
Comments
@nuraGGupta@ 27-Oct-10 5:51am    
At present, using this code, I am writing 5000(approx) file metadata to the table per minute, but my client wants it to make more faster. So, I am looking a way fit Boeing 747 engine to a chopper :-)

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900