Click here to Skip to main content
15,884,960 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more: , +
I have a folder in which I have, lets say 5 files each ~300 MB.
I am trying to read all the files concurrently in a ConcurrentDictionary so that duplicate values are not there. If a new item is found then I also push it to a ConcurrentQueue. I also have a writer which DeQueues from the ConcurrentQueue and writes to a final file.

The problem is that in my final file i still see duplicate values.

Below is the code:

C#
class Program
    {
        readonly static ConcurrentDictionary<string, byte> _hashSetList = new ConcurrentDictionary<string, byte>();
        readonly static ConcurrentQueue<IList<string>> _concurrentQueue = new ConcurrentQueue<IList<string>>();
        private static bool _flag;
        private readonly object _lock = new object();
        DateTime time = DateTime.Now;

        static void Main(string[] args)
        {
            var obj = new Program();

            Console.WriteLine("Starting");
            Task[] tasks = new Task[5];
            for (int i = 1; i <= 5; i++)
            {
                int temp = i;
                string path = @"C:\TJ\TEMP\TradeDefFiles\1" + " (" + temp + ").csv";
                tasks[i - 1] = Task.Factory.StartNew(() => obj.ReadValues(path));
            }

            Task.Factory.StartNew(() => obj.WriteValues());

            Task.WaitAll(tasks);

            _flag = true;

            Console.ReadLine();
        }

        #region CSVFileHelper

        private void WriteValues()
        {
            int count = 0;
            using (var writer = new CsvFileWriter("WriteTest.csv"))
            {
                // Write each row of data
                IList<string> columns;
                while (!_flag || _concurrentQueue.TryDequeue(out columns))
                {
                    if (_concurrentQueue.TryDequeue(out columns))
                    {
                        var templist = new List<string>(columns);
                        writer.WriteRow(templist);
                        count++;
                    }
                }
                Console.WriteLine("Done");
                Console.WriteLine("Total time taken : " + (DateTime.Now - time).TotalSeconds);
                Console.WriteLine("Total records : " + count);
            }
        }

        private void ReadValues(string path)
        {
            Console.WriteLine(Thread.CurrentThread.ManagedThreadId + " Starting work on Thread");
            Console.WriteLine(Thread.CurrentThread.ManagedThreadId + " Reading file  : " + path);

            var columns = new List<string>();

            int totalrecords = 0;
            int uniquerecords = 0;

            using (var reader = new CsvFileReader(path))
            {
                while (reader.ReadRow(columns))
                {
                    totalrecords++;

                    lock (_lock)
                    {
                        if (!_hashSetList.ContainsKey(columns[0]))
                        {
                            _hashSetList.GetOrAdd(columns[0], Byte.MinValue);
                            _concurrentQueue.Enqueue(columns);
                            uniquerecords++;
                        }
                    }
                }
            }

            Console.WriteLine(Thread.CurrentThread.ManagedThreadId + " Stopping work on Thread : " + Thread.CurrentThread.ManagedThreadId);
            Console.WriteLine(Thread.CurrentThread.ManagedThreadId + " Total Records Read = " + totalrecords + " Unique Records = " + uniquerecords);
        }

        #endregion
    }
}


sample data

VB
TradeRef,Risk Source System,Source System Id,Legal Entity Id,Branch,Business,Portfolio Name,Book Id,Book Name,Deal Group Id,Pabulum Id,Deal Version,Status,Instrument,Product Group,Product Code,Instrument Sub Type,Valuation Model,Number of Simulation Paths,Underlying,PL Currency,Buy Sell,Trade Date,Expiry Date,Delivery Date,Strike Rate,Forward Rate,Interest Rate,Upper Barrier Rate,Adjusted Upper Barrier Rate,Lower Barrier Rate,Adjusted Lower Barrier Rate,Notional Currency 1,Notional Amount 1,Notional Currency 2,Notional Amount 2,Strike Delta Percentage,Vol For Strike,Premium Currency,Premium Amount,Premium Date,Counterparty Code,Counterparty Name,Close Out Deal Id,Closed Out Day On Day,Close Out Date,Fixing Day on Day,Last Fixing Date,Triggered Day On Day,Triggered Date,Trader Id,Modification Date,Modification Time,Sales Person,AV Amount,ACTV Amount,CTV Amount,TCV Currency,TCV Amount,MonteCarlo Error
L141212072279,SYXLDN,GFXLDN,,,,,PRIME,,L141212072279,,0,Live,Forward,Forward,FWD,SPOT,ModelFree,0,USDCAD,USD,Sell,12-Dec-14,,15-Dec-14,,1.15471,,,,,,USD,-800000,CAD,923768,,,,,,XDEUH,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212072280,SYXLDN,GFXLDN,,,,,PRIME,,L141212072280,,0,Live,Forward,Forward,FWD,IDT,ModelFree,0,USDCAD,USD,Buy,12-Dec-14,,15-Dec-14,,1.15471,,,,,,USD,800000,CAD,-923768,,,,,,RBSMARG,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212073467,SYXLDN,GFXLDN,,,,,PRIME,,L141212073467,,0,Live,Forward,Forward,FWD,SPOT,ModelFree,0,USDCAD,USD,Sell,12-Dec-14,,15-Dec-14,,1.15487,,,,,,USD,-1500000,CAD,1732305,,,,,,XLUCIDP,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212073468,SYXLDN,GFXLDN,,,,,PRIME,,L141212073468,,0,Live,Forward,Forward,FWD,IDT,ModelFree,0,USDCAD,USD,Buy,12-Dec-14,,15-Dec-14,,1.15487,,,,,,USD,1500000,CAD,-1732305,,,,,,RBSMARG,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212073443,SYXLDN,GFXLDN,,,,,PRIME,,L141212073443,,0,Live,Forward,Forward,FWD,SPOT,ModelFree,0,USDCAD,USD,Buy,12-Dec-14,,15-Dec-14,,1.155,,,,,,USD,1000000,CAD,-1155000,,,,,,XCITLH,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212073444,SYXLDN,GFXLDN,,,,,PRIME,,L141212073444,,0,Live,Forward,Forward,FWD,IDT,ModelFree,0,USDCAD,USD,Sell,12-Dec-14,,15-Dec-14,,1.155,,,,,,USD,-1000000,CAD,1155000,,,,,,RBSMARG,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212073446,SYXLDN,GFXLDN,,,,,PRIME,,L141212073446,,0,Live,Forward,Forward,FWD,SPOT,ModelFree,0,USDCAD,USD,Sell,12-Dec-14,,15-Dec-14,,1.155,,,,,,USD,-1000000,CAD,1155000,,,,,,XCAMPCO,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212073447,SYXLDN,GFXLDN,,,,,PRIME,,L141212073447,,0,Live,Forward,Forward,FWD,IDT,ModelFree,0,USDCAD,USD,Buy,12-Dec-14,,15-Dec-14,,1.155,,,,,,USD,1000000,CAD,-1155000,,,,,,RBSMARG,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212073464,SYXLDN,GFXLDN,,,,,PRIME,,L141212073464,,0,Live,Forward,Forward,FWD,SPOT,ModelFree,0,USDCAD,USD,Buy,12-Dec-14,,15-Dec-14,,1.15487,,,,,,USD,1500000,CAD,-1732305,,,,,,XBARLHS,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212073465,SYXLDN,GFXLDN,,,,,PRIME,,L141212073465,,0,Live,Forward,Forward,FWD,IDT,ModelFree,0,USDCAD,USD,Sell,12-Dec-14,,15-Dec-14,,1.15487,,,,,,USD,-1500000,CAD,1732305,,,,,,RBSMARG,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212076208,SYXLDN,GFXLDN,,,,,PRIME,,L141212076208,,0,Live,Forward,Forward,FWD,SPOT,ModelFree,0,USDCAD,USD,Buy,12-Dec-14,,15-Dec-14,,1.15459,,,,,,USD,3500000,CAD,-4041065,,,,,,BGFX07B,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212076209,SYXLDN,GFXLDN,,,,,PRIME,,L141212076209,,0,Live,Forward,Forward,FWD,IDT,ModelFree,0,USDCAD,USD,Sell,12-Dec-14,,15-Dec-14,,1.15459,,,,,,USD,-3500000,CAD,4041065,,,,,,RBSMARG,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212076096,SYXLDN,GFXLDN,,,,,PRIME,,L141212076096,,0,Live,Forward,Forward,FWD,SPOT,ModelFree,0,USDCAD,USD,Sell,12-Dec-14,,15-Dec-14,,1.15459,,,,,,USD,-3500000,CAD,4041065,,,,,,XDEUBCM,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212076097,SYXLDN,GFXLDN,,,,,PRIME,,L141212076097,,0,Live,Forward,Forward,FWD,IDT,ModelFree,0,USDCAD,USD,Buy,12-Dec-14,,15-Dec-14,,1.15459,,,,,,USD,3500000,CAD,-4041065,,,,,,RBSMARG,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212075478,SYXLDN,GFXLDN,,,,,PRIME,,L141212075478,,0,Live,Forward,Forward,FWD,SPOT,ModelFree,0,USDCAD,USD,Buy,12-Dec-14,,15-Dec-14,,1.15524999892305,,,,,,USD,784619.78,CAD,-906432,,,,,,XLUCIDP,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212072279,SYXLDN,GFXLDN,,,,,PRIME,,L141212072279,,0,Live,Forward,Forward,FWD,SPOT,ModelFree,0,USDCAD,USD,Sell,12-Dec-14,,15-Dec-14,,1.15471,,,,,,USD,-800000,CAD,923768,,,,,,XDEUH,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212072280,SYXLDN,GFXLDN,,,,,PRIME,,L141212072280,,0,Live,Forward,Forward,FWD,IDT,ModelFree,0,USDCAD,USD,Buy,12-Dec-14,,15-Dec-14,,1.15471,,,,,,USD,800000,CAD,-923768,,,,,,RBSMARG,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212073467,SYXLDN,GFXLDN,,,,,PRIME,,L141212073467,,0,Live,Forward,Forward,FWD,SPOT,ModelFree,0,USDCAD,USD,Sell,12-Dec-14,,15-Dec-14,,1.15487,,,,,,USD,-1500000,CAD,1732305,,,,,,XLUCIDP,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212074836,SYXLDN,GFXLDN,,,,,PRIME,,L141212074836,,0,Live,Forward,Forward,FWD,SPOT,ModelFree,0,USDCAD,USD,Buy,12-Dec-14,,15-Dec-14,,1.15508,,,,,,USD,500000,CAD,-577540,,,,,,XCSSFX,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212074837,SYXLDN,GFXLDN,,,,,PRIME,,L141212074837,,0,Live,Forward,Forward,FWD,IDT,ModelFree,0,USDCAD,USD,Sell,12-Dec-14,,15-Dec-14,,1.15508,,,,,,USD,-500000,CAD,577540,,,,,,SOLIDFX,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212074791,SYXLDN,GFXLDN,,,,,PRIME,,L141212074791,,0,Live,Forward,Forward,FWD,SPOT,ModelFree,0,USDCAD,USD,Sell,12-Dec-14,,15-Dec-14,,1.1551,,,,,,USD,-500000,CAD,577550,,,,,,XVFSCIT,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212074792,SYXLDN,GFXLDN,,,,,PRIME,,L141212074792,,0,Live,Forward,Forward,FWD,IDT,ModelFree,0,USDCAD,USD,Buy,12-Dec-14,,15-Dec-14,,1.1551,,,,,,USD,500000,CAD,-577550,,,,,,SOLIDFX,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212076093,SYXLDN,GFXLDN,,,,,PRIME,,L141212076093,,0,Live,Forward,Forward,FWD,SPOT,ModelFree,0,USDCHF,USD,Buy,12-Dec-14,,16-Dec-14,,0.96518,,,,,,USD,1000000,CHF,-965180,,,,,,XDEUBCM,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
L141212076094,SYXLDN,GFXLDN,,,,,PRIME,,L141212076094,,0,Live,Forward,Forward,FWD,IDT,ModelFree,0,USDCHF,USD,Sell,12-Dec-14,,16-Dec-14,,0.96518,,,,,,USD,-1000000,CHF,965180,,,,,,RBSMARG,,0,,,,,,,,,00:01:00,,0.0000,0.0000,0.0000,USD,0.0000,
Posted
Updated 16-Dec-14 22:50pm
v3
Comments
Tomas Takac 17-Dec-14 4:00am    
You are using concurrent collections and then locking explicitly. Why not use ConcurrentDictionary.TryAdd()[^]?
[no name] 17-Dec-14 4:09am    
Hi Tomas,
Had tried TryAdd as well, still it behaves similarly.
Tomas Takac 17-Dec-14 4:35am    
What else bothers me is that you call TryDequeue twice every time. But that would not produce duplicate records, rather skip some. Maybe post some sample data so we can see the duplicates.
[no name] 17-Dec-14 4:53am    
Please find the sample data attached. you can make 5 files out of the same data and then test the code.

Yes, you are correct. Thanks for pointing it out. However even after correcting the said code, I still encounter the issue, its just that its a little more correct now.

I have analyzed it more and found out that the problem is with the TryDequeue method. Its writing the same line again and again randomly.
Tomas Takac 17-Dec-14 5:25am    
Some of the value in first column have trailing space. Could that be the problem? Try to trim those before adding/checking in the dictionary.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900