How do I synchronise tasks transforming string[][] to datarow

Question

0.00/5 (No votes)

See more:

, +

Hi community,

I am new to parallel coding and have some trouble understanding what I have read about it.

Let's assume the following scenario:

I have 5 tables in a sql server and I receive data from source X which needs to be filtered,sorted and validate and the resulting string[][] needs to be transformed to dataRow[] and uploaded to the sql server tables.

My sequentiel solution works, but to be honest - I have an eight core processor...

To transform my problem I thought I might use the TPL with a for loop. (iterating 0 to 4 = 1 taks per table). Based on the iteration index I would perform a specific LINQ query on my array and then simply take a new DataRow with the respective table schema and populate its fields. I have set the tables ID to increment automatically. My sequentiel solution therefore does not provide a value for the ID column of my DataRow (it will be done by the SQL server anyways).

Problem:
The string[][] exists outside of the TPL for loop and therefore needs synchronisation - is that correct ?

Is it further correct that all variables that I create within the TPL for loop are threadsafe - and that I therefore do not need to synchronise them? Meaning creating the DataRow withing the for loop should be fine in terms of exceptions ?

Last question regarding the TPL for loop. Will it automatically wait for all tasks before the main thread continues or do I have to Call Task.WaitAll() ? In that case, wouldn't it be better to create individual tasks add them to an array and do Task.WaitAll(arrayOfTasks) ?

I am using ADO DataTables -> therefore I want to wait for all the updates/changes to the local tables and then simply update the entire database at once.

I am happy about pseudo solutions, but I am more interested about understanding the concept correctly. Am I approaching this problem correctly or should I create a normal for loop and within the for loop create Tasks ?

As always - thanks for your help and time.

-DK

Posted 2-Sep-14 12:10pm

KergalBerlin

Add a Solution

1 solution

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

kbrandwijk · Accepted Answer · 2014-09-02T12:37:00

You have a string[][]:

C#

string[][] myArray = new string[][] {};

You can either use one method for all tables, or define a separate method for each table.

Using one method for all tables, you define a list of tables

C#

List<string> tables = new List<string> { "table1", "table2", "table3", "table4", "table5" };

Next, you use PLINQ to execute your method for each table in parallel:

C#

tables.AsParallel().ForAll(table => DoSomethingWithTheArray(myArray, table));

...

private static void DoSomethingWithTheArray(string[][] myArray, string table)
{
    switch (table)
    {
        case "table1":
            // Perform transformation of array to table 1
            break;
        // etc.
    }
}

If you want to structure your code a bit better, and define a separate method for each table, you change the list of tables to be a list of methods:

C#

List<Action<string[][]>> transformations = new List<Action<string[][]>>
{
    TransformArrayForTableOne,
    // etc...
};

Then, change your PLINQ query to run all the methods in parallel:

C#

transformations.AsParallel().ForAll(transformation => transformation(myArray));

...

private static void TransformArrayForTableOne(string[][] obj)
{
    // Do something with the array specific for this table
}

private static void TransformArrayForTableTwo(string[][] obj)
{
    // Do something with the array specific for this table
}

// etc.

In each of the methods, you would perform the transformation to the specific DataRows. It's up to you how you want to commit these to the database.