Click here to Skip to main content
14,690,956 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
I have data table which contains many duplicate rows i need to filter those rows from data table based upon multiple columns to get distinct rows in resultant data table....
Barcode Itemid PacktypeId

1      100      1

1      100      2

1      100      3

1      100      1

1      100      3

need only rows which contains packtypeid 1,2,3 remaining 4th and 5th row should be removed

I have tried using two methods but none didn't turns for better result

Data table contains more than 10 columns but unique column's is "Barcode", "ItemID", "PackTypeID"

Method-1:

dt_Barcode = dt_Barcode.DefaultView.ToTable(true, "Barcode", "ItemID", "PackTypeID");

The above method filter's the rows but it returns columns only 3 column values i need entire 10 column values.

Method-2:
List<string> keyColumns = new List<string>();
keyColumns.Add("Barcode");
keyColumns.Add("ItemID");
keyColumns.Add("PackTypeID");   
RemoveDuplicates(DataTable table, List<string> keyColumns)
{
	var uniqueness = new HashSet<string>();
	StringBuilder sb = new StringBuilder();
	int rowIndex = 0;
	DataRow row;
	DataRowCollection rows = table.Rows;             
	int i = rows.Count;
	while (rowIndex < i)
	{
	row = rows[rowIndex];
	sb.Length = 0;
	foreach (string colname in keyColumns)
	{
		sb.Append(row[colname]);
		sb.Append("|");
	}
	
	if (uniqueness.Contains(sb.ToString()))
	{
		rows.Remove(row);
	}
	else
	{
		uniqueness.Add(sb.ToString());
		rowIndex++;
	}
}

The Above Method returns exception like there is no rows at position 5
Posted
Updated 5-Aug-20 0:10am
v2

try with LINQ
dt_Barcode = dt_Barcode.AsEnumerable()
            .GroupBy(r => new { Itemid = r.Field<int>("Itemid"), PacktypeId = r.Field<int>("PacktypeId")})
            .Select(g => g.First())
            .CopyToDataTable();


Sample Test Code:
void Main()
{
    DataTable dt_Barcode =GetTable();
	dt_Barcode = dt_Barcode.AsEnumerable()
            .GroupBy(r => new { Itemid = r.Field<int>("Itemid"), PacktypeId = r.Field<int>("PacktypeId")})
            .Select(g => g.First())
            .CopyToDataTable();
}

DataTable GetTable()
{
	DataTable table = new DataTable();
	table.Columns.Add("Barcode", typeof(int));
	table.Columns.Add("Itemid", typeof(int));
	table.Columns.Add("PacktypeId", typeof(int));
	table.Rows.Add(1,100,1);
	table.Rows.Add(1,100,2);
	table.Rows.Add(1,100,3);
	table.Rows.Add(1,100,1);
	table.Rows.Add(1,100,3);
	return table;
}
   
v6
Comments
pradeepbliss 11-Nov-14 4:14am
   
Hi DamithSL partially u r correct but i need to pass both packtypeid and itemid at the same time via linq how can i accomplish both in where clause...
DamithSL 11-Nov-14 4:48am
   
check my updated answer
pradeepbliss 11-Nov-14 4:57am
   
hi i am getting incorrect syntax near , (invalid expression term ,)
DamithSL 11-Nov-14 5:13am
   
Oh sorry, new keyword need to be used
pradeepbliss 11-Nov-14 5:20am
   
hi i am sorry i am new to linq i am getting syntax error->invalid anonymous type member declarator........
DamithSL 11-Nov-14 5:28am
   
I think you take the intermediate version of my update, check the latest code.
pradeepbliss 11-Nov-14 5:33am
   
checked with latest code but getting runtime exception->Specified cast is not valid.but works for single parameter....
DamithSL 11-Nov-14 5:37am
   
do you have null values in these columns or do you have any text or characters with any of these column values?
I have used r.Field<int> by assuming you have integer data in your columns, if you have text fields you better change it to r.Field<string>
pradeepbliss 11-Nov-14 6:08am
   
wow superb it's works great...Great job man
Manas Bhardwaj 11-Nov-14 4:50am
   
+5!
DamithSL 11-Nov-14 5:16am
   
thank you, Manas
Example code for your problem.
List<items> arritems = new List<items>();
items item = new items();
items item1 = new items();
items item2 = new items();
items item3 = new items();
items item4 = new items();
item.barcode = 1;
item.itemid = 100;
item.packtypeid = 1;

item1.barcode = 1;
item1.itemid = 100;
item1.packtypeid = 2;

item2.barcode = 1;
item2.itemid = 100;
item2.packtypeid = 3;

item3.barcode = 1;
item3.itemid = 100;
item3.packtypeid = 1;

item4.barcode = 1;
item4.itemid = 100;
item4.packtypeid = 3;


arritems.Add(item);
arritems.Add(item1);
arritems.Add(item2);
arritems.Add(item3);
arritems.Add(item4);

var distinctList = arritems.Select(x => new{x.itemid , x.packtypeid}).Distinct().ToList();


public class items
    {
        public int barcode;
        public int itemid;
        public int packtypeid;
    }
   
Comments
sudhakarkoganti 11-Nov-14 5:54am
   
I have similar problem....this works for me.
pradeepbliss 11-Nov-14 8:00am
   
based upon performance and code optimization which is better either linq or datatable select........
Rajesh Varma Buddaraju 12-Nov-14 2:02am
   
There is no much performance difference on both. You can go with linq.
public void RemoveDuplicatesFromDataTable(ref DataTable table, List<string> keyColumns)
        {

            Dictionary<string, string> uniquenessDict = new Dictionary<string, string>(table.Rows.Count);
            StringBuilder stringBuilder = null;
            int rowIndex = 0;
            DataRow row;
            DataRowCollection rows = table.Rows;
            string error = string.Empty;

            try
            {
                while (rowIndex < rows.Count)
                {

                    row = rows[rowIndex];

                    stringBuilder = new StringBuilder();

                    foreach (string colname in keyColumns)
                    {
                        try
                        {
                            if (row[colname].ToString() != string.Empty)
                            {
                                stringBuilder.Append(((string)row[colname]));
                            }
                            else
                            {
                                //If it comes here, means one of the keys are blank
                                error += "One of the key values is blank.";
                            }
                        }
                        catch (Exception ss)
                        {
                            error += "Error " + ss.Message + ".";
                        }
                    }

                    if (uniquenessDict.ContainsKey(stringBuilder.ToString()))
                    {
                        rows.Remove(row);
                    }
                    else
                    {
                        uniquenessDict.Add(stringBuilder.ToString().Replace(",", ""), string.Empty);
                        rowIndex++;
                    }

                }
            }
            catch (Exception ex)
            {
                error = "Failed - " + ex.Message;
            }

            if(error != string.Empty)
                Show(error);
        }
   
Comments
Richard Deeming 5-Aug-20 7:02am
   
An unexplained code-dump is not a "solution" to this already-solved question.

You need to explain why you think your code is better than the existing accepted solution. (Hint: It's not.)

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900