Click here to Skip to main content
15,663,531 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
I have been experimenting but haven't found a solution that does exactly what I need it to do. So I am turning to the group here to ask for help.

I have two List<string> objects that I want to compare, and I want to find any item that is not the same in both lists. Both lists are already sorted into alphabetical order.

So I need a way to iterate through the lists to find anything that does not appear in both lists.

For example...

List 1 = Apple, Banana, Carrot, Grape, Plum
List 2 = Apple, Carrot, Grape, Mango, Plum


What I want is to be able to find "Banana" in List 1 and "Mango" in List 2 - and in a way that I know which list it came from.

Thanks!
Posted
Updated 20-Oct-15 3:20am
v3
Comments
Sergey Alexandrovich Kryukov 20-Oct-15 2:12am    
Programming is not about "looking for an algorithm"; it includes creation of algorithms, too. And what you want is not normally called "compare"...
Why not just using common logic?
—SA

Use Linq!

C#
List<string> List1 = new List<string>(){"Apple", "Banana", "Carrot", "Grape", "Plum"};
List<string> List2 = new List<string>(){"Apple", "Carrot", "Grape", "Mango", "Plum"};

var result = List1.Concat(List2) //concatenate list into single recordset
		.Except(List1.Intersect(List2)) //eliminate common items
		.Select(a=>new
			{
				Value = a, //get value
				List = List1.Any(c=>c==a) ? "List1" : "List2" //define list
			});
			
foreach(var d in result)
{
	Console.WriteLine("Item: '{0}' found on: '{1}'", d.Value, d.List);
}


Returns:
Item: 'Banana' found on: 'List1'
Item: 'Mango' found on: 'List2'


Try!
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 20-Oct-15 9:56am    
Sure, a 5.
—SA
Maciej Los 20-Oct-15 10:24am    
Thank you, Sergey.
Maciej Los 21-Oct-15 0:26am    
Thank you, Tadit
C#
List<string> lstIn1NotIn2 = lst1.Except(lst2).ToList();
List<string> lstIn2NotIn1 = lst2.Except(lst1).ToList();
 
Share this answer
 
v2
Comments
Maciej Los 20-Oct-15 9:32am    
A 4! Nice trick, Tadit. As per my understanding, OP wants to fetch items that do not exist on both lists + extra info about the source of item. Please, see my answer.
Yeah. Thanks :)
Use LINQ to compare two lists of strings[^]
class CompareLists
{        
    static void Main()
    {
        // Create the IEnumerable data sources.
        string[] names1 = System.IO.File.ReadAllLines(@"../../../names1.txt");
        string[] names2 = System.IO.File.ReadAllLines(@"../../../names2.txt");

        // Create the query. Note that method syntax must be used here.
        IEnumerable<string> differenceQuery =
          names1.Except(names2);

        // Execute the query.
        Console.WriteLine("The following lines are in names1.txt but not names2.txt");
        foreach (string s in differenceQuery)
            Console.WriteLine(s);

        // Keep the console window open in debug mode.
        Console.WriteLine("Press any key to exit");
        Console.ReadKey();
    }
}
/* Output:
     The following lines are in names1.txt but not names2.txt
    Potra, Cristina
    Noriega, Fabricio
    Aw, Kam Foo
    Toyoshima, Tim
    Guy, Wey Yuan
    Garcia, Debra
     */


-KR
 
Share this answer
 
Comments
Maciej Los 20-Oct-15 9:29am    
The example does not meet OP's requirements. Above code returns only "names" from first list of names, which do not exist on second list. OP wants to return those items which do not exist on both. In other words, he wants to reject common items. Please, see my answer.
Please see my comment to the question.

The problem is quite trivial, the only question is how efficient your search can be. First of all, your "in a way that I know which list it came from" means pretty big thing. As your solution will list all encountered words, and each word can be found in both lists or in a subset of lists (in case of just two, in one or another) "the way to know" would mean that each element of solution should contain the following components: 1) a word itself, 2) a set (or any other collection) of references to original lists. The whole solution would be something like
using System.Collections.Generic;

C#
using System.Collections.Generic;

//...

Dictionary<string, HashSet<string>> = //...</string>

The key of your dictionary would represent a word, and the value type — the set of references to your original lists. I suggested the type HashSet for purely "ideological" reasons: the order of references may not matter; but the reason to choose dictionary is more important: this is the performance of finding already added word you will need to do on each step, which will be of the time complexity of O(1), in contrast to list's O(N).

Now, as you will need to perform the search many times, the List type is not the best choice. I would populate the data in System.Collections.Generic.HashSet, which provides the time complexity for search of O(1):
https://msdn.microsoft.com/en-us/library/bb359438%28v=vs.110%29.aspx[^],
https://msdn.microsoft.com/en-us/library/bb356440%28v=vs.110%29.aspx[^] (search method).

At the same time, if the lists are small, the overhead of re-population of data may not pay off, so using lists in all cases will be good enough; and you can use LINQ to Objects: https://msdn.microsoft.com/en-us/library/bb397919.aspx[^].

You can iterate this collection is a usual way with the usual way, foraach… in, because this type implements System.Collections.Generic.IEnumerable<>:
https://msdn.microsoft.com/en-us/library/9eekhta0(v=vs.110).aspx[^].

So, you would need to iterate one set, checking presence of each word in a second one and adding it in the solution dictionary. After that, you will need to do it with the second set, because some words may never appear in the first one. When you add data to the dictionary, you use the method TryGetValue:
https://msdn.microsoft.com/en-us/library/xfhwa508%28v=vs.110%29.aspx[^],
https://msdn.microsoft.com/en-us/library/bb347013(v=vs.110).aspx[^].

If the word is already found, you try to add the source set reference(s) to the value part of the dictionary element, if not, you add the element by the new key (source word).

That's all. Your overall time complexity will be of O(1), which is the fastest possible. However, the absolute time could be better for, say, lists, if the input data set is very small, but combination of set and dictionary will very quickly outperform any other solution as size of data grows.

—SA
 
Share this answer
 
Comments
Maciej Los 20-Oct-15 9:24am    
5ed!
Using Dictionary + Hashsets is one of possible ways. Probably - the best. For less portion of data, it's possible to use Linq.
Please, see my answer, Sergey.
Sergey Alexandrovich Kryukov 20-Oct-15 9:55am    
Thank you, Maciej.
Yes, LINQ, as I mentioned. For compactness and elegance of code. ;-)
—SA
Maciej Los 20-Oct-15 10:23am    
Seems, you're very busy these days...
Sergey Alexandrovich Kryukov 20-Oct-15 10:44am    
Well, yes, more than average. Hope to overcome time crisis during the rest of the year. But, by the way, I work at new CodeProject articles. There are not many interesting questions anyway...
—SA
Maciej Los 20-Oct-15 10:59am    
Agree. I'm during writing an article too. Take care!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900