On top of the other answers you could make the process faster by creating a low-cost hash simply constructed of a XOR amongst all the elements of each List.
You wouldn't have to order your list and all you would get is an int
which is easier and faster to store than strings.
Then you only need to use the resulting XORed number as a key to a Hashtable and check for the existence of the key before inserting it.
If there is already an existing key, only then do you sort the corresponding Lists and compare them.
You still need to compare them if you find a match because there may be some collisions using a simple XOR.
I think thought that the result would be much faster and have a much lower memory footprint than re-ordering arrays and converting them to strings.
If you were to have your own implementation of the List<>
, then you could build the generation of the XOR key within it so it would be recalculated at each operation on the List.
This would make the process of checking duplicate lists even faster.
Code
Below is a first-attempt at implementing this.
Dictionary<int, List<List<int>>> checkHash = new Dictionary<int, List<List<int>>>();
public bool CheckDuplicate(List<int> theList) {
bool isIdentical = false;
int xorkey = 0;
foreach (int v in theList) xorkey ^= v;
List<List<int>> existingLists;
checkHash.TryGetValue(xorkey, out existingLists);
if (existingLists != null) {
// Already in the dictionary. Check each stored list
foreach (List<int> li in existingLists) {
isIdentical = (theList.Count == li.Count);
if (isIdentical) {
// Check all elements
foreach (int v in theList) {
if (!li.Contains(v)) {
isIdentical = false;
break;
}
}
}
if (isIdentical) break;
}
}
if (existingLists == null || !isIdentical) {
// never seen this before, add it
List<List<int>> newList = new List<List<int>>();
newList.Add(theList);
checkHash.Add(xorkey, newList);
}
return isIdentical;
}
Not the most elegant or easiest to read at first sight, it's rather 'hackey' and I'm not even sure it performs better than the more elegant version from Guffa.
What it does though is take care of collision in the XOR key by storing Lists of List<int>
in the Dictionary.
If a duplicate key is found, we loop through each previously stored List until we found a mismatch.
The good point about the code is that it should be probably as fast as you could get in most cases and still faster than compiling strings when there is a collision.