I have a List
of paths of files stored on my computer. My aim is to first filter out the files which have the same name and and then filter out those which have the same size.
To do so, I have made two classes implementing IEqualityComparer<string>
, and implemented Equals
and GetHashCode
methods.
var query = FilesList.Distinct(new CustomTextComparer())
.Distinct(new CustomSizeComparer());
The code for both of the classes is given below:-
public class CustomTextComparer : IEqualityComparer<string>
{
public bool Equals(string x, string y)
{
if (Path.GetFileName(x) == Path.GetFileName(y))
{
return true;
}
return false;
}
public int GetHashCode(string obj)
{
return obj.GetHashCode();
}
}
public class CustomSizeComparer : IEqualityComparer<string>
{
public bool Equals(string x, string y)
{
if (new FileInfo(x).Length == new FileInfo(y).Length)
{
return true;
}
else
{
return false;
}
}
public int GetHashCode(string obj)
{
return obj.GetHashCode();
}
}
But the code does not work.
It doesn't throw any exceptions nor is there any compiler error, but the problem is that the code doesn't work(doesn't exclude duplicate files).
So, how can I correct this problem? Is there anything I can do to make the code work correctly.
Your GetHashCode must return the same value for any objects that are of equal value:
But this is a much easier way for the whole problem without the extra classes:
The hash code is used before Equals is ever called. Since your code gives different hash codes for items that are equal, you're not getting the desired result. Instead, you have to make sure the hash code returned is equal when the items are equal, so for example:
However, as Piotr pointed out, this isn't exactly a good way to go about your goal, since you're going to be doing a lot of
Path.GetFileName
andnew FileInfo
respectively, which is a going to be a significant performance hit, especially since you're dealing with the file system, which is not exactly known for its speed of response.Change your
GetHashCode
to work on the compared value. I.e. for your size comparer:And for the other:
According to this answer - What's the role of GetHashCode in the IEqualityComparer<T> in .NET?, the hash code is evaluated first.
Equals
is called in case of collision.Obviously it would be sensible to work on
FileInfo
s, not on strings.So maybe:
Of course, then you have to change your comparers to work on the correct type.