Old question
My understanding is that C# has in some sense HashSet
and set
types. I understand what HashSet
is. But why set
is a separate word? Why not every set is HashSet<Object>
?
New question
Why does C# has no generic Set
type, similar to Dictionary
type? From my point of view, I would like to have a set with standard lookup/addition/deletion performance. I wouldn't care much whether it is realized with hashes or something else. So why not make a set class that would actually be implemented as a HashSet
in this version of C# but perhaps somewhat different in a future version?
Or why not at least interface ISet
?
Answer
Learned thanks to everyone who answered below: ICollection
implements a lot of what you'd expect from ISet
. From my point of view, though, ICollection
implements IEnumerable
while sets don't have to be enumerable --- example: set of real numbers between 1 and 2 (even more, sets can be generated dynamically). I agree this is a minor rant, as 'normal programmers' rarely need uncountable sets.
Ok, I think I get it. HashSet
was absolutely meant to be called Set
but the word Set
is reserved in some sense. More specifically, creators of .NET architecture wanted to have a consistent set (sic!) of classes for different languages. This means that every name of the standard class must not coincide with any keyword in the .NET languages. The word Set
, however, is used in VB.NET which is actually case-insensitive (is it?) so unfortunately there is no room for maneuvre there.
Mystery solved :)
Epilogue
The new answer by Alex Y. links to the MSDN page which describes the upcoming .NET 4.0 interface ISet
which behaves pretty much as I thought it should and is implemented by HashedSet
. Happy end.
There is no Set
<T>
. This BCL team Blog post has lot's of details on HashSet including a not entirely conclusive discussion on including hash in the name. I suspect not everyone on the BCL team liked the decision to use the name HashSet<T>
.Ah right I understand your question now
Not sure I can 100% see the need for an
ISet<T>
.I guess the question is which do you see as essential behaviour for a set?
Is it Add,Remove, Contains etc. If so then
ICollection<T>
already provides an interface for that.If it's set operations such as Union, Intersect, etc then is that something you'd consider generic enough to abstract out to a contract style enforcement?
I have to say I don't know the right answer to this one - I think it's open to debate and I suspect the BCL team may end up putting something like this in a future version but that's up to them. I personally don't see it as massive missing piece of functionality
Original Post
The BCL doesn't have a Set collection at all, at least not as far as I know.
There a few 3rd party Set libs out there like Iesi.Collections
HashSet<T>
was introduced in .NET 3.5 to create a fast set collection i.e where you want a collection with no duplicates. It also has typical set operations such as Union and Join. Check out this link from BCL team on HashSetYou'd typically use it where previously you had to use
List<T>
and check for duplicates when adding.Adding items to a
HashSet<T>
can also be significantly faster than ListSome further details:
Another nice feature of HashSet is that it doesn't throw an exception if you try and add a duplicate it just fails to add the duplicate entry which saves you having to put lots of try.catch blocks around every add - nice :)
I'm pretty sure there's no
Set<T>
class in the BCL, at least in .NET 3.5 (and not .NET 4.0 either it seems). What would you expect is the need for such a class, anyway?HashSet<T>
is itself just an ordinary set data structure that uses hash codes (theGetHashCode
method of an object) to compare elements. This is simply an efficient way of implementing a set type. (Other methods for checking equality would likely have lower performance.)(Your original question about
set
has been answered. IIRC, "set" is the word with the most different meanings in the English language... obviously this has an impact in computing too.)I think it's fine to have
HashSet<T>
with that name, but I'd certainly welcome anISet<T>
interface. Given thatHashSet<T>
only arrived in .NET 3.5 (which in itself was surprising) I suspect we may eventually get a more complete collection of set-based types. In particular, the equivalent of Java'sLinkedHashSet
, which maintains insertion order, would be useful in some cases.To be fair, the
ICollection<T>
interface actually covers most of what you'd want inISet<T>
, so maybe that isn't required. However, you could argue that the core purpose of a set (which is mostly about containment, and only tangentially about being able to iterate over the elements) isn't quite the same as a collection. It's tricky. In fact, a truly mathematical set may not be iterable or countable - for instance, you could have "the set of real numbers between 1 and 2." If you had an arbitrary-precision numeric type, the count would be infinite and iterating over it wouldn't make any sense.Likewise the idea of "adding" to a set doesn't always make sense. Mutability is a tricky business when naming collections :(
EDIT: Okay, responding to the comment: the keyword
set
is in no way a legacy to do with Visual Basic. It's the operation which sets the value of a property, vsget
which retrieves the operation. This has nothing to do with the idea of a set as an operation.Imagine that instead the keywords were actually
fetch
andassign
, e.g.Is the purpose clear there? Now the real equivalent of that in C# is just
So if you write:
that will use the
get
part of the property. If you write:that will use the
set
part.Is that any clearer?
Set is a reserved keyword in VB.NET (it's the equivalent to set in C#). VB.NET can use classes/methods/etc with the same name as keywords but they have to be written between square brackets, which it's ugly:
set
is a C# language keyword that has been around since version 1.0. Is is used to define the value-assigning part of a property (andget
is used to implement the value-reading part of a property). In this context you should understand the word 'set' as a verb, as in setting a value.HashSet<T>
is a particular implmentation of the mathematical concept of a Set. It was first introduced in .NET 3.5. This blog post by the BCL Team explains more about the reasoning behind it, as well as some clues to why the name isHashSet<T>
and not justSet<T>
: http://blogs.msdn.com/bclteam/archive/2006/11/09/introducing-hashset-t-kim-hamilton.aspx.In the case of
HashSet<T>
you should understand the word 'set' as a noun.