For example, I often see this:
Set<Integer> s = new TreeSet<Integer>();
Set<Integer> s = new HashSet<Integer>();
Map<Integer, String> m = new HashMap<Integer, String>();
over
TreeSet<Integer> ts = new TreeSet<Integer>();
HashSet<Integer> hs = new HashSet<Integer>();
HashMap<Integer, String> hm = new HashMap<Integer, String>();
What are the advantages/disadvanges of the former vs the latter?
You should read On Understanding Data Abstraction, Revisited by William R. Cook and also his Proposal for Simplified, Modern Definitions of "Object" and "Object Oriented".
Bascially: if you use Java classes as anything else than factories, i.e. if you have a classname anywhere expect after a
new
operator, then you are not doing object-oriented programming. Following this rule does not guarantee that you are doing OO, but violating this rule means that you aren't.Note: there's nothing wrong with not doing OO.
The maintenance of an application can cost three times as much as the development. This means you want the code to be as simple and as clear as possible.
If you use a List instead of an ArrayList, you make it clear you are not using any method special to an ArrayList and that it can be changed from another List implementation. The problem with using an ArrayList when it doesn't have to be is that it takes a long time to determine safely that really it could have been a List. i.e. its very hard to prove you never needed something. (It is relatively easy to add something than remove it)
A similar example is using Vector when a List will do. If you see a Vector you say; the developer chose a Vector for a good reason, it is thread safe. But I need to change it now and check that the code is thread safe. They you say, but I can't see how it is used in a multi-threaded way so I have to check all the ways it could possibly be used or do I need to add synchronized when I iterate over it when actually it never need to be thread safe. Using a thread safe collection when it doesn't need to be is not just a waste of CPU time but more importantly a waste of the developers time.
For me it comes down to a number of points.
Do you care about the implementation? Does your code need to know that the
Map
is aHashMap
or aTreeMap
? Or does it just care that it's got a key/value structure of some kindIt also means that when I'm building my implementation code, if I expose a method that returns Map, I can change the implementation over time without effecting any code that relies on it (hence the reason why it's a bad idea to try and cast these types of values)
The other is that it becomes easier to move these structures around the code, such that any method that can accept a
Map
is going to be easier to deal with then one that relies on aHashMap
for instanceThe convention (that I follow) is basically to use the lowest functional interface that meets the need of the API. No point using an interface that does not provide the functionality your API needs (for example, if you need a
SortedMap
, no point using aMap
then)IMHO
Generally, you want to declare the most general type that has the behavior you're actually using. That way you don't have to change as much code if you decide to take a different concrete class. And you allow users of the function more freedom.