Why continue to use getters with immutable objects

2019-02-16 07:44发布

问题:

Using immutable objects has become more and more common, even when the program at hand is never meant to be ran in parallel. And yet we still use getters, which require 3 lines of boilerplate for every field and 5 extra characters on every access (in your favorite mainstream OO language). While this may seem trivial, and many editors remove most of the burden from the programmer anyways, it is still seemingly unnecessary effort.

What are the reasons for the continued use of accessors versus direct field access of immutable objects? Specifically, are there advantages to forcing the user to use accessors (for the client or library writer), and if so what are they?


Note that I am referring to immutable objects, unlike this question, which refers to objects in general. To be clear, there are no setters on immutable objects.

回答1:

I'd say this is actually language-dependent. If you'll excuse me I'll talk about C# a bit, since I think it'll help answer this question.

I'm not sure if you're familiar with C#, but its design, tools, etc. are very intuitive and programmer-friendly.
One feature of C# (which also exists in Python, D, etc.) that helps this is the property; a property is basically a pair of methods (a getter and/or a setter) which, on the outside, look just like an instance field: you can assign to it and you can read from it just like an instance variable.
Internally, of course, it's a method, and it can do anything.

But C# data types also sometimes have GetXYZ() and SetXYZ() methods, and sometimes they even expose their fields directly... and that begs the question: how do you choose which to do when?

Microsoft has a great guideline for C# properties and when to use getters/setters instead:

Properties should behave as if they are fields; if the method cannot, it should not be changed to a property. Methods are better than properties in the following situations:

  • The method performs a time-consuming operation. The method is perceivably slower than the time that is required to set or get the value of a field.
  • The method performs a conversion. Accessing a field does not return a converted version of the data that it stores.
  • The Get method has an observable side effect. Retrieving the value of a field does not produce any side effects.
  • The order of execution is important. Setting the value of a field does not rely on the occurrence of other operations.
  • Calling the method two times in succession creates different results.
  • The method is static but returns an object that can be changed by the caller. Retrieving the value of a field does not allow the caller to change the data that is stored by the field.
  • The method returns an array.

Notice that the entire goal of these guidelines is to make all properties look like fields externally.

So the only real reasons to use properties instead of fields would be:

  1. You want encapsulation, yada yada.
  2. You need to verify the input.
  3. You need to retrieve the data from (or send the data to) somewhere else.
  4. You need forwards binary (ABI) compatibility. What do I mean? If you sometime, down the road, decide you need to add some sort of verification (for example), then changing a field to a property and recompiling your library will break any other binaries that depends on it. But, at the source-code level, nothing will change (unless you're taking addresses/references, which you probably shouldn't be anyway).

Now let's get back to Java/C++, and immutable data types.

Which of those points apply to our scenario?

  1. Sometimes it doesn't apply, because the whole point of an immutable data structure is to store data, not to have (polymorphic) behavior (say, the String data type).
    What's the point of storing data if you're going to hide it and do nothing with it?
    But sometimes it does apply (e.g. say you have an immutable tree) -- you might not want to expose metadata.
    But then in that case, you would obviously hide the data you don't want to expose, and you wouldn't have been asking this question in the first place! :)
  2. Doesn't apply; there's no input to verify because nothing is changing.
  3. Doesn't apply, otherwise you can't use fields!
  4. May or may not apply.

Now Java and C++ don't have properties, but methods take their place -- and so the advice above still applies, and the rule for languages without properties becomes:

If (1) you don't need ABI compatibility, and (2) your getter would behave just like a field (i.e. it satisfies the requirements in the MSDN documentation above), then you should use a field instead of a getter.

The important point to realize is that none of this is philosophical; all these guides are all based on what the programmer expects. Obviously, the goal at the end of the day is to (1) get the job done, and (2) keep the code readable/maintainable. The guide above has been found to be helpful in making this happen -- and your goal should be to do whatever suits your fancy that will make that happen.



回答2:

Encapsulation serves several useful purposes, but the most important one is that of information hiding. By hiding the field as an implementation detail, you protect clients of the object from depending on there actually being a field there. For example, a future version of your object may want to compute or fetch the value lazily, and that can only be done if you can intercept a request to read the field.

That said, there is no reasons for getters to be particularly verbose. In the Java world in particular, even where the "get" prefix is very well entrenched, you'll still find getter methods named after the value itself (that is, a method foo() instead of getFoo()), and that's a fine way to save a few characters. In many other OO languages, you can define a getter and still use syntax that looks like a field access, so there's no extra verbosity at all.



回答3:

Immutable objects should use direct field access for uniformity and because it allows one to design objects that perform exactly how the client expects they should.

Consider a system where every mutable field was hidden behind accessors while every immutable field was not. Now consider the following code snippet:

class Node {
    private final List<Node> children;

    Node(List<Node> children) {
        this.children = new LinkedList<>(children);
    }

    public List<Node> getChildren() {
        return /* Something here */;
    }
}

Without knowing the exact implementation of Node, as you must do when you design by contract, anywhere you see root.getChildren(), you can only assume one of three things is occurring:

  • Nothing. The field children is returned as is, and you can't modify the list because you will break the immutability of Node. In order to modify the List you must copy it, an O(n) operation.
  • It is copied, for example: return new LinkedList<>(children);. This is an O(n) operation. You can modify this list.
  • An unmodifiable version is returned, for example: return new UnmodifiableList<>(children);. This is an O(1) operation. Again, in order to do modify this List you must copy it, an O(n) operation.

In all cases, modifying the returned list requires an O(n) operation to copy it, while read only access takes anywhere from O(1) or O(n). The important thing to note here is that by following design by contract you cannot know which implementation the library writer chose and thus must assume the worst case, O(n). Hence, O(n) access and O(n) to create your own modifiable copy.

Now consider the following:

class Node {
    public final UnmodifiableList<Node> children;

    Node(List<Node> children) {
        this.children = new UnmodifiableList<>(children);
    }
}

Now, everywhere you see root.children, there is exactly one possibility, namely it is an UnmodifiableList and thus you can assume O(1) access and O(n) for creating a locally mutable copy.

Clearly, one can draw conclusions about the performance characteristics of accessing the field in the latter case, whereas the only conclusion that can be made in the former is that the performance, in the worst case, and thus the case we must assume, is far worse than the direct field access. As a reminder, that means the programmer must take into account a O(n) complexity function on every access.


In summary, with this type of system, wherever one sees a getter the client automatically knows that either the getter corresponds to a mutable field, or the getter performs some sort of operation, whether it be a time consuming O(n) defensive copy operation, lazy initialization, conversion, or otherwise. Whenever the client sees a direct field access, they immediately know the performance characteristics of accessing that field.

By following this style, more information can be inferred by the programmer as to the contract provided by the object he/she is interacting with. This style also promotes uniform immutability because as soon as you change the above snippet's UnmodifiableList to the interface List, the direct field access allows the object to be mutated, thus forcing your object heirarchy to be carefully designed to be immutable from top to bottom.

The good news is, not only do you gain all the benefits of immutability, you are also able to infer the performance characteristics of accessing a field no matter where it is, without looking at the implementation and with confidence that it will never change.



回答4:

Joshua Bloch, in Effective Java (2nd Edition) "Item 14: In public classes, use accessor methods, not public fields," has the following to say about exposing immutable fields:

While it’s never a good idea for a public class to expose fields directly, it is less harmful if the fields are immutable. You can’t change the representation of such a class without changing its API, and you can’t take auxiliary actions when a field is read, but you can enforce invariants.

and summarizes the chapter with:

In summary, public classes should never expose mutable fields. It is less harmful, though still questionable, for public classes to expose immutable fields.



回答5:

You can have public final fields (to imitate some kind of immutability) but it doesn't mean that referenced objects can't change their state. We still need defensive copy in some cases.

 public class Temp {
    public final List<Integer> list;

    public Temp() {
        this.list = new ArrayList<Integer>();
        this.list.add(42);
    }

   public static void foo() {
      Temp temp = new Temp();
      temp.list = null; // not valid
      temp.list.clear(); //perferctly fine, reference didn't change. 
    }
 }


回答6:

What are the reasons for the continued use of accessors versus direct field access of immutable objects? Specifically, are there advantages to forcing the user to use accessors (for the client or library writer), and if so what are they?

You sounds like a procedural programmer asking why you cannot access fields directly, but have to create accessors. Main problem is that even the way you put your question is wrong. This is not how OO design works - you design object behavior through it's methods and expose that. Then you create internal fields if necessary which you need to implement that behavior. So putting it this way: "I am creating that fields and then expose each by a getter, this is verbose" is a clear sign of improper OO design.



回答7:

It's a OOP practice to encapsulate fields and then expose it only through getters method. If you expose field directly this means that you will have to make it public. Making fields public is not good idea as it exposes inner state of object.

So making your field/data members public is not a good practice and it violates Encapsulation principle of OOP. Also i would say it's not specific to Immutable objects; this is true for non-immutable objects as well.

Edit As pointed by @Thilo ; Another reason : Maybe you don't need to expose how a field is stored.

thanks @Thilo.



回答8:

One very practical reason for the continued practice of generating (I hope nobody writes them by hand nowadays) getters in Java programs, even for immutable "value" objects where, in my opinion, it is unnecessary overhead :

Many libraries and tools rely on the old JavaBeans conventions (or at least the getters and setters part of it).

These tools, that use reflection or other dynamic techniques to access field values via getters, cannot handle accessing simple public fields. JSP is an example that comes to my mind.

Also modern IDEs make it trivial to generate getters for one or many fields at a time, and also to change the name of the getter when the name of the field is changed.

So we just keep writing getters even for immutable objects.