In class today, my professor was discussing how to structure a class. The course primarily uses Java and I have more Java experience than the teacher (he comes from a C++ background), so I mentioned that in Java one should favor immutability. My professor asked me to justify my answer, and I gave the reasons that I've heard from the Java community:
- Safety (especially with threading)
- Reduced object count
- Allows certain optimizations (especially for garbage collector)
The professor challenged my statement by saying that he'd like to see some statistical measurement of these benefits. I cited a wealth of anecdotal evidence, but even as I did so, I realized he was right: as far as I know, there hasn't been an empirical study of whether immutability actually provides the benefits it promises in real-world code. I know it does from experience, but others' experiences may differ.
So, my question is, have there been any statistical studies done on the effects of immutability in real-world code?
Immutability is a good thing for value objects. But how about other things? Imagine an object that creates a statistic:
which should print "Processed 536.21 rows/s". How do you plan to implement
count()
with an immutable? Even if you use an immutable value object for the counter itself,s
can't be immutable since it would have to replace the counter object inside of itself. The only way out would be:which means to copy the state of
s
for every round in the loop. While this can be done, it surely isn't as efficient as incrementing the internal counter.Moreover, most people would fail to use this API right because they would expect
count()
to modify the state of the object instead of returning a new one. So in this case, it would create more bugs.I'd argue that your professor is just being obtuse -- not necessarily intentionally or even a bad thing. Its just that the question is too vague. Two real problems with the question:
For what its worth, the ability for the compiler to optimize immutable objects is well-documented. Off the top of my head:
map f . map g
tomap f . g
. Since Haskell functions are immutable, these expressions are guaranteed to produce equivalent output, but the second function runs twice as fast since we don't need to create an intermediate list.x = foo(12); y = foo(12)
totemp = foo(12); x = temp; y = temp;
is only possible if the compiler can guaranteefoo
is a pure function. To my knowledge, the D compiler can perform substitutions like this using thepure
andimmutable
keywords. If I remember correctly, some C and C++ compilers will aggressively optimize calls to these functions marked "pure" (or whatever the equivalent keyword is).Regarding concurrency, the pitfalls of concurrency using mutable state are well-documented and don't need to be restated.
Sure, this is all anecdotal evidence, but that's pretty much the best you'll get. The immutable vs mutable debate is largely a pissing match, and you are not going to find a paper making a sweeping generalization like "functional programming is superior to imperative programming".
At most, you'll probably find that you can summarize the benefits of immutable vs mutable in a set of best practices rather than as codified studies and statistics. For example, mutable state is the enemy of multithreaded programming; on the other hand, mutable queues and arrays are often easier to write and more efficient in practice than their immutable variants.
It takes practice, but eventually you learn to use the right tool for the job, rather than shoehorning your favorite pet paradigm into project.
As other comments have claimed, it would be very, very hard to collect statistics on the merits of immutable objects, because it would be virtually impossible to find control cases - pairs of software applications which are alike in every way, except that one uses immutable objects and the other does not. (In nearly every case, I would claim that one version of that software was written some time after the other, and learned numerous lessons from the first, and so improvements in performance will have many causes.) Any experienced programmer who thinks about this for a moment ought to realize this. I think your professor is trying to deflect your suggestion.
Meanwhile, it is very easy to make cogent arguments in favor of immutability, at least in Java, and probably in C# and other OO languages. As Yishai states, Effective Java makes this argument well. So does the copy of Java Concurrency in Practice sitting on my bookshelf.
I would point to Item 15 in Effective Java. The value of immutability is in the design (and it isn't always appropriate - it is just a good first approximation) and design preferences are rarely argued from a statistical point of view, but we have seen mutable objects (Calendar, Date) that have gone really bad, and serious replacements (JodaTime, JSR-310) have opted for immutability.
The biggest advantage of immutability in Java, in my opinion, is simplicity. It becomes much simpler to reason about the state of an object, if that state cannot change. This is of course even more important in a multi-threaded environment, but even in simple, linear single-threaded programs it can make things far easier to understand.
See this page for more examples.
Immutable objects allow code which to share an object's value by sharing a reference. Mutable objects, however, have the identity that code which wants to share an object's identity to do so by sharing a reference. Both kinds of sharing are essential in most applications. If one doesn't have immutable objects available, it's possible to share values by copying them into either new objects or objects supplied by the intended recipient of those values. Getting my without mutable objects is much harder. One could somewhat "fake" mutable objects by saying
stateOfUniverse = stateOfUniverse.withSomeChange(...)
, but would requires that nothing else modifystateOfUniverse
while itswithSomeChange
method is running [precluding any sort of multi-threading]. Further, if one were e.g. trying to track a fleet of trucks, and part of the code was interested in one particular truck, it would be necessary for that code to always look up that truck in a table of trucks any time it might have changed.A better approach is to subdivide the universe into entities and values. Entities would have changeable characteristics, but an immutable identity, so a storage location of e.g. type
Truck
could continue to identify the same truck even as the truck itself changes position, loads and unloads cargo, etc. Values would not have generally have a particular identity, but would have immutable characteristics. ATruck
might store its location as typeWorldCoordinate
. AWorldCoordinate
that represents 45.6789012N 98.7654321W would continue to so as long as any reference to it exists; if a truck that was at that location moved north slightly, it would create a newWorldCoordinate
to represent 45.6789013N 98.7654321W, abandon the old one, and store a reference to that new one.It is generally easiest to reason about code when everything encapsulates either an immutable value or an immutable identity, and when the things which are supposed to have an immutable identity are mutable. If one didn't want to use any mutable objects outside a variable
stateOfUniverse
, updating a truck's position would require something like:but reasoning about that code would be more difficult than would be: