I feel the term rather pejorative. Hence, I am flabbergasted by the two sentences in Wikipedia:
Imperative programming is known for employing side effects to make programs function. Functional programming in turn is known for its minimization of side effects. [1]
Since I am somewhat Math-biased, the latter sounds excellent. What are the arguments for side-effects? Do they mean the loss of control or the acceptance of uncertainty? Are they a good thing?
Side effects are essential for a significant part of most applications. Pure functions have a lot of advantages. They are easier to think about because you don't have to worry about pre and post-conditions. Since they don't change state, they are easier to parallelize, which will become very important as the processor-count goes up.
Side effects are inevitable. And they should be used whenever they are a better choice than a more complicated but pure solution. The same goes for pure functions. Sometimes a problem is better approached with a functional solution.
It's all good =) You should use different paradigms according to the problem you're solving.
Side effects are a necessary evil, and one should seek to minimize/localize them.
Other comments on the thread say effect-free programming is sometimes not as intuitive, but I think that what people consider "intuitive" is largely a result of their prior experience, and most people's experience has a heavy imperative bias. Mainstream tools are becoming more and more functional each day, because people are discovering that effect-free programming leads to fewer bugs (though admittedly sometimes a new/different class of bugs) due to less possibility of separate components interacting via effects.
Almost no one has mentioned performance, and effect-free programming usually has worse performance than effectful, since computers are von-Neumann machines that are designed to work well with effects (rather than being designed to work well with lambdas). Now that we're in the midst of the multi-core revolution, this may change the game as people discover they need to take advantage of cores to gain perf, and whereas parallelization sometimes takes a rocket-scientist to get right effect-fully, it can be easy to get right when you're effect-free.
Every so often I see a question on SO which forces me to spend half an hour editing a really bad Wikipedia article. The article is now only moderately bad. In the part that bears on your question, I wrote as follows:
Without side effects, you can't perform I/O operations; so you can't make a useful application.
Without side-effects, you simply can't do certain things. One example is I/O, since making a message appear on the screen is, by definition, a side-effect. This is why it's a goal of functional programming to minimize side-effects, rather than eliminate them entirely.
Setting that aside, there are often instances where minimizing side-effects conflicts with other goals, like speed or memory efficiency. Other times, there's already a conceptual model of your problem that lines up well with the idea of mutating state, and fighting against that existing model can be wasted energy and effort.
That quote really made me chuckle. That said, I find the minimization of side effects to really translate to code that is much easier to reason about and maintain. However, I don't have the luxury of exploring functional programming quite as much as I would like.
The way I look at it when working in object-oriented and procedural languages that revolve around side effects is to contain and isolate side effects.
As a basic example, a video game has a necessary side effect of rendering graphics to a screen. However, there are two different kinds of design paths here with respect to side effects.
One seeks to minimize and loosen coupling by making the renderer very abstract and basically told what to render. The other parts of the system then tell the renderer what to draw and that could be a batch of primitives like triangles and points with projection and modelview matrices or maybe something higher-level like abstract models and cameras and lights and particles. Either way, such a design revolves around many things causing external side effects, since potentially many parts of the codebase will be pushing changes to the renderer (no matter how abstract or indirect, the net effect is still a whole bunch of things in such a system triggering external rendering side effects).
The other way is to contain/isolate those side effects. Instead of the renderer being told what to render, it instead becomes coupled to the game world (though this could just be some basic abstractions and maybe access to a scene graph). Now it accesses the scene on its own (read-only access) and looks through the scene and figures out what to render using more of a pull-style design. That leads to more coupling from renderer to game world but it also means the side effects related to screen output are now completely contained inside the renderer.
This latter design contains or isolates side effects, and I find that type of design much easier to maintain and keep correct. It still causes side effects but all the side effects related to outputting graphics to a screen are now entirely contained in the renderer. If there's an issue there, you know the bug is going to be in the renderer code and not the result of something external misusing it and telling it the wrong things to do.
Because of this, when it comes to coupling, I have always found it more desirable to maximize efferent (outgoing) couplings in things that cause external side side effects and minimize afferent (incoming) couplings. This applies regardless of abstractions. In the context of side effects, a dependency to
IRenderer
is still a dependency to a concreteRenderer
as far as communication goes with respect to what side effects are going to happen. The abstraction makes no difference as far as what side effects are going to occur.The renderer should depend on the rest of the world so that it can completely isolate those side effects to the screen; the rest of the world shouldn't depend on the renderer. Same kind of analogy for a file saver. The file saver shouldn't be told what to save by the outside world. It should look at the world around it and figure out what to save on its own. Such would be the design path that seeks to isolate and contain side effects; it tends to be more pull-based than push-based. The result tends to introduce a bit more coupling (though it could be loose) if you graph out the dependencies since the saver might need to be coupled with things it's not even interested in saving, or the renderer might need read-only access to things it's not even interested in rendering to discover the things it is interested in renderering.
However, the end result is that the dependencies flow away from side effects instead of towards side effects. When we have a system with many dependencies flowing towards pushing external side effects, I have always found those the hardest to reason about since so many parts of the system could potentially be changing external states to the point where it's not just hard to figure out what's going to happen but also when and where. So the most straightforward way to correct/prevent that problem is to seek to make dependencies flow away from side effects, not towards them.
Anyway, I have found favoring these types of designs the practical way to help avoid bugs and also help detect and isolate them when they exist to make them easier to reproduce and correct.
Another useful strategy I find is to make side effects more homogeneous for any given loop/phase of the system. For example, instead of doing a loop which removes associated data from something, delinks it, and then removes it, I have found it much easier if you do three homogeneous loops in such cases. The first homogeneous loop can remove associated data. The second homogeneous loop can delink the node. The third homogeneous loop can remove it from the rest of the system. That's on a lower-level note related more to implementation than design, but I have often found the result easier to reason about, maintain, and also even optimize (easier to parallelize, e.g., and with improved locality of reference) -- you take those non-homogeneous loops triggering multiple different types of side effects and break them down into multiple homogeneous loops, each triggering just one uniform kind of side effect.