I'm learning computer programming and at several places I've stumbled upon the concept of cohesion and I understand that it is desirable for a software to have "high cohesion" but what does it mean? I'm a Java, C and Python programmer learning C++ from the book C++ Primer which mentions cohesion without having it in the index, could you point me to some links about this topic? I did not find the wikipedia page about computer science cohesion informative since it just says it's a qualitative measure and doesn't give real code examples.
问题:
回答1:
High cohesion is when you have a class that does a well defined job. Low cohesion is when a class does a lot of jobs that don't have much in common.
Let's take this example:
You have a class that adds two numbers, but the same class creates a window displaying the result. This is a low cohesive class because the window and the adding operation don't have much in common. The window is the visual part of the program and the adding function is the logic behind it.
To create a high cohesive solution, you would have to create a class Window and a class Sum. The window will call Sum's method to get the result and display it. This way you will develop separately the logic and the GUI of your application.
回答2:
An explanation of what it is from Steve McConnell's Code Complete:
Cohesion refers to how closely all the routines in a class or all the code in a routine support a central purpose. Classes that contain strongly related functionality are described as having strong cohesion, and the heuristic goal is to make cohesion as strong as possible. Cohesion is a useful tool for managing complexity because the more code in a class supports a central purpose, the more easily your brain can remember everything the code does.
Some way of achieving it from Uncle Bob's Clean Code:
Classes should have a small number of instance variables. Each of the methods of a class should manipulate one or more of those variables. In general the more variables a method manipulates the more cohesive that method is to its class. A class in which each variable is used by each method is maximally cohesive.
In general it is neither advisable nor possible to create such maximally cohesive classes; on the other hand, we would like cohesion to be high. When cohesion is high, it means that the methods and variables of the class are co-dependent and hang together as a logical whole.
The notion of cohesion is strongly related with the notion of coupling; also, there is a principle based on the heuristic of high cohesion, named Single Responsibility Principle (the S from SOLID).
回答3:
High cohesion is a software engineering concept. Basically, it says a class should only do what it is supposed to do, and does it fully. Do not overload it with functions that it is not supposed to do, and whatever directly related to it should not appear in the code of some other class either.
Example is quite subjective, since we also have to consider the scale. A simple program should not be too modularized or it will be fragmented; while a complex program may need more level of abstractions to take care of the complexity.
e.g. Email class. It should contains data members to, from, cc, bcc, subject, body, and may contain these methods saveAsDraft(), send(), discardDraft(). But login() should not be here, since there are a number of email protocol, and should be implemented separately.
回答4:
Cohesion is usually measured using one of the LCOM (Lack of cohesion) metrics, the original LCOM metric came from Chidamber and Kemerer. See for example: http://www.computing.dcu.ie/~renaat/ca421/LCOM.html
A more concrete example: If a class has for example one private field and three methods; when all three methods use this field to perform an operation then the class is very cohesive.
Pseudo code of a cohesive class:
class FooBar {
private SomeObject _bla = new SomeObject();
public void FirstMethod() {
_bla.FirstCall();
}
public void SecondMethod() {
_bla.SecondCall();
}
public void ThirdMethod() {
_bla.ThirdCall();
}
}
If a class has for example three private fields and three methods; when all three methods use just one of the three fields then the class is poorly cohesive.
Pseudo code of a poorly cohesive class:
class FooBar {
private SomeObject _bla = new SomeObject();
private SomeObject _foo = new SomeObject();
private SomeObject _bar = new SomeObject();
public void FirstMethod() {
_bla.Call();
}
public void SecondMethod() {
_foo.Call();
}
public void ThirdMethod() {
_bar.Call();
}
}
The class doing one thing principle is the Single Responsibility Principle which comes from Robert C. Martin and is one of the SOLID principles. The principle prescribes that a class should have only one reason to change.
Staying close to the Single Responsibility Principle could possibly result in more cohesive code, but in my opinion these are two different things.
回答5:
This is an example of low cohesion:
class Calculator
{
public static void main(String args[])
{
//calculating sum here
result = a + b;
//calculating difference here
result = a - b;
//same for multiplication and division
}
}
But high cohesion implies that the functions in the classes do what they are supposed to do(like they are named). And not some function doing the job of some other function. So, the following can be an example of high cohesion:
class Calculator
{
public static void main(String args[])
{
Calculator myObj = new Calculator();
System.out.println(myObj.SumOfTwoNumbers(5,7));
}
public int SumOfTwoNumbers(int a, int b)
{
return (a+b);
}
//similarly for other operations
}
回答6:
A general way to think of the principle of cohesion is that you should locate a code along with other code that either depend on it, or upon which it depends. Cohesion can and should be applied to levels of composition above the class level. For instance a package or namespace should ideally contain classes that relate to some common theme, and that are more heavily inter-dependent than dependent on other packages/namespaces. I.e. keep dependencies local.
回答7:
cohesion means that a class or a method does just one defined job. the name of the method or class also should be self-explanatory. for example if you write a calculator you should name the class "calculator" and not "asdfghj". also you should consider to create a method for each task, e.g. subtract() add() etc... the programmer who might use your program in the future knows exactly what your methods are doing. good naming can reduce commenting efforts
also a principle is DRY - don't repeat yourself
回答8:
MSDN's article on it is probably more informative than Wikipedia in this case.
回答9:
The term cohesion was originally used to describe modules of source code as a qualitative measure of how well the source code of the module was related to each other. The idea of cohesion is used in a variety of fields. For instance a group of people such as a military unit may be cohesive, meaning the people in the unit work together towards a common goal.
The essence of source code cohesion is that the source code in a module work together towards a common, well defined goal. The minimum amount of source code needed to create the module outputs is in the module and no more. The interface is well defined and the inputs flow in over through the interface and the outputs flow back out through the interface. There are no side effects and the emphasis is on minimalism.
A benefit of functionally cohesive modules is that developing and automating unit tests is straightforward. In fact a good measure of the cohesion of a module is how easy it is to create a full set of exhaustive unit tests for the module.
A module may be a class in an object oriented language or a function in a functional language or non-object oriented language such as C. Much of the original work in this area of measuring cohesion mostly involved work with COBOL programs at IBM back in the 1970s so cohesion is definitely not just an object oriented concept.
The original intent of the research from which the concept of cohesion and the associated concept of coupling came from was research into what where the characteristics of programs that were easy to understand, maintain, and extend. The goal was to be able to learn best practices of programming, codify those best practices, and then teach the practices to other programmers.
The goal of good programmers is to write source code whose cohesion is as high as possible given the environment and the problem being solved. This implies that in a large application some parts of the source code body will vary from other parts as to the level of cohesion of the source code in that module or class. Some times about the best you can get is temporal or sequential cohesion due to the problem you are trying to solve.
The best level of cohesion is functional cohesion. A module with functional cohesion is similar to a mathematical function in that you provide a set of inputs and you get a specific output. A truly functional module will not have side effects in addition to the output nor will it maintain any kind of state. It will instead have a well defined interface which encapsulates the functionality of the module without exposing any of the internals of the module and the person using the module will provide a particular set of inputs and get a particular output in return. A truly functional module should be thread safe as well.
Many programming language libraries contain a number of examples of functional modules whether classes, templates, or functions. The most functional cohesive examples would be mathematical functions such as sin, cosine, square root, etc.
Other functions may have side effects or maintain state of some kind resulting in making the use of those functions more complicated.
For instance a function which throws an exception or sets a global error variable (errno
in C) or must be used in a sequence (strtok()
function is an example from the Standard C library as it maintains an internal state) or which provides a pointer which must then be managed or issues a log to some log utility are all examples of a function that is no longer functional cohesion.
I have read both Yourdon and Constantine's original book, Structured Programming, where I first came across the idea of cohesion in the 1980s and Meilir Page-Jones' book Practical Guide to Structured Systems Design, and Page-Jones did a much better job of describing both coupling and cohesion. The Yourdon and Constantine book seems a bit more academic. Steve McConnell's book Code Complete is quite good and practical and the revised edition has quite a bit to say about good programming practice.