可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I've recently started a course on data compression at my university. However, I find the use of the term "entropy" as it applies to computer science rather ambiguous. As far as I can tell, it roughly translates to the "randomness" of a system or structure.
What is the proper definition of computer science "entropy"?
回答1:
Entropy can mean different things:
Computing
In computing, entropy is the
randomness collected by an operating
system or application for use in
cryptography or other uses that
require random data. This randomness
is often collected from hardware
sources, either pre-existing ones such
as mouse movements or specially
provided randomness generators.
Information theory
In information theory, entropy is a
measure of the uncertainty associated
with a random variable. The term by
itself in this context usually refers
to the Shannon entropy, which
quantifies, in the sense of an
expected value, the information
contained in a message, usually in
units such as bits. Equivalently, the
Shannon entropy is a measure of the
average information content one is
missing when one does not know the
value of the random variable
Entropy in data compression
Entropy in data compression may denote the randomness of the data that you are inputing to the compression algorithm. The more the entropy, the lesser the compression ratio. That means the more random the text is, the lesser you can compress it.
Shannon's entropy represents an
absolute limit on the best possible
lossless compression of any
communication: treating messages to be
encoded as a sequence of independent
and identically-distributed random
variables, Shannon's source coding
theorem shows that, in the limit, the
average length of the shortest
possible representation to encode the
messages in a given alphabet is their
entropy divided by the logarithm of
the number of symbols in the target
alphabet.
回答2:
My favorite definition, with a more practical focus, is found in Chapter 1 of the excellent book The Pragmatic Programmer: From Journeyman to Master by Andrew Hunt and David Thomas:
Software Entropy
While software development is immune
from almost all physical laws, entropy
hits us hard. Entropy is a term from
physics that refers to the amount of
"disorder" in a system. Unfortunately,
the laws of thermodynamics guarantee
that the entropy in the universe tends
toward a maximum. When disorder
increases in software, programmers
call it "software rot."
There are many factors that can
contribute to software rot. The most
important one seems to be the
psychology, or culture, at work on a
project. Even if you are a team of
one, your project's psychology can be
a very delicate thing. Despite the
best laid plans and the best people, a
project can still experience ruin and
decay during its lifetime. Yet there
are other projects that, despite
enormous difficulties and constant
setbacks, successfully fight nature's
tendency toward disorder and manage to
come out pretty well.
...
...
A broken window.
One broken window, left unrepaired for
any substantial length of time,
instills in the inhabitants of the
building a sense of abandonment—a
sense that the powers that be don't
care about the building. So another
window gets broken. People start
littering. Graffiti appears. Serious
structural damage begins. In a
relatively short space of time, the
building becomes damaged beyond the
owner's desire to fix it, and the
sense of abandonment becomes reality.
The "Broken Window Theory" has
inspired police departments in New
York and other major cities to crack
down on the small stuff in order to
keep out the big stuff. It works:
keeping on top of broken windows,
graffiti, and other small infractions
has reduced the serious crime level.
Tip 4
Don't Live with Broken Windows
Don't leave "broken windows" (bad
designs, wrong decisions, or poor
code) unrepaired. Fix each one as soon
as it is discovered. If there is
insufficient time to fix it properly,
then board it up. Perhaps you can
comment out the offending code, or
display a "Not Implemented" message,
or substitute dummy data instead. Take
some action to prevent further damage
and to show that you're on top of the
situation.
Text taken from: http://pragprog.com/the-pragmatic-programmer/extracts/software-entropy
回答3:
I always encountered entropy in the sense of Shannon Entropy.
From http://en.wikipedia.org/wiki/Information_entropy:
In information theory, entropy is a measure of the uncertainty associated with a random variable. The term by itself in this context usually refers to the Shannon entropy, which quantifies, in the sense of an expected value, the information contained in a message, usually in units such as bits. Equivalently, the Shannon entropy is a measure of the average information content one is missing when one does not know the value of the random variable.
回答4:
(source: mit.edu)
from University of Mexico
The information theoretic notion of
Entropy is a generalization of the
physical notion. There are many ways
to describe Entropy. It is a measure
of the randomness of a random
variable. It is also a measure of the
amount of information a random
variable or stochastic process
contains. It is also a lower bound on
the amount a message can be
compressed. And finally it is the
average number of yes/no questions
that need to be asked about an random
entity to determine its value.
Equation for Entropy in a sample application for probability calculation:
it is the sum over all values of a rv
of the probability of that value times
the log of that prob(i.e.
p(x)logp(x)). This equation can be
derived from first principles of the
properties of information.
回答5:
In terms of compression and information theory, the entropy of a source is the average amount of information (in bits) that symbols from the source can convey. Informally speaking, the more unlikely a symbol is, the more surprise its appearance brings.
If your source has two symbols, say A
and B
, and they are equally likely, then each symbol conveys the same amount of information (one bit). A source with four equally likely symbols conveys two bits per symbol.
For a more interesting example, if your source has three symbols, A
, B
, and C
, where the first two are twice as likely as the third, then the third is more surprising but is also less likely. There's a net entropy of 1.52 for this source, as calculated below.
You calculate entropy as the "average surprise", where the "surprise" for each symbol is its probability times the negative binary log of the probability:
binary
symbol weight probability log surprise
A 2 0.4 -1.32 0.53
B 2 0.4 -1.32 0.53
C 1 0.2 -2.32 0.46
total 5 1.0 1.52
The negative of the binary log is used (of course) because logs of values between 0 and 1 (exclusive) are negative.
回答6:
Here is a great alternate explanation for entropy in information theory.
Entropy is a measure of uncertainty involved in making a
prediction.
We can also describe entropy as how surprised we would be if we get an outcome after we made our initial prediction.
Lets say we have a bent coin that gives us a head 99% of the time and a tail 1% of the time. Since there is only a one percent chance of getting a tail, we would be very surprised if we actually get a tail. On the other hand, it won't be too surprising if we got a head as we already have a 99 percent chance of getting a head.
lets assume that we have a function called Surprise(x)
that would give us the amount of surprise for each outcome; then we can average the amount of surprise on a probability distribution. This average amount of surprise could also be used as a measure for how uncertain we are. This uncertainty is called entropy.
回答7:
Super SIMPLE definition
The word entropy can be defined in one sentence:
"The amount of information needed to describe a system."
Imagine for an example the expansion of the universe: From the beginning, all matter was collected in a small point before the big bang, so we could have described the system with "all matter is within one point." While today significantly more information is required to describe the system (the Universe, that is), one would need to describe all planetary positions, their movement, what's on them etc..
In terms of information theory, the definition also works: E.g: The more letters you add to a password (the system), the more information is needed to describe the password. Then you can measure it in different units, eg bits or characters, like
"hello" = 5 characters entropy = 40 bits of entropy (if charsize is 8 bits).
From this also comes that the more information you have the more ways you can arrange that information in. If you have 40 bits there are 2^40 different ways they can be arranged. If we are talking passwords here then the more possible arrangements of the information (bits) the longer it is going to take cracking (with brute force or dictionary attacks).
回答8:
http://en.wikipedia.org/wiki/Entropy_(information_theory)
回答9:
In simpler words, Entropy defines randomness. It’s more like how unpredictable something is. In more technical words, “In computing, entropy is the randomness collected by an operating system or application for use in cryptography or other uses that require random data. This randomness is often collected from hardware sources, either pre-existing ones such as mouse movements or specially provided randomness generators.” as defined by wikipedia.
One can now easily conclude the meaning of entropy in respect to a file as the measurement of the how much disordered the bytes are in a file. There are various units used for defining entropy like nat, shannon or hartley. Well, most common unit used is Shannon. The range of values a file’s entropy must come in as per Shannon’s algorithm is 0 to 8. So, when the entropy value is zero, one can say the outcome is certain. On contrary, when the entropy value is 8, the outcome is most unpredictable it could be. The formula given by Shannon to measure randomness in outcome of events is:
Entropy = ∑ pi log(1/pi)
where i is the event with probability pi.
This equation will always result in between 0 to 8.
For more information, go through the link: https://www.talentcookie.com/2016/02/file-entropy-in-malware-analysis/
回答10:
entropy refers to the extent where a software is reshaped occasionally basing on customer requirements hence the cost for reshaping it to meet customer reqrments becomes maximum.
回答11:
Entropy is like a hash code for virus researchers as well. Less entropy you get, it would mean that it is likely encrypted or compressed code which could be potentially be a virus.
A standard binary would have a higher entropy than a compressed or encrypted one.
回答12:
Entropy has many meanings typically in Computer Science. It depends on the context. In security entropy means how much randomality you place, for instance when you generate a private key many applications ask you to move the mouse around to generate entropy. This generates entropy by taking the "human" element of randomality and adds it to the hashing process of generating the key.
Now there is also a defnition for software engineering of entropy. This definition represents out of date code, or code that has had many developers writing it. Typically used in reference to when it is near time to refactor your software project. "The code for this project has an enourmous amount of entropy because many of the individuals who maintained it are not on the project currently".
Here is a third example usage that I remembered too. In the topic of simulated annealing (as far as computer science is concerned), entropy is described as how much decay has happened during the evaluation of the algorithm.
I guess to answer your question though, there is not a concrete definition of the word 'entropy' except for the ones that you can find in a dictionary. How computer science tends to apply that term depends on the context of the term being used and what it is being applied to.
回答13:
It's easy to make a big deal out of entropy. To my mind it is a pretty simple and useful concept.
Basically it quantifies what, on average, you will learn from an event, like flipping a coin, taking a branch instruction, or indexing an array.
Like a comparison operation in the middle of a search algorithm has a certain probability P of taking one branch, and 1-P of taking the other.
Suppose P is 1/2, as it is in a binary search. Then if you take that branch, you know 1 bit more than you did before, because log(2/1), base 2, is 1. On the other hand, if you take the other branch you also learn 1 bit.
To get the average amount of information you will learn, multiply what you learn on the first branch times the probability you take that branch, plus what you learn on the second branch times the probability of that branch.
1/2 times 1 bit, plus 1/2 times 1 bit, is 1/2 bit plus 1/2 bit, or total 1 bit of entropy. That's what you can expect to learn on average from that decision.
On the other hand, suppose you are doing linear search in a table of 1024 entries.
On the first == test, the probability of YES is 1/1024, so the entropy of YES at that decision is
1/1024 times log(1024/1)
or 1/1024 * 10 = about 1/100 bit.
So if the answer is YES, you learn 10 bits, but the chance of that is about 1 in a thousand.
On the other hand, NO is much more likely. It's entropy is
1023/1024 * log(1024/1023)
or roughly 1 times roughly zero = about zero.
Add the two together, and on average you will learn about 1/100 of a bit on that decision.
That's why linear search is slow. The entropy (how much you can expect to learn) at each decision is too small, since you're going to have to learn 10 bits to find the entry in the table.
回答14:
Entropy in computer science commonly refers to how random a string of bits is.
The following question is about making that precise:
How do I compute the approximate entropy of a bit string?
回答15:
In simple words if you know the probabilities of symbols in the langauge ,one can compute the average information content of symbol in the language .
Or
The entropy of a language is a measure of the information content of an average symbol in the language
Consider a fair coin ;
There are two symbols ,each with probability 1/2
so entropy is calculated as
h =-(1/2*log1/2 +1/2*log1/2)=1
回答16:
I've heard people misuse the thermodynamic definitions of entropy w.r.t CS.
E.g. Entropy is definitely increasing in this system.
When what they mean is this code is getting worse and worse!