I've recently started a course on data compression at my university. However, I find the use of the term "entropy" as it applies to computer science rather ambiguous. As far as I can tell, it roughly translates to the "randomness" of a system or structure.
What is the proper definition of computer science "entropy"?
(source: mit.edu)
from University of Mexico
Equation for Entropy in a sample application for probability calculation:
http://en.wikipedia.org/wiki/Entropy_(information_theory)
Entropy is like a hash code for virus researchers as well. Less entropy you get, it would mean that it is likely encrypted or compressed code which could be potentially be a virus.
A standard binary would have a higher entropy than a compressed or encrypted one.
My favorite definition, with a more practical focus, is found in Chapter 1 of the excellent book The Pragmatic Programmer: From Journeyman to Master by Andrew Hunt and David Thomas:
Text taken from: http://pragprog.com/the-pragmatic-programmer/extracts/software-entropy
I always encountered entropy in the sense of Shannon Entropy.
From http://en.wikipedia.org/wiki/Information_entropy:
In information theory, entropy is a measure of the uncertainty associated with a random variable. The term by itself in this context usually refers to the Shannon entropy, which quantifies, in the sense of an expected value, the information contained in a message, usually in units such as bits. Equivalently, the Shannon entropy is a measure of the average information content one is missing when one does not know the value of the random variable.
In terms of compression and information theory, the entropy of a source is the average amount of information (in bits) that symbols from the source can convey. Informally speaking, the more unlikely a symbol is, the more surprise its appearance brings.
If your source has two symbols, say
A
andB
, and they are equally likely, then each symbol conveys the same amount of information (one bit). A source with four equally likely symbols conveys two bits per symbol.For a more interesting example, if your source has three symbols,
A
,B
, andC
, where the first two are twice as likely as the third, then the third is more surprising but is also less likely. There's a net entropy of 1.52 for this source, as calculated below.You calculate entropy as the "average surprise", where the "surprise" for each symbol is its probability times the negative binary log of the probability:
The negative of the binary log is used (of course) because logs of values between 0 and 1 (exclusive) are negative.