What is "Mutual Information"?

January 6th, 2009
  • What is "Mutual Information"?


  • Hi herbjmy, I believe I know the meaning of the term you are referring to, but could you put it into context just so I can confirm my answer? Thanks, answerguru-ga


  • Mutual information is a termed used in the science of information theory. Information, developed by Claude Shannon at Bell Labs in the 1940s, http://www.lucent.com/minds/infotheory/ , has applications particularly in machine learning (also pattern classification, aritificial intelligence, statistics, and the like) and communication (including coding and cryptography). See http://www.math.psu.edu/gunesch/Entropy/infcode.html for various links on information theory; see, for example, http://131.111.48.24/pub/mackay/info-theory/course.html for a short course. The standard textbook on information theory is Elements of Information Theory by Thomas Cover and Joy Thomas. In order to understand the notion of mutual information, two preliminary concepts must first be understood: entropy and conditional entropy. Of the many links on the net to these topics, one of the clearest expositions seems to be the course notes to the Stanford course in information theory at: http://www.stanford.edu/class/ee376a/ . The general introduction is in lecture 1 at http://www.stanford.edu/class/ee376a/handouts/lect01.pdf and mutual information is defined, with examples, at: http://www.stanford.edu/class/ee376a/handouts/lect02.pdf . (You will need a pdf viewer to access these files, see: http://www.adobe.com/products/acrobat/readstep2.html ) . Rather than recap the entire first two lectures, I will give a very quick bird's eye view here: Entropy is a measure of the uncertainty of a random variable. If we think of a discrete random variable X, the entropy of that variable is the average number of bits required to represent an observation of X from a long string of symbols from X. The more uniform X is, the more information any particular value gives us, and the higher the entropy. Conditional entropy of two random variables X and Y, H(XY) is a measure of the uncertainty in X once we know Y. Once Y is known, we can represent X with on average H(XY) bits. The difference, H(X)-H(XY) can therefore be interpreted as a measure of how much information Y gives about X on average. This difference is the mutual information of X and Y, usually written I(X;Y). An example of the use of mutual information in the interpretation of MRI images of breasts is at: http://www-ipg.umds.ac.uk/d.rueckert/research/breast/breast.html . In this case, the researcher was interested in the amount of information one image, A, gives about a second image, B. Search strategy: "information theory" "mutual information" "mutual information" example. OTHER LINKS: A short course on information theory is here: http://www.inference.phy.cam.ac.uk/mackay/info-theory/course.html Primer for biologists http://www.lecb.ncifcrf.gov/~toms/paper/primer/ Mathematical definitions: http://cgm.cs.mcgill.ca/~soss/cs644/projects/simon/Entropy.html


  • Can you provide more information about "Mutual Information" it self? The answer you provided before is more likely an answer to "Information Theory - Entropy". For example, can you give me the formula of "mutual information", or any relative technology on similar perpose? Thank you very much.


  • Intuitively speaking, the mutual information of two random variables is the amount of information they have in common.( http://www.stanford.edu/class/ee376a/handouts/lect02.pdf [page 4]). The for mutual information *is*: I(X;Y) = H(X) - H(X Y) where H(X) is the entropy of X and H(XY) is the entropy of X given Y. Actually though it is possible to expand this out in terms of fundamental definitions as shown on page 3 of the reference: http://www.stanford.edu/class/ee376a/handouts/lect02.pdf . Suppose we have a random variable X that takes value x with probability p(x) and takes value y with probability p(y). Let p(x,y) be the probability that X=x and Y=y. Then the mutual information of X and Y is the sum, over all pairs {x,y} of p(x,y) log (p(x,y) / ( p(x) p(y) ) Intuitively speaking, applications of mutual information would include situations where we are interested in finding out how much information two items have in common. Here are a couple of examples: The thesis "Alignment by Maximization of mutual information" http://citeseer.nj.nec.com/cache/papers/cs/9621/http:zSzzSzwww.ai.mit.eduzSz~violazSzresearchzSzpublicationszSzPHD-thesis.pdf/viola95alignment.pdf describes how we can use mutual information to determine the correct alignment of an image. The correct alignment will be the alignment that maximizes the mutual information between the two images. Suppose you are given two photographs of an object and you want to determine if they are photographs of the same object. The mutual information formula might be used (albeit with additional mathematical massaging) to determine if these objects have a lot of information in common, in which case they might be the same object. Consider an assembly line that has a camera that takes pictures of parts coming down the assembly line and then feeds the image to a robot that has to put the part into the correct orientation. Mutual information can be used to help determine the correct orientation. In biology, we might be interested in how similar two sequences of DNA are, to determine, for instance, if they represent the same gene. The mutual information formula might be involved here. http://www.smi.stanford.edu/projects/helix/psb00/grosse.pdf "Average mutual information of coding and noncoding DNA". However, I am not sure I would call mutual information a "technology" in your phrase; it's a just a mathematical definition.







  • #If you have any other info about this subject , Please add it free.#
    Your name:
    E-mail:
    Telphone:

    Your comments:


    If you have any other info about What is "Mutual Information"? , Please add it free.