260 likes | 385 Views
An intuitive introduction to information theory. Ivo Grosse Leibniz Institute of Plant Genetics and Crop Plant Research Gatersleben Bioinformatics Centre Gatersleben-Halle. Outline. Why information theory? An intuitive introduction. History of biology. St. Thomas Monastry, Brno. Genetics.
E N D
An intuitive introduction to information theory Ivo Grosse Leibniz Institute of Plant Genetics and Crop Plant Research Gatersleben Bioinformatics Centre Gatersleben-Halle
Outline • Why information theory? • An intuitive introduction
History of biology St. Thomas Monastry, Brno
Genetics Gregor Mendel 1822 – 1884 1866 Mendel‘s laws Foundation of Genetics Ca. 1900: Biology becomes a quantitative science
50 years later … 1953 James Watson & Francis Crick
DNA Watson & Crick 1953 Double helix structure of DNA 1953: Biology becomes a molecular science
1989 Goals: • Identify all of the ca. 30.000 genes • Identify all of the ca. 3.000.000.000 base pairs • Store all information in databases • Develop new software for data analysis
2003 Human Genome Project officially finished 2003: Biology becomes an information science
2003 – 2053 … biology = information science Systems Biology
What is information? • Many intuitive definitions • Most of them wrong • One clean definition since 1948 • Requires 3 steps • Entropy • Conditional entropy • Mutual information
Before starting with entropy … Who is the father of information theory? Who is this? Claude Shannon 1916 – 2001 A Mathematical Theory of Communication. Bell System Technical Journal, 27, 379–423 & 623–656, 1948
Before starting with entropy … Who is the grandfather of information theory? Simon bar Kochba Ca. 100 – 135 Jewish guerilla fighter against Roman Empire (132 – 135)
Entropy • Given a text composed from an alphabet of 32 letters (each letter equally probable) • Person A chooses a letter X (randomly) • Person B wants to know this letter • B may ask only binary questions • Question: how many binary questions must B ask in order to learn which letter X was chosen by A • Answer: entropyH(X) • Here: H(X) = 5 bit
Conditional entropy (1) • The sky is blu_ • How many binary questions? • 5? • No! • Why? • What’s wrong? • The context tells us “something” about the missing letter X
Conditional entropy (2) • Given a text composed from an alphabet of 32 letters (each letter equally probable) • Person A chooses a letter X (randomly) • Person B wants to know this letter • B may ask only binary questions • A may tell B the letter Y preceding X • E.g. • L_ • Q_ • Question: how many binary questions must B ask in order to learn which letter X was chosen by A • Answer: conditional entropyH(X|Y)
Conditional entropy (3) • H(X|Y) <= H(X) • Clear! • In worst case – namely if B ignores all “information” in Y about X – B needs H(X) binary questions • Under no circumstances should B need more than H(X) binary questions • Knowledge of Y cannot increase the number of binary questions • Knowledge can never harm! (mathematical statement, perhaps not true in real life )
Mutual information (1) • Compare two situations: • I: learn X without knowing Y • II: learn X with knowing Y • How many binary questions in case of I? H(X) • How many binary questions in case of II? H(X|Y) • Question: How many binary questions could B save in case of II? • Question: How many binary questions could B save by knowing Y? • Answer: I(X;Y) = H(X) – H(X|Y) • I(X;Y) = information in Y about X
Mutual information (2) • H(X|Y) <= H(X) I(X;Y) >= 0 • In worst case – namely if B ignores all information in Y about X or if there is no information in Y about X – then I(X;Y) = 0 • Information in Y about X can never be negative • Knowledge can never harm! (mathematical statement, perhaps not true in real life )
Mutual information (3) • Example 1: random sequence composed of A, C, G, T (equally probable) • I(X;Y) = ? • H(X) = 2 bit • H(X|Y) = 2 bit • I(X;Y) = H(Y) – H(X|Y) = 0 bit • Example 2: deterministic sequence … ACGT ACGT ACGT ACGT … • I(X;Y) = ? • H(X) = 2 bit • H(X|Y) = 0 bit • I(X;Y) = H(Y) – H(X|Y) = 2 bit
Mutual information (4) • I(X;Y) = I(Y;X) • Always! For any X and any Y! • Information in Y about X = information in X about Y • Examples: • How much information is there in the amino acid sequence about the secondary structure? How much information is there in the secondary structure about the amino acid sequence? • How much information is there in the expression profile about the function of the gene? How much information is there in the function of the gene about the expression profile? • Mutual information
Summary • Entropy • Conditional entropy • Mutual information • There is no such thing as information content • Information not defined for a single variable • 2 random variables needed to talk about information • Information in Y about X • I(X;Y) = I(Y;X) info in Y about X = info in X about Y • I(X;Y) >= 0 information never negative knowledge cannot harm • I(X;Y) = 0 if and only if X and Y statistically independent • I(X;Y) > 0 otherwise