270 likes | 300 Views
Statistical Applications in Biology and Genetics. Tian Zheng Wednesday, March 12, 2003. Outline. Biological Background Overview of quantitative research area related to genetics Sample project I: Bayesian Regression Analysis with application to Microarray studies
E N D
Statistical Applications in Biology and Genetics Tian Zheng Wednesday, March 12, 2003
Outline • Biological Background • Overview of quantitative research area related to genetics • Sample project I: Bayesian Regression Analysis with application to Microarray studies • Sample project II: BHTA algorithm for complex traits
Chromosomes and genes • Video from the Human Genome Project • You can also find links to background readings at : http://www.stat.columbia.edu/~tzheng/research/statgen.html • Celebrating the 50th Anniversary of the discovery of DNA double-helix structure.
Biology: Science of 21st century Everybody talks about it!
Computational Biology (1) • Sequence to function • Sequence alignment using wet-lab results • Model aligned sequences • Predict function to sequence with unknown function using model fitted • Sequence to structure of proteins • Significance: sequence structure function
Computational Biology (2) • Motif detection • Homology detection
Bioinformatics/Genomics • Gene expression analysis (using DNA chips or Microarray) • Protein regulatory network inference • Pedigree inference • Phylogeny inference
Genetic Epidemiology • Linkage mapping • Association mapping • Mapping for complex traits: quantitative traits, epistasis etc.
Linkage and Association • Gene, alleles; • Haplotype • Transmission • Cross-over and recombination • Linkage
Sample Project: Bayesian Regression Analysis • Mike West et al (2000) Bayesian Regression Analysis in the “large p, small n” Paradigm with application in DNA Microarray studies.
What is a Microarray/DNA chip How Chips Work?
Oligonucleotide Arrays Current “Golden Standard”!
Gene Expression Data • n experiments (patients, types of cell lines, types of cancer tissues, etc) • p genes on one array • Subtracted and normalized gene expression data is a n by p matrix