310 likes | 495 Views
Applied Statistics – Challenges and Reward. Wenjiang Fu, Ph.D Computational Genomics Lab, Department of Epidemiology Michigan State University fuw@msu.edu www.msu.edu/~fuw. What is Statistics ?. “Lies, Damned Lies, and Statistics” “Figures fool when fools figure”
E N D
Applied Statistics – Challenges and Reward Wenjiang Fu, Ph.D Computational Genomics Lab, Department of Epidemiology Michigan State University fuw@msu.edu www.msu.edu/~fuw
What is Statistics ? • “Lies, Damned Lies, and Statistics” • “Figures fool when fools figure” • A branch of mathematical science that studies data through probability distribution and modeling. • Fields: probability theory, actuarial science, biostatistics, finance statistics, industrial statistics, etc. • Related fields: biometrics, bioinformatics, geo-statistics, statistical mechanics, econometrics, etc.
Knowledge & Information Decision “Data” Statistics Grand challenges we are facing … 21st century will be the golden age of statistics !
Grand challenges we are facing … • Data collection technology has advanced dramatically, but without sufficient statistical sampling design and experimental design. • Advancement of technology for discovering and retrieving useful information has been lagging and has become the bottleneck. • More sophisticated approaches are needed for decision making and risk management.
Statistical Challenges – Functional Data, Graph (Network) Data, and Shape Data
Statistics in Science Cosmic microwave background radiation High Energy Physics Genomic/proteomic data Tick-by-tick stock data
Statistics in Science Microarray Finger Prints
What do we do? • New ways of thinking and attacking problems • Finding sub-optimal but computationally feasible solutions. • New paradigm for new types of data • Be satisfied with ‘very rough’ approximations • Turn research results into easy and publicly available software and programs • Join force with computer scientists.
Some ‘hot’ research directions • Dimension reduction • Visualization • Dynamic systems • Simulation and real time computation • Uncertainty and risk management • Interdisciplinary research
Example 3 Medical study data: Ob/Gyn Modeling of PlGF: Placental Growth Factor
SNP: Single Nucleotide Polymorphism • Homologous pairs of chromosomes • Paternal allele • Maternal allele Paternal allele Maternal allele ACGAACAGCT TGCTTGTCGA SNP A/G ACGAGCAGCT TGCTCGTCGA
Haplotype[AB] SNP1: two allelesA and a SNP2: two allelesB and b Haplotype[ab] Diplotype[AB][ab] Allele, Haplotype and Diplotype a A b B
Microarray Technology: 2 channels Hybridization: A T C G T A G | | | | | | | T A G C A T C
Microarray normalization: between slides Boxplots of log ratios from 3 replicate self-self hybridizations. Left panel: before normalization Middle panel: after within print-tip group normalization Right panel: after a further between-slide scale normalization.
Affymetrix SNP Array ‘AB’ SNP: AC A – A, B – C. Illustration of SNP annotation on Affymetrix SNP array. Adopted from Matsuzaki et al 2004.
Computational Genomics Data: SNP Genotype Error rate : 1 – 5 % : GIGO – Garbage in Garbage out
Prospects I Genome-oriented Medicine Genetic Variation influences - disease susceptibility - disease progression - therapeutic response - unwanted drug effects Genetics is pointing the way to personalized medicine… With the development of human HapMap project, coupling with advanced statistical approaches, we are entering an era to design personalized medicine based on individual’s genetic profile.
Whole Genome-wide Association Studies • Successful study: • Wellcome Trust Case-Control Consortium • GWAS on 7 diseases with 14,000 patients and 2000 common controls. (Nature 2007) • Hypertension, diabetes, etc.
Recruiting Graduate Students • Epidemiology: Study distribution of Disease; • Biostatistics: data modeling, computation; • Quantitative Biology Initiative: MSU cross-disciplinary center. • Background: Mathematics, Statistics, Physics, Biology, Chemistry, and others. • Opportunity: Contact your department graduate director/chairman for funding from the Ministry of Education. MSU Epi/Biostatistics provide partial funding and cover tuition fee. • Qualification: TOEFL, GRE, GPA, Reference letter. • My contact: fuw@msu.edu www.msu.edu/~fuw • Application: WWW.MSU.EDU
Thank you! • Q and A. • Office: CMS 415.