170 likes | 188 Views
Bioinformatics: One Minute and One Hour at a Time. Laurie J. Heyer L.R. King Asst. Professor of Mathematics Davidson College laheyer@davidson.edu. What is Bioinformatics?. Computer Science. Mathematics. Bioinformatics. Biology. Genomics, Proteomics and Systems Biology.
E N D
Bioinformatics: One Minute and One Hour at a Time Laurie J. Heyer L.R. King Asst. Professor of Mathematics Davidson College laheyer@davidson.edu
What is Bioinformatics? Computer Science Mathematics Bioinformatics Biology
Genomics, Proteomics and Systems Biology • Primary audience • Junior bio majors • Prerequisites • Bioinformatics and intro molecular biology or • One of several 300-level biology courses • Course home page: • http://www.bio.davidson.edu/genomics • “Math Minutes” • Taught by A. Malcolm Campbell (Biology)
Plotting Expression Data • One highlighted gene is induced 16 fold • One highlighted gene is repressed 16 fold • But induction looks much more dramatic
Log Transformation • Calculate log2 of each ratio • Ratio of 16 becomes value of 4 • Ratio of .0833 (1/16) becomes value of –4 • Induction and repression look equal, but opposite sign
Hierarchical Clustering • Join two most similar genes • Join next two most similar “objects” (genes or clusters of genes) • Distance from one gene to a set of genes is minimum of all distances from the gene to the individual members (Single Linkage) • Repeat until all genes have been joined
Genome Consortium for Active Teaching (GCAT) http://www.bio.davidson.edu/GCAT
High School Chips See Kathy Gabric’s page: http://cstaff.hinsdale86.org/~kgabric/honorscalendar.html
Bioinformatics Course • Prerequisites • Genomics or experience with modeling and “algorithmic thinking” • Goals: • To understand and apply various algorithms and statistical tests for analyzing DNA, RNA and protein sequences, and DNA microarray data. • To gain practical experience with Perl, a programming language widely used in molecular biology, web design, and text processing. • Course home page • http://gcat.davidson.edu/bioinformatics/bioinf.html
Bioinformatics Topics • Determining sequences • Comparing sequences • Finding genes • Predicting structure • Comparing genomes • Inferring phylogenies • Analyzing images • Clustering gene expression patterns • Designing experiments
Image Segmentation • Locate spot (signal) pixels • Measure intensity of signal and background in each channel • Compute ratio
Adaptive Circle Algorithm • Specify threshold % between darkest and lightest pixel • Pixels above threshold are “on”, others are “off” • Combine two binary images – if pixel is “on” in either image, it is “on” in combined image • Search for radius and center that maximize percent of “on” pixels
Adaptive Circle V2 (Dapple) • Compute 4-neighbor second-difference approximation to the Laplacian • Find sharply defined “upper” edge by convolving Laplacian with annular filters From “Dapple: Improved Techniques for Finding Spots on DNA Microarrays” UW CSE Technical Report UWTR-2000-08-05
Quality Clustering: QT Clust • 1. Each gene builds a supervised cluster • 2. Gene with “best” list, and genes in its list, becomes next cluster • 3. Remove these genes from consideration, and repeat • 4. Stop when all genes are clustered, or largest cluster is smaller than user specified threshold
Why teach Bioinformatics? • Critical thinking • Interdisciplinary • Integrative • Modeling • Data analysis • Computational science • Discrete math • Probability and statistics • Student research opportunities