720 likes | 1.07k Views
Interdisciplinary Faculty Collaboration Creating a Bioinformatics Minor June 5, 2013 Michael Werner, John Russo, Hongsheng Wu, David Rilett , Ali Ahrabi , Laurie Grove, Douglas Dow, Paloma Valverde. The Current & Future Vision. Making Wentworth even better.
E N D
Interdisciplinary Faculty Collaboration Creating a Bioinformatics Minor June 5, 2013 Michael Werner, John Russo, Hongsheng Wu, David Rilett, Ali Ahrabi, Laurie Grove, Douglas Dow, PalomaValverde The Current & Future Vision Making Wentworth even better
Bioinformatics: Courses and Faculty Collaborations Douglas E. Dow, Ph.D. Department of Biomedical Engineering Department of Electrical Engineering and Technology Wentworth Institute of Technology dowd@wit.edu
What is Bioinformatics? Related Fields • Biomedical Engineering – application of engineering and science to problems in medicine and biology • Biotechnology – utilization of living systems to make useful products. Typically involving analysis or manipulation of genomic information. “Bio- informatics – conceptualizing biology in terms of molecules (in the sense of physical-chemistry) and then applying "informatics" techniques (derived from disciplines such as applied math, CS, and statistics) to understand and organize the information associated with these molecules, on a large-scale.” From http://bioinfo.mbb.yale.edu/what-is-it/: Computer Science & Math Bioinformatics Biology Luscombe, N. M.; Greenbaum, D.; Gerstein, M. Methods Inf Med. 2001, 40, 346-58.
What is Bioinformatics? • Bioinformatics employs a wide range of computational techniques including: • sequence and structural alignment • database design and data mining • macromolecular geometry • phylogenetic tree construction • prediction of protein structure and function • gene finding • expression data clustering Applications Luscombe, N. M.; Greenbaum, D.; Gerstein, M. Methods Inf Med. 2001, 40, 346-58.
Interdisciplinary Minor in Bioinformatics PalomaValverde Chair of Sciences Michael Werner Computer Science • Collaboration of faculty in 3 Departments: • Sciences • Computer Science & Network • Biomedical Engineering • Computer Science started offering bioinformatics courses toward a concentration in 2006 • Biomedical Engineering started as major in Fall 2011 • Five core chemistry and biology courses • One each of computer programing and biostatistics • Strong interest in many students to go deeper into biology and biochemistry (3 new electives) • Introduction to Medical Biotechnology • Directed Study in Biological Research • Proteins, Medicine and Disease • Minor developed during Spring 2013 Pathway to life sciences major? Ali Ahrabi Biology Hongsheng Wu Computer Science Laurie Grove Chemistry John P. Russo Computer Science Douglas Dow Biomedical Eng. David Rilett Computer Science
Course Requirements • Minimum of 20 credits required • Three core courses (12 credits) • BIOL 130: Cell and Molecular Biology • COMP 601: Intro to Bioinformatics • COMP 611: Intro to Biostatistics • Two electives chosen from the options below (8 credits) • COMP 602: Bioinformatics Algorithms • COMP 612: Biological Data Mining • CHEM 410: Basics of Organic and Biochemistry • CHEM 420: Proteins, Medicine and Disease • BIOL 250: Introduction to Medical Biotechnology • BIOL 406: Directed Study in Biological Research • Course descriptions and additional information can be found at: http://www.wit.edu/sciences/minors/minor-bioinformatics.html
Biol250: Intro to Medical Biotechnology Ali Ahrabi, Ph.D. Department of Sciences Wentworth Institute of Technology ahrabia@wit.edu
Many scientific disciplines that contribute to Biotechnology
BIOTECHNOLOGY • Biotechnology is the use of living systems and organisms to develop or make useful products, or "any technological application that uses biological systems, living organisms or derivatives thereof, to make or modify products or processes for specific use" (UN Convention on Biological Diversity) • How medical diagnosis has changed as a result of biotechnology • How data from the Human Genome Project will be used to diagnose and treat human medical disease conditions. Types of Biotechnology • Microbial Biotechnology • Agricultural Biotechnology • Animal Biotechnology • Forensic Biotechnology • Bioremediation Biotechnology • Aquatic Biotechnology • Medical Biotechnology
Dolly(5 July 1996 – 14 February 2003) Dolly was a female domestic sheep, and the first mammal to be cloned from an adult somatic cell, using the process of nuclear transfer
OPPONENTS Battling Bioterrorism
Bioremediation Strains of the bacterium Pseudomonas used to help clean Alaskan beaches following the Exxon Valdez oil spill
Aquatic Biotechnology Casper: A transparent Zebrafish A normal Zebrafish
Forensic Biotechnology DNA Fingerprinting PCR
Human Genome Project Gene therapy Snips are the cause of some genetic diseases
PRESENT & FUTURE Nanotechnology Regenerative Medicine
The Biotechnology Workforce Development of Skills for Internships & Co-Ops Wentworth
Distribution of U.S. Biotechnology Companies Successful Biotech Companies Revenue of $3B to $20B
Chem420: Proteins, Medicine and Disease Laurie Grove, Ph.D. Department of Sciences Wentworth Institute of Technology grovel@wit.edu
Chem 420: Proteins, Medicine and Disease Incorporates faculty research into undergraduate curriculum • This course introduced students to: • General biochemistry including protein structure and enzyme catalysis • Basic concepts and mechanisms of disease • Development of medicines to treat diseases • Laboratory methods and computational work used in drug discovery • Students were introduced to tools used in structural bioinformatics • Structural bioinformatics – analysis and prediction of 3D structure of biological macromolecules • Molecular recognition (binding interactions) • Protein folding • Structure/function relationships • Why/How/When these tools are used in industry to design new medicines Each student applied these tools to solve a problem related to drug discovery.
Structural Bioinformatics Application: Drug Discovery How is structural bioinformatics used? Drug discovery • Enzymes catalyze chemical reactions in the body by binding to a substrate and converting it to the product. Enzymes are the “machines” of the body. Substrate Product • Many medicines work by blocking substrate access to the enzyme binding site competitive inhibition. • Structural bioinformatics can be used to detect inhibitor binding sites.
Protein mapping • Faculty research: Protein mapping • Computational method used to find a binding site on a protein surface • Detects regions of the protein surface that bind small organic probe molecules with high affinity. • Student Projects Use protein visualization, mapping and docking tools to investigate three problems in drug discovery. • Drug resistance • Medicines versus toxins • Drug selectivity
Examples of Student Projects: Chem 420 CAII, CA IX, and CA XII Student Project: Drug Selectivity (Kirsten Wilde) • Carbonic anhydrases (CAs) play a vital role in the regulation of pH and fluid balance. • 14 different types (I – XIV) found in different tissues throughout the body each is related to a variety of different diseases. • Due to the similar structure, targeting just one type of CA for disease treatment is difficult. Poor selectivity often leads to side effects. Alterio, V.; Di Fiore, A.; D’Ambrosio, K.; Supuran, C. T.; Giuseppina, D. S. Chem. Rev.2012, 112, 4421-4424.
Examples of Student Projects: Chem 420 • Project Goal: This research compared structures of CA II, CA IX, and CA XII to identify differences in the binding sites that could be exploited to improve drug selectivity. • Mapping results: The CA XII site is smaller than the others and has a different shape. CA II: 42 probe molecules CA XII: 29 probe molecules
Course Outcomes and Beyond • A goal of this course was to introduce techniques used in drug discovery, with a focus on structural bioinformatics and how structural bioinformatics can be used to solve problems. • This course can be easily modified depending on the student population more scripting for CS students, more applications for a general population. • This course is a great starting point for undergraduate research projects.
Biol406: Directed Studies in Biological Research Paloma Valverde, Ph.D. Department of Sciences Wentworth Institute of Technology valverdep@wit.edu
Directed Studies in Biological Research (Biol406) • The main goal of this course was to integrate the real scientific research process early into the undergraduate curriculum to improve the technical and softs skills of the students • This course is part of the Minor in Biology and the Minor in Interdisciplinary Bioinformatics offered by the Sciences Department (WIT) • It has been offered during the Fall 2012 and Spring 2013. Students were sophomores from biomedical engineering and civil engineering majors.
Biol406: Project-Based with Emphasis in Research Skill Development Traditional Format of Sciences Courses (weekly unrelated labs) Face-to-Face Face-to-Face Face-to-Face Lectures Lectures Face-to-Face Labs Lectures 2 hours 50 min 50 min 50 min @ Bio Lab Biol406, Directed Studies in Biological Research: Modular Course with Integrated Lectures, Lab Projects and Research Skill Development 2 hours 2 hours 2 hours
Biol406: Modules in First Half of the Semester Safety Guidelines in a new Biology lab Introduction to Bioethics Module 1 (wk 1) Safety and Bioethics Learning how to set up a biology lab and prepare solutions Practicing with commonly used bioinstrumentation: To make measurements of DNA and Protein To detect fluorescence Test, Design or Improve Labs used in other Biology courses Module 2 (wk 2-4) Bioinstrumentation & Experimental Design Module 3 (wk 4-7) • Controlled Experiments with small animals; Bioinformatics Projects, Restriction enzyme analysis , DNA electrophoresis and Experiments with Bacteria Experimental design with Model Organisms, Bioinformatics Molecular Biology
Biol406: Second Half of the Semester is More Student-Driven Module 4 (wk7-11) Module 5 and Final Exam Week (wk 11-15) 1) Projects on Critical Reading of Scientific Literature 2) Experimental Design with small animals (planarians) and/or based on their critical reading of the scientific literature 3) Bioinstrumentation Optimization (new Equipment or Software) 4) Learning Gains in Advanced Technical Skills and Soft Skills to enhance their resume for Internships/coops 5) Encouraged to Submit their Work for Publication
Example of Students’ Projects (Fall 2012) Use of Bioinformatics Databases and Algorithms to identify the planarian homologue gene for WW45 (Salvador) that I discovered in 2000 (Valverde, 2000) Construction of cDNA from RNA isolated from heads and tails of planarias regenerating for Expression Analysis of WW45 (use of Semiquantitative RT-PCR) (from Valverde, 2000) (from Gentile et al, 2011) RNA cDNA Putative Planarian WW45 cDNA and Protein Sequence RT-PCR/Expression Analysis of WW45In Regenerating Planarian
Example of Students’ Projects from Spring13 Uses of Fluorescence in Biomedical Research Applications Exploration of Synthetic Biology the design and construction of new biological parts/devices/systems the re-design of existing, natural biological systems for useful purposes. Growing Banana-Scented Bacteria Properties of Green Fluorescence Protein (GFP) Use of SYBR Green for DNA Detection after PCR (polymerase chain reaction) Bacterial Constructs Expressing ATF1 Enzyme Generously Provided by BioBuilderEducational Foundation Generation of Banana odor by Growing Bacteria Expressing ATF1 Enzyme GFP protein Expression and Purification Future Students’ Projects??? Design and Test other Biological Devices (Gene Designer + Biobricks)
Advanced Biotechnology Skills Animal Cell Culture Techniques and Effective Team-Work
The Role of Bioinformatics Algorithms Michael Werner Professor Computer Science and Networking Wentworth Institute of Technology wernerm@wit.edu
Supporting Biologists with Algorithms • Define the problem biologists are trying to solve • Start with a standard algorithm • Refine it to fit the problem • Code the algorithm in C++, java or C# • Test for correctness • Benchmark it with inputs of increasing size • Optimize the algorithm • Parallelize it? An algorithm whose run-time varies as the square of the input size. Tweaking the algorithm may lower the curve but it remains a parabola.
Suffix Trees for Finding Motifs BANANA$ We came up with a parallelized version. Instead of a single suffix tree we used a forest of small trees by dividing the text and overlapping segments by the length of the longest pattern. The work of building and searching the trees can be done in parallel by different threads
Sequence alignment The dynamic programming approach is O(|S1|*|S2|. We looked at ways to reduce this, including divide and conquer approaches and applying heuristics to eliminate computing unlikely paths.
Hierarchal clustering This is a DNA heat map. The rows represent gene expression as found in a blood sample. The columns represent samples from subjects who had been diagnosed with prostate cancer and subjects free of it. A hierarchical cluster was done on the samples to find predictive patterns of gene expression. Columns were rearranged to place similar samples adjacently. The dendrogram shows 4 clusters. A Perspective on DNA Microarrays in Pathology Research and Practice, Jonathan R. Pollack
Hidden Markov models We observe when the subjects walks, shops or cleans. We hypothesize if it is rainy or sunny.
Trans-membrane receptors • Odorant receptors are proteins that enable our sense of smell. They cross in and out of the cell membrane. Segments (regions) that lie in watery solutions either outside or inside the cell are hydrophilic (water loving), regions lying in the cell membrane are hydrophobic (water hating). The receptors sense odor molecules outside the cell and send signals to the inside of the cell. Humans have about 400 odorant receptors recognizing 10,000 different odors. Dogs and human ancestors had more.
State Path and Viterbi Algorithm The Viterbi algorithm finds the most likely state path given the observed emissions.
BIOLOGICAL DATAMINING John P. Russo Associate Professor Department of Computer Science and Networking Wentworth Institute of Technology
Overview • Follow-on to other Bioinformatics courses in curriculum • Required Background • Database Management Systems • Introduction to Bioinformatics • Probability and Statistics for Engineers • (Optional) Biostatistics
Course Culture • Three 50 minute lectures per week • One two hour lab per week • Additional homework assignments • Semester long team project • Textbook: • Witten, Frank and Hall: Data Mining: Practical Machine Learning Tools and Techniques, 3rd Edition. Morgan Kaufman. ISBN: 978-0- 12-374856-0 • Additional selected readings