Overview of Bioinformatics

Overview of Bioinformatics BY DR.C. AMRUTHAVALLI HOD OF BIOINFORMATICS CIST UNIVERSITY OF MYSORE

Bioinformatics is the field of science in which molecular biology, statistics, computer science, and information technology merge into a single discipline. • Bioinformatics is the science of managing and analyzing molecular biology data using advanced computing techniques. • Bioinformatics is the computer-assisted data management discipline that helps us: • Gather, store, analyze, integrate biological and genetic information (data), and represent this information efficiently. • Bioinformatics is the electronic infrastructure of molecular biology

Significance of protein folding problem VLSEGEWQLVLV . . . O2 Sequence structure function folds into a 3D to perform a

Software of Bioinformatics • There are many different bioinformatics tools available over the Internet free of charge to whomever wishes to use them. • There are also many commercial software packages used in bioinformatics by researchers who can afford it. • The number of software products is growing constantly, so that it is impossible to list, as software developers working in the life sciences (or life scientists with software development talents), are constantly updating and producing useful new applications. • Development and implementation of tools that enable efficient access and management of different types of information, such as various databases, integrated mapping information

Bioinformatics is associated typically with massive databases of gene and protein sequence and structure/function information databases. • New sequences, new structures or protein/gene function that are discovered are searched, (compared) against what is already known, (gathered), and deposited into the databases. • (These searches are done by remote computer access using various bioinformatics tools.) • Analysis and interpretation of various types of biological data including: nucleotide and amino acid sequences, protein domains, and protein structures. • Development of new algorithms and statistics with which to assess biological information, such as relationships among members of large data sets.

What type of information do we deal with in bioinformatics? • DNA (Genome) • RNA (Transciptome) • Protein (Proteome) • Sequence • Structure • Evolution • Pathways • Interactions • Mutations

DNA • Simple Sequence Analysis • Database searching • Pairwise analysis • Regulatory Regions • Gene Finding • Whole Genome Annotations • Comparative Genomics (Analyses between Species and Strains )

RNA • Splice Variants • Tissue specific expression • Structure • Single gene analysis (various cloning techniques) • Experimental data involving thousands of genes simultaneously • DNA Chips, MicroArray, and Expression Array Analyses

Protein • Proteome of an Organism • 2D gels • Mass Spec • 2D Structure • 3D Structure

Sequence Analysis Software • What is the information contained in a biological sequence? • How can we analyze it to gain knowledge? • Does it contain any functional clues?

Biological problems that computers can help with: I cloned a gene - is it a known gene? Does the sequence match? Is the sequence any good? Does it look like anything else in the database? Which family does it belong to? How can I find more family members? I have an orphan receptor, how can I find its ligand? The gene Iím interested in was found in another organism, but not mine. How can I look for it? I have linkage to a specific region on chromosome x, how do I find genes in that region?

The Potential of Bioinformatics The potential of Bioinformatics in the identification of useful genes leading to the development of new gene products, drug discovery and drug development has led to a paradigm shift in biology and biotechnology-these fields are becoming more & more computationally intensive. The new paradigm, now emerging, is that all the genes will be known "in the sense of being resident in database available electronically", and the starting point of biological investigation will be theoretical and a scientist will begin with a theoretical conjecture and only then turning to experiment to follow or test the hypothesis. With a much deep understanding of the biological processes at the molecular level, the Bioinformatics scientist have developed new techniques to analyse genes on an industrial scale resulting in a new area of science known as 'Genomics'.

Bioinformatics - Industry Overview The Bioinformatics industry has grown to keep up with the information explosion, growing at 25-50% a year. In 2000, the US market Research company Oscar Gruss estimated that the value of the Bioinformatics industry would touch $5 billion. Now it s demand for individuals capable of doing bioinformatics is soaring. Industry's demand for scientists with skills in Bioinformatics far exceeds the supply of qualified specialists in the field, Seems likely that this figure will be reached within the coming year. Therefore, companies are developing methods of spotting potential Bioinformatics experts and then training them on the job.

Assigning fold and function utilizing similarity to experimentallycharacterized proteins: • Sequence similarity: BLAST and others • Beyond sequence similarity: matching sequences and shapes (threading)

Aims of Bioinformatics: • The aims of bioinformatics are basically three-fold. They are • Organization of data in such a way that it allows researchers to access existing information & to submit new entries as they are produced. While data-creation is an essential task, the information stored in these databases is useless unless analyzed. Thus the purpose of bioinformatics extends well beyond mere volume control. • To develop tools and resources that help in the analysis of data. For example, having sequenced a particular protein, it is with previously characterized sequences. This requires more than just a straightforward database search. As such, programs such as FASTA and PSI-BLAST much consider what constitutes a biologically significant resemblance. Development of such resources extensive knowledge of computational theory, as well as a thorough understanding of biology. • Use of these tools to analyze the individual systems in detail, and frequently compared them with few that are related.

Analysis activity in Bioinformatics • Development of methods to predict the structure and/or function of newly discovered proteins and structural RNA sequences. • Clustering protein sequences into families of related sequences and the development of protein models. • Aligning similar proteins and generating phylogenetic trees to examine evolutionary relationships

Sub-disciplines within bioinformatics There are three important sub-disciplines within bioinformatics involving computational biology: The development of new algorithms and statistics with which to assess relationships among members of large data sets The analysis and interpretation of various types of data including nucleotide and amino acid sequences, protein domains, and protein structures and The development and implementation of tools that enable efficient access and management of different types of information

Medical applications: • Understand life processes in healthy and disease states. • Genetic Disease (SNPs) • Pharmaceutical and Biotech Industry • To find (develop) new and better drugs. • Gene-based or Structure-based Drug Design • Agricultural applications • Disease, Drought Resistant Plants • Higher Yield Crops

Careers in Bioinformatics Genomics: • Genome sequencing of – Bacteria, viruses – Animals– Plants • Comparative genomics • Annotation and Mapping • Gene Discovery Functional Genomics (Gene Expression and Regulation): • Control Regions – Switches – Circuits – Bypass – Feedback loops • Environmental Effects • Diseased States • Chemical Consequences

Careers in Bioinformatics Pharmacogenomics: • SNPs – Regional, ethnic variations – Inheritance patterns – Radiological/ecological modifications • Therapeutic target recognition • Correlation of drug and expression effects • Pathway Effects Proteomics: • Protein Profiling – Alternate splice variants – Orphan genes – Cryptic introns • Gene Therapy

Careers in Bioinformatics Structural Genomics: • Experimental Protein structures – Apo state – Holo state – Structural modifications • Membrane Proteins • Homology Modelling • Comparative Modelling Drug and Vaccine Design: • Screening Natural Products – Plants, – Fungi – Bacteria • Chemicals • In silico modifications of ligands • Vaccine design and delivery

Conclusion With the confluence of biology and computer science, the computer applications of molecular biology are drawing a greater attention among the life science researchers and scientists these days. As it becomes imperative for biologists to seek the help of information technology professionals to accomplish the ever growing computational requirements of a host of exciting and needy biological problems, the synergy between modern biology and computer science is to blossum in the days to come. Thus the research scope for all the mathematical techniques and algorithms coupled with software programming languages, software development and deployment tools are to get a real boost. In addition, information technologies such as databases, middleware, graphical user interface(GUI) design, distributed object computing, storage area networks (SAN), data compression, network and communication and remote management are all set to play a very critical role in taking forward the goals for which the Bioinformatics field came into existence.

The genomics revolution has transformed the landscape of drug discovery. DNA and protein sequences are yielding a host of new therapeutic targets and an enormous amount of associated information. The challenges in the genomics arena are to securely and reliably manage and analyze huge quantities of sequence and associated data, and to extract useful information from that data Detailed study of the three dimensional molecular structure of DNA, proteins, and other biological compounds can be critical to understanding their function and to designing therapeutics to control their effects. Modeling, simulation, and other computational techniques to predict and analyze this structure are essential components in today's discovery research

Thank you

Significance of protein folding problem VLSEGEWQLVLV . . . O2 Sequence structure function folds into a 3D to perform a

Overview of Bioinformatics