230 likes | 416 Views
Information Technology As A CATALYST in Basic Biological Research. Sudha Bhattacharya J.N.U. New Delhi. Mining of gene Sequence Data Pattern finding in DNA. Specific Example . The Retrotransposons in Entamoeba histolytica genome. Retrotransposons. Mobile DNA elements
E N D
Information Technology As A CATALYST in Basic Biological Research Sudha Bhattacharya J.N.U. New Delhi
Mining of gene Sequence Data • Pattern finding in DNA
Specific Example • The Retrotransposons in Entamoeba histolytica genome
Retrotransposons • Mobile DNA elements • Some insert in a sequence specific manner • Others are widely distributed • Can disrupt the function of genes resulting in diseases
What Information can Bioinformatics provide? • I. Defining the element. • II. Where is the element located in the genome. • III. Pattern Finding in preinsertion sites.
I.Defining the Element • Its size • Copy number in the genome • Are all copies full length? • Are all copies functional? • To which group this element belongs (DNA transposon, LTR retrotransposon, non LTR retrotransposon)
Empty site Post insertion (could be truncated) • Defining the end points Of the Element by Sequence alignment • Constructing a consensus • Sequence with no • Mutation • Type of Element:- Deduced • by BLAST search, using the • sequence of reconstructed • element Reconstructed consensus element
Consensus structures of EhLINEs/SINEs Bakre Abhijeet
Genomic abundance of full-length and truncated copies of EhLINEs and EhSINEs.
II. Where is the element located in the genome. Element Analyzer (ELAN) – a tool that searches the genome and locates all the elements.
Occurrence of genes and other elements near EhLINEs/SINEs
Genes located downstream of EhLINE 1 From analysis of both genes upstream and downstream, it is clear that EhLINE 1 has invaded the genome widely
III. Pattern Finding Although the element inserts in many locations, it has some preferences. What are these?
Preferred sites • The sites that are preferred by Endonuclease for nicking (GCATT) • Amongst these, the sites that have preferred structure ? ? GCATT ? ? GCATT
DNA structure criteria tested based on dinucleotide frequencies • Thymine Excess • Bendability • Propeller Twist • Stacking Energy • Free Energy • DNA Denaturation Energy • Protein induced deformability • Nucleosome positioning
Computational analysis of preinsertion loci (b) (a) (d) (c)
Conclusion EhLINEs/SINEs insert in a rigid region that can melt easily and is 10-35 nucleotides upstream of the preferred EN sequence (GCATT)
Nucleic Acids Research, 2006, Vol. 00, No. 00 1–12 doi:10.1093/nar/gkl710 Identification of insertion hot spots for non LTR retrotransposons: computational and biochemicalapplication to Entamoeba histolytica Prabhat K. Mandal3, Kamal Rawal1, Ram Ramaswamy 1,2, Alok Bhattacharya 1,3 and Sudha Bhattacharya* School of Environmental Sciences, Jawaharlal Nehru University, New Mehrauli Road, New Delhi 110 067, India, 1School of Information Technology, Jawaharlal Nehru University, New Delhi 110 067, India, 2School of Physical Sciences, Jawaharlal Nehru University, New Delhi 110 067, India and 3School of Life Sciences, Jawaharlal Nehru University, New Delhi 110 067, India Received June 26, 2006; Revised August 22, 2006; Accepted September 14, 2006