390 likes | 547 Views
The biology of Leishmania beyond the Genome Project. Leishmania life cycle. H u m a n. S a n d f l y. Leishmania : some features. No chromosome condensation Generally accepted as a diploid, asexual organism No mutants available. A Friendly parasite for genome research
E N D
Leishmania life cycle H u m a n S a n d f l y
Leishmania: some features • No chromosome condensation • Generally accepted as a diploid, asexual organism • No mutants available A Friendly parasite for genome research • 35 Mb genome - 36 size-polymorphic chromosomes • High GC content (~60%) • High gene density • Low frequency of repetitive sequences • Transcription is polycistronic • RNA processing by trans-splicing (no introns)
3’ 5’ Pre-mRNA (policistronic) SL-RNA (139 nt) 3’ miniexon (39bp) 5’ AG AG AG AG 3’ 3’ 3’ 3’ AAA AAA AAA AAA AA AA AA AA 5’ 5’ 5’ 5’ mRNAs Transcription in Leishmania Trans-splicing + polyA addition
Leishmania: some features • No introns (virtually) • Unclear rules for the regulation of gene expression • Few genes detected as stage specific messages • No mutants available • A Friendly parasite for genome & genetic manipulation • Reverse genetics is well established • overexpression of genes/variety of shuttle vectors • gene knockout through homologous recombination • genetic complementation
Leishmania Genome Network: LGN • Genome Project of Parasites • WHO/TDR initiative launched in 1994 • International laboratories network established (LGN) • Funding expansion 96-98: NIH, WT, BW, EC • http://www.sanger.ac.uk/Projects/L_major/ • http://www.genedb.org/genedb/leish/index.jsp
Leishmania Genome Project: LGN • Main activities • Molecular karyotype: definition of syntenic groups • A physical map for the entire L.major Friedlin genome • (cosmid - cLHYG - library) • Sequencing: • Gene discovery/expressed sequences • Genomic sequencing • Functional genomics • systematic gene knockout • microarray • proteomics, metabolomics • http://www.sanger.ac.uk/Projects/L_major/ • http://www.genedb.org/genedb/leish/index.jsp
L. amazonensis L. brasiliensis LEM 1317 LEM 1958 LEM 1163 LV 39 LGN CC1 Molecular Karyotype L.donovani L.major C kb 630 194 97
Definition of syntenic groups P.Bastien & coworkers
Syntenic groups: a molecular karyotype for L.major -Friedlin LGN laboratories
Leishmania Genome Project: LGN Main activities • Molecular karyotype: definition of syntenic groups • A physical map for the entire L.major Friedlin genome (cosmid - cLHYG - library) • Sequencing: • Gene discovery/expressed sequences • Genomic sequencing • http://www.sanger.ac.uk/Projects/L_major/ • http://www.genedb.org/genedb/leish/index.jsp
Leishmanias e leishmanioses Leishmaniose tegumentar L.(L.) amazonensis L.(V.) braziliensis L.(V.) guyanensis L.(V.) lainsoni Leishmaniose visceral L.(L.) chagasi
Comparative Genomics: L.braziliensis X L.major Choosing an approach • Complete genome sequencing • Expressed genome • ESTs • microarrays • Genome survey sequences GSS (genome survey sequences): • gene discovery • comparative genomics
Empty clones E.coli sequences Clones with small inserts Valid sequences Leishmania braziliensis GSS library Genomic DNA Mechanical fragmentation Fragment size selection 1-2 kb Genomic Library in pUC18 Sequencing of ~12000 clones randomly picked Assembly clusterization Data Bank non redundant GSS
Sequences with hits in databanks Sequences with no hits Leishmania braziliensis GSS: Analysis in DataBanks • 2.309 hits BlastX • 9.128 no hits DB
Leishmania braziliensis GSS: on going analysis Use of different data banks for the search of similarity Definition of functional classes Definition of repetitive elements and classes present in L.braziliensis Construction of maps of sinteny
Tackling mapping and sequencing information in Leishmania • Functional organization of chromosomal ends • Leishmania unusual transcripts
The organization of the chromosomal ends • Repetitive sequences and organism’s specific genes • Sequence shuffling and genetic diversity In protozoan parasites: • Genes involved with Antigenic Variation • Surface proteins • Tools: • A sheared genomic library cLHYG + LV39 L.major • The telomeric hexamere repeat from T.brucei
Digested genomic DNA H H H G E E B G B Tel Poor in reiterated sequences Tel H H B H H G H E H G G E B E B G E Reiterated and Unique sequences are Interspersed Tel H H E H G G H B H H G H G H B B H Characterization of three chromosomal ends: • Presence and distribution of reiterated sequences PFGE Reference strains 1 2 3 B1 Chr3 B2 Chr7 E8 Chr20 Pedrosa et al, MBP, 2001
Sequence and annotation of one end of chr. 20 Tel RNA Pol III ORF7 ORF6 ORF5 ORF4 ORF3 ORF2 ORF1 PGKC PGKB 1,0 kb Tel LST-R378 • Are annotated ORFs real genes? • Are they transcribed? When? • Is sequence shuffling (gene truncation) a common event? LCTAS ORF Pedrosa et al, MBP 2001
Transfectants carrying the telomeric clones Total RNA extraction northern blot hybridization Transfection of episomes for transcript detection at the chromosomal ends Tel cL-Hyg Telomeric clones
H H H G E E B G B Tel Transfectants Non-transf. Control Cont. B2 E8 B1 Tel H H E H G G H B H H G H G H B B H Overexpression of episomal genes: mapping transcripts Total RNA from Leishmania promastigotes B2-HH-3.2 DHFRTS Chr3 B1 Tel H H B H H G H E H G G E B E B G E B2 Chr20 4.5 chr7 E8
Transfectants Transfectants Transfectants Control Control Control C C C B1 B1 B1 B2 B2 B2 E8 E8 E8 3.0 B1 Tel H H H G E E B G B 2.3 1.5 B2 Tel H H B H H G H E H G G E B E B G E 4.5 3.7 6.6 E8 Tel H H E H G G H B H H H G H G H B B 2.2 0.6 Transcriptional silencing, Sequence shuffling and gene truncation at chromosomal ends chr3 6.4 chr7 chr20 DHFRTS
Chrom. 1 ~280 Kb H • Genes may be silenced at the very end of chromosomes • Shuffling and potential gene truncation was also observed Right end LEM1958 LEM1317 LEM1163 LV39 C (>chr.30) Chr.2 Chr.1 Perspectives: • further investigation of the sequence shuffling process and repetitive sequences involved • investigation of a transcriptional silencing process At the chromosome extremities: findings and perspectives
Transcripts of unpredicted function A tool from the Leishmania genome project: 2500 ESTs from cDNA libraries Genes of unknown function - ~ 50% of the ESTs • Do they have any unexpected features? • Could they be functional molecules? • proteins • non coding RNAs >100 ESTs classified as genes of unknown function • Re-sequenced, “re-blasted” • GC content analysed • length of the transcripts (ORFs?)
65 65 60 55 50 45 38% 84% 95% 60 60 55 55 50 62% 16% 5% 50 45 1.0 2.0 0.5 1.5 2.0 4.0 0 6.0 8.0 1.0 1.5 2.0 0 0.5 Transcript with ORF Transcript with no ORF Some features of a sample of orphan Genes (ESTs) • Unidentified ESTs • Identified ESTs • Identified proteins GC content (%) ORF length (kb)
Transcripts with odd characteristics Possible explanations: • polycistronic transcription is promiscuous • high level of ‘junk’ transcripts • regulatory RNA molecules/unusual small proteins • how to sort them out?
ODD1: a spurious or a functional transcript ? • single copy sequence (chr. 6) • conserved in several species • the transcript is 448 bases long • GC content of 53.8 % • two internal stop codons
RNA PolyA+ Total Cytoplasmic 4.4kb 4.4kb 3.8kb 2.3kb 2.3kb promastigotes ama 1.3kb 1.3kb 0.4kb 0.4 0.2kb 2.37 1.37 0.2kb ODD1 Transcript is processed • RNA is processed and transported to the cytoplasm • Higher levels of the transcript are present in promastigotes • Transcript is present in many Old and New World species
A SECIS element? ODD1: a spurious or a functional transcript ? • Two internal “stop codons” conserved in several species: • selenoprotein • selenocystein • read-through mechanism • Tryptophan (AT rich genomes) • tRNA suppressor • non coding RNAs (ncRNAs)
Is ODD1 translated? • Selenium incorporation assays did not identify ODD1 as a selenoprotein gene • 7 selenoproteins were detected in Leishmania • Anti-ODD1 (rabbit) failed to localize a protein in Western blots • No “overexpressed protein” was detected in ODD1 Leishmania transfectants
ODD 1 TRANSCRIPT HAS A POTENTIAL HAIRPIN AT THE 3’END …that seems to be no SECIS... • ncRNAs involved in control of gene expression • short transcripts • no poly A tails • presence of hairpins • 20-25 nucleotides complementary to an organism’s gene
Four candidates to be ncRNA Screening for Potential ncRNA genes • 12 Odd transcripts were selected for further investigation • chromosomal assignment • conservation among species • search for similarities in databanks • search for a transcript in pro- and amastigotes (of different species) • investigation of RNA secondary structure
André L. Pedrosa Eliane C. Laurentino Jeronimo C. Ruiz M. Pilar Iribar Simone A. Antoniazi University of São Paulo at Ribeirão Preto Financial Support by FAPESP Luiz R. O. Tosi, Faculdade de Medicina Ribeirão Preto, USP, Brasil Marla Berry, Harvard Medical School, Boston, USA