350 likes | 489 Views
Using genomic information to understand Leishmania biology. Leishmania braziliensis genome survey. Functional organization of chromosomal ends. Leishmania life cycle. Vertebrate host. Insect Vector. Leishmania : some features. No chromosome condensation
E N D
Using genomic information to understand Leishmania biology • Leishmania braziliensis genome survey • Functional organization of chromosomal ends
Leishmania life cycle Vertebrate host Insect Vector
Leishmania: some features • No chromosome condensation • Generally accepted as a diploid, asexual organism • No mutants available • No antisense-RNA experiments successful A friendly parasite for genome research • 35 Mb genome - 36 size-polymorphic chromosomes • High gene density • Low frequency of repetitive sequences • Transcription is polycistronic • RNA processing by trans-splicing (no introns) • L.major genome complete/annotated (L.infantum finished)
Leishmania & leishmaniasis Mucocutaneous form L.(V.) braziliensis Cutaneous form L.(L.) amazonensis L.(L.) major etc. . . Visceral form L.(L.) chagasi L.(L.) donovani
Comparative Genomics: L.braziliensis X L.major Choosing an approach • Complete genome sequencing • Expressed genome • ESTs • microarrays • Genome survey sequences GSS (genome survey sequences): • gene discovery • comparative genomics
Genomic DNA Mechanical fragmentation Fragment size selection 1-2 kb Valid sequences Genomic Library in pUC18 Sequencing of 10,848 clones randomly picked Library validation Empty clones E.coli sequences Clones with small inserts Assembly clusterization (non) Redundant Data Banks L. braziliensis GSS library
10,848 DNA templates 1,440 single end 7,968 single end CAP3 BLASTN (1E-7) filter off (20-50 nt >90%) 6,569 assemblies = 5,187 singletons Total 5.15 Mb – 15% of the genome 3957 contigs 60.2% L.braziliensis NR DB vs L.major genome
10,848 DNA templates 1,440 single end 7,968 single end CAP3 BLASTX (1E-12) 6,569 assemblies = 5,187 singletons 45.3% 2980 contigs L.braziliensis NonRedundant DB vs L.major genome
45.3% BLASTX (1E-12) Significant matches 2646 contigs (88.9%) 2980 contigs BLASTN (1E-7) left over: 3589 contigs BLASTN (1E-7) Significant matches 1312 (36.5%) L.braziliensis NRvs L.major
L.major chromosomes Non-coding Coding G+C (%) Number of hits G+C (%) Number of hits 1 - - 64.9 21 2 49.8 12 63.7 29 3 56.8 3 66.5 31 L.braziliensis GSSsanchored to L.major chromosomes 1, 2 and 3
L.braziliensis GSSs: functional classification – COG /GO 700 600 500 400 300 200 100 0 apoptosis transporter transcription structural signal enzyme catalytic binding regulator activity regulator molecule transducer regulator activity activity activity activity activity activity
mosaic end 'lacZ-NEO 4000 1000 vector pMOD_neosat 4533 bps AG (Nsi-Sma) 3000 2000 SAT Ban II Ecl 136II Sac I Ban II Ecl 136II mosaic end Sac I Not I Xba I Hin cII Sal I Sbf I Pst I Sph I Hin dIII Further down on comparative and functional genomics • in vitro Transposition system:Tn5 • synteny studies • gene trapping
Ban II Ecl 136II Hin cII Sph I Ecl 136II Sac I Sal I Hin dIII Sac I Not I Sbf I Ban II Xba I Pst I Cla I Ava I Eco RI Ban II Ecl 136II Sac I Acc 65I Ban II Bam HI Kas I Sph I Sca I Hin cII Kpn I Nar I Nco I Pst I Xmn I Ava I Ava I Pst I Nae I Dra III Ban II Stu I Psi I mosaic end mosaic end EM-7 promoter SMB-587 ‘NEO SAT AG Splice acceptor vector end vector end 500 1000 1500 2000 Tn_neosat (2169 bps) Further down on comparative and functional genomics MCS
Tn5 in vitro transposition reaction target DNA insertion at the recognition site 9 bp gaps DNA repair duplication of target site
vector Bam HI fragment L.braziliensis cosmids/genomic DNA: in vitro transposition with Tn5
10D02 sequencing (Tn 5based) Repetitive elements L.major LmjF05 X L.braziliensis 10D02: a synteny survey LmjF05 fragment 10D02 clone: sequencing from the inserted Tn5 element
Further down on functional genomics: in silico identification of protein fusions 3’-end 35 nucleotides transposon insertion site target DNA sequence In silico translation of the target DNA sequence (ID with a L.major RNA helicase)
Localization of a protein fusion in a transfectant Antibody anti-endoplasmic reticulum
Localization of a protein fusion in a transfectant Antibody anti-neo (green) Antibody anti-endoplasmic reticulum (red) + dapi
Further down on functional genomics: where are we now Transposon based gene trapping limitations • trafficking and localization- 5’, 3’, insertion size • overexpression leading to misdirection - artifacts • 4 recombinant genomic clones sequenced • identification of trapped genes in silico (genes of interest = transfected) • Transfection of pools of transposed clones into L.braziliensis • neo resistance selection
The organization of the chromosomal ends • Repetitive sequences and organism’s specific genes • Sequence shuffling and genetic diversity In protozoan parasites: • Genes involved with Antigenic Variation • Surface proteins
H H H G E E B G B Tel Tel H H B H H G H E H G G E B E B G E Characterization of three chromosomal ends of L.major PFGE Reference strains 1 2 3 • Presence and distribution of reiterated sequences may vary • Genes may be silenced at the very end of chromosomes • Shuffling and potential gene truncation was also observed B1 Chr3 B2 Chr7 E8 Chr20 Tel RNA Pol III ORF7 ORF6 ORF5 ORF4 ORF3 ORF2 ORF1 PGKC PGKB A.L.Pedrosa et al, MBP, 2001
Transcription at the very end of some L.major chromosomes chromosome 1 chr 20 chr 3 Peptidyl-dipeptidase
Peptidyl-dipeptidase chromosomal assignment Peptidyl-dipeptidase Miniexon gene
L.major chromosome 2 annotation pseudogenes metabolism miniexon array hypothetical proteins putative protein hypothetical – conserved protein
LmjF02_45R LmjF02.0690 - hypothetical protein, unknown function LmjF02.0700 -hypothetical protein, conserved, unknown LmjF02.0710 -ATP-dependent Clp protease, ATP-binding LmjF02.0720 -hypothetical protein, function unknown LmjF02.0730 -dehydrogenase/oxidoreductase-like protein LmjF02.0740 -Peptidyl-dipeptidase, putative LmjF27
LmjF02_45R LmjF02.0690 - hypothetical protein, unknown function LmjF02.0700 -hypothetical protein, conserved, unknown LmjF02.0710 -ATP-dependent Clp protease, ATP-binding LmjF02.0720 -hypothetical protein, function unknown LmjF02.0730 -dehydrogenase/oxidoreductase-like protein LmjF02.0740 -Peptidyl-dipeptidase, putative LmjF01
LmjF02_45L LmjF02.0010 - phosphoglycan beta 1,3 galactosyltransferase LmjF02.0020 - histone h4, putative LmjF02.0030 - tagatose-6-phosphate kinase-like protein LmjF02.0040 - aminopeptidase, putative LmjF25_R
LmjF02_45L LmjF02.0010 -phosphoglycan beta 1,3 galactosyltransferase LmjF02.0020 - histone h4, putative LmjF36_L
LmjF02_45L LmjF02.0010 -phosphoglycan beta 1,3 galactosyltransferase LmjF02.0020 - histone h4, putative LmjF35
LmjF02_45L LmjF02.0010 -phosphoglycan beta 1,3 galactosyltransferase LmjF02.0020 - histone h4, putative LmjF21_L
LmjF02_45L LmjF02.0010 -phosphoglycan beta 1,3 galactosyltransferase LmjF02.0020 - histone h4, putative LmjF31_R
Eliane C. Laurentino Jeronimo C. Ruiz André.L Pedrosa Loislene O. Brito Tania Defina Ana Lúcia Massini Faculdade de Medicina Universidade de São Paulo - Ribeirão Preto Financial Support by FAPESP CNPq Colaborators: Luiz R.ºTosi José Marcos Ribeiro Peter Myler