10 likes | 96 Views
D E P A R T M E N T O F PLANT BIOLOGY AND BIOTECHNOLOGY VKR RESEARCH CENTER PRO-ACTIVE PLANTS F A C U L T Y O F L I F E S C I E N C E S U N I V E R S I T Y O F C O P E N H A G E N.
E N D
D E P A R T M E N T O F PLANT BIOLOGY AND BIOTECHNOLOGY VKR RESEARCH CENTER PRO-ACTIVE PLANTS F A C U L T Y O F L I F E S C I E N C E S U N I V E R S I T Y O F C O P E N H A G E N Genes involved in metabolism of cyanogenic glucosides in Zygaena filipendulae determined by 454 pyrosequencing Mika Zagrobelnya, Karsten Scheibye-Alsinga,b, Niels Bjerg Jensena, Birger Lindberg Møllera, Jan Gorodkinb, Søren Baka aPlant Biochemistry Laboratory, Department of Plant Biology, University of Copenhagen, 40 Thorvaldsensvej, DK-1871 Frederiksberg C, Denmark The VKR Research Centre ‘‘Proactive Plants”, University of Copenhagen, 40 Thorvaldsensvej, DK-1871 Frederiksberg C, Denmark bDepartment of Basic Animal and Veterinary Sciences/Genetics & Bioinformatics, University of Copenhagen, 3 Grønnegårdsvej, DK-1871 Frederiksberg C, Denmark Center for applied bioinformatics, University of Copenhagen, 40 Thorvaldsensvej, DK-1871 Frederiksberg C, Denmark UGTs 41 putative UGTs could be identified in the Z. filipendulae transcriptome, three of which are full length (Figure 4). Introduction Zygaena filipendulae is a brightly colored diurnal moth, capable of biosynthesizing as well as sequestering the cyanogenic glucosides (CNglcs) linamarin and lotaustralin from its food plant Lotus corniculatus. The CNglcs are toxic and act as defense compounds as well as carry out other important functions in the life cycle of Z. filipendulae. In insects, the biosynthetic pathway of CNglcs is unknown, but in plants, the pathway was resolved in Sorghum bicolor. Sorghum contain the CNglc dhurrin and its biosynthesis involves the P450s CYP79A1 and CYP71E1 and the glycosyltransferase UGT85B1. The CNglcs are bio-activated by degradation involving a β-glucosidase and an α-hydroxynitrile lyase (Figure 1). Figure 4. Neighbor-joining bootstrap tree of full length UGT genes. Genes from Z. filipendulae are marked in read and genes from plants are green. Ae: Aedes aegypti, AM: Antheraea mylitta, At: Arabidopsis thaliana, BM: Bombyx mori, Ce: Caernorhabditis elegans, Cq: Culex quinquefasciatus, Hs: Homo sapiens, SF: Spodoptera frugiperda, Tc: Tribolium castaneum. -----------------Biosynthesis-------------- Bio-activation Isoleucine Lotaustralin 2-Butanone C H C H C H C H C H 3 3 3 3 3 H C H H C H C P450s Approximately 120 putative P450s could be identified in the Z. filipendulae transcriptome. Five of these are full length, and seven more were extended by RACE PCR (Figure 3). H C C O O H H C 3 3 3 3 3 C N C N O O H O β-glucosidase + -hydroxynitrile lyase Plant enzymes +Glc +HCN N N H 2 c G l UGT85B CYP79 CYP71 H O C H 3 C H C H 3 C H 3 3 H C O O H H C 3 H C H C O H C C N 3 C H 3 3 3 O N N H 2 H O c G l Valine Linamarin Acetone H C C N 3 H O Figure 1. Metabolism of cyanogenic glucosides. Enzymes are shown in green. We had the transcriptome of Z. filipendulae feeding on acyanogenic L. corniculatus plants sequenced, to elucidate the pathways of CNglc metabolism in insects. Results Figure 3. Neighbor-joining bootstrap tree of full-length P450 genes. Genes from Z. filipendulae are marked in red, original full length genes encircled in red. Green genes are from plants. Ag: Anopheles gambiae, Bm: Bombyx mori, Dm: Drosophila melanogaster, Lj: Lotus japonicus, Sb: Sorghum bicolor. We received 320.000 reads assembled into 30.000 contigs and 40.000 singletons (Figure 2). A B C bp bp bp Figure 2. Distribution of sequence lengths and cluster sizes. A: The lengths of individual reads. B: The lengths of contigs. C: Cluster sizes. All sequences in our dataset similar to P450s, glycosyl transferases (UGTs), α-hydroxynitrile lyases (HNLs) and β-glucosidases were found by BLAST searches and aligned with CLUSTAL W in MEGA and refined by hand. Phylogenetic analyses We tested full length sequences from the four gene families for selection in PAML4.1. The glucocerebrosidases and HNLs were not tested, since there were too few full length sequences in each group. We tested Models 0, 1, 2, 3, 5, 7, 8 from codeml on our sequences with likelihood ratio tests. ω-values are low and signifies purifying selection, but with less restraint in some areas for P450s and β-glucosidases, since they have three classes of ω. UGTs have larger areas where they seem to undergo neutral evolution. No sites with positive selection were detected in any of the four gene families. HNLs HNLs are divided into four groups, three of which are represented in Z. filipendulae: FAD HNLs, Serine carboxypeptidase-related HNLs, and Non-FAD HNLs. 52 putative HNLs could be identified in the Z. filipendulae transcriptome, but none of them are full length. Six are longer than 1000 nucleotides, two from each HNL group. β-Glucosidases 17 putative β-glucosidasescould be identified in the Z. filipendulae transcriptome, three of which are full length. However, earlier protein sequences from a Z. filipendulae β-glucosidase was similar to glucocerebrosidases. Therefore our cyanogenic β-glucosidase could be a glucocerebrosidase. Four glucocerebrosidases could be identified in the Z. filipendulae transcriptome, one of which is full length. Figure 5: Neighbor-joining bootstrap tree of full length β-glucosidase and glucocerebrosidase genes. Genes from Z. filipendulae are marked in read and plant genes in green. AM: Antheraea mylitta, Am: Apis melifera, BM: Bombyx mori, HE: Heliconius erato, Hs: Homo sapiens, Lj: Lotus japonicus, Tc: Tribolium castaneum. Perspectives Based on analysis of the Z. filipendulae transcriptome generated by 454 pyrosequencing, gene candidates for biosynthesis and bioactivation of CNglcs were identified. The number of Z. filipendulae sequences within the examined gene families closely correspond to the number of genes in the same families within other sequenced insects, signifying the good coverage achieved with 454 pyrosequencing. Full length sequences of the gene candidates are being generated by RACE PCR, and they will be heterologously expressed and the recombinant enzymes biochemically characterized to determine biological function.