320 likes | 681 Views
Thesis capstone report. DNA sequence evolution in Sunflower and Lettuce. Yi Zou. Advisor: Dr. Loren Rieseberg Dr. Sum Kim. Major: Bioinformatics 07/16/2004. Background.
E N D
Thesis capstone report DNA sequence evolution in Sunflower and Lettuce Yi Zou Advisor: Dr. Loren Rieseberg Dr. Sum Kim Major: Bioinformatics 07/16/2004
Background • Sunflower and lettuce represent two major subfamilies of the Compositae Family, which is one of the largest and most diverse families of flowering plants • Sunflower is an important oil seed crop and domestication and breeding have focused on seed traits. • Lettuce is an important leaf crop and domestication and breeding have focused on leaf traits. • Extensive lettuce and sunflower EST database available (CGPDB)
Background • Examination of DNA differences between closely related species of Compositae will provide insight into the nature of mutational rates and processes in this family • Hypothesis: genes associated with primary domestication traits (seeds in sunflower and leaves in lettuce) will evolve faster than genes expressed in other tissues. • Hypothesis: upstream enzymes in metabolic pathways will evolve less rapidly than downstream enzymes.
Goals • Compare distribution of indels and base substitutions among closely related lettuce and sunflower EST sequences • Compare rates of EST sequence divergence for genes from different tissue types • Compare rates of EST sequence divergence from different pathway, and protein evolution among specific genes along major metabolic pathways
Data http://cgpdb.ucdavis.edu
CGPDB • contains about 112,000 individual ESTs sequenced from both sunflower and lettuce • Sunflower: about 44,000 individual ESTs, previously assembled into 4430 unique contigs, were sequenced from two Helianthus annuus cultivars: RHA801(exotic) and RHA280(oil). • Lettuce: around 68,000 ESTs, previously assembled into 8179 unique contigs(genes), sequenced from two species: Lactuca serriola (wild) and L. sativa (cultivated)
Goals • Compare distribution of indels and base substitutions among closely related lettuce and sunflower EST sequences • Compare rates of EST sequence divergence for genes from different tissue types and metabolic pathways • Compare rates of EST sequence divergence from different pathway, and protein evolution among specific genes along major metabolic pathways
Data Analysis – Example from sunflower Genotype 1 Genotype 2
Data Analysis –comparison of complete EST sequence
Data Analysis – comparison of coding region only sun_vs_ath_TIGR_unique lettuce_vs_ath_TIGR_unique
Result - Sequences information for assembling Contigs and conseneus
Result – Comparison of complete EST sequences between two sunflower and two lettuce genotypes
Conclusion1 • Substitutions are 3-6 times more frequent than indels in both sunflower and lettuce, regardless of whether coding regions or complete EST sequences are analyzed.
Goals • Compare distribution of indels and base substitutions among closely related lettuce and sunflower EST sequences • Compare rates of EST sequence divergence for genes from different tissue types and metabolic pathways • Compare rates of EST sequence divergence from different pathway, and protein evolution among specific genes along major metabolic pathways
Lettuce: TAG0 - callus - "cls" TAG1 - roots - "rot" TAG2 - none (leaf) - "not" TAG3 - flowers pre-fert - "flr" TAG4 - flowers post-fert - "flo" TAG5 - chemical induction - "chi" TAG6 - none - "nos" TAG7 - roots env stress - "rts" TAG8 - shoots env stress - "shs" TAG9 - germinating seeds - "gsd" TAG10 - flowers env stress - "fls" TAG11 - leaves dark grow - "lvd Tag_1_7: all root related contigs Tag_3_4_10: All flower related contigs Tag_7_8_10: All contigs related to environment stress Sunflower: TAG0 - callus - "cls" TAG1 - roots - "rot" TAG2 - disk & ray flowers - "drf" TAG3 - flowers pre-fert - "flr" TAG4 - developing kernel - "dkn" TAG5 - chemical induction - "chi" TAG6 - none - "nos" TAG7 - roots env stress - "rts" TAG8 - shoots env stress - "shs" TAG9 - germinating seeds - "gsd" TAG10 - flowers env stress - "fls" TAG11 - hulls - "hls Tag_1_7: all root related contigs Tag_3_10: All contigs related to flower Tag_7_8_10: All contigs related to environment stress Data Analysis – EST divergence for different tissue types
Result – number of tissue-specific contigs in lettuce Lettuce TAG-specific contig information 800 738 700 600 500 contigs with coding region found in both genotypes 400 336 278 300 200 140 85 78 69 60 100 30 29 18 14 11 0 0 0 TAG3(flr) TAG1(rot) TAG2(no) TAG7(rts) TAG4(flo) TAG5(chi) TAG0(cls) TAG6(nos) TAG9(gsd) TAG11(lvd) TAG8(shs) TAG10(fls) TAG_1_7(root) TAG_7_8_10(stress) TAG_3_4_10(flower) Tissue and Treatment
Result – Rates of sequence divergence among tissue-specific contigs in sunflower and lettuce
Result – Comparison of rates of sequence divergence for genes expressed in seeds versus other 16.00% T-test: SubRateKH vs SubRateNonKH P-value = 0.0009414 14.00% indel rate 12.00% 10.00% Content 8.00% Substitution 6.00% rate 4.00% 2.00% 0.00% DknHls Non-DknHls Other Seeds
Result – Rates of sequence divergence among treatment-specific contigs in sunflower and lettuce
Conclusion2 • As predicted, sunflower genes expressed in seeds evolve significantly faster than genes expressed in other tissues. Artificial selection for large seeds and high seed oil content may contribute to these higher rates. • For lettuce, there are no significant differences in rates of sequence evolution among different tissues • No differences were found in sunflower or lettuce among biotic and abiotic stress treatments
Goals • Compare distribution of indels and base substitutions among closely related lettuce and sunflower EST sequences • Compare rates of EST sequence divergence for genes from different tissue types and metabolic pathways • Compare rates of EST sequence divergence from different pathway, and protein evolution among specific genes along major metabolic pathways
Data Analysis– EST divergence among metabolic pathways • To identify contigs for specific pathways, the metabolic pathway information from TAIR (The Arabidopsis Information Resource: http://www.arabidopsis.org/) database was utilized. • Each contig in the CGPDB was assigned to an Arabidopsis gene locus (or remained unassigned) based on the BLAST results. • Genes (contigs) for different metabolic pathways were clustered and protein divergence was estimated.
Data Analysis – protein evolution along major metabolic pathways • Metabolic pathways: • Lipid metabolic pathways • Phenylpropanoid biosynthetic pathways • Cellulose, lignin, sucrose …etc. metabolic pathways • The nonsynonymous substitution rate (Ka) was calculated for enzymes in different positions along pathways • Software DnaSP 4.0 was utilized for this calculation
Result – comparison of metabolic pathway genes between CGPDB and TAIR • Based on the blast results, the contigs in CGPDB were compared with genes in TAIR and assigned to appropriate pathways • Currently there are 186 pathways with more than 800 unique reactions in the TAIR database. For these reactions, 1144 unique locus_iDs were assigned to the enzymes involved. • Among the TAIR loci, 72.1% match Contigs in the sunflower database and 83.15% match Contigs in the lettuce database.
Result – Rates of sequence evolution for sunflower metabolic pathway-specific contigs
Result – Rates of sequence evolution for lettuce metabolic pathway-specific contigs
Result – Nonsynonymous substitution rate (Ka) for genes along four metabolic pathways in sunflower and lettuce
Conclusion3 • Rates of sequence divergence did not differ among metabolic pathways • Rates of protein evolution (Ka) did not vary along metabolic pathways (i.e., upstream genes evolved at the same rate as downstream genes)
Summary • Substitutions are much more frequent than indels in both sunflower and lettuce • Sunflower genes expressed in seeds evolve significantly faster than genes expressed in other tissues • There are no significant differences in rates of sequence evolution among different tissues in lettuce • Rates of sequence divergence did not vary significantly either among or along metabolic pathways in either sunflower or lettuce
Acknowledge • Thanks • Dr. Loren Rieseberg • Dr. Sun Kim • Dr. Sheri Church • Dr. Zhao Lai