730 likes | 856 Views
My Master ’ s Work. Richa Tiwari. Outline of the talk. Analysis of Phylogeny Tree Evaluation Approaches (Project done in CS641). Proteomics and 2-D Gel Electrophoresis (Study done for CS)
E N D
My Master’s Work Richa Tiwari
Outline of the talk • Analysis of Phylogeny Tree Evaluation Approaches (Project done in CS641). • Proteomics and 2-D Gel Electrophoresis (Study done for CS) • Coexpression analysis of dimerization between bZIP proteins in groups C, S1 and S2 in Arabidopsis Thaliana, under the conditions of differential light and CO2 levels (Project done for BST676).
Phylogenetic Analysis • Alignment of the sequences • Determining the presence of relationship between sequences • Decision of most appropriate tree building algorithm • Scrutinize the tree to determine level of confidence
Algorithmic Method • Defines an algorithm that leads to the determination of a tree. Criteria Based Method • Defines a criterion for comparing different phylogenies and thereforephylogenies can be ranked, and comparison possible.
Maximum Parsimony Method • “Most parsimonious tree will explain the observed character distribution with a tree that have the minimum tree length.” • Tree selection criterion - Minimum tree length (Fewest character state transformation)
Maximum Likelihood (ML) • ML evaluates the probability that the chosen evolutionary model will have generated the observed sequences. • Evolutionary Model: Accounts for the changes in sequences. • Phylogenies are then inferred by finding those trees that yield the highest likelihood.
Distance Based Method • Distance-based methods attempts to find the distance that is the total changes between the two taxons from the point where they last shared an ancestor. • It is a cluster based method.
Software used…. PHYLIP To compare the three phylogeny methods. Programs used from the package are: • Maximum Parsimony: DNAPARS • Maximum Likelihood DNAML • Distance-based DNADIST and Neighbor • Tree constructed using : DRAWGRAM • Consensus tree constructed using: CONSENSUS
Using Sample data… Maximum parsimony Maximum likelihood Distance Based DNAPARS DNAML Neighbor
Consensus tree for given example… +------Human +--1.0-| +--1.0-| +------Chimp | | +------| +-------------Orang | | | +--------------------Rhesus | +---------------------------Gorilla +------Human +--1.0-| | +------Orang +------| | | +------Rhesus | +--1.0-| | +------Gorilla | +--------------------Chimp Parsimony Method Maximum Likelihood +-------------Orang +--1.0-| | | +------Chimp +------| +--1.0-| | | +------Human | | | +--------------------Rhesus | +---------------------------Gorilla Distance Based/Neighbor joining
Observation • Reliability of branch length estimates NJ and ML> MP • Computational speed (n>500) NJ/DNADIST: 0.005 seconds DNAPARS: 0.5 seconds DNAML: 230.0 seconds
Conclusion • Our experiments and the results obtained indicate that the Distance Based method is better than the other two methods in terms of Fastness, Simplicity and good performance for high number of taxa. • Also we can say that if you have a fast computer and large dataset Maximum likelihood method is better than Maximum parsimony.
Introduction • The entire set of proteins expressed by the genome in a cell, organ or organism is referred to as the proteome. • Proteomics : Methods that discover and quantify proteins and their biochemical changes.
Application of Proteomics • Protein Mining • Network Mapping • Mapping Protein Modifications
Proteomics Analysis Reference: www.mbi.osu.edu/sciprograms/prfmaterials/vandre.ppt
2-D Gel Electrophoresis The horizontal position tells us about the charge of a protein, whereas the intensity of the gel spot tells us about the amount of that protein in the system. Steps- 1. Prepare protein sample in solution 2. Separate proteins (in each dimension) I. Based on pH Using isoelectric focusing (IEF) Using immobilized pH gradient (IPG) strips II. Based on molecular weight (size) Using gel electrophoresis 3. Stain proteins to enable visualization.
Introduction to the project • This project focuses on 2D gel electrophoretic separation of proteins. • We analyzed few random spots from the 2D gels of rat mammary tissue. • Statistical methods to find the variance in pI of the same protein in different gels. • Analyzed the reasons for these differences. • Inferred the relationship between the experimental values and the predicted values.
The Gels we used were from an already done experiment.28 Random protein spots were selected based on the their intensity from each of the three gels. Mass Spectrometry Differentially expressed proteins identified by image analysis were excised from 2D gels and trypsin digested. The resulting peptide fragments were analyzed on a MALDI mass spectrometer (MS). The MALDI spectra displays a “peptide fingerprint” of the protein usingcorresponding peptide masses.
Proteins were identified by entering the masses (ions from MALDI spectrum) of the peptides into a peptide mapping database. Some examples of such protein search engine are- • Mascot - very popular and also used in this project • Sequest • Aldente • ProteinLynx • Phenyx
Results • We tabulated the result obtained from the database internet search and the one we obtained from the experiment. • We observed that the pI values as well as the molecular weight were not same in all gels for same protein. • The pI values of the three gels were quite similar but they were different from the predicted pI values.
In a 2D gel the position of protein spot can change due to various reasons and because of which the molecular weight and pI values may also differ.
Graph showing the variance among the predicted pI and observed pI
Observations • We saw that the difference between the pI values of the three gels that is the experimental values are not very different from each other. • So we can interpret that the difference due to non biological reason is very less in the experiment. • There were few protein spots for which internet search revealed the same result as same protein name. But our experiment gave different results which can be because of different group (like phosphate or sulphate) getting attached to it. There can be other reasons for it too.
Average deviations between the three observed proteins and the predicted pI values were calculated as – {(pI (gel 12_5)- pred. pI) + (pI (gel 12_5)- pred. pI) + (pI (gel 12_5)- pred. pI)} / 3 This gave the results shown in the next slide. We obtained positive as well as negative values for the deviations.
Average deviations between the three gels and the predicted pI
We can interpret that the proteins were modified more by negatively charged group such that there pI values decreased. • The addition of one phosphate groups to serine, threonine, and tyrosine residues typically decreases their isoelectric points by 0.1 pH unit.
Regression results • A statistical analysis test was performed to determine which of the three gels were closest to the predicted pI values. That is in which of the three gels had the proteins being least modified. • The test was Clibration test. We prepared a regression model for each gel. The inverse regression equation used was – Predicted pI = {Observed pI from Gel – Intercept } slope
Predicted pI values from the Calibration test and internet database
The result we obtained showed us that all the three gels predicted almost same pI values and they were quite away from the original predicted pI values. • All these similarities between the three gels show us that the difference between the pI values of proteins between the predicted and the experimented values is not very much because of non biological factors, but because of chemical modifications in the proteins.
Coexpression analysis of dimerization between bZIP proteins in groups C, S1 and S2 in Arabidopsis Thaliana, under the conditions of differential light and CO2 levels.
IntroductionTranscription factor • Transcription factor are proteins involved in the regulation of gene expression, that bind to promoter region upstream of genes. • They are composed of two essential functional regions: DNA binding domain – It binds to DNA. Activator Domain – It interacts with other regulatory proteins there by affecting the efficiency of DNA binding.
bZIP proteins • bZIP proteins are a class of transcription factor which has leucine zipper motif consisting of a periodic repetition of a leucine residue at every seventh position forming an alpha-helical confirmation. • The segment that comprises the basic region and the periodic array of leucine residues is referred to as ‘basic-region leucine zipper’ or bZIP motif.
Some facts • There are 792 bZIP proteins recorded in nonredundant database. • The no of bZIP proteins in the cell of selected organisms are as follows yeast – 16 fruitfly – 110 plant (Arabidopsis thaliana) – 75 Human - 114
Arabidopsis • The Arabidopsis genome sequence contains 75 distinct members of the bZIP family, of which ~50 of them are not well studied. • Using common domains the bZIP family can be subdivided into 10 groups: Groups A - S.
C & S protein interaction • Elhert et al measured interactions between C and S proteins. • C and S1 heterodimerized • Two S2 proteins dimerized.
Effect of Light & CO2 on C & S proteins • Carbohydrate signaling Increase of carbohydrate partitioning in elevated CO2, and a decrease in low light. • Seed development Photosensory system detects the quality, quantity, direction and duration of light. Controls developmental pattern. • Stress Light dependent generation of active oxygen species is a type of stress called photo oxidative stress.
Experiment Selection Criteria • a) Chose C and S bZIP proteins • Coexpression Engine: http://www.ssg.uab.edu/coexpression • b) Selected tissue and array type • c) Chose specific experiment