140 likes | 246 Views
Quantitative analysis of 2D gels Generalities. Applications. Mutant / wild type Physiological conditions Tissue specific expression Disease / normal state Drug effects. 2 images (or image groups) comparison. Expression over time Multiple conditions analysis. serial analysis.
E N D
Applications • Mutant / wild type • Physiological conditions • Tissue specific expression • Disease / normal state • Drug effects 2 images (or image groups) comparison • Expression over time • Multiple conditions analysis serial analysis
Image analysis requirements • Quality of separation spot detection • Reproducibility of migration matching • Labelling method quantification • Signal / noise accuracy
2 images comparison • Statistical analysis unusable • Only for important quantitative variations • Essential to confirm
2 sets comparison • Mimimum number of images is 3 • Maximum is not limited ! • Allows detection of smaller variations • T test is allowed
Serial analysis The most frequent question is to find sets of proteins that have correlated expression profiles • Quantitative evolution of each spot • Need to group the spots according to their behaviour (clustering) • Use of Michael Eisen’s software package (http://rana.lbl.gov/EisenSoftware.htm) • Cluster • TreeView
Results Making sense of the data
2 sets comparison Image normalisation to obtain comparable spot volumes • Using the matched spots • Using a single spot Data analysis • Using the analysis program • Using Excel
Serial analysis • Image normalisation input data • Find clusters of genes • According to the method, the number of clusters will be fixed from the beginning (K-means) or determined after the analysis (hierarchical clustering)
2 4 5 3 1 1 3 2 4 5 Hierarchical clustering The length of the branch = the distance between joined genes or clusters The dendrogram induces a linear ordering of the data points Dendrogram
1- The similarity between two genes: • measures how similar two series of number are. • it is based on Pearson correlation coefficient. Centered correlation Uncentered correlation Absolute correlation Euclidean ... • a matrix of distances between all pairs of items is computed. • agglomerative hierarchical clustering is performed by joining by a branch the two closest items. 2- The distance between the new cluster and the others: • it is measured by different methods. Average Linkage:distance between cluster centers Single Linkage:distance between closest pair Complete Linkage:distance between farthest pair 3- The weight of each serie: • it is possible to give a different weight to a particular experiment. Hierarchical clustering Two parameters must be defined:
assign points tocentroids move centroids to centerof assign points iterate until centroids are stable iteration = 1 iteration = n K-means - centroid method start with random position of K centroids iteration = 0
K-means - centroid method • The user chooses the number of cluster • The result varies with each run compare several runs