190 likes | 332 Views
Outline. MotivationExperimental designPrevious work ApproachResultsSummary. Motivation. A panel of cell lines for analysis. Morphometric subtyping for a panel of breast cancer cell lines in identifyingsubpopulations with similar morphometric propertiesmolecular predictors for each subpopulat
E N D
1. The title is molecular predictors of 3D morphogenesis by breast cancer cells in 3D culture.
This work focuses on quantitative analysis of 3D morphology of breast cancer cell lines. The title is molecular predictors of 3D morphogenesis by breast cancer cells in 3D culture.
This work focuses on quantitative analysis of 3D morphology of breast cancer cell lines.
2. Outline Motivation
Experimental design
Previous work
Approach
Results
Summary
In this presentation, we will start with the motivation, experimental design and previous work, followed by the details of the technical approach. We will discuss the experimental results and summary at the end of the presentation. In this presentation, we will start with the motivation, experimental design and previous work, followed by the details of the technical approach. We will discuss the experimental results and summary at the end of the presentation.
3. Motivation A panel of cell lines for analysis In this project, we use a panel of cell lines for analysis
Why a panel of cell lines?
First, it will introduce the necessary molecular diversity
Second, it generates heterogeneous responses
And it offers an improved model system for high-content screening, comparative analysis and cell systems biology
In this presentation, we will focus on morphometric subtyping for a panel of breast cancer cell lines in identifying
Subpopulations with similar morphometric properties
And identifying molecular predictors for each subpopulations
In this project, we use a panel of cell lines for analysis
Why a panel of cell lines?
First, it will introduce the necessary molecular diversity
Second, it generates heterogeneous responses
And it offers an improved model system for high-content screening, comparative analysis and cell systems biology
In this presentation, we will focus on morphometric subtyping for a panel of breast cancer cell lines in identifying
Subpopulations with similar morphometric properties
And identifying molecular predictors for each subpopulations
4. Experimental design In this experiment, we have a panel of 24 breast cancer cell lines
All 3D cell cultures were maintained for 4 days with media change every 2 days,
And the samples were then imaged with phase contrast microscopy
The computational pipeline includes
the colony segmentation and representation, and the phenotypic clustering as well.
Furthermore, we will find the molecular predictor for morphometric clusters,
Or the molecular predictor of morphometric features
For example, the colony sizeIn this experiment, we have a panel of 24 breast cancer cell lines
All 3D cell cultures were maintained for 4 days with media change every 2 days,
And the samples were then imaged with phase contrast microscopy
The computational pipeline includes
the colony segmentation and representation, and the phenotypic clustering as well.
Furthermore, we will find the molecular predictor for morphometric clusters,
Or the molecular predictor of morphometric features
For example, the colony size
5. Previous work(Kenny et. al, Gene Ontology, 2007) Here is the previous work on the same data by Kenny and other people in Bissel Lab.
The phase and fluorescence images of different cell lines are visually examined
And mannually divided into 4 distinct classes ……..
However, the manual analysis is labor expensive and subject to user bias.
So we have developed a computational protocol for quantitative analysis
And automatic subtyping of these cell lines.
Here is the previous work on the same data by Kenny and other people in Bissel Lab.
The phase and fluorescence images of different cell lines are visually examined
And mannually divided into 4 distinct classes ……..
However, the manual analysis is labor expensive and subject to user bias.
So we have developed a computational protocol for quantitative analysis
And automatic subtyping of these cell lines.
6. Automatic subtyping a panel of breast cancer cell lines in 3D culture Now we show the subtyping a panel of breast cancer cell lines in 3D culture
Phase images were obtained from cell lines cultured in 3D
Colonies were segmented from the background
Morphogenesis indices were computed
Similarly, we use consensus clustering for subtyping
At the same time, gene expression data were obtained for each cell line
Then, molecular predictors for phenotypic subpopulations and morphogenesis were discovered.
We use phase image as an example here.
This work could be extend to fluorescence images as well.
Now we show the subtyping a panel of breast cancer cell lines in 3D culture
Phase images were obtained from cell lines cultured in 3D
Colonies were segmented from the background
Morphogenesis indices were computed
Similarly, we use consensus clustering for subtyping
At the same time, gene expression data were obtained for each cell line
Then, molecular predictors for phenotypic subpopulations and morphogenesis were discovered.
We use phase image as an example here.
This work could be extend to fluorescence images as well.
7. Colony segmentation and representation (phase images) Colonies are separated from the background based on texture features;
Morphometric features (size and shape) are extracted for each colony. The images on the lest are colony images from different cell lines
We can see that the intensity and gradient are heterogeneous in different colonies
So simple thresholding methods may not work here.
Instead, we segment the colony from the background through a clustering method based on texture features.
The images on the right show segmented colonies
Once the conloies are segmented,
morphometric features, including size and shape, are extracted for each segmented colony.
The images on the lest are colony images from different cell lines
We can see that the intensity and gradient are heterogeneous in different colonies
So simple thresholding methods may not work here.
Instead, we segment the colony from the background through a clustering method based on texture features.
The images on the right show segmented colonies
Once the conloies are segmented,
morphometric features, including size and shape, are extracted for each segmented colony.
8. Clustering of morphometric features Consensus clustering
A proven method in analyzing gene expression data (Monti et. al, Machine Learning 2003)
Repeated random resampling
Determine the number of clusters by evaluating the consensus distribution for different cluster numbers
The next step is the clustering of colony mohrphometric features for different cell lines
Here are some challenges
First Morphometric features are heterogeneous for the same cell line
Second, the number of objects for each cell line is various
Third, there is no prior knowledge of the number of clusters
We use consensus clustering to address these issues
Consensus clustering is a proven method in analyzing gene expression data
It addresses the heterogeneity issue through repeated random resampling
It determines the number of clusters by evaluating the consensus distribution of different cluster number
Which will be discussed in next slide
This is the flowchart
………
The next step is the clustering of colony mohrphometric features for different cell lines
Here are some challenges
First Morphometric features are heterogeneous for the same cell line
Second, the number of objects for each cell line is various
Third, there is no prior knowledge of the number of clusters
We use consensus clustering to address these issues
Consensus clustering is a proven method in analyzing gene expression data
It addresses the heterogeneity issue through repeated random resampling
It determines the number of clusters by evaluating the consensus distribution of different cluster number
Which will be discussed in next slide
This is the flowchart
………
9. Consensus clustering on a panel of 24 breast cancer cell lines in 3D Here are the visualization of the consensus matrices for different number of clusters.
Each matrix is a similarity matrix
Each row or column represents a cell line
The row and column are ordered through hierarical clustering
So that similar cell lines are close to each other
The values represent the similarities between cell lines
They are normalized between 0 and 1
The darker, the more similarity
We can see all the diagonal elements are blank since they are identical
We can visually examine the clustering quality for different number of clusters
That is, the cleaner of the matrix (or the more 0’s and 1’s), the better.
We can also used a systematic way to find the optimal number of clusters
based on the so-called concentration histogram in the literature.
First, the cdfs of the elements in each consensus matrix are computed
Then we have the area under each CDF, n=2……6
the percentage changes of cdf areas with respect to the number of clusters is plotted here
This is so called concentration histogram
For n=2, its cdf is the red curve, and the area under this cdf is just above .2
For n=3, the cdf is the green curve,
Since the area of green cdf almost double the red,
So the percentage change is ~1
There is very little changes of cdf area from n=3 to n=4
So the percentage change for n=4 is very small.
The literature suggests that the peak of the concentration histogram indicate the optimal number of clusters
3 in this case
More details could be found in the literature
Here are the visualization of the consensus matrices for different number of clusters.
Each matrix is a similarity matrix
Each row or column represents a cell line
The row and column are ordered through hierarical clustering
So that similar cell lines are close to each other
The values represent the similarities between cell lines
They are normalized between 0 and 1
The darker, the more similarity
We can see all the diagonal elements are blank since they are identical
We can visually examine the clustering quality for different number of clusters
That is, the cleaner of the matrix (or the more 0’s and 1’s), the better.
We can also used a systematic way to find the optimal number of clusters
based on the so-called concentration histogram in the literature.
First, the cdfs of the elements in each consensus matrix are computed
Then we have the area under each CDF, n=2……6
the percentage changes of cdf areas with respect to the number of clusters is plotted here
This is so called concentration histogram
For n=2, its cdf is the red curve, and the area under this cdf is just above .2
For n=3, the cdf is the green curve,
Since the area of green cdf almost double the red,
So the percentage change is ~1
There is very little changes of cdf area from n=3 to n=4
So the percentage change for n=4 is very small.
The literature suggests that the peak of the concentration histogram indicate the optimal number of clusters
3 in this case
More details could be found in the literature
10. Results with three clusters These images show the three clusters from the consensus clustering:
Round, grape-like and stellate
This results is consistent with previous manual clustering of the same data
In this literature, the round subpopulation is further divided into round and round-mass groups
However, the difference can only be observable through fluorescence microscopy,
Not from these phase images.
One observation is that And 8 out of 9 grape-like cell lines are ERBB2 positive
And all stellate cell lines are triple-negativeThese images show the three clusters from the consensus clustering:
Round, grape-like and stellate
This results is consistent with previous manual clustering of the same data
In this literature, the round subpopulation is further divided into round and round-mass groups
However, the difference can only be observable through fluorescence microscopy,
Not from these phase images.
One observation is that And 8 out of 9 grape-like cell lines are ERBB2 positive
And all stellate cell lines are triple-negative
11. Molecular predictors of morphometric clusters To find molecular predictors of morphometric clusters
We ranked the genes based on bootstrap cross validation error of SVM classifier
These are the heatmaps of top selected genes that best predict each morphometric clusters, round, grape-like and stellate, with the same cut-off
It is shown that significantly more genes were selected as the predictor of stellate clustersTo find molecular predictors of morphometric clusters
We ranked the genes based on bootstrap cross validation error of SVM classifier
These are the heatmaps of top selected genes that best predict each morphometric clusters, round, grape-like and stellate, with the same cut-off
It is shown that significantly more genes were selected as the predictor of stellate clusters
12. Best genes for predicting the stellate cluster This table shows the best genes for predicting the stellate cluster
There are several interesting genes in this list
This table shows the best genes for predicting the stellate cluster
There are several interesting genes in this list
13. Molecular predictors of morphometric features (colony size) To find the molecular predictors of morphometric features,
We rank the genes by nonlinear correlation
It is interesting to see that the PPARG also appears as the top gene in this listTo find the molecular predictors of morphometric features,
We rank the genes by nonlinear correlation
It is interesting to see that the PPARG also appears as the top gene in this list
14. Discussion The gene expression profiles of the stellate colonies are the most distinct from the other two morphometric classes According to these results, we can see that
The gene expression profiles of the stellate colonies are the most distinct from the other two morphometric classes
PPARgamma appears as the top gene on both lists
Currently, we are working on the biological validation for PPARgamma
According to these results, we can see that
The gene expression profiles of the stellate colonies are the most distinct from the other two morphometric classes
PPARgamma appears as the top gene on both lists
Currently, we are working on the biological validation for PPARgamma
15. Validation 1: In vitro experiment on PPARG MDAMB231 was assayed in 3D cell cultures maintained in H14 medium with 1% fetal bovine serum
The 3D cultures were prepared in triplicate by seeding single cells on top of a thin layer of Matrigel at a density of 2200 cells/cm2 and overlaid by 5% final Matrigel diluted in culture medium
GW9662, a PPARG inhibitor, was dissolved in DMSO and added to the 3D cultures in the final concentration of 10 uM at the time of seeding
The vehicle control was pure DMSO
The culture medium and the drug were changed every other day
Five images per well were collected after five full days in 3D culture
16. In vitro validation results
17. Validation 2: In vivo experiment on PPARG These are in vivo experiment results
Normal tissues (A on the left) and triple-negative tissues (B on the right) are stained for PPARgamma expression (the red)
The amount of PPAARgamma were quantified on a cell by cell base
C shows that more cells have PPARgamma expressed in triple-negative tissues than in normal tissues
D shows the histogram of PPARgamma/cell on both tissues
These are in vivo experiment results
Normal tissues (A on the left) and triple-negative tissues (B on the right) are stained for PPARgamma expression (the red)
The amount of PPAARgamma were quantified on a cell by cell base
C shows that more cells have PPARgamma expressed in triple-negative tissues than in normal tissues
D shows the histogram of PPARgamma/cell on both tissues
18. Summary A system for identifying sub-populations for a panel of breast cancer cell lines Here comes to the summary
The content in this work is going to be published in Plos Computational biology. Here comes to the summary
The content in this work is going to be published in Plos Computational biology.
19. Acknowledgement LBNL
Parvin Lab
Hang Chang
Gerald Fontenay
Bahram Parvin
Bissell Lab
Genee Lee
Paraic Kenny
Mina Bissell
Joe Gray At the end, I would like to thank all the people contributing to this work.
Hang is a computer scientist working on image analysis and algorithm development
Gerald helped on database management
I would like to thank Genee and Paraic for the cell culture and data acquisition.
Currently, Genee is with Genetech,
and Paraic is with the Albert Einstein College of Medicine
Orsi is a Postdoc in Kenny Lab.
She worked on the validation for treatment of MDAMB231 with PPARG-inhibitor
Frederick is a pathologist at UCSF
He did the experiment of PPARG expression on triple-negative breast cancer tissue.
Finally, I would like to thank Mina and Joe for their help on this work, and the funding from NIH ICBP program
At the end, I would like to thank all the people contributing to this work.
Hang is a computer scientist working on image analysis and algorithm development
Gerald helped on database management
I would like to thank Genee and Paraic for the cell culture and data acquisition.
Currently, Genee is with Genetech,
and Paraic is with the Albert Einstein College of Medicine
Orsi is a Postdoc in Kenny Lab.
She worked on the validation for treatment of MDAMB231 with PPARG-inhibitor
Frederick is a pathologist at UCSF
He did the experiment of PPARG expression on triple-negative breast cancer tissue.
Finally, I would like to thank Mina and Joe for their help on this work, and the funding from NIH ICBP program