150 likes | 287 Views
Clustering Algorithms to make sense of Microarray data: Systems Analyses in Biology. Doug Welsh and Brian Davis BioQuest Workshop Beloit Wisconsin, June 2004. Biological Questions. What are the differences between cancer cells and normal cells?
E N D
Clustering Algorithms to make sense of Microarray data: Systems Analyses in Biology • Doug Welsh and Brian Davis • BioQuest Workshop • Beloit Wisconsin, June 2004
Biological Questions • What are the differences between cancer cells and normal cells? • What are the differences in gene expression between cancer cells and normal cells? • Can you guess at the cellular sub-systems that may be affected by cancer? • What are the cellular processes (pathways) that might differ between cancer cells and normal cells? • Can you guess at the components (proteins) of the pathways that might be involved in cancer
Goals • Systems Biology (shift focus among levels of knowledge) • Biology • Gene Expression (technique) • Pathways • DNA Replication • Individual Proteins • Math • Clustering Algorithms (theory and technique) • Statistics • Medicine (human phenotype)
Goals Cluster Programming Medicine Stats Math Biology Cell Biology Knowledge Pathway Physics Optics Robotics Protein
Goals Cluster Programming Medicine Stats Math Biology Cell Biology Knowledge Pathway Physics “Tools” Optics Robotics Protein
Problem Space Bedrock web site: http://bioquest.org/bedrock/problem_spaces/
Problem Space DNA Replication Cell Cycle (Depends on Paper) Microarray Files Gene Annotation Microarray Analysis Pathway Analysis Statistical Analysis
Assumptions • Assume co-expression of genes has significance. • We can generate A LOT of data. Wheat and Chaff • Clustering algorithms and viewing software allow a researcher to focus on subsets of (“significant”) data at a time.
Project • Paper: Singh D. et al. (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell Mar;1(2):203-9. • Questions: What is the testable hypothesis? How is it tested? What are the results? Are the conclusions valid? • Are there other (better?) ways to test this hypothesis? Are there better hypotheses to formulate?
Biological Questions • What are the differences between cancer cells and normal cells? • What are the differences in gene expression between cancer cells and normal cells? • Can you guess at the cellular sub-systems that may be affected by cancer? • What are the cellular processes (pathways) that might differ between cancer cells and normal cells? • Can you guess at the components (proteins) of the pathways that might be involved in cancer
Cluster • Download: version 1.4 (v2.2 has bug): • http://rana.lbl.gov/EisenSoftware.htm • Load data • Data transform***
TreeView • Download latest version • http://rana.lbl.gov/EisenSoftware.htm • Load data
Analysis • Clustering may reveal organizational units • What are these proteins and what processes are they involved in?
Next Steps • Hand off clusters of organizational units to Doug (GeneMAPP and MAPPFinder: What are these proteins in the context of cellular pathways?) • … • Investigate interesting single proteins (e.g., with NCBI tools). • Are these proteins conserved? (do yeast get cancer?) • What is the molecular basis of cancer?
Goals Cluster Medicine Stats Math Biology Cell Biology Knowledge Pathway Physics Optics Robotics Protein