10 likes | 222 Views
Gene Ontology term 1. Gene 1. Gene 2. Gene Ontology term 2. Gene 3. Gene Ontology term 3. …. …. Gene n. Gene Ontology term m. Gene Ontology Network Visualization and Analysis. Program title here in. Hepatitis C Virus (HCV) Micorarray Data Function Analysis.
E N D
Gene Ontology term 1 Gene 1 Gene 2 Gene Ontology term 2 Gene 3 Gene Ontology term 3 … … Gene n Gene Ontology term m Gene Ontology Network Visualization and Analysis Program title here in Hepatitis C Virus (HCV) Micorarray Data Function Analysis R.B.H. Williams1, K. Xu2, X.-X. Huang1, C.J. Cotsapas1, S.-H. Hong2, G.W. McCaughan3, M.D. Gorrell3, P.F.R. Little11. School of Biotechnology and Biomolecular Sciences, University of New South Wales, Australia.2. VALACON project, National ICT Australia, Australia.3. Centenary Institute of Cancer Medicine and Cell Biology, University of Sydney, Australia. Using visualization and network analysis to assist function analysis of microarray data Microarray data Functions of interest Candidate genes Analysis using Gene Ontology Step 2: Statistical analysis We use the number of genes annotating to a GO term as the test statistic to select functions of interest when comparing against a set of background genes. This utilize the gene-term network structure rather than number of appearance. The results of this and next steps are shown in the figure below. • Current approaches: • Focus on identifying Gene Ontology (GO) terms that are over-represented in a set of genes of interest. • Problems: • Provide either too specific, or too general, level of biological information. • Tend to neglect the diversity of biological function (prefer genes with one or a few GO terms). • Focus on individual term, neglect the connections between them, and cannot explain the functions of a group of genes. • Our approach: • Analyzing in the context of a custom-build gene-function network. • Data used: • molecular pathogenesis of liver disease in hepatitis C virus (HCV) infection in humans. Interrelationships between GO term degree, statistical significance of GO term degree and cluster membership in the 1-level gene-term network. Left panel: shows the ranked distribution of GO term degree (i.e. number of genes annotated to a term; closed black circles) and the bootstrap estimate P−value of observing the magnitude of degree compared to all genes expressed in the experiment (open white circles). GO terms with high significance are highlighted with a light grey banding (only terms with P < 0.01 are shown; no multiple comparison correction). Right panel: shows the membership of each of the terms in the top 6 clusters (ranked by number of genes and labeled c1– c6 ) detected by hierarchical clustering. Shading denotes the proportion of genes in each cluster that map to a term (black is 100% and white is 0%). Step 1: Network construction Ak−level gene-term network is constructed according to the parameter k that specifies the level of abstraction: each gene is connected to the kth parent of its primary annotation. Also included are the connections among these terms in GO hierarchy. An example of 2-level network is shown below. Increasing the value of k results in the inclusion of higher level terms from the GO hierarchy. Step 3. Clustering Hierarchical clustering is used to identify groups of genes that had related functional annotations. The similarity of the GO annotations is used as the distance metrics between genes. The figure on the right is clusters 5 in the results shown in the figure above. 2-level gene-term networks for the 100 differentially expressed genes in HCV data. GO terms are green and genes are blue. GO terms that are highly annotated are shown in shades of red based on number of connections.