140 likes | 303 Views
BioNetwork Biological Modeling and Analysis. Microarray and Visualization. Data Integration. Bioinformatics data is currently spread across the internet and throughout organizations with different format. With data integration, we integrate different data into our single data.
E N D
BioNetwork Biological Modeling and Analysis Microarray and Visualization
Data Integration • Bioinformatics data is currently spread across the internet and throughout organizations with different format. • With data integration, we integrate different data into our single data. • With data integration, scientists can discover relationship between genes, proteins, etc that enable them to make better and faster decision about diseases and drug compounds
Data Source Large, complex data structures, reflecting the richness of the scientific concepts. Bioinformatics data sources cover similar domains; such as genes, proteins, structures, DNAs, or microarray results. We need integrated view of all data sources above that are relevant for a particular research Data is often incomplete, different format and missing certain attributes.
Microarray • Form of an array for the purpose of expression profiling, monitoring expression levels for thousands of genes simultaneously
Part I: Microarray of Hair and Skin Epidermis Melanocytes • Figure 1 Information A. melanocytes B. muscle C. sebaceous gland D. hair shaft E. epidermis F. dermis G. subcutaneous tissue H. fat I. artery J. sweat gland K. hair follicle L. Pacinian corpuscle
Data Source • Data is from Prof Des Tobin, School of Life Sciences • Contains + 24,000 genes • Over express and Under express genes • Data is incomplete
BioNetwork • System Architecture
Functional Functions • 1.Searching by given any keywords • 2.Searching by given multiple keywords using Boolean Operator • 3.Extract Gene Information • 4.Over In expression and Only in expression • 5.Save the search result back to Excel Format • 6.Excel Reader • 7.Connection to Public Microarray Database (Gene Onthology, GenBank, Kegg)
Preparations Background Adjustment Normalization Summary Differential Expression Over expressed gene Under expressed gene Pearson Calculation Co-expression of gene Biology Network Scale free network Random Network Part II: Modeling and Analysis • Purpose: to Identify and Modeling Genes
Preparation • Data Source : Lung (http://genome-www5.stanford.edu) • Normalization Background • Fortunately, all data downloaded are normalized x-µ λ
DifferentialExpression • Over express p>0 • Under express p<0
Co-expression • Similarity between each genes either under or over express • Pearson Correlation • Here is how to interpret correlations: • -1.0 To -0.7 strong negative associations. • -0.7 To -0.3 weak negative association. • Modeling and Analysis • -0.3 to +0.3 little or no association. • +0.3 to +0.7 weak positive association. • +0.7 to +1.0 strong positive association.
2 1 3 5 4 Visualization • Matrix of Microarray • Biology Network (random, free scale network)
BioNetwork Biological Modeling and Analysis Frederik Surya Tjoe