50 likes | 71 Views
Develop a software package to group genes, represent functional themes, and minimize information loss using Gene Ontology concepts. This project involves learning Python, object-oriented programming, and graph algorithms.
E N D
Developing a Software Package for Conceptualizing Molecular Findings Xinghua Lu, Harry Hocheiser & Vicky Chen Dept Biomedical Informatics
Motivation and Goal • Bioinformatics research often produce a long list of genes of potential interest, e.g., genes differentially expressed in a cancer tumor • A key task is to understand what are the major functional themes of the genes • Input: a list of genes of interest • Output: divide genes in to functional groups; the functional theme of each group is to be represented by a suitable biological concept
Strategies • Currently, gene function is annotated with specific concepts from the Gene Ontology • The Gene Ontology consists of a set of concepts organized in a hierarchy—a directed acyclic graph (DAG) • Given a gene list, find out their corresponding annotations on the graph • Group genes whose functions are closely related to each other within the graph and search a general concept to summarize the genes • Quantitatively assess the information loss and strive minimize information loss
What are involved • Learning Python programming language • Object-oriented programming, graph representation, and graph algorithms • Information theory • Most of the functionality is already developed • Need to package/organize code into a package or a tool kit • Define API • Learn the function of existing code, modify if needed • Implement API by wrapping existing code • Package into a Python module • Documentation • Submit to BioPython community.