350 likes | 624 Views
BIND: the Biomolecular Interaction Network Database. Gary D. Bader, Doron Betel and Christopher W. V. Houge. Seminar in Bioinformatics Elinor Heller. Abstract. What is Bind? Why do we need a tool like Bind? How does Bind work?. What is Bind?.
E N D
BIND: the Biomolecular Interaction Network Database Gary D. Bader, Doron Betel and Christopher W. V. Houge Seminar in Bioinformatics Elinor Heller
Abstract • What is Bind? • Why do we need a tool like Bind? • How does Bind work?
What is Bind? • Bind is a Database archive that hold information about: • Biomolecular interactions • Reactions • Complexes • Pathways http://www.bind.ca
Why do we need Protein Interaction Info? Motivation…
Why do we need Protein Interaction Info? - cont • Learning protein functions: If two proteins interact, there is a very high possibility that theirfunctions are related as well. • Cellular operations are largely endured by interactions among proteins. • From protein pathways to understanding cells, tissues, … to life and evolution
Why do we need Bind? • Until 2001: • This type of data was stored in journal publications, where it is difficult to mine.
Why do we need Bind?-cont • The genome era has taught us that it is important to use effective tools for storing and managing data before they become too large. • Preparing for the future: A concerted effort by the biological community is required now to prepare for the interaction information of the near future .
BIND -Goals Goals: • Provide a standard, comprehensive and integrated interaction resource to the scientific community • Define protein function and mechanisms • Recover and integrate biomolecular interaction knowledge • Discover new knowledge through data mining
BIND data specification: • The problem: Storing different interactions, with different data structure in a generic way. • Solution: Using ASN.1 • Main concept: ASN.1 is a formal notation used for describing data transmitted by telecommunications protocols, regardless of language implementation and physical representation of these data, whatever the application, whether complex or very simple.
What is ASN1? • ASN.1 = Abstract Syntax Notation 1 • Internationally standardized data specification language used to build complex data types in a hierarchical manner - origins are Xerox • Used in telephone systems, air traffic, building and machine control, toll highways, smart cards, security and more • Used by NCBI to store GenBank, PubMed, MMDB and more • For more info - http://www.oss.com/
Objects Data-types What kind of information does bind store? • BIND stores information about interactions, molecular complexes and pathways. (These are the high level data types).
Interactions: • interaction record stores a description of the binding event between two objects, A and B, which are generally molecules . A B
Molecular complex • a generally stable aggregate of molecules that have a function when linked together and are usually described as having sub-units. example: the ribosome
Pathways: • A pathway is defined as a group of molecules that are generally free from each other, but form a network of interactions usually to mediate some cellular function.
Bind Objects: • An object in BIND is basically a molecule. It can be: DNA RNA Protein Photon or a small/complex molecule.
Bind Objects-cont The object record holds : • its name + a list of name synonyms • its origin - whether natural or not • where it occurs in the cell • the cell stages in which it occurs • a sequence database reference to or a full instantiation of biological sequence and 3D structure.
A B Bind Objects-cont Most of the biological information in BIND is stored in an interaction record . • An interaction also stores:
DATA SUBMISSION Data is entered into BIND either by manual or automatic methods. Who enters the data? • Expert on the BIND team are entering high quality records on a continuing basis. • Users are encouraged to enter records into the database by the web-based system, or to contact the BIND staff if they have large data sets they want to process.
DATA SUBMISSION-cont How is a record submitted? • First stage: entering contact information. • Second stage: enter the PubMed identifier and two interacting molecules. Every record that is entered in this way will be validated by BIND indexers and by at least one other expert before it is made available in any public data release.
DATA SUBMISSION-cont • Submitters cannot limit the intended use of submitted BIND data • Submitters have the right to edit/alter their records over time • Suggestions made by a third party will be forwarded by us to the submitters to seek approval for any changes or corrections
BIND growth: • The fist version of BIND (June 1999): • Contained over 1000 interaction records • Pathways: 6 • Complexes: 40 • The last version of BIND • Interactions: 178004 • Pathways:8 • Complexes:3388
FAST = “parallel” RPS BLAST Used to spot domain similarities in a protein interaction cluster Server-generated scalable FLASHgraphics – zoomable, printable. Followed-up by zoom in on FASTA formatted sequences to see domain superposition and links to SMART/PFAM
More usages for BIND-1 • Helping direct future interaction studies: example: The human and mouse variants of the protein tyrosine kinase Fyn: • each have 9 recorded interactions in BIND • Share 6 similar interactions • The mouse variants is known to interact with a protein tyrosime kinase Vav. • The human variant has no record of interaction with the Vav homologue.
Example - continue Using Bind in combination with other tools, it has been lately discovered that : Human homologues with similar domain architecture to mouse Fyn interactions can be identified.
More usages for BIND-2 • Comparing between creatures with a different number of genes. Example: Drosophila VS. C.elegans
Example - continue • Who has a higher Gene number? • Who has larger Protein Interaction complexity ?
References: • BIND: the Biomolecular Interaction Network DatabaseGary D. Bader, Doron Betel, and Christopher W. V. Hogue. Nucleic Acids Res. 2003 January 1; 31(1): 248–250. • Bader G.D., Donaldson,I., Wolting,C., Ouellette,B.F., Pawson,T. and Hogue,C.W. (2001) BIND—the biomolecular interaction network database. Nucleic Acids Res., 29, 242–245. • Bader G.D. and Hogue,C.W. (2000) BIND—a data specification for storing and describing biomolecular interactions, molecular complexes a pathways. Bioinformatics, 16, 465–477. • The BIND and related tools 2005 update. D418-D422 Nucleic Acids Res, 2005 ,vol33. Doron Betel, and Christopher W. V. Hogue at el. • http://www.bind.ca • http://www.ncbi.nlm.nih.gov/ • http://www.oss.com/