280 likes | 384 Views
Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein. What is Bioinformatics?. (Molecular) Bio - informatics
E N D
Web ResourcesforBioinformaticsVadim Alexandrov and Mark Gerstein
What is Bioinformatics? • (Molecular)Bio - informatics • One idea for a definition?Bioinformatics is conceptualizing biology in terms of molecules (in the sense of physical-chemistry) and then applying “informatics” techniques (derived from disciplines such as applied math, CS, and statistics) to understand and organize the information associated with these molecules, on a large-scale. • Bioinformatics is “MIS” for Molecular Biology Information. It is a practical discipline with many applications.
Molecules Sequence, Structure, Function Algorithms HMMs alignments simulations Web Resources: • Databases
0. Good Starting Point http://www.ncbi.nlm.nih.gov/ http://www.rcsb.org/pdb/
Web tour of UCL tools and resourceswww.biochem.ucl.ac.uk/bsm/biocomp
1. PDBsum capabilities PDBsum:www.biochem.ucl.ac.uk/bsm/pdbsum Starting point for looking at PDB structure Each entry contains: a. View -Schematic pictures of the entry • -Interactive views (RasMol/VRML) b. Details • -Name, date and description of macromolecules in PDB entry • -Authors, resolution and R-factor c. Links • -PDB header information • -PDB, NDB, SWISSPROT • -PQS (protein quaternary structure), MMDB • -CATH, SCOP, FSSP • -Structure check reports - PROCHECK, WHATIF • -Many others – enzyme, PRINTS etc
PDBsum capabilites, continued d. Each chain -CATH classification -Plot of sequence, secondary structure and domain assignments -PROMOTIF analysis -TOPS topology diagram -SAS – annotated FASTA alignment of related sequences in PDB -PROSITE pattern e. Nucleic acid ligands -Base sequence -NUCPLOT diagram of interactions f. Small molecule ligands -Schematic diagram of ligand -LIGPLOT diagram of interactions
2. SAS (Sequence Annotated by Structure):www.biochem.ucl.ac.uk/bsm/sas Annotation of protein sequences by structural information. a. Input for FASTA search of rest of PDB - PDB code - SWISS-PROT code - Paste sequence - Upload own alignment b. Annotation - Residue type - Ligand contacts - Active site residues - CATH domains - Residue similarity c. Options - Select inclusion in alignment - Colour/b&w, secondary structure d. View 3D structural superposition - coloured by SAS annotation
3. CATH: www.biochem.ucl.ac.uk/bsm/cath Hierarchical domain classification of protein structures in the PDB. Four basic levels: a. Class (automated): secondary structure composition and packing within structure - mainly-a, mainly- b, mixed a-b, low secondary structure b. Architecture (manual): overall shape of the domain structure as determined by the orientations of the secondary structures. Connectivity is ignored - e.g. barrel, sandwich etc. c. Topology (semi-automated): fold families determined by shape and connectivity of secondary structures - e.g. Mainly-b two-layer sandwich d. Homologous superfamily (semi-automated): domains of common ancestors determined by sequence and structural similarity e. Sequence family (automated): highly similar structures and function as determined by sequence identity
4. Other classification databases a. Enzyme structures database:www.biochem.ucl.ac.uk/bsm/enzymes - PDB enzymes structures classified by E.C. number b. Protein-DNA database:www.biochem.ucl.ac.uk/bsm/prot_dna/prot_dna.html - PDB complex structures classified by binding motif
5. Protein sequence analysis: www.biochem.ucl.ac.uk/bsm/dbbrowser Protein sequence search using protein fingerprints - group of conserved sequence motifs used to characterize a protein family.
Gross-level protein properties Protein-DNA interaction server: www.biochem.ucl.ac.uk/bsm/PP/server Protein-protein interaction server: www.biochem.ucl.ac.uk/bsm/PP/server
7. Atomic-level protein properties a. PROCAT: www.biochem.ucl.ac.uk/bsm/PROCAT/PROCAT.html - Database of 3D enzyme active sites b. Hydrogen bond atlas: www.biochem.ucl.ac.uk/~mcdonald/atlas - Graphical summary of hydrogen-bonding properties of amino acids c. Atlas of side chain-side chain/side chain-base interactions: www.biochem.ucl.ac.uk/bsm/sidechains - interaction geometries of side chain and side chain-base pairs
8. Publicly available software(protein structure/interaction) a.HBPLUS - calculation of interactions in PDB structures b.LIGPLOT - schematic diagrams of protein-ligand interactions c.NUCPLOT - schematic diagrams of protein-DNA interactions d.PROMOTIF - analyze protein secondary structural motifs e.NACCESS - calculate atomic accessibilities of protein surfaces f.SURFNET - visualization of molecular surfaces, cavities etc g.PROCHECK - check stereochemical quality of protein structures h.THREADER - prediction of protein tertiary structure i.MEMSAT - prediction of transmembrane protein structure j-z BROWSE THE WEB AT YOUR SPARE TIME AND BOOKMARK ‘EM!
‘Domestic’ resources: http://bioinfo.mbb.yale.edu/partslist/