260 likes | 412 Views
I SUPPORT OPEN ACCESS PubMed Central www.pubmedcentral.nih.gov/ Public Library of Science www.publiclibraryofscience.org Nucleic Acids Research nar.oupjournals.org/. Thanks to the authors and reviewers who support NAR Please introduce yourselves!. rebase.neb.com/rebase/rebase.html.
E N D
I SUPPORT OPEN ACCESS PubMed Central www.pubmedcentral.nih.gov/ Public Library of Science www.publiclibraryofscience.org Nucleic Acids Research nar.oupjournals.org/
Thanks to the authors and reviewers who support NAR Please introduce yourselves!
RM Systems Type Functional R S M I 94 R M 3701 II res mod III 10 R R 3 IV ( )
Type II Subtypes R M EcoRI GAATTC M C R BamHI GGATCC R M1 M2 HphI GGTGA R1 R2 M1 M2 BsrDI GCAATG M S R C AhdI GAC(N)5GTC RM S BcgI CGA(N)6TGC V M R HpaII CCGG
REBASE Entries I II III IV = R = M = S = Predicted
Type II Restriction Enzymes and Methylases Total number of R specificities: 262 Number of sequenced examples: 188 Total number of M specificities: 253 Number of sequenced examples: 193
The Bioinformatics Problems of RM Systems 1. M genes Easy to find using motifs 2. S and V genes Easy to find using motifs 3. C genes Some are easy (C.BamHI, etc.) Some are difficult 4. R genes Very difficult unless homologs exist
Sequenced Restriction Enzymes Genes Recognition Sequence Family 1 Family 2 Family 3 Family 4 Family 5 AATT ACGT AGCT ATAT CATG CCGG CGCG CTAG GATC GCGC GGCC GTAC TATA TCGA TGCA TTAA 1 1 (2) 1 None known 2 (68) 1 (1) 2 (2) 3 (1) 21 (13) 2 5 (4) 1 (1) None known 6 (1) 1 (4) 1 1 (1) 1 (1) 1 3(7) 1 2(2) 1 1 1 2(5) 1 1 1 1
Analysis of new M gene hits 1. Is the overall sequence of the M gene similar to a known M gene? 2. Is the variable region (DNA recognition domain) highly similar to a known variable region? 3. Are there genes nearby that are similar to known S, V, C, R or other M genes? 4. Are the flanking genes similar to known non-R genes?
Problems Methylases 1. What cutoff value will distinguish true positives from spurious hits? a) How do we avoid just populating the database with more examples of the same? b) How do we avoid the degeneration of the database by including marginal examples? 2. The HemK group of “apparent” methylases
Problems Restriction enzymes 1. Even “true” matches are often very poor. 2. Good matches are “usually”, but not always, real isoschizomers. How do we distinguish? 3. Can we identify the “real” candidates, in the absence of sequence similarity?
HindVP HindVP
BssHII and McaTI have no significant sequence similarity! • digestion of λ DNA using McaTI expression lysate only • digestion using BssHII (NEB) only • double digestion using BssHII and McaTI expression lysate
Acknowledgements Janos Posfai Computer Scientist - Sequence Analysis Tamas Vincze Programmer - Sequence Analysis Yu Zheng Postdoctoral Fellow – in vitro experiments Rick Morgan Staff Scientist – Experimental RE discovery Dana Macelis Programmer - REBASE
I SUPPORT OPEN ACCESS PubMed Central www.pubmedcentral.nih.gov/ Public Library of Science www.publiclibraryofscience.org Nucleic Acids Research nar.oupjournals.org/