1 / 6

Position-dependent motif characterization using Non-negative matrix Factorization (NMF)

Position-dependent motif characterization using Non-negative matrix Factorization (NMF). In collaboration with: Thomas Blumenthal, University of Colorado David Kulp, University of Massachusetts. Joel H Graber Lucie N. Hutchins, Erik McCarthy, Sean Murphy, Priyam Singh

ailish
Download Presentation

Position-dependent motif characterization using Non-negative matrix Factorization (NMF)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Position-dependent motif characterization using Non-negative matrix Factorization (NMF) In collaboration with: Thomas Blumenthal, University of Colorado David Kulp, University of Massachusetts Joel H Graber Lucie N. Hutchins, Erik McCarthy, Sean Murphy, Priyam Singh The Jackson Laboratory Funding Sources Current: NIH GM 072706, NIH HD037102 Previous: NIH RR 16463 (INBRE-Maine) NSF 2010 Project DBI 0331497

  2. N position counts PWC Matrix M sequence words Functional site Motifs are often constrained in positioning AUGCACAUAGAGGCAAUUGUGUAUCAAUAUUAAAAAUAAAGUAAAACUUA AAGCAUGUGUAGACCGUGUG AUGAAUCCUUGUAUAAGCAACUGCCAAUGAAAUCGGGCUCGCUGUGGUCA UCCGUGAGUGCUUAUCAUUC UGGUAAUACCGUGGUCUAUUUAUACAAAUAUUAAAAGUGCUGUUUAUAGA GCCUGUGUCAUGUGGCAACU UCCUGUGUCAUGACCUCAGGAAAUAAAUUUCCUUGACUUUAUAAAAGCCA AAACGUUUGCCCUCUUCCUU GGAAUUUGAAAUUACUCCAAUUUAAAAUAAAUUACUGGACUGUGGAAAUA ACAUGUAGAAUUGCAGUUUU ACACUGUAACAGUUGCUUCUGCCUACCUUAUAAAUAAAGAAUCACUAAGA AAAAGAGUUCUCAGGUCUCC CUGAGCUCAGACUGAGGGGAAACGGAGGCAAAUAAAGCUGAGUUUUGAGA ACUCGGUGGCCUGUGUUCCU AGCCUGUACUCACCCCUUCCCUUAAUAAUAAUAAAACAACAACUUUGUGA AUUUGAGUUUUCCUUAGAGC UCAACAGAUCAUAUUCAGUGUCUUGAAUAAAUUGCUCUAUUUUGAUAUUA GAGAACAUAGUGACUGUGUU UGGUACGAUUAUUUUUUUUAACUAAAAUGAGAUAAAAUUCUAUAUUCUUAUGUGUGUGUGGUUUUUGAUG GGUGAAACUGUCUCAAUUUGAAUAAAUAUUUUUAUUGCAAUUCUGAACCA AUUUUAAAAGAAAAGAUACA AAUGUCCUUCCAAAUAGAGCCUUUUUAUUAAUAAAGGGCCUUGUACUUCA CUUGGAACAAAGGACGUUUC AUUUCAUUGUGUUAAAUGUAUACUUGUAAAUAAAAUAGCUGCAAACCUUA AGCCUUUGAGCUACUUGGUG UAUCUCACUCGGUAUUACGUGCUCUGCAAUAGAAGUUGGUGUGAACAUUC CCAGGUGACAUGCAGUGUUA CCACCACCCCUCCAUCAGUAAGCCACUAAUAAAGUGCAUCUAUGCAGCCA CAGGUCUGUCUGCCUCUUUU GGCUGGGCACCUUAAAAGAGAAGUCAAUAAACUGGGCUACACAGUACUUA AAACGCUGAACUGGCUAAGA UGUGUAUUUAUGAAUAUUAAUGAAUAAAAACUGCUUGGAUGGUUUACCUA ACUACUGCAUGAGGUUUUUU UCCUUUCUUUUCUCUCCACUCAAUAAAUACUUUAAAGCACAUUUGGAAUA AAGGAAGAGACUUUUAAGUG GUGCUUAAUGAUAAGGUUUUGACUUGUUAAAUUAAACCAUUUGGAAUAUA UUGUGUGUUUGUAGUAGUCA GUGCCUUUGUUUGUAAACCAAAAAGUAAUAAAUGAAUCCCUAUAUUUCUA UUAUAGCAUCUAUUGUAUUU AAUAUAGUAUUUUAUUUAAGAAAAUAAACUUUGCAGUUUUUGCAUUGUGA AUUCUCUCUCUUCCCGCCCA CUGCCAUGAAAAAUGUUGUUUAUGGAAUAAAAAAAAUGUAACUGCCUUUA AAUUUCCUGGUGGCUGUGUU

  3. NMF decomposes the PWC matrix into characteristic patterns (motifs) Counts (M x N) Bases (M x r) Weights (r x N) Wik= weight of ith word in the kth motif (content) Hkj= abundance of kth motif at the jth position (positioning) r = number of basis functions (patterns)

  4. Synthetic data verifies NMF performance

  5. Test matrix 1 Test matrix 2 Human polyA sequences Artificial sequences RSS provides a robust estimate for the optimal number of vectors (r)

  6. Mouse 3’-processing sequences Human transcription start sites NMF can characterize complex control sequences

More Related