1 / 9

Algorithm Complexity, Regular Expression

Explore the complexity of algorithms & regular expressions in Bioinformatics, comparing speeds & bounds for efficient problem-solving. Understand DNA sequences representation through graph automation & grammar expressions. Utilize ScanProsite for motif analysis.

Download Presentation

Algorithm Complexity, Regular Expression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Algorithm Complexity, Regular Expression Ka-Lok Ng Department of Bioinformatics Asia University

  2. Fast versus Slow Algorithms • typical speed of a CPU ~ GHz  10-9 s for each operation • estimate the running time of the algorithm ~ the total number of operations that the algorithm performs  compare different algorithms that solves the same problem • suppose algorithm A ~ 11n3 and algorithm B ~ 99n2+7 operations • algorithm B is faster for large n • Brute force algorithm is an exponential algorithm in contrast to polynomial algorithms (n2, n3 …)

  3. Big-O notation (Time complexity) • O(n2) 5n2+3.2n+99993 the dominant term is n2 • Big-O relationship establishes an upper bound on the growth of a function • F(n) = O(g(n))  The function f grows no faster than the function g (upper bound) • F(n) = W(g(n))  The function f grows no slower than the function g (lower bound) • An algorithm time grown no faster than g and no slower than g  g is a tight bound • If an algorithm requires 2nlog n , technically it is anO(n2) algorithm , although this is a misleading loose bound. A tight bound is O(nlog n). It is often easier to prove a loose bound than a tight one.

  4. Graph as automation • Consider the following four DNA sequences, ACAATG ACAAATC AGAATC ACCGATC • These four sequences can be represented by a special sort of graph, Figure 3.13, called an automation. • Remarks: (1) draw allow and circle, (2) write down the character, (3) loops back to earlier states and self-stats are allowed, and (4) fill in states 1 ~ 8.

  5. Expressions and grammar The following four DNA sequences, ACAATG ACAAATC AGAATC ACCGATC can be represented by an expression, so-called regular expression, A [ G | C+ | C+G] A* T [G | C] Where ‘*’ mean ‘zero or more occurrences’ , ‘+’ mean ‘one more occurrences’ and ‘[|]’ mean ‘or’, with alternatives provided on either side of the middle stick.

  6. Expressions and grammar Figure 3.13 can be represented by Table 3.3.

  7. Expressions and grammar • Figure 3.13 can be represented by the following six transition rules, • Those four DNA sequences are represented by the six rules.

  8. ScanProsite • http://expasy.org/tools/scanprosite/ • Parameter setting: human, at least 10 hits, show 100 results only

  9. ScanProsite • A-X-[ST](2)-X(0,1)-V motif

More Related