1 / 23

Modeling Regular Replacement for String Constraints Solving

Modeling Regular Replacement for String Constraints Solving. Xiang Fu Hofstra University Chung- Chih Li Illinois State University. malicious scripts . Hacker. Cool page! . Server. Background. Problem? Lack of Sufficient Sanitation of Text Inputs. One Typical Error. 1 <? php

sveta
Download Presentation

Modeling Regular Replacement for String Constraints Solving

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modeling Regular Replacement for String Constraints Solving Xiang Fu Hofstra University Chung-Chih Li Illinois State University NFM 2010

  2. malicious scripts Hacker Cool page! Server Background Problem? Lack of Sufficient Sanitation of Text Inputs NFM 2010

  3. One Typical Error 1 <?php 2 $msg = $_POST[”msg”]; 3 $sanitized = pregreplace( 4 ”/\< s c r i p t .*?\>.*?\<\/ s c r i p t .*?\ >/ i ”, 5 ” ” , 6 $msg ) ; 7 savetodb($sanitized ) 8 ?> Reluctant Kleene Star Attacker’s Input <<script></script>script>alert(’a’)</script> <script>alert(’a’)</script> NFM 2010

  4. Bigger Picture • Objective: Automatic Discovery of Vulnerabilities SUSHI Bytecode Symbolic Execution String Constraint Solver Test Replayer Attack Pattern NFM 2010

  5. Our Contribution • Atomic Replacement Constraints • Consider Two Semantics • Greedy • Reluctant • Modeling Using Finite State Transducer (FST) • Compact Representation of FST • Security Analysis NFM 2010

  6. Finite State Transducer • Accepts Regular Relation • Union, Concat, Composition • Intersection, Complement • Used for Modeling Rewriting Rules [Kaplan94, Karttunen96] ε:1 a:2 1 2 3 4 b:3 A (ab,123) ∈ L(A) NFM 2010

  7. Hierarchical FST &Modeling Declarative Semantics Goal: Replacement Regular Search Pattern Any String not Containing patter r Identical Relation Id(∑* - ∑* r ∑*) r : ω Id(∑* - ∑* r ∑*) 2 1 3 4 ε:ε NFM 2010

  8. Modeling Reluctant Semantics • 2 Steps • Mark the beginning of pattern • Do the replacement Goal: Key: Left-Most Matching NFM 2010

  9. Input Word Search Pattern a a b b c d a b c a b d a+b+c  x Begin Marker # a # a b b c d # a b c a b d s1 reluc(r)#’ : ω #: ε x d x a b d ε: ε s2 f1 Id(∑) NFM 2010

  10. The Challenge: Begin Marker Input Word Search Pattern a a b b c d a b c a b d a+b+c  x # # # # Look-ahead Capability? 3 Steps: End marker Generic end marker Begin marker Non-determinism NFM 2010

  11. 1 2 3 4 Preliminary End Marker Search Pattern a+b+c  x c:c b:b a: a b :b ε:$ Idea: Start with End Marker for Reverse of Search Pattern a: a 5 Reversed Pattern A1 cb+a+ Problem: Input tape accepts cb+a+ only! NFM 2010

  12. Generic End Marker Pattern cb+a+ c:c b:b 1 c:c b:b a:a ε:$ 1 Deterministic! a:a a:a a:a b:b c:c c:c 2 3 4 5 5,1 2,1 4,1 3,1 b:b A2 c c b a a c c b a $ a $ Input Word Output Word NFM 2010

  13. Finally, the Begin Marker Search Pattern a+b+c  x 0 ε:ε c:c b:b ε:ε ε:ε c:c 1 b:b a:a ε:# 1 3 2 4 5 a:a a:a b:b 2,1 5,1 4,1 3,1 c:c c:c b:b A3 NFM 2010

  14. Input Word Search Pattern a a b b c d a b c a b d a+b+c  x Begin Marker # a # a b b c d # a b c a b d s1 reluc(r)#’ : ω #: ε x d x a b d ε: ε s2 f1 Id(∑) NFM 2010

  15. Greedy Semantics Goal: greedy Challenge: Look-ahead longestmatch NFM 2010

  16. Search Pattern a+ x aabab Step 1: Begin Marker #a#ab#ab Step 2: ND End Marker #a#ab#a$b #a$#a$b#a$b #a#a$b#a$b #a#a$b#ab Step 3: Pairing Markers #aa$b#a$b #aaba$b Step 4: Checking Match #a$#a$b#a$b Step 5: Check Longest Step 6: Replacement xbxb NFM 2010

  17. Login Servlet Applications • Solve String Constraints Input: user name After filtering single quote and length restriction NFM 2010

  18. Solving Atomic Constraint Goal: A1 Id(P) Project to Input Tape Solution NFM 2010

  19. SUSHI Constraint Solver Type I Type II Type III • Solves Simple Linear String Constraints (SISE) • Relies on • dk.brics.automaton for FSA operations • Self-made Java package for FST operations • Supports 16-bit Unicode • Compact Transition Representation (I,I) (II,I) (III,II) NFM 2010

  20. Login Servlet Efficiency of Solver 1.4 Seconds on 2Ghz PC Benchmark Equations 1 Flex SDKXSS Attack 2 Equation Size: 565 74 Seconds Shorter than Security Track #1022748 3 4 NFM 2010

  21. Related Work Our Contribution: Precise Modeling of Various Regular Substitution Semantics • Forward String Analysis • Christensen & Møller [SAS’03] • Wasserman & Su [PLDI’07, ICSE’08] • Bjørner & Tillmann [TACAS’09] • Backward String Analysis • Kiezun & Ganesh [ISSTA’09] • Yu & Bultan [SPIN’08, ASE’09] • Fu [COMPSAC’07, TAVWEB’08] • Natural Language Processing • * Kaplan and Kay [CL’1994] NFM 2010

  22. Limitations • SISE String Constraints • All Variables Appear on LHS (Once) • No Easy Solution for Equation System Yet • No string length • Future Directions • Encoding string length in automata • Finite model on bit-vector NFM 2010

  23. Questions? NFM 2010

More Related