COLT: Effective Testing of Concurrent Collection Usage

COLT: Effective Testing of Concurrent Collection Usage Mooly Sagiv Ohad Shacham Tel Aviv University Alex Aiken Nathan Bronson Stanford University Martin Vechev Eran Yahav IBM Research

Concurrent Data Structures Writing highly concurrent data structures is complicated Efficient concurrent collections with atomic operations provided by libraries Atomic and efficient composition of several library operations often needed by the client Efficiency and correctness are client responsibility atomic V put(K,V) {…} atomic V get(K) {…} atomic V putIfAbsent(K,V) {…} atomic V remove(K) {…} atomic boolean remove (K,V) {…} atomic V replace(K,V) {…} atomic boolean replace (K,V, V’) {…} atomic boolean contains(V) {…} atomic boolean containsKey(K) {…} V foo (key K) { Object val = m.get(K) ; if (val != null) { val = m.remove(K); } return val; }

Question How can we write an atomic operation composed from a few libraries operations?

TOMCAT Motivating Example Coarse-Grain (TOMCAT 5.*) Fine-Grain (TOMCAT 6.*) Attribute removeAttribute(String name) { Attribute val = null; synchronized(attr) { found = attr.containsKey(name) ; if (found) { val = attr.get(name); attr.remove(name); } } return val; } Attribute removeAttribute(String name) { Attribute val = null; /* synchronized(attr) { */ found = attr.containsKey(name) ; if (found) { val = attr.get(name); attr.remove(name); } /* } */ return val; }  Invariant: removeAttribute(name) returns the value it removes from attr or null

Attribute removeAttribute(String name) { Attribute val = null; found = attr.containsKey(name) ; if (found) { val = attr.get(name); attr.remove(name); } return val; } T1.val attr T2 T1 {<“A”, o>} removeAttribute(“A”) { Attribute val = null; found = attr.containsKey(“A”) ; if (found) { val = attr.get(“A”); attr.remove(“A”); } return val; null o removeAttribute(“A”) { Attribute val = null; found = attr.containsKey(“A”) ; if (found) { val = attr.get(“A”); attr.remove(“A”); Φ Breaking the invariant: removeAttribute(name) returns the value it removes from attr or null

TOMCAT Fix Coarse-Grain (Before) Fine-Grain (After) Attribute removeAttribute(String name) { Attribute val = null; synchronized(attr) { found = attr.containsKey(name) ; if (found) { val = attr.get(name); attr.remove(name); } return val; } } Attribute removeAttribute(String name) { Attribute val = attr.get(name) ; if (val != null) { val = attr.remove(name); } return val; } 

Very hard to reconstruct in TOMCAT Specific thread interleaving Same input key T1.val attr T2 T1 {<“A”, o>} removeAttribute(“A”) { Attribute val = null; found = attr.containsKey(“A”) ; if (found) { val = attr.get(“A”); attr.remove(“A”); } return val; null o removeAttribute(“A”) { Attribute val = null; found = attr.containsKey(“A”) ; if (found) { val = attr.get(“A”); attr.remove(“A”); Φ

TOMCAT Motivating Example Attribute removeAttribute(String name) { Attribute val = null; found = attr.containsKey(name) ; if (found) { val = attr.get(name); attr.remove(name); } return val; }

TOMCAT Motivating Example removeAttribute is a new operation of attr composed by attr’s operations Should be atomic in any client environment Resilient for future client changes Backed up by our experimental results attr Attribute removeAttribute(String name) { Attribute val = null; found = attr.containsKey(name) ; if (found) { val = attr.get(name); attr.remove(name); } return val; } atomic V put(K,V) {…} atomic V get(K) {…} atomic V putIfAbsent(K,V) {…} atomic V remove(K) {…} atomic boolean remove (K,V) {…} atomic V replace(K,V) {…} atomic boolean replace (K,V, V’) {…} atomic boolean contains(V) {…} atomic boolean containsKey(K) {…} . No operation on attr

Challenge Checking atomicity of composed operations in any client environment

Memoization Example Coarse-Grain V compute(K) { synchronized(m) { val = m.get(K); if (val == null) { val = calculateVal(K); // @Pure m.put(K, val); } return val; } }

Fine-Grained Concurrent Clients • Concurrent collections provide basic operations for fine-grained concurrency (read-check-modify, CAS like) • V putIfAbsent(K, V) • Store <K, V> and return null or return the current value if K already exists • boolean replace(K, V, V’) • Replace <K, V> by <K,V’> if <K,V> already exists • Return true on success and false on failure • boolean remove(K, V) • Remove <K,V> if its exists • Return true on success and false on failure

Memoization Example V compute(K) { val = m.get(K); if (val == null) { val = calculateVal(K); // @Pure m.putIfAbsent(K, val); } return m.get(K); } Coarse-Grain (Before) Fine-Grain (After) V compute(K) { synchronized(m) { val = m.get(K); if (val == null) { val = calculateVal(K); // @Pure m.put (K, val); } return val; } }  m.remove(K)

Memoization Fix // atomic V compute(K) { val = m.get(K); if (val == null) { val = calculateVal(K); // @Pure tmpVal = m.putIfAbsent(K, val); if (tmpVal != null) val = tmpVal; } return val; } atomic implementation of compute • What do we mean by atomic?

Linearizability [Herlihy and Wing, TOPLAS'90] V compute(K) { val = m.get(K); //@LP val != null if (val == null) { val = calculateVal(K); // @Pure tmpVal = m.putIfAbsent(K, val); //@LP if (tmpVal != null) val = tmpVal; } return val; } {<7,8>} compute(7) { val=m.get(7) … tmpVal=m.putIfAbsent(7,8) return val; if (tmpVal != null) compute(7) { val=m.get(7) return val; if (tmpVal != null)

Thread Modular Linearizability ConcurrentHashMap MUT atomic V put(K,V) {…} atomic V get(K) {…} atomic V putIfAbsent(K,V) {…} atomic V remove(K) {…} atomic boolean remove (K,V) {…} atomic V replace(K,V) {…} atomic boolean replace (K,V, V’) {…} atomic boolean contains(V) {…} atomic boolean containsKey(K) {…} . V compute(K) { val = m.get(K); if (val == null) { val = calculateVal(K); m.putIfAbsent(K, val); } return m.get(K); }

Thread Modular Linearizability atomic V put(K,V) {…} atomic V get(K) {…} atomic V putIfAbsent(K,V) {…} atomic V remove(K) {…} atomic boolean remove (K,V) {…} atomic V replace(K,V) {…} atomic boolean replace (K,V, V’) {…} atomic boolean contains(V) {…} atomic boolean containsKey(K) {…} . V compute(K) { val = m.get(K); if (val == null) { val = calculateVal(K); m.putIfAbsent(K, val); } return m.get(K); }

Thread Modular Linearizability . . . compute(7) { val=m.get(7) … tmpVal=m.putIfAbsent(7,8) return m.get(7) compute(8) { val=m.get(8) … m.put(9,10) tmpVal=m.putIfAbsent(8,8) return m.get(8) compute(12) { val=m.get(12) … tmpVal=m.putIfAbsent(12,8) m.put(19,12) return m.get(12) compute(5) { val=m.get(5) … tmpVal=m.putIfAbsent(5,13) return m.get(5) m.put(5,12) compute(20) { m.put(20,10) val=m.get(20) … return m.get(20) m.remove(20) compute(2) { val=m.get(2) … tmpVal=m.putIfAbsent(2,8) return m.get(2) m.put(30,12) compute(20) { val=m.get(20) … return m.get(20) m.put(9,10) m.remove(14) m.remove(14) compute(14) { val=m.get(14) … tmpVal=m.putIfAbsent(14,8) return m.get(14)

Hard to test even when modular . . . compute(7) { val=m.get(7) … tmpVal=m.putIfAbsent(7,8) return m.get(7)   compute(8) { val=m.get(8) … m.put(9,10) tmpVal=m.putIfAbsent(8,8) return m.get(8) compute(12) { val=m.get(12) … tmpVal=m.putIfAbsent(12,8) m.put(19,12) return m.get(12)   compute(5) { val=m.get(5) … tmpVal=m.putIfAbsent(5,13) return m.get(5) m.put(5,12)  compute(20) { m.put(20,10) val=m.get(20) … return m.get(20)  m.remove(20) compute(2) { val=m.get(2) … tmpVal=m.putIfAbsent(2,8) return m.get(2) m.put(30,12) compute(20) { val=m.get(20) … return m.get(20) m.put(9,10) m.remove(14)  m.remove(14) compute(14) { val=m.get(14) … tmpVal=m.putIfAbsent(14,8) return m.get(14)  • Large number of interleavings • Bug exhibited very infrequently

Hard to test even when modular . . . compute(7) { val=m.get(7) … tmpVal=m.putIfAbsent(7,8) return m.get(7)   compute(8) { val=m.get(8) … m.put(9,10) tmpVal=m.putIfAbsent(8,8) return m.get(8) compute(12) { val=m.get(12) … tmpVal=m.putIfAbsent(12,8) m.put(19,12) return m.get(12)   compute(5) { val=m.get(5) … tmpVal=m.putIfAbsent(5,13) return m.get(5) m.put(5,12)  compute(20) { m.put(20,10) val=m.get(20) … return m.get(20)  m.remove(20) compute(2) { val=m.get(2) … tmpVal=m.putIfAbsent(2,8) return m.get(2) m.put(30,12) compute(20) { val=m.get(20) … return m.get(20) m.put(9,10) m.remove(14)  m.remove(14) compute(14) { val=m.get(14) … tmpVal=m.putIfAbsent(14,8) return m.get(14)

Leveraging Commutativity m.put(9,10) Compute (8) { val=m.get(8) … tmpVal=m.putIfAbsent(8,8) return m.get(8)  m.put(9,10) compute(8) { … val=m.get(8) tmpVal=m.putIfAbsent(8,8) return m.get(8)   compute(8) { val=m.get(8) … m.put(9,10) tmpVal=m.putIfAbsent(8,8) return m.get(8)  compute(8) { val=m.get(8) … tmpVal=m.putIfAbsent(8,8) m.put(9,10) return m.get(8)  compute(8) { val=m.get(8) … tmpVal=m.putIfAbsent(8,8) m.put(9,10) return m.get(8) Operations on different keys are guaranteed to commute Cannot lead to linearizability violation No need to try such interleavings Use commutatively for partial order reduction at the library level

Leveraging Commutativity • To expose bugs, try operations that do not commute • Execute thread in adversarial environment • Adversarial environment picks non-commutative operations • Adversarial scheduler uses non-commutative • No bugs are lost due to the reduction

Implementing Dynamic Modular Analyzer Compare concurrent execution to a specific sequential execution Concurrent execution emulated using an adversary Check that every concurrent operation returns the same result as its sequential counterpart operation operation Potential linearizationpoint Potential linearizationpoint MUT ... Adversary compare results compare results Reference Method

Running Example V compute(K) { val = m.get(K); //@LP val != null if (val == null) { val = calculateVal(K); m.putIfAbsent(K, val); //@LP } return m.get(K); }

MUTAdversaryReference Method Map (m)opopMap (refM) Φ compute(7) { Φ val = m.get(7) if (val == null) val = calculateVal(7) {(7,12)} m.put(7,12) != refM.put(7,12) {(7,12)} m.putIfAbsent(7, 14) compute(7) { val = refM.get(7) if (val == null) return refM.get(7) Φ m.remove(7) != refM.remove(7) Φ return m.get(7) null 12 != V compute(K) { val = m.get(K); //@LP val != null if (val == null) { val = calculateVal(K); m.putIfAbsent(K, val); //@LP } return m.get(K); }

Benchmark • Tested 89 MUTs from 48 applications • Apache’s Tomcat, Derby, Cassandra, My Faces – Trinidad, etc…

Results • Tested 89 MUTs from 48 applications • Apache’s Tomcat, Derby, Cassandra, My Faces – Trinidad, etc… • Discovered 56 linearizability violations • 6 reports were proved as linearizable and the adversary refined accordingly • Other MUTs were manually proved linearizable • Filed bug reports; many already fixed • Random adversary failed to find even a single violation • in 10 hours

Reasons For Success Deterministic MUTs For most MUTS A bug exists for a key iff a bug exists for every key Control influences by collection’s operation result Does not value dependant V compute(K) { val = m.get(K); if (val == null) { val = calculateVal(K); // @Pure tmpVal = m.putIfAbsent(K, val); if (tmpVal != null) val = tmpVal; } return val; }

Future Work Verify thread modular linearizability Small MUT Use ADT semantics to remove implementation Small model verification

Summary Fine-grained concurrency is hard Employing atomic library operations is error prune Leverage commutativy Sweet spot Identify important bugs Hard to find Simple and efficient technique

COLT: Effective Testing of Concurrent Collection Usage