280 likes | 424 Views
Armed Bandits, Machine Learning and Fast Java: Practical Advice for Real-Time APIs Breandan Considine OneSpot, Inc. Why Real-Time?. The world is full of hard problems Types of real time applications Hard (nuclear reactor control) Firm (auction bidding) Soft (train scheduling)
E N D
Armed Bandits, Machine Learning and Fast Java: Practical Advice for Real-Time APIs Breandan Considine OneSpot, Inc.
Why Real-Time? • The world is full of hard problems • Types of real time applications • Hard (nuclear reactor control) • Firm (auction bidding) • Soft (train scheduling) • Real-time is a good thing • Real world applications • Performance over scalability
Benefits of Real-Time Processing • Forces us to narrow our priorities • Focus on constant, stable solutions rather than time-varying, exact solutions • Abundance of data but scarce processing power • Lifespan of actionable data extremely short • Tradeoff between optimality and throughput • Speed and parallelism will come over time • Upfront investment with long-term benefits
Real-time Interactive Tasks (RITs) • Online auctions: DSPs, SSPs • Multivariate testing • Inventory control, SCM • Scheduling, navigation, routing • Recommendation systems • High frequency trading • Fraud prevention
Common Thread • Agent offered a context and set of choices • Each choice has a unknown payoff distribution • Choose an option, measure the outcome • Goal: Maximize cumulative payoff • Many instances • Time sensitive • Nontrivial features
Challenges • Impractical to test every action in context • Computationally intractable to consider • Cost of full survey outweighs benefit • Exploration-Exploitation Tradeoff • Opportunity cost for suboptimal choices • Local extrema conceal optimal solutions • Latency comes at the cost of throughput • Every clock cycle must count • Firm real-time characteristics
Dis/advantages • Starts from scratch, training is expensive • Credit assignment problem & reward structure • Issues with non-stationary systems • Continuously integrates feedback • Adapts to real-time decisions • No assumptions about data • Follows signal on-line • Similar to how we learn
Non-blocking Algorithms • Critical for high performance I/O • Relatively difficult to implement correctly • Offers large speedup over lock-based variants • Types of non-blocking guarantees • Wait-freedom • Lock-freedom • Obstruction-freedom
Lock-Freedom • Guarantees progress for at least one thread • Does not guarantee starvation-freedom • May be slower overall, see Amdahl's law
Java Memory Model • happens-before relation • Threaded operations follow a partial order • Ensures JVM does not reorder ops arbitrarily • Sequential consistency is guaranteed for race-free programs • Does not prevent threads from having different visibility on operations, unless explicitly declared
Thevolatilekeyword • Mechanics governed by two simple rules • Each action within a thread happens in program order • volatile writes happen before all subsequent reads on that same field • Reads from and writes to main memory • Syntactic shorthand for lock on read, unlock on write – incurs similar performance toll
Java Concurrency • ConcurrentHashMap, ConcurrentLinkedQueue • Need to carefully benchmark • Can be significantly slower depending on implementation • Avoid using default hash map constructor • Faster implementations exist, lock-free • Java 8 improvements in the pipeline • Prone to atomicity violations
ConcurrentHashMap<String, Data> map; Data updateAndGet(String key) { Data d = map.get(key); if(d == null) { // Atomic violation d = new Data(); map.put(key, d); } return d; }
Java Atomics • Guarantees lock-free thread safety • Uses CAS primitives to ensure atomic execution • Better performance than volatile under low to moderate contention, must be tested in production setting
private T current; public synchronized <T> T compareAndSet(T expected, T new) { T previous = current; if(current == expected) current = new; return previous; }
ABA Problem • Direct equality testing is not sufficient • Full A-B-A transaction can execute immediately before execution of CAS primitive, causing unintended equality when structure has changed • Solution: generate a unique tag whenever value changes, then CAS against value-tag pair
False Sharing • Can be prevented by padding out fields • Java 8 addresses this problem with @Contended
Multi-Armed Bandit Problems • N choices, each with hidden payoff distributions • What strategy maximizes cumulative payoff? • Observation: Choose randomly from a distribution representing observed probability, return ARGMAX
Bayesian Bandits *http://camdp.com/blogs/multi-armed-bandits
Adaptive Control Problems • Parameter estimation for real time processes • Uses continuous feedback to adjust output
Counting/Filtering Problems • Large domain of inputs (IPs, emails, strings) • Need to maintain online, streaming aggregates • See Hadoop libraries for good implementations • Observation: Fast hashing is key.
Bloom Filters • Fast probabilistic membership testing • Guarantees no false negatives, low space overhead
References http://mechanical-sympathy.blogspot.ie/ http://camdp.com/blogs/multi-armed-bandits http://blog.locut.us/2011/09/22/proportionate-ab-testing http://blog.locut.us/2008/01/12/a-decent-stand-alone-java-bloom-filter-implementation/ http://www.cl.cam.ac.uk/research/srg/netos/lock-free/ https://github.com/edwardw/high-scale-java-lib M. Michael, et al. Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms. [PDF] P. Tsigas, et al. Wait-free queue algorithms for the real-time java specification. [PDF]