1 / 16

Behavior Isolation in Enterprise Systems

Behavior Isolation in Enterprise Systems. Mohamed Mansour mansour@cc.gatech.edu. Travel Industry Example. Client 1. clearinghouse . Client 2. Message queue. Client 3. GDS. Airlines. Message queue. GDS. GDS Scale. Mission critical environment 24/7 11.5 million queries/days

teddy
Download Presentation

Behavior Isolation in Enterprise Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Behavior Isolation in Enterprise Systems Mohamed Mansour mansour@cc.gatech.edu

  2. Travel Industry Example Client 1 clearinghouse Client 2 Message queue Client 3 GDS Airlines

  3. Message queue GDS GDS Scale • Mission critical environment • 24/7 • 11.5 million queries/days • 2-16 seconds processing time • ~10GB data set, 20% annual increase • 8 updates per day, moving to seamless updates

  4. Effect of Request Stream

  5. Why We Care? • Business • Consumer Loyalty • Violates contractual agreements • Technical • Occurs even in highly engineered systems • Can cause ripple effects

  6. Lets Just Fix it! • Difficult to identify root cause • Constant data changes • Request stream dependency • Sometimes can’t fix root cause • 3rd part libraries • Interactions with OS, and H/W caches • Complex code base

  7. I(solation) Queue • Dynamic management of message streams • Correlate message sequences with server behavior • Learning phase • Isolate undesired sequences • Control phase • Evaluation metrics • Quality of Information metrics (QoI)

  8. Learning Phase • Use online learning methods • Statistical correlation [ICSOC 06] • HMM [GIT-CERCS-06-11] • Behavior Model • Associate undesired behaviors with certain input patterns

  9. Control Phase • Observe input message sequence • Control sequence dispatched to each server to maintain QoI • Dispatcher • Reordering messages in queue

  10. I-Queue Applied to Worldspan Pricing Engine • Affects customer relations • Possible impact on consumer experience – less options • Objective: return maximum number of alternate fares • Problem • Variable number of alternate fares for same query • Root cause unknown

  11. Establishing Behavior Model • Heuristics point to query geographies • Geography based on From/To city pair, e.g. East Coast to EU • Fare data stored in disk files separated by geography • Use geo-locality as our predictor • Goal: improve geo-locality

  12. Modified Queue Dispatcher • Dispatcher maintains server execution history • Request routed to an available server with matching geography Message queue GDS

  13. Evaluation • Used real traces from Worldspan • Set of about 1800 requests • 20% process in 16 seconds • Geography extracted from messages • Hand-coded mapping from city pairs to geography code • Processing times measured using Worldspan servers • Completely static environment • Simulations to measure geo-matching • Compare different isolation points

  14. Improvement in Geo-locality • Matching improves 6 times for min. farm size • Matching can improve further by adding more servers

  15. Choosing the Right Metrics to Monitor • Min. of 28 servers to avoid queuing delays • Geo-match increases with more servers • Queuing delay is not the best metric to monitor

  16. Future Directions

More Related