140 likes | 227 Views
Understanding Simulators Using Tree-Query Language. New idea: Treatment learners. Tim Menzies 1 , Ying Hu 1 , James D. Kipers 2 1 SE, ECE, UBC, Canada 2 Com Sci, Miami, Oxford, Ohio, USA tim@menzies.com , huying_ca@yahoo.com , kiperjd@muohio.edu. Business motivation.
E N D
Understanding Simulators Using Tree-Query Language New idea: Treatment learners Tim Menzies1, Ying Hu1, James D. Kipers2 1SE, ECE, UBC, Canada 2Com Sci, Miami, Oxford, Ohio, USA tim@menzies.com, huying_ca@yahoo.com, kiperjd@muohio.edu
Business motivation • IV&V practitionersseek: • Cheapest questions from IV&V to the projects • Conversations take time • Data collection = $$$ • Least changes proposed by IV&V to the projects • Organizational change is hard • Fewest monitors required for the IV&V effort • Monitoring = on-going data collection = $$$ • Method: • Knowledge farming • Treatment learning
Knowledge Farming • If data is plentiful: • Then data mining (e.g. Khoshgoftaar & Allen) • Else, farm it: • Plant a seed • Quickly build (or borrow) a model • Grow the data • Monte Carlo (ish) simulations • Harvest • Summarize • Happy surprise: • When we explore a large space of “maybes”… • …Stable conclusions exist.
stableRequirements if effectiveReviews @ 0.3 and requirementsUsed @ 0.3 …. and (workProductsIdentified @ 0.3 or … or softwareTracking @ 0.3). Planting some seeds • ARRT • The CMM2 model- • 350 lines of Prolog • Unsure of influence weights? • Jiggle! • Unsure which of N to do? • At runtime, pick at random! • The COCOMO-R risk mitigation model • Risk of development time over-run • Unsure of internal tuning parameters? • Pick randomly from tunings in the literature • Unsure of inputs (e.g. SLOC)? • Pick randomly from possible ranges. • Generate sketches of local intuitions • Qualitative models, balanced score cards, fault trees,… • Unsure if X or not X? Add them both in!
Harvest: improve the scores sunny, 85, 86, false, none (21=$2) sunny, 80, 90, true, none sunny, 72, 95, false, none rain, 65, 70, true, none rain, 71, 96, true, none rain, 70, 96, false, some (22=$4) rain, 68, 80, false, some rain, 75, 80, false, some sunny, 69, 70, false, lots (23=$8) sunny, 75, 70, true, lots overcast, 83, 88, false, lots overcast, 64, 65, true, lots overcast, 72, 90, true, lots overcast, 81, 75, false, lots hours on course outlook temp humidity wind Lots some Lots none Delta(outlook.overcast)=10= ((8-2)*(4-0))+((8-4)*(4-0)) (4+0+0) Outlook= overcast Humidity= 90.. 96%
If no change, Then lots of golf 6/(6+3+5)= 43% If outlook=overcast, Then lots of golf 4/(4)= 100% Least change: bribe DJs to lie about the weather assumes causality, controllability BTW- deltaf >> tree query (faster, simpler). Harvesting from data: golf If humidity=90..97 Then lots of golf 1/(4)= 25% Least monitor: watch the humidity- alert if rising over 90%
Harvesting from seed1: ARRT • ARRT: manual risk balancing • TAR2+ARRT: • 88 possible actions (28 8 billion*billion*billioncombinations) • Random combinations <benefit,cost> • Seek actions high benefit, low cost Treated (found automatically in 88 secs): Untreated: do! P697=Y P919=Y P753=Y P692=N don’t! worst best worst best
Same Not 2 Lower cost of requirements used (via B?) (A) Using ultra-lightweight formal methods such as proposed by Leveson et.al. (B) sharing requirements documents around the development team in some searchable hypertext format (C) Build test stubs Harvesting from seed2: CMM-2 Lower cost of formal reviews at milestones (via A?) Do periodicsoftwarereviews Lower cost of unit testing (via C?) Warning: general conclusions may not apply to specific projects (as we shall see)
Inputs = all ranges Harvesting from seed3: COCOMO-R • Sced=4: Time =160% of schedule • Pcap=4: programmer capability > 90th percentile • Pmat=2: CMM level 2 • Inputs = KC1= changes1 U now1 • Sced=2: Time = 100% of schedule • Acap=2: analyst capability 55th percentile
Qualitative constraint model of a circuit : 9 switches, 9 bulbs, 3 batteries Many unknowns: E.g. bulbs may be blown/ok Score each run by # shining light bulbs (max=9) Run I: Find top treatment T learnt from run I. T must be: Acceptable to users And Possible Constrain run I+1 with T 1 question culls 90% of options Runs=35228 Runs= 3264 Leave Sw2c open Runs= 648 And close Sw1c Runs= 32 Andclose Sw3c (not done) Harvesting from seed4: a qualitative model
Discussion • For coarse-grain decision making • Don’t fret what you don’t know • Simulate across the space of don’t knows • Actions = top ranked treatment(s) • Monitors = lowest ranked treatment(s) • To ask the fewest questions: • Explore issues in order of treatment ranking • Generality: • Average case analysis: use with care for safety critical sub-systems • Works in many domains • Assumes low-hanging fruit • Elsewhere: theoretically, empirically, many maybes mean (mostly) the same thing
Questions or comments? (Some random other slides follow)
Software • The TAR2 treatment learner • Visual C++/Windows • Fast: • Thousands of examples in a few seconds • Free: • If you tell us what you are doing with it • Simple: • Easy to use (good documentation) • But we’re happy to help • Dr. Tim Menzies, professor, UBC, tim@menzies.com • Ms. Ying Hu, masters student, UBC
Generality TAR2 works when domains generates a small number of outstandingly large deltaf values