Proactivity = Observation + Analysis + Knowledge extraction + Action planning ?

Proactivity = Observation + Analysis + Knowledge extraction + Action planning? András Pataricza, Budapest University of Technology and Economics

Contributors Prof. G. Horváth (BME) I. Kocsis (BME) Z. Micskei (BME) K. Gáti (BME) Zs. Kocsis (IBM) I. Szombath (BME) And manyothers

Therewill be nothingnewinthislecture

I learnedthebasics, whenIwassoyoung

But old professorsare happy of newaudience

Whatcantraditionalsignalprocessinghelpforproactivity Proactivestance: Buildson foreknowledge (intelligence) and creativity to anticipate the situation as an opportunity, regardless of how threatening or how bad it looks; influence the system constructively instead of reacting

Reactivity vs. proactivity • Reactive control • „acting in response to a situation rather than creating or controlling it:” • Proactive control • „controllinga situation rather than just responding to it after it has happened:”

Test environment

Test configuration Virtualdesktopinfrastructure ~ a few of tens of VM/host ~ a few of tens of host/cluster VSphere monitoring and supervisory control • Objective: • VM level SLA control • Capacityplanning, • Proactivemigration • „CPU-ready” metrics: • VM ready to run, but lack of resources to start

Performance monitoring Detecting a possibleproblemon VM orhostlevel Failureindicatoraswell

Actionstoprevent performance issue Add limitsneighbouringVMs

Actionstoprevent performance issue Livemigrate VM toother (underutilized) host

Measureddata (at20 sec samplingrate)

Aggregation over population Statisticalclusterbehavior versus QoS over the VM population

Mean of thegoalVM-metric (VM_CPU_READY) VM application: • readytorun • Resourcelack-> Performance bottleneck-> Availabilityproblem Vmwarerecommended threshold: • 5% watching • 10% typicallyaction is needed

The twotraps Visual processing: Youbelieveyoureyes Automatedprocessing: youbelieveyour computer

Mean of the goal VM-metric • Statistics: • Mean: 0.007 -> a goodsystem • Only 2/3 of thesamplesareerror-free • -> A badsystem • Aftereliminatingfailure-freecasesbelowthethreshold • Mean: 0.023 • -> a goodsystem Visual inspection: Lotof badvalues This is a badsystem

Hostshared and usedmemoryalongthetime • Noisy… • Highfrrequencycomponentsdominate • Buttheycorrelate (93%!) • YOU DON’T SEE IT

… and a host of more mundaneobservations • Computingpoweruse = CPU use × • CPU clkrate (const.) • Should be pureproportional • Correlationcoefficient: • 0.99998477434137 • Well-visible, butnumericallysuppressed • Origin???

Most importantfactor: host CPU usagemean • Host CPU usagevs • VM ratio: „bad” vCPUready

The battleplan

Impacts of temporalresultion • Nyquist–Shannon sampling theorem: • 2× sampling frequency = bandwidth • Samplingperiod = 20 sec-> Samplingfrequency = 5 Hz-> Bandwidth = 2.5 Hz • Additionaly: • Samplingclockjitter (SW sampling) • Clock skew (distributed system) • Precision Time Protocol (PTP)(IEEE 1588-2008) • No finegranularprediction

Proactivity • Proactivity needs: • Situation recognitionbased on historical experience • What is to be expected ? • Identification of the principal factors • Singlefactor /multiplefactors • Operationdomainsleadingtofailures • Boundaries • Predictor design • High failure coverage • Temporal lookahead sufficient for reaction • Design of reaction

Situationsto be covered • Single VM: applicationdemand > resourceallocated • VM-host:overcommisioning, overloadduetootherVMs • VM-host-cluster

Data preparation Data cleaning Data reduction

Data reduction • Huge initial set of samples • Reduction • Objectsampling: Represenative measurement objects • Parameterselection/reduction: • Aggregation • Relevance • Redundancy • Temporal • Sampling • Relevance

Objectsampling Inpursuit of discoveringfine-grainedbehavior and thereasonsforoutliers

Subsample: ratio > 0 + random subsampling • Forpresentationpurposesonly • - Reduction of thesamplesizeto 400  • Manageability Real-life analysis: - keepenoughdatatomaintain a propercorrelationwiththeoperation

Demo: Visual datadiscoverywith || coordinates

Visual multifactoranalysis Visual analyticsfor an arbitrarynumber of factors • Inselberg, A: Parallel Coordinates, Visual Multidimensional Geometry and Its Applications, Springer 2009 • You can do much, much more • Redundancy reduction • Correlation analysis • Clustering • Data mining • Approximation • Optimization

Predictionattheclusterlevel What ratio of theVMswillbecomeproblematic?

Pinpointedintervalforone VM Situation of interest Trainingtime > Predictiontime

One minute predictionbasedonalldatasources

One minute prediction and classification

One minute predictionwithselectedvariables

Classificationerror (simplestpredictor) False alarm rateis low (dominantpattern) Featuresetselectionis criticaltodetection More is less (PROPER selectionis needed – cf. PFARM 2010) Caseseparationfordifferentsituations Long termpredictionis hard (automatedreactions)

Case study – Connectivity testing in Large Networks Indynamicinfrastructurestheactiveinternodetopology has to be discoveredaswell…

Large Networks • not known explicitly • too complex forconventional algorithms • Social network graph • Yahoo! Instant Messenger friendconnectivity graph * • 1.8M nodes ~4M edges • Serve as a model ofLarge Infrastructures • Typical power law network • 75% of the friendships are related to 35% of users Yahoo! Research Alliance Webscope program *ydata-yim-friends-graph-v1_0 http://research.yahoo.com/Academic_Relations

Typical Model: Random graphs Random order: Ordered by degree: Limit: Graphon Yahoo! Instant Messenger dataset – Adjacency Matrix Preferential attachment graph

Approx. edge density by subgraph sampling Sample size k = 35 Repeated n = 20 times 2% error 4% of the graph examined Relative error White:error < 5% Sample size (k) Number of samples (n) Random, k=4 sample • Graph with 800 nodes 320000 edges • Subgraph sampling method • Random induced subgraph • Take krandom nodes • Repeat n times

Neighborhood sampling: Fault Tolerant Services Root node Redundancy? • Neighborhood sampling • take random nodes • explore neighborhood to a given depth (m) Fault Tolerant Domain Trends • No. of 3 and 4 cycles = possible redundancy • High node has many substitute nodes (e.g. load balancer) • Distribution approximated from samples are very close!

Summary: proactivityneeds Thankyouforyourattention • Observations • Allrelevantcases(Stress test) • Analysis • Check of input data • Visual analysis • UNDERSTANDING • Automatedmethodsforcalculation • Knowledge extraction • Clustering (situationrecognition) • Predictor • (generalization) Action planning • Situationdefiningprincipalfactorsareindicative

Proactivity = Observation + Analysis + Knowledge extraction + Action planning ?

Proactivity = Observation + Analysis + Knowledge extraction + Action planning ?

Presentation Transcript

Task Analysis

Language-Independent Class Instance Extraction Using the Web

Chief Officers Training Curriculum

Knowledge Representation and Extraction for Business Intelligence

INTRODUCTRY TO CHEMICAL ENGINEERING

ACTION RESEARCH

Artificial Intelligence: Planning

Observation and Interpretation

ANALYSIS IN ACTION

Information Extraction

ACML 2010 Tutorial Web People Search: Person Name Disambiguation and Other Problems

E-HRM | Job Ana.| Unit 2

Ch 1 Study G uide Science in Action

EM415 – Custom Extraction Techniques

The Action Planning Process

The Market for Real Estate Knowledge

EVENT EXTRACTION

Do Now: Observation v. Inference

Observation, Measurement, and Data Analysis in PER: Methodological Issues and Challenges

Observation vs. Inference