Machine learning, probabilistic modelling

Machine learning, probabilistic modelling

Outline • Some basic aspects of machine learning • Example: detecting artifacts in ICU data • Example: probabilistic data association • Multitarget tracking • Freeway traffic • CiteSeer • Sibyl attacks on recommender systems

Machine learning: model-free Learning hypothesis data

Model-free learning contd. • Supervised learning • Input: x1, f(x1) … xn, f(xn) • (many possible input and label spaces) • Output: h  f • E.g., f classifies xi as earthquake/explosion • Unsupervised learning • Input: x1, … xn • Output: clustering of inputs into categories

Model-free learning contd. • Application, form of data influence choice of hypothesis class for H • Linear models, logistic regression • Decision trees (classification or regression) • Nonparametric (instance-based) • Kernel methods • effectively linear separators in a transformed high-dimensional input space • Probabilistic grammars for strings • Etc.

Model-based learning prior knowledge Learning knowledge data

Bayesian model-based learning • Generative approach • P(world) describes prior over what is (source), also over model parameters, structure • P(signal | world) describes sensor model (channel) • Given new signal, compute P(world | signal) • Learning • Posterior over parameters (or structure) given data • Or use maximum a posteriori, maximum likelihood • Substantial advances modeling capabilities, general-purpose inference algorithms • Applications with millions of parameters, gigabytes of data are fairly routine

Artifact events ubiquitous

Blood pressure signals

Artifact events • Goal: detect, categorize, and correct for artifacts in blood pressure signal

Generative model • Parameters for event duration, frequency trained on small sample of one-second data • Detection uses equivalent one-minute model based on measurement and artifact processes

ALARM

Example: classical data association

Generative model • World = aircraft, trajectories, blip associations #Aircraft ~ NumAircraftPrior(); State(a, t) if t = 0 then ~ InitState() else ~ StateTransition(State(a, t-1)); #Blip(Source = a, Time = t) ~ NumDetectionsCPD(State(a, t)); #Blip(Time = t) ~ NumFalseAlarmsPrior(); ApparentPos(r)if (Source(r) = null) then ~ FalseAlarmDistrib()else ~ ObsCPD(State(Source(r), Time(r)));

Aircraft Tracking Results [Oh et al., CDC 2004] (simulated data) MCMC has smallest error, hardly degrades at all as tracks get dense MCMC is nearly as fast as greedy algorithm; much faster than MHT [Figures by Songhwai Oh]

Extending the Model: Air Bases #Aircraft(InitialBase = b) ~ InitialAircraftPerBasePrior(); CurBase(a, t) if t = 0 then = InitialBase(b) elseif TakesOff(a, t-1) then = null elseif Lands(a, t-1) then = Dest(a, t-1) else = CurBase(a, t-1); InFlight(a, t) = (CurBase(a, t) = null); TakesOff(a, t) if !InFlight(a, t) then ~ Bernoulli(0.1); Lands(a, t) if InFlight(a, t) then ~ LandingCPD(State(a, t), Location(Dest(a, t))); Dest(a, t) if TakesOff(a, t) then ~ Uniform({Base b}) elseif InFlight(a, t) then = Dest(a, t-1) State(a, t) if TakesOff(a, t-1) then ~ InitState(Location(CurBase(a, t-1))) elseif InFlight(a, t) then ~ StateTrans(State(a, t-1), Location(Dest(a, t)));

Unknown Air Bases • Just add two more lines: #AirBase ~ NumBasesPrior(); Location(b) ~ BaseLocPrior();

Example: traffic surveillance Multiple distributed sensors Uncertain, time-varying travel time Prediction error >>> object separation

Example: Citation Matching [Lashkari et al 94] Collaborative Interface Agents, Yezdi Lashkari, Max Metral, and Pattie Maes, Proceedings of the Twelfth National Conference on Articial Intelligence, MIT Press, Cambridge, MA, 1994. Metral M. Lashkari, Y. and P. Maes. Collaborative interface agents. In Conference of the American Association for Artificial Intelligence, Seattle, WA, August 1994. Are these descriptions of the same object? Core task in CiteSeer, Google Scholar

(Simplified) BLOG model #Researcher ~ NumResearchersPrior(); Name(r) ~ NamePrior(); #Paper(FirstAuthor = r) ~ NumPapersPrior(Position(r)); Title(p) ~ TitlePrior(); PubCited(c) ~ Uniform({Paper p}); Text(c) ~ NoisyCitationGrammar (Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));

Citation Matching Results Four data sets of ~300-500 citations, referring to ~150-300 papers

Example: Sibyl attacks • Typically between 100 and 10,000 real entities • About 90% are honest, have one identity • Dishonest entities own between 10 and 1000 identities. • Transactions may occur between identities • If two identities are owned by the same entity (sibyls), then a transaction is highly likely; • Otherwise, transaction is less likely (depending on honesty of each identity’s owner). • An identity may recommend another after a transaction: • Sibyls with the same owner usually recommend each other; • Otherwise, probability of recommendation depends on the honesty of the two entities.

#Entity ~ LogNormal[6.9, 2.3](); Honest(x) ~ Boolean[0.9](); #Identity(Owner = x) ~ if Honest(x) then 1 else LogNormal[4.6,2.3](); Transaction(x,y) ~ if Owner(x) = Owner(y) then SibylPrior () else TransactionPrior(Honest(Owner(x)), Honest(Owner(y))); Recommends(x,y) ~ if Transaction(x,y) then if Owner(x) = Owner(y) then Boolean[0.99]() else RecPrior(Honest(Owner(x)), Honest(Owner(y))); Evidence: lots of transactions and recommendations, maybe some Honest(.) assertions Query: Honest(x)

Summary • Generative approach to machine learning • Can accommodate • strong prior knowledge • heterogeneous data • noise, artifacts • Vertically integrated probability models (not pipeline) connect events, transmission, detection, association

Machine learning, probabilistic modelling

Machine learning, probabilistic modelling

Presentation Transcript

An introduction to machine learning and probabilistic graphical models

Learning Probabilistic Relational Models

ENVIRONMENTAL MODELLING PROBABILISTIC MODELS

CSCI 5822 Probabilistic Models of Human and Machine Learning

Probabilistic modelling of drought characteristics

Probabilistic Machine Learning Approaches to Medical Classification Problems Chuan LU

Learning probabilistic finite automata

ENVIRONMENTAL MODELLING PROBABILISTIC MODELS (2)

Probabilistic Modelling of Brain Imaging Data

Learning Probabilistic Relational Models

Probabilistic modelling in computational biology

Practical Probabilistic Relational Learning

Machine Learning

Machine learning Courses | Machine Learning Training

Machine Learning

Probabilistic Modelling of Brain Imaging Data

Machine Learning System Modelling and Analysis Burak Tiftik

PROBABILISTIC AND LOGIC APPROACHES TO MACHINE LEARNING AND DATA MINING

Applications of Machine Learning to Ecological Modelling

Machine Learning Projects | Machine Learning Applications | Machine Learning Training | Simplilearn

Read E-book Probabilistic Machine Learning: An Introduction (Adaptive Computatio