Instructor: Eli Upfal Brown University Computer Science

CS145: Probability & ComputingLecture 10: Continuous Bayes’ Rule,Joint and Marginal Probability Densities Instructor: Eli Upfal Brown University Computer Science Figure credits:Bertsekas & Tsitsiklis, Introduction to Probability, 2008Pitman, Probability, 1999

Midterm Exam During regular class time - Thursday, October 24 • Material: Up to and including the lecture today.Format: Hand-written answers to probability problems.Similar to “medium” difficulty problems from homeworks.No programming, calculators not needed or allowed. • Formula Sheet: We will distribute in the exam – copy on the course web site.Included: Formulas for the pmf/pdf/cdf of particular distributions, mean/variance of distributions, integral and derivative identities, etc.Not included: Conceptual equations. Joint & marginal distributions, Bayes’ rule, checking independence, computing cdf from pdf, etc.

Classification from Continuous Data X=1 fire, X=0 no fire F_Y|X (y~|~x=0) = distribution of smoke (carbon monoxide gas) level when there is no fire. F_Y|X (y~|~x=1) = distribution of smoke (carbon monoxide gas) level when there is fire. Given that $Y=y$, is there a fire? How do we adapt Bayes’ Rule to continuous distributions?

CS145: Lecture 10 Outline • Bayes’ Rule: Classification from Continuous Data • Joint and Marginal Probability Densities

Continuous Random Variables • For any discrete random variable, the CDF is discontinuous and piecewise constant • If the CDF is monotonically increasing and continuous, have a continuous random variable: 1 • The probability that continuous random variable X lies in the interval (x1,x2] is then 0

Probability Density Function (PDF) • If the CDF is differentiable, its first derivative is called the probability density function (PDF): 1 0 • By the fundamental theorem of calculus: • For any valid PDF: 0

Classification Problems Athitsos et al., CVPR 2004 & PAMI 2008 • Which of the 10 digits did a person write by hand? • Which of the basic ASL phonemes did a person sign? • Is an email spam or not spam (ham)? • Is this image taken in an indoor our outdoor environment? • Is a pedestrian visible from a self-driving car’s camera? • What language is a webpage or document written in? • How many stars would a user rate a movie that they’ve never seen?

Continuous & Discrete Variables Example: Probability of each category in a classification problem. Example: Output of temperature sensor, motion sensor, camera, etc. For each possible X=x, the distribution of this sensor is different.

Marginal Distributions are Mixtures Marginal Distribution: Summing over all possible values of X, the marginal CDF (PDF) is a mixtureof the conditional CDFs (PDFs): • Then taking derivative of each term in sum:

Example: Hard Drive Lifetimes Exponential Distributions: • Suppose 90% of hard drives in some laptop computer model have exponentially distributed lifetime param • However, 10% of hard drives have a manufacturing defect that gives them a shorter lifetime • Recall mean of exponential distribution:

Example: Hard Drive Lifetimes Exponential Distributions: • Suppose 90% of hard drives in some laptop computer model have exponentially distributed lifetime param • However, 10% of hard drives have a manufacturing defect that gives them a shorter lifetime • If your hard drive has operated for t seconds and has not yet failed, what is the probability it is defective?

Example: Hard Drive Lifetimes Exponential Distributions: • Suppose 90% of hard drives in some laptop computer model have exponentially distributed lifetime param • However, 10% of hard drives have a manufacturing defect that gives them a shorter lifetime • If your hard drive fails after exactly t seconds of operation, what is the probability it is defective? ? Problem: For continuous variable Y,

Discrete Inference from Continuous Data • Y has a different PDF for each possible discrete X=x • Given Y=y, we want to find the probability of each possible X=x • But how can we condition on an event of probability zero? l'Hopital's Rule

Discrete Inference from Continuous Data

Example: Hard Drive Lifetimes Exponential Distributions: • Suppose 90% of hard drives in some laptop computer model have exponentially distributed lifetime param • However, 10% of hard drives have a manufacturing defect that gives them a shorter lifetime • If your hard drive fails after exactly t seconds of operation, what is the probability it is defective?

Classification from Continuous Data Suppose X is discrete and Y is continuous, and we observe Y=y. Priorprobability mass function for X: Conditionalprobability density function for Y: Posteriorprobability mass function of X given Y=y:

Moments of Continuous Data Suppose X is discrete and Y is continuous, and we observe Y=y. Priorprobability mass function for X: Conditionalprobability density function for Y: Marginal expected value of random variable Y:

Mixed Continuous/Discrete Distributions • Representations of distribution of X: • CDF is (always) well defined • Formally, PDF does not exist in this case • Informally, can illustrate via a hybrid of aprobability density function (PDF) and aprobability mass function (PMF) • Related concepts in physics & engineering:Dirac delta function, impulse response • Distribution of random variable X: • With probability 0.5, X has acontinuous uniform distribution on [0,1] • With probability 0.5, X=0.5

CS145: Lecture 10 Outline • Bayes’ Rule: Classification from Continuous Data • Joint and Marginal Probability Densities

Joint Probability Distributions

Example: Joint Distribution Pitman’s Probability, 1999

2D Uniform Distributions

Marginal Distributions Example: Uniform Distributions

Independence

Classification Problems Athitsos et al., CVPR 2004 & PAMI 2008 • Which of the 10 digits did a person write by hand? • Which of the basic ASL phonemes did a person sign? • Is this image taken in an indoor our outdoor environment? • Is a pedestrian visible from a self-driving car’s camera? If raw data is a list of M continuous features (like pixel intensities),scale models to high-dimensional data via independence assumptions:

Buffon’s Needle

Instructor: Eli Upfal Brown University Computer Science

Instructor: Eli Upfal Brown University Computer Science

Presentation Transcript

University of Hartford Computer Science Program

University of Bridgeport Computer Science

University of Virginia Computer Science

Warren Schudy Brown University Computer Science

Brown University Materials Science Research and Engineering Center

Anglia Ruskin University Computer Science Technology

Christopher Buckingham, Computer Science, Aston University

Brown University

Rutgers University Computer Science Department

WELCOME TO COMPUTER SCIENCE 1027a COMPUTER SCIENCE FUNDAMENTALS II Instructor: Vadim Mazalov

Weber State University Computer Science

Department of Computer Science, Princeton University

Alexandria University Faculty of Science Computer Science Department

Brown University Materials Science Research and Engineering Center

Columbia University Department of Computer Science

Concordia University Department of Computer Science

University of Virginia Computer Science

Anthony Cassandra Computer Science Dept. Brown University Providence, RI 02912 arc@cs.brown

Columbia University Department of Computer Science

CSE2021: Computer Organization Instructor: Dr. Amir Asif Department of Computer Science

Brown University