50 likes | 238 Views
Logistic Regression. The logistic function: The logistic function is useful because it can take as an input any value from negative infinity to positive infinity, whereas the output is confined to values between 0 and 1.
E N D
Logistic Regression • The logistic function: • The logistic function is useful because it can take as an input any value from negative infinity to positive infinity, whereas the output is confined to values between 0 and 1. • The variable z represents the exposure to some set of independent variables, while ƒ(z) represents the probability of a particular outcome, given that set of explanatory variables. • The variable z is a measure of the total contribution of all the independent variables used in the model and is known as the logit.
Probabilistic Models: Logistic Regression • In Information Retrieval, estimates for relevance based on log-linear model with various statistical measures of document content as independent variables. Log odds of relevance is a linear function of attributes: Term contributions summed: Probability of Relevance is inverse of log odds:
Logistic Regression 100 - 90 - 80 - 70 - 60 - 50 - 40 - 30 - 20 - 10 - 0 - Probability of Relevance 0 10 20 30 40 50 60 Term Frequency in Document
Probabilistic Models: Logistic Regression Estimation of the Probability of relevance is based on Logistic regression from a sample set of documents to determine values of the coefficients. At retrieval the probability estimate is obtained by: For the 6 X attribute measures shown on the next slide
Probabilistic Models: Logistic Regression attributes (“TREC3”) Average Absolute Query Frequency Query Length Average Absolute Document Frequency Document Length Average Inverse Document Frequency Inverse Document Frequency Number of Terms in common between query and document -- logged