90 likes | 109 Views
CSL758 Instructors: Naveen Garg Kavitha Telikepalli Scribe: Neha Dahiya March 7, 2008. Winnowing Algorithm. Concept Class. A sequence of instances each having n binary attributes along with a result of (+) or (-) is presented to the algorithm to train it to predict the result.
E N D
CSL758 Instructors: Naveen Garg Kavitha Telikepalli Scribe: Neha Dahiya March 7, 2008. Winnowing Algorithm
Concept Class • A sequence of instances each having n binary attributes along with a result of (+) or (-) is presented to the algorithm to train it to predict the result. • The goal is to come up with an adaptive strategy. • It is assumed that a disjunction of r literals accurately describes whether a particular instance is in the required group or not. • For example: if x1, x2, x3, x4 and x5 are attributes of instances, where xi = 1 if attribute ‘i’ is present, then n=5. If x1ν x2ν x5 exactly determines whether the instance is in required group or not, then r = 3.
The Winnowing algorithm • Initialize variables w1=1, w2=1, …., wn =1. • For any input instance, • If ∑wi *xi >= n, then declare current example as (+). • Else, declare current example as (-). • Now check the actual result. • If our result matches with actual result, then no change. • If we declared (+) and actual result was (-), then half the weights of those attributes which were present in current example. • If we declared (-) and actual result was (+), then double the weights of those attributes which were present in current example.
Upper Bound on # of mistakes • Now we try to find an upper bound on total no. of mistakes that can happen using winnowing algorithm. • The mistakes can be of two types: • Type 1 Mistake: We declare example as (-) and it was (+). • Type 2 Mistake: We declare example as (+) and it was (-).
Bound on Type 1 Mistakes • For Type 1 mistakes, we increase the weights of attributes present in current example. • None of the relevant attributes (attributes present in the disjunction) gets its weight reduced. • Because we reduce the weight of an attribute only when an example having it makes type 2 mistake and any instance having a relevant attribute can cause type 2 error as it will always be a (+) example. • The upper bound on weight of any relevant attribute is n. • Once the weight of a relevant attribute reaches n, no instance having that attribute will be declared (-) as ∑wi xi >= n will be satisfied always (considering that weights are always positive).
Bound on Type 1 Mistakes • So, for any given relevant attribute, the weight will not be doubled more than log2n times. • And there are r such attributes. • So, we cant make more than r* log2n mistakes of type 1.
Bound on Type 2 Mistakes • Let the no. of type 2 mistakes be C. • Lets do amortized analysis for function W = ∑wi. • Its initial value = n. • At case 1 mistake, W is increased by at most n.( ∑wixi was less then n before that instance, so on doubling weights of attributes present in current instance, we increase W by at most n.) • At case 2 mistake, W is decreased by at least n/2.(∑wixi >=n for that instance => On halving weights of attributes present, we decrease W by at least n/2)
Bound on Type 2 Mistakes • At any point of time, W is positive. So value subtracted from W should be less than Initial value + value added to W. • Initial value = n. • Value added to W <= n * no. of type 1 mistakes <= n * r * log2n • Value subtracted from W >= (n/2) * no. of type 2 mistakes =C*n/2 • C*n/2 <= n*r*log2 n + n • C <= 2*r*log2 n + 2 • So, the upper bound on type 2 mistakes = 2*r*log2 n + 2
Upper bound on # of mistakes • Total no. of mistakes = No. of Type 1 mistakes + No. of Type 2 mistakes <= r*log2 n + 2*r*log2 n + 2 = 3*r*log2 n + 2 So total no. of mistakes made by winnowing algorithm is O(rlog n).