140 likes | 285 Views
Active Learning: Class Questions. Meeting 8 — Feb 7, 2013 CSCE 6933 Rodney Nielsen. Diverse Ensembles for AL: Your Questions. Concerning the artificial selection of diverse data, why not just select a diverse pool to start with?. Diverse Ensembles for AL: Your Questions.
E N D
Active Learning: Class Questions Meeting 8 — Feb 7, 2013 CSCE 6933 Rodney Nielsen
Diverse Ensembles for AL: Your Questions • Concerning the artificial selection of diverse data, why not just select a diverse pool to start with?
Diverse Ensembles for AL: Your Questions • Again concerning the artificial selection of diverse data, does this tend to overfit the underlying data distribution?
Diverse Ensembles for AL: Your Questions • How/Why it is expected that artificially generated examples, which are automatically labeled inversely to the current ensemble's prediction, provide a 'right' training set for a new classifier? • (NOTE: What I understand of 'inversely to the current ensemble's prediction' is to choose the less probable class.)
Diverse Ensembles for AL: Your Questions • Comment: Table 2 shows the best error reduction over DECORATE by each other Learning Algorithm (and dataset), but every error reduction (in each dataset) occur in a different moment (number of training examples). It would be interesting to see how many instances each algorithm required to reach that error reduction value. • For example, in Statlog, QBag(11.31), QBoost(10.34) and ActiveDecorate(11.43); it seems that the winner is ActiveDecorate, but maybe QBoost reached that error reduction earlier that ActiveDecorate, and they have almost same value.
Diverse Ensembles for AL: Your Questions • I could benefit from having boosting explained again.
Diverse Ensembles for AL: Your Questions • When talking about evaluating "the utility of candidate examples based on the margin of the example", is the standard practice for committee members to cast only 1 vote?
Diverse Ensembles for AL: Your Questions • I use the My Library feature Google Books to keep a collection of the books that I own. Are you aware of a similar service that allows us to collect academic articles that we liked, and might want to reference in the future? • I don't see any similar option using Google Scholar.
Diverse Ensembles for AL: Your Questions • The new classifier is trained from original training data and the diversity data. Can we just use the diversity data? Training on both data sets can help getting a "good" classifier more easily, but I think that may lead to all of the classifiers are similar.
Diverse Ensembles for AL: Your Questions • How to stop the iteration? Stop when the accuracy changes less than a certain threshold?
Diverse Ensembles for AL: Your Questions • For preparing the Artificial training set, In DECORATE, the classifier is trained on some artificial data. What do we mean by artificial data here? It was just mentioned that the data is generated using approximation of training data distribution and what are the category labels. Are they the sets we categorize the samples in to?
Diverse Ensembles for AL: Your Questions • The other parts of the paper was pretty convincing (as the numbers say so). I would like to read more about DECORATE and how its better than Boosting when I have some time.
Diverse Ensembles for AL: Your Questions • Since there is a lot of research and different techniques for single- and multiple-winner voting systems that use ranked voting rather than simple majority, it would be interesting to consider the consequences of using one type of voting system over another.
Diverse Ensembles for AL: Your Questions • The next section says this paper uses the difference in confidences, a form of cumulative voting. I wonder how this compares in practice to single transferable vote.