270 likes | 283 Views
Toward Mixed-Initiative Clustering. Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009. Unsupervised clustering: A machine builds the model alone. Semi-supervised clustering: A user performs an oracle role. Mixed-Initiative Clustering.
E N D
Toward Mixed-Initiative Clustering Yifen HuangTom M. MitchellCarnegie Mellon UniversityAgents that Learn from Human TeachersMarch 23, 2009
Unsupervised clustering:A machine builds the model alone. Semi-supervised clustering:A user performs an oracle role.
Mixed-Initiative Clustering Unsupervised clustering:A machine builds the model alone. Semi-supervised clustering:A user performs an oracle role.
Key Question • How can autonomous clustering algorithms be extended to enable mixed-initiative clusteringapproaches involving an iterative sequence of computer-suggested and user-suggested revisionsto converge to a useful hierarchical clustering? • From autonomous clustering to mixed-initiative clustering • From flat feedback to hierarchical feedback
Activity X contains this list of emails: • ## ## ## ## ## An email from Andrea Thomaz belongs to your AAAI symposium activity. Adam Cheyer is a key-person to your CALO activity.
Activity X contains this list of emails: • ## ## ## ## ## • What the hell is this?? DELETE! An email from Andrea Thomaz belongs to your AAAI symposium activity. Too lazy to comment. This is correct. Adam Cheyer is a key-person to your CALO activity.
Framework for Mixed-Initiative Clustering Computer-to-user language: hypotheses User-to-computer language: modified hypotheses Model adaptation algorithm
Communicative Languages inSemi-Supervised Clustering Cluster Document
Communicative Languages inSemi-Supervised Clustering ConfirmRemove Cluster Document
Enriching Languages in Flat Clustering ConfirmRemove Cluster Document Word Person
Enriching Languages in Flat Clustering ConfirmRemove ConfirmRemove Cluster Document ConfirmRemove Word Person
Enriching Languages in Hierarchical Clustering ConfirmRemove ConfirmRemove Cluster Cluster Document ConfirmRemove Word Person
Enriching Languages in Hierarchical Clustering ConfirmRemove MoveMergeAddSplit ConfirmRemove Cluster Cluster Document Move ConfirmRemove Move Word Person
Experiment Design • Can mixed-initiative clustering help a user achieve the result faster? • Can mixed-initiative clustering help a machine build a better model?
Dataset • An email dataset of one of the authors • 623 emails • 6684 unique words and 135 individual people • Manually sorted into a hierarchy of 15 cluster nodes including a root, 3 intermediate nodes and 11 leaf nodes
Feedback Sessions • Five initial hierarchical clustering results • Two feedback sessions on each result • Diligent session • Lazy session
Diligent User ConfirmRemove MoveMergeAddSplit ConfirmRemove Cluster Cluster Document Move ConfirmRemove Move Word Person
Lazy User ConfirmRemove MoveMergeAddSplit ConfirmRemove Cluster Cluster Document Move ConfirmRemove Move Word Person
Measurement • User feedback is equivalent to edge modification. • Edge Modification Ratio (EMR)equals the ratio of edges needed to be modified in order to reach the reference hierarchy.
One More StepToward Mixed-Initiative Clustering Yifen HuangTom M. MitchellCarnegie Mellon UniversityAgents that Learn from Human TeachersMarch 23, 2009
Future Work • Feasibility study of the low-latency mixed-initiative interface