Rebecca Fiebrink Perry Cook, Advisor Pre-FPO, 6/14/2010

Real-time Human-Computer Interaction with Supervised Learning Algorithms for Music Composition and Performance Rebecca FiebrinkPerry Cook, AdvisorPre-FPO, 6/14/2010

Source: googleisagiantrobot.com

function [x flag histdt] = pagerank(A,optionsu) [mn] = size(A); if (m ~= n) error('pagerank:invalidParameter', 'the matrix A must be square'); end; options = struct('tol', 1e-7, 'maxiter', 500, 'v', ones(n,1)./n, … 'c', 0.85, 'verbose', 0, 'alg', 'arnoldi', … 'linsys_solver', @(f,v,tol,its) bicgstab(f,v,tol,its), … 'arnoldi_k', 8, 'approx_bp', 1e-3, 'approx_boundary', inf,… 'approx_subiter', 5); if (nargin > 1) options = merge_structs(optionsu, options); end; if (size(options.v) ~= size(A,1)) error('pagerank:invalidParameter', … 'the vector v must have the same size as A'); end; if (~issparse(A)) A = sparse(A); end; % normalize the matrix P = normout(A); switch (options.alg) case 'dense’ [x flag histdt] = pagerank_dense(P, options); case 'linsys’ [x flag histdt] = pagerank_linsys(P, options) case 'gs’ [x flag histdt] = pagerank_gs(P, options); case 'power’ [x flag histdt] = pagerank_power(P, options); case 'arnoldi’ [x flag histdt] = pagerank_arnoldi(P, options); case 'approx’ [x flag histdt] = pagerank_approx(P, options); case 'eval’ [x flag histdt] = pagerank_eval(P, options); otherwise error('pagerank:invalidParameter', ... 'invalid computation mode specified.'); end; function [x flag histdt] = pagerank(A,optionsu)

? Effective Efficient Satisfying

? Effective Efficient Satisfying Machine learning algorithms

Outline • Overview of computer music and machine learning • The Wekinator: A new interface for using machine learning algorithms • Live demo + video • Completed studies • Findings • Further work for FPO and beyond • Wrap-up

computer music

Interactive computer music sensed action interpretation response (music, visuals, etc.) computer

Computer as instrument sensed action interpretation sound generation computer

Computer as instrument sensed action interpretation mapping sound generation human + control interface computer

Computer as collaborator sensed action interpretation model meaning sound generation microphone and/or sensors computer

A composed system sensed action mapping/model/interpretation mapping/model/interpretation response

Supervised learning

Supervised learning inputs model training data algorithm Training outputs

Supervised learning inputs “C Major” “F minor” “G7” training data model algorithm Training outputs “F minor” Running

Supervised learning is useful • Models capture complex relationships from the data and generalize to new inputs. (accurate) • Supervised learning circumvents the need to explicitly define mapping functions or models. (efficient) So why isn’t it used more often?

A lack of usable tools for making music Existing computer music tools • WEKA: • Many standard algorithms • Apply to any dataset • Graphical interface + API • > 10,000 citations (Google scholar) • (Witten and Frank, 2005) ??? Weka 1. General-purpose: many algorithms & applications ✓ ✓ ✗ ✗ 2. Runs on real-time signals ✓ ✓ 3. Appropriate user interface and interaction support ✗ ✓ Built by engineer-musicians for specific applications

The Wekinator • A general-purpose, real-time tool with appropriate interfaces for using and constructing supervised learning systems. • Built on Weka APIs • Downloadable at http://code.google.com/p/wekinator/ (Fiebrink, Cook, and Trueman 2009; Fiebrink, Trueman, and Cook 2009; Fiebrink et al. 2010)

A tool for running models in real-time Feature extractor(s) .01, .59, .03, ... .01, .59, .03, ... .01, .59, .03, ... .01, .59, .03, ... time model(s) 5, .01, 22.7, … 5, .01, 22.7, … 5, .01, 22.7, … 5, .01, 22.7, … time Parameterizable process

A tool for real-time, interactive design Wekinator supports user interaction with all stages of the model creation process.

Under the hood Learning algorithms: Classification: AdaBoost.M1 J48 Decision Tree Support vector machine K-nearest neighbor Regression: MultilayerPerceptron joystick_x joystick_y webcam_1 … Feature1 Feature2 Feature3 FeatureN Model1 Model2 ModelM … … Parameter1 Parameter2 ParameterM volume pitch 3.3098 Class24

Tailored but not limited to music The Wekinator • Built-in feature extractors for music & gesture • ChucK API for feature extractors and synthesis classes Open Sound Control (UDP) Control messages Other feature extraction modules Other modules for sound synthesis, animation, …?

Wekinator in performance

Recap: what’s new? • Runs on real-time signals and general-purpose • A single interface for building and running models • Comprehensive support for interactions appropriate to computer music tasks

Study 1: Participatory design process with 7 composers • Fall semester 2009 • 10 weeks, 3 hours / week • Group discussion, experimentation, and evaluation • Iterative design • Final questionnaire (Fiebrink et al., 2010)

Study 2: Teaching interactive systems building in PLOrk • COS/MUS 314 Spring 2010 • Focus on interactive music systems building • Wekinator midterm assignment • Master process of building a continuous and discrete gestural control system, and use in a performance • Logging + questionnaire • Final projects

Study 3: Bow gesture recognition • Winter 2010 • Work with a composer/cellist to build gesture recognizer for a commercial sensor bow • Classify standard bowing gestures • e.g., up/down, legato/marcato/spiccato(Fiebrink, Schedel, and Threw, 2010) • Outcomes: classifiers, improved software, written notes on engineering process

Study 4: Composition/composer case studies • Completed: Winter 2010 to present • CMMV (Dan Trueman, faculty) • Martlet (v 1.0) (Michelle Nagai, graduate student) • G (Raymond Weitekamp, undergraduate) • Blinky; nets0 (Rebecca Fiebrink) • Interviews completed with Michelle and Raymond

Findings to date Interacting with supervised learning Training the user Supervised learning in a creative context Usability summary

Interactively training • Primary means of control: iteratively edit the dataset, retrain, and re-evaluate • A straightforward way of affecting the model • Add data to make a model more complex • Add or delete data to correct errors

Exercising control via the dataset N=21; Students re-trained an average of 4.64 times per task (4.91)

The interface to the training data is important • Real-time example recording and a single interface improve efficiency • Supports embodiment and higher-level thinking • Several composers used playalong learning as the dominant method • Supports different granularities of control • K-Bow: visual label editing interface • Spreadsheet editor is still used

Interactive evaluation • Evaluation of models is also an interactive process in Wekinator

“Traditional” evaluation (e.g. Weka) Available data Training set Train model Evaluation set Evaluate

Evaluation in Wekinator Training set Train model Evaluate

Interactive evaluation • Running models is primary mode of evaluation • In PLOrk study: • Model run & used: 5.3 times (5.3) per task; • On average, 4.0 minutes (out of 19 minutes) running • CV computed: 1.4 times (std dev. 2.6) per task • Traditional metrics also useful • Compare different classifiers quickly (K-Bow) • Validation (of the user’s model-building ability)

When is this interaction feasible? • Appropriate and possible for human to provide and/or modify the data • User has knowledge and (possibly control) over future input space • Training process is fast • Training time in PLOrk: Median .80 seconds, 71 % of trainings under 5 seconds • PLOrk # training examples in final round: Mean 692, std. dev. 610

Related approaches to interactive learning • Building models of the user • Standard in speech recognition systems • Use human experts to improve a model of other phenomena • Vision: Fails and Olsen, 2003 • Document classification: Baker, Bhandari, and Thotakura, 2009 • Web images: Amershi 2010 • Novel in music, novel for a general-purpose tool

Findings to date Interacting with supervised learning Training the user Supervised learning in a creative context Usability summary

Interaction is two-way control Machine learning algorithms feedback Running & evaluation

Training the user to provide better training examples • Minimize noise and choose easily differentiable classes

PLork students learned: “In collecting data, it is crucial, especially in Motion Sensor, that the positions recorded are exaggerated (i.e. tilt all the way, as opposed to only halfway.) Usually this will do the trick…” “I tried to use very clear examples of contrast in [input features]... If the examples I recorded had values that were not as satisfactory, I deleted them and rerecorded… until the model understood the difference…”

Rebecca Fiebrink Perry Cook, Advisor Pre-FPO, 6/14/2010

Rebecca Fiebrink Perry Cook, Advisor Pre-FPO, 6/14/2010

Presentation Transcript

FPO SINGAPORE

Rebecca

Batch X – FPO

2010 Pre-Game

Mount Cook Airline

perry

John Perry

Rebecca Fiebrink Perry Cook, Advisor FPO, 12/13/2010

Rebecca Fiebrink Perry Cook, Advisor FPO, 12/13/2010

Rebecca Fiebrink Princeton University / University of Washington

John Perry

Rebecca Fris Science Coordinator

Ananya Misra, Perry Cook Princeton University

Rebecca M. Hamner Advisor: Wilson Freshwater

Answer: A. Rick Perry

Advisor Training

Victoria Perry Fiscal Affairs Department, IMF

Rebecca

Warren Harrison, Department Chair Barbara Sabath, UG Advisor Rebecca Sexton, Graduate Advisor