570 likes | 918 Views
Seminar by: Vaibhav Krishan Kanwal Prakash Singh Tarique Aziz. COMPUTATIONAL MUSIC AND ARTIFICIAL INTELLIGENCE. ROADMAP. Introduction to computational music . History. Need for artificial intelligence. challenges. T erminologies . Some algorithms for tracking a live performance.
E N D
Seminar by: VaibhavKrishan KanwalPrakash Singh Tarique Aziz COMPUTATIONAL MUSIC AND ARTIFICIAL INTELLIGENCE
ROADMAP Introduction to computational music. History. Need for artificial intelligence. challenges. Terminologies. Some algorithms for tracking a live performance. A genetic algorithm for composing music.
WHAT IS COMPUTATIONAL MUSIC Musicians have many roles. Composing music. Accompanying a performer. Editing and processing audio after recording. If by some method in any way a computer can help us with these roles, it is studied under computational music.
HISTORY Much of the work has drawn on the relationship between music theory and mathematics. Max Mathews at Bell Laboratories developed the influential “MUSIC I” program and its descendants. Sophisticated audio synthesis has led to wide variety of algorithms and approaches.
NEED FOR ARTIFICIAL INTELLIGENCE Due to challenges in four major areas: Composition of music. Tracking a performance to maintain sync with performer. Need of formal generative model for music based on cognitive sciences. Digital sound processing and editing.
CHALLENGES IN COMPOSITION Extraordinary dedication required. Human must learn set of protocols for input. Design and debug the configurations. A small piece can mean a thousand of numbers as input. Several programs written. But many employ fixed strategy. Flexibility in strategy is needed.
Continued... Problems with synthesis of sound Done in non-real time. Musician may have to wait for several minutes to several hours. As input is large, typing mistakes are highly probable which can ruin entire run. Also the higher level structure in music is not easily understood in computers. It is clear that knowledge can be useful.
CHALLENGES IN TRACKING Human can make mistake. Computer must not be mislead by these. Must also not take minor tweaks to be mistakes. Like small change in note duration. The computer may be stopped for solos Computer must pickup at appropriate position when turned on again. Tedious for tape operator to do.
TERMINOLOGIES Note: atomic musical unit distinguished by frequency. Pitch: A parameter taken in perception as degree of shrillness in sound which is not noise. Tempo: Rate of music as beats per minute(bpm). Rest: A duration in which nothing is played intentionally Has importance in music.
continued... Score: A musical pattern already stored in computer's memory. Needed for accompaniment. Performance: Musical pattern being played by the human performer in real-time. Comes incrementally as input i.e. one note after another. Cognitive processes: Processes in our mind guiding our actions.
Pulse/Beat Basic unit of time in music. Interval Distance between two notes based on the number of notes between them. Octave A set of frequencies containing all distinct notes. Bar A sequence of beats of given duration. continued...
continued... Tone Steady periodic sound characterized by duration, pitch, intensity and quality. Tonality Hierarchical ordering of notes based on musical formula. For example C major scale is C D E F G A B C Tones are basically characterized based on the tonality they belong to.
Tracking a Live Performance How does a Computer find out where, what and how fast a Performer is playing in real-time.
CHALLENGES Tracking is essentially a pattern-matching problem. Input from the performance is to be matched with the score. A lot of possibilities to consider in a short response time. Music played must follow closely to the performer. A lot of Parameters like tempo, time duration of rests in music depend on style of music.
challenges continued.. Music must not be unpleasant to human listeners Sudden high changes in tempo, note frequency or position in score are generally unpleasant for human. Such music should not be played by computer even if human performer does it. Must take into account note sequence, pitch, length of note and rests. Inspired by cognitive processes of human while listening to music. For ex: Most people can recognize Twinkle Twinkle Little Star by listening to four notes only.
challenges continued... Mistakes are often made by the fellow Human Performer. Like falling off the tempo of music. Playing a note much longer or shorter than its expected duration. Often what looks like a mistake to the computer is not actually a mistake. A time difference of small amount should not be taken as mistake. Pause between two notes may just be time required to move finger from one key to other.
DANNENBERG’S ALGORITHM Uses notes played as the most important criterion. A derivation of Kruskal's dynamic programming algorithm for longest common sub-sequence. One string is the score stored in memory. Given the time and need, only a part of string needs to be considered. Uses heuristic to get the “best-match”.
KRUSKAL'S ALGORITHM Populates a two-dimensional matrix. M[i][j] = length(lcs(s1[0..i],s2[0..j])). M[i][j] = max{M[i-1][j], M[i][j-1], if(s1[i]=s2[j]){1+M[i-1][j-1]} } Initialize first row and column. Back-track to find the common sub-sequence.
VARIATION FROM KRUSKAL'S Algorithm must be on-line i.e. must give the best match as input keeps arriving. Keep the performance in row and score in column. Store the current best match length. As new input arrives calculate the next column. If better length is found then report the position else keep calculating more columns. No unnecessary calculations because only one column is needed to calculate the next column.
HEURISTIC'S Store only one column as only one is needed. Saves space. As erroneous note is generally close to current No need to consider whole score and performance. Look at a small window around the expected match. If a match was found on some column then center the window next to the row in which it is found. If no match found move the window down by one row.
ISSUES AND SOLUTIONS If an extra event occurs(wrong extra note) and it matches some note in near future it may be misinterpreted as skipping of some notes. The algorithm will eventually find out that wrong choice was made but that may be too late. Limit the change in window position per event. Deduct score for skipping score events. Minimal change in algorithm. Now M[i][j] = max{M[i-1][j],M[i][j-1]-1, if(s1[i]=s2[j]){1+M[i-1][j-1]} }
LIMITATIONS No attempt to adjust tempo. No effort to respond to loudness, articulation or other nuances. Assuming input to be ordered set Not true if multiple sensors are used. Notion of simultaneity is different than human Difference of microsecond not observable for human Difference of microsecond not under control.
ALGORITHM BY BLAIRD, BLEVINS AND ZEHLER Each note together with its duration is called a unit. A sequence forms a musical pattern. Length determined by piece. Algorithm matches units in performance with score. Units are matched with penalties for difference in pitch or duration. Rests treated as special notes with different penalties.
algorithm continued.. Duration of completed notes extended to start time of next note to smooth over gaps. If the duration of note being played is less than that of score then duration assumed to be of score. If the duration is more then penalty is assessed. Algorithm is called whenever a note or rest begins. The length and duration to wait before deciding a rest is taken depends on tempo.
algorithm continued.. When algorithm is invoked a performance pattern is set-up based on recent inputs. Compared with score Ideally all possible score patterns to be considered. Not possible due to time constraints. A moving window is used centered on previous best match. When units are compared, different kind of matches are considered. Based on errors that humans are most likely to make.
FOUR KIND OF MATCHES Verbatim Direct correspondence between score note and performance note.
Amalgamated Two successive notes in performance correspond to one of score. Performer plays an incorrect node. Realizes early enough and corrects himself without losing the tempo. Pitch of note as of second one, duration combined.
Held through One note in performance is matched to two notes in score Performer instead of playing a note held the previous note through the following. May be due to reasons of tempo or difficulty. Two notes considered as one with pitch of earlier and duration as combined.
Rest Deciding whether incidental pause or intentional rest. This varies from piece to piece. If incidental pause, note and subsequent pause are joined and unit compared to one note in score. Pitch as of the note, duration combined.
algorithm continued.. Computer must do this in real-time. If response comes too late then the performance gets spoiled. In examining one pattern each of these matches are considered for each unit in pattern. Recursion is used heavily to find the best score. An upper limit is set and if a pattern exceeds the cost, it's discarded as a possibility. If no acceptable match is found, then computer “keeps playing” while gradually going back to original tempo.
algorithm continued.. Now based on previous score a best-promising pattern is found. A new but more costly path is abandoned. Both window size and length of performance affects speed. An optimal can be found by varying both until response time is considerable and noticeable. Length of performance is also restricted by the style of music A slow tempo means not many notes can be considered.
ADJUSTING TEMPO Computer constantly tracks tempo of performer. If a good match is found but computer is not at the current performance position Computer must sync. Good estimate of current tempo required as not only the position but speed needs to be known. A history of recent matches and times is kept. A weighted least-square fit estimate is calculated with more weights to current input.
COMPOSING MUSIC Models for algorithmic composition: Mathematical models Knowledge-based systems Grammars Evolutionary Methods(discussed in detail) Systems which learn Hybrid Systems Single algorithmic model rarely give satisfying results.
GENETIC ALGORITHM BY DRAGAN MATIC GA implementation. Population initialization and algorithm flow. Creating an individual. Determining fitness. Genetic operators. An example.
GA IMPLEMENTATION Compositions, represented by an array carrying information about the pitches and their duration. Input parameters Length of composition Tonality Number and range of tones Maximum number of iterations Criteria for completion Interpretation of results.
Continued... Algorithm starts with random individuals partially controlled by input parameters. Algorithm terminates when it reaches the maximum number of iterations or when a good enough individual is found. The individual becomes “better” as its fitness value decreases. New individuals are created by mutation. Output from the algorithm is a composition.
CREATING AN INDIVIDUAL Total no. of pitches are n. GCD of durations is k(shortest length). Total number of bars are m. Number of Pulse in a bar is p. Number of shortest lengths in a pulse is q. Then, composition array will have size of m*p*q. Each index will have value 0 for break, [1,n] for pitches and n+1 for shortest length continuation.
Population contains individuals(complete compositions) having predefined rhythm as of the referenced individual. Population sorted on calculated fitness. Mutation is applied on the best individuals. Duplicates and excess are removed. Algorithm terminates when best individual is found or max number of iterations are reached. POPULATION INTIALIZATIONAND ALGORITHM FLOW
DETERMINIG FITNESS = weight for ith factor( fi ) and jth bar. n is the number of criteria and m are no. of bars. fi may be the ratio of tones out of tonality with total number of tones, number of dissonant tones etc. Each interval is assigned a value , lower the value greater the quality of interval.
Continued.. Arithmetic mean and variance are calculated for intervals in each bar. is influence of difference of variance for ithbar. is influence of difference of mean for ithbar. , ,ai and bi2 are mean & variance for ith bar of referenced composition and an individual respectively.
Continued.. *bl is weighted factor and blis total number of bad tones. Total Fitness = f 1+ f2 + f3 = The quality of composition is inversely proportional to the overall fitness.
GENETIC OPERATORS Three types of mutations are used : Changing tone for an octave This adjusts the tones so that the interval is not large (usually same octave). Changing one tone Changes one tone in an individual. Swapping two consecutive tones Randomly swaps a node with its neighbor. No cross-over used because of short compositions.
LIMITATIONS Not good for long compositions. Out of structure compositions are not produced. Odd/complicated time signatures. Tempo variation. Time of composition remains same as the reference input throughout the algorithm. As no insertion or deletion of tones is performed. Emotional expressions are not considered.
CONCLUSIONS • Computational music and AI can provide assistance to musicians. • Machine improvisations encourages musical creativity. • Computers replacing musicians is still a long way ahead (challenges discussed earlier). • We still need to have satisfactory algorithms involving more than one model of generation.