HasSound: Generating Musical Instrument Sounds in Haskell

HasSound:Generating Musical Instrument Soundsin Haskell Paul HudakYale UniversityDepartment of Computer Science Joint work with: Matt Zamec (Yale ‘00) Sarah Eisenstat (Brown ‘08) NEPLS, Brown University October 27, 2005

Haskore/HasSound Pictorial Haskore HasSound csoundorchestra file(signal processingdescription) csoundscore file(note level) MIDI File(note level) small small small csound MIDI File Player(uses synthesizer on sound card) wav/snd/mp3 file(signal sample level) large Sound File Player(D/A conversion)

Background • Haskore is a Haskell library for describing music at a high level of abstraction. • MIDI is a low-level representation of music at the note level. • csound is a low-level DSL both for describing music at the note level, and for defining instruments that generate sounds in a signal-processing framework. • The Haskore implementation translates high level descriptions into either MIDI or (the note-level version of) csound. (MIDI instruments are pre-defined.) • What’s missing is a way to define instruments in Haskell. • HasSound fills this gap. The HasSound implementation translates high-level sound descriptions into (the sound-generating version of) csound.

Haskore • The primitive element in Haskore is a note(all higher-level concepts will be ignored). • A note consists of a pitch, a duration, and one or more instrument-dependent parameters, or“p-fields” (whichencode things like volume, vibrato, tremolo, reverb, etc). • The instrument must know how to handle these: • In MIDI, only volume is allowed. • But in csound, the instrument is user-defined – so any number of p-fields can be accommodated.

HasSound • A HasSound program describes an orchestra. • An orchestra consists of a header and a collection of instruments. • Each instrument consists of: • An instrument number. • A note extension. • A signal expression describing the output of the instrument in terms of pitch, duration, & p-fields. • A list of global signals to which values (expressed as signal expressions) are communicated. • The key concept (3) above.

Example 1 Monophonic output of a 440 hertz using wavetable 1 (a sine wave) at amplitude 10,000, and sampled at the audiorate: mono (oscil ar10000440 1) Ok, so that’s not very interesting…

Score Files and Orchestra Files Score file (.sco) Orchestra file (.orc) f1 0 4096 10 1 i2 0 1 2000 880i2 2 1 4000 440i2 3 2 8000 220i2 4 1 16000 110i2 6 1 32000150 e <header> Instr 1…endin … instr 2a1 oscil p4, p5, 1 out a1endin … Instrumentnumber (2) start time duration p4 = volumep5 = frequency P-fields 4 and 5 (p4 and p5)

A Bigger Example This example chooses one of four waveforms (sine, sawtooth, square, or pulse), and adds chorusing (detuned signals), vibrato (frequency modulation), and various kinds of envelopes. • Let’s set up the p-fields as follows: • p3 = duration p6 = attack time p9 = vibrato depth • p4 = amplitude p7 = release time p10 = vibrato delay (0-1) • p5 = pitch p8 = vibrato rate p11 = waveform selection let irel = 0.01 -- vibrato release idel1 = p3 * p10 -- initial delay isus = p3 - idel1 - irel -- sustain time iamp = ampdb p4 -- amplitude inote = cpspch p5 -- frequency k3 = linseg 0 idel1 p9 isus p9 irel 0 -- envelope k2 = oscil k3 p8 1 -- vibrato signal k1 = linen (iamp/3) p6 p3 p7 -- envelope a3 = oscil k1 (inote*0.999+k2) p11 -- chorusing signal a2 = oscil k1 (inote*1.001+k2) p11 -- “ “ a1 = oscil k1 (inote+k2) p11 -- main signal in mono (a1+a2+a3) -- result

Executable Specification lineseg let …inote = cpspch p5k2 = oscil k3 p8 1k1 = linen (iamp/3) p6 p3 p7a3 = oscil k1 (inote*0.999+k2) p11a2 = oscil k1 (inote*1.001+k2) p11a1 = oscil k1 (inote+k2) p11in mono (a1+a2+a3) p8 k3 oscil1 p5 cpspch k2 inote linen x 1.001 x .999 + + + k1 oscil oscil oscil p11 a2 “A picture is worth athousand lines of code…” a1 a3 +

csound Orchestra Code The previous example is equivalentto the following csound code: instr 6 irel = .01 idel1 = p3 * p10 isus = p3 - (idel1 + irel) iamp = ampdb(p4) inote = cpspch(p5) k3 linseg 0, idel1, p9, isus, p9, irel, 0 k2 oscil k3, p8, 1 k1 linen iamp/3, p6, p3, p7 a3 oscil k1, inote*.999+k2, p11 a2 oscil k1, inote*1.001+k2, p11 a1 oscil k1, inote+k2, p11 out a1+a2+a3 endin • This is not bad! • But there are some funny things going on: • sometimes there is an “equal” sign (=), other times not • “out” looks like a variable, not a function • what happens when the order of the “statements” is changed?

Motivation and Goals • HasSound allows Haskore user to define instru-ments without leaving “convenience” of Haskell. • HasSound provides simple framework for algorithmic instrument synthesis. • HasSound is: • as expressive as csound. • purely functional (thus the “value added” is the power of functional programming). • compilable into csound (for efficiency).(this is a big constraint!)

Problems • There are several problems that arise in meeting these goals: • Global variables in csound. • Delay lines in csound. • Imperative glue in csound. • Recursive signals (not allowed in csound). • Csound is enormous (the manual is 1200 pages long!) • I will talk about some ( ) of these…

Global Variables • Suppose we want reverb that lasts past the end of a note. • One way to do this is to use global variables to “communicate” to another instrument that is “always on”: garvb init 0 ; initialize global (in header) instr 9 … ; instrument using reverb out a1 ; normal output garvb = garvb + a1 * rvbgain ; add reverb to global variable endin instr 99 ; global reverb instrument asig reverb garvb, p4 ; compute reverb out asig garvb = 0 ; then clear global! endin

Globals in HasSound • Global variables are replaced by global signals in HasSound. In addition to the normal signal output, an instrument can attach signals to global signal names:data InstrBlock a = InstrBlock Instr -- instrument number SigExp -- note extension a -- normal output -- (i.e. Mono se, Stereo se1 se2, Quad … ) [(GlobalSigName, SigExp)] -- global signalsdata GlobalSig = Global (SigExp -> SigExp -> SigExp) -- combining function Int -- unique identifier

Example in HasSound So the previous example can be written in HasSound like this: let gsig = Global (+) 1in [ InstBlock 9 0 (mono a1) [ (gsig, a1*rvbgain) ] , InstBlock 99 0 (mono (reverb (read gsig) p4)) [ ] ] Note: more than one “instance” of instrument 9 may be active at any given time.

Imperative Glue • Global variables in csound are not modular – if you combine different instrument definitions, global variable names may clash. • So in HasSound we introduce a monad of gobal signal names at the top level to ensure modularity. For example:let a1 = oscI AR (tableNumber 1) 1000 440 comp = do h <- mkSignal AR (+) addInstr (InstrBlock 1 0 (Mono a1) [(h, a1)]) addInstr (InstrBlock 2 0 (Mono (readGlobal h)) [])in saveIA (mkOrc (44100, 4410) comp)

Delay Lines • A delay line delays a signal by a given duration. Useful for reverb, but also for certain percussive sounds. • In csound, it is unnecessarily imperative:a1 delayr max ; sets max delay … a2 deltapi atime1 ; taps delay line a3 deltapi atime2 ; another tap …delayw asource ; input to delay line • The effect is non-local, can’t have more than one delay line in same sequential fragment, and it is prone to errors.

Delay Lines in HasSound • In HasSound we provide equivalent power, but purely functionally:data DelayLine = DelayLine SigExp -- max delay time SigExp -- signal to be delayeddata SigExp = … | Tap DelayLine [SigExp] -- create multiple taps | Result DelayLine -- output of delay line • Thus delay lines are “first-class values”.

Recursive Signals • We would like to be able to write things such as:let x = delay (sig + 0.5 * x) 1.0 in … • This looks like a finite loop. But as a data structure, it’s an infinite tree… • We could design our own DSL, or require the user to “flag” each recursive reference. • In HasSound, we introduce implicit looping via a fixed point operator:rec (\x -> delay (sig + 0.5 * x) 1.0)

Translating Loops • Conceptually, this expression: x = delay (sig + 0.5 * x) 1.0must be translated into this csound code: ax init 0 -- initialize ax … ax1 delay (sig + 0.5 * ax) 1.0 -- create new sig ax = ax1 -- update old sig … • This is done in two steps, starting from the “rec” form: • Generate a unique variable name, and inject into functional:rec (\x -> delay (sig + 0.5 * x) 1.0) Loop n (delay (sig + 0.5 * Var n) 1.0) -- n unique • For each Loop, generate init code and usage code as above.

Compiling Into csound • All of the “functional” translations are straightforward. • However, common subexpression elimination is critical. • Delay lines are tedious but straightforward. • Global variables require special care in order to ensure proper initialization and resetting. • Recursive signals also require careful sequencing of code.

Csound is Enormous • The csound manual is 1200 pages long. • Chapter 15, “Orchestra Opcodes and Operators”, is 972 pages long!! • We cannot hope to import everything, but it’s easy to add your favorite operation, as long as its functional…

Conclusions • Certain kinds of imperative ideas can be redesigned, and others can be reengineered. • Embedding a DSL in Haskell works well, but has some limitations (for example, recursion is not transparent). • Future work: • Better (more type-safe, etc.) interface between intruments and the score. • Import more of csound. • Graphical interface.

HasSound: Generating Musical Instrument Sounds in Haskell