440 likes | 655 Views
An Algebraic Theory of Polymorphic Temporal Media. Paul Hudak Yale University Department of Computer Science PADL Symposium June 18, 2004. Motivation. Previous work on: Haskore : a library for computer music composition. Fran : a language for functional reactive animation.
E N D
An Algebraic Theory ofPolymorphic Temporal Media Paul Hudak Yale University Department of Computer Science PADL Symposium June 18, 2004
Motivation • Previous work on: • Haskore: a library for computer music composition. • Fran: a language for functional reactive animation. • Dance: a language for humanoid robots. has revealed striking similarities at the highest level of expression. • In particular, notions of: • Sequential composition • Parallel composition • Absence of value • Temporal properties (duration, etc.) • Map- and fold-like operations (scaling, transposing, etc.) • Questions: • Can these notions be captured in a single unified framework? • How do we give meaning to these structures? • How do we manipulate and reason about them?
Outline • Polymorphic media • Syntactic (structural) operations and properties (map, fold, etc.) • Temporal operations and properties (duration, take, drop, etc.) • Semantic operations and properties (sequential and parallel composition) • Axiomatic semantics (with soundness and completeness results) Haskell code is used throughout,with running examples from music and animation.
The Nature of this Talk • Everything is fairly simple. • But it uses lots of ideasfrom PL research. • Therefore it’s good pedagogy. • Not entirely sure if it’s practical…but hopefully it’sfun!
Haskell Types and Classes • Polymorphic data type (new type):data List a = Nil | Cons a (List a) • Type synonym (new name for existing type):type IntList = List Int • Type class:class Eq a where (==) :: a -> a -> Bool • Type class instance with “context”:instance Eq Int => Eq IntList where Nil == Nil = True Cons x xs == Cons y ys = x==y && xs==ys • Class “laws”:a==a a==b = b==a a==b && b==c a==c
Polymorphic Media • Define an algebraic data type:data Media a = Prim a -- base media value | Media a :+: Media a -- sequential composition | Media a :=: Media a -- parallel composition(later we will define a way to express the absence of media) • We refer to T in Media T as the base media type. • So: • Prim x is a media value from the base media type. • m1 :+: m2 is media value m1 followed in time by m2. • m1 :=: m2 is media value m1 occurring simultaneously with m2.
Example 1: Music • For music media, Note is the base media type:type Music = Media Notedata Note = Rest Dur | Note Pitch Durtype Dur = Realtype Pitch = (NoteName, Octave)type Octave = Intdata NoteName = Cf | C | Cs | Df | D | Ds | Ef | E | Es | Ff | F | Fs | Gf | G | Gs | Af | A | As | Bf | B | Bs • For example:let dMinor = Note (D,3) 1 :=: Note (F,3) 1 :=: Note (A,3) 1 gMajor = Note (G,3) 1 :=: Note (B,3) 1 :=: Note (D,4) 1 cMajor = Note (C,3) 2 :=: Note (E,3) 2 :=: Note (G,3) 2 in dMinor :+: gMajor :+: cMajoris a ii-V-I progression in C major.
In Contrast: Haskore type Pitch = (PitchClass, Octave) data PitchClass = Cf | C | Cs | Df | D | Ds | Ef | E | Es | Ff | F | Fs | Gf | G | Gs | Af | A | As | Bf | B | Bs type Octave = Int data Music = Note Pitch Dur [NoteAttribute] -- a note \ atomic | Rest Dur -- a rest / objects | Music :+: Music -- sequential composition | Music :=: Music -- parallel composition | Tempo (Ratio Int) Music -- tempo scaling | Trans Int Music -- transposition | Instr IName Music -- instrument label | Player PName Music -- player label | Phrase [PhraseAttribute] Music -- phrasing attributes type Dur = Ratio Int -- in whole notes type IName = String type PName = String
data NoteAttribute = Volume Float -- convention: 0=min, 100=max | Fingering Int | Dynamics String | PFields [Float] data PhraseAttribute = Dyn Dynamic | Art Articulation | Orn Ornament data Dynamic = Accent Float | Crescendo Float | Diminuendo Float | PPP | PP | P | MP | SF | MF | NF | FF | FFF | Loudness Float | Ritardando Float | Accelerando Float data Articulation = Staccato Float | Legato Float | Slurred Float | Tenuto | Marcato | Pedal | Fermata | FermataDown | Breath | DownBow | UpBow | Harmonic | Pizzicato | LeftPizz | BartokPizz | Swell | Wedge | Thumb | Stopped data Ornament = Trill | Mordent | InvMordent | DoubleMordent | Turn | TrilledTurn | ShortTrill | Arpeggio | ArpeggioUp | ArpeggioDown | Instruction String | Head NoteHead data NoteHead = DiamondHead | SquareHead | XHead | TriangleHead | TremoloHead | SlashHead | ArtHarmonic | NoHead
Example 2: Animation • For animation media, Anim is the base media type:type Animation = Media Animtype Anim = (Dur, Time -> Picture) type Time = Real type Dur = Real data Picture = EmptyPic | Circle Radius Point | Square Length Point | Polygon [Point] Point • For example:let ball1 = (10, \t -> Circle t origin) ball2 = (10, \t -> Circle (10-t) origin box = (20, const (Square 1 (1,1)) in (ball1 :+: ball2) :=: boxis a ball that first grows for 10 seconds and then shrinks, next to a stationary box.
Syntactic Operations • Syntactic operations depend only on the “syntax”, or “structure”, of polymorphic Media values. • For example:instance Functor Media where fmap f (Prim n) = Prim (f n) fmap f (m1 :+: m2) = fmap f m1 :+: fmap f m2 fmap f (m1 :=: m2) = fmap f m1 :=: fmap f m2 • This instance obeys the standard laws of the Functor class; namely:fmap (f . g) = fmap f . fmap g fmap id = id
Example • A function to scale the tempo of a Music value:tempo :: Dur -> Music -> Musictempo r = fmap temp where temp (Rest d) = Rest (r*d) temp (Note p d) = Note p (r*d) • A function to transpose a Music value by a given interval:trans :: Int -> Music -> Musictrans i = fmap tran where tran (Rest d) = Rest d tran (Note p d) = Note (transPitch i p) d • Using Functor class laws, it is straightforward to show that:tempo r1 . tempo r2 = tempo (r1*r2) trans i1 . trans i2 = trans (i1+i2) tempo r1 . tempo r2 = tempo r2 . tempo r1 trans i1 . trans i2 = trans i2 . trans i1 tempo r1 . trans i1 = trans i1 . tempo r1 • Similarly, we can define functions to scalean animation in size, or translate it in 2D space.
Catamorphism • We can also define a fold-like function:foldM :: (a->b) -> (b->b->b) -> (b->b->b) -> Media a -> bfoldM f g h (Prim x) = f xfoldM f g h (m1 :+: m2) = foldM f g h m1 `g` foldM f g h m2foldM f g h (m1 :=: m2) = foldM f g h m1 `h` foldM f g h m2 • For which the following laws hold:foldM (Prim . f) (:+:) (:=:) = fmap ffoldM Prim (:+:) (:=:) = id • As well as this fusion law:k . foldM f g h = foldM f’ g’ h’if the following equalities hold:f’ x = k (f x) g’ (k x) (k y) = k (g x y) h’ (k x) (k y) = k (h x y) • Several examples of catamorphisms are forthcoming.
Reversing a Media Value • We can reverse, in time, a Media value if we can reverse the base media type. We enforce this using type classes:class Reverse a where reverseM :: a -> ainstance Reverse a => Reverse (Media a) where reverseM (Prim a) = Prim (reverseM a) reverseM (m1 :+: m2) = reverseM m2 :+: reverseM m1 reverseM (m1 :=: m2) = reverseM m1 :=: reverseM m2 • But note that reverseM can be defined more succinctly as a catamorphism:instance Reverse a => Reverse (Media a) where reverseM = foldM (Prim . reverseM) (flip (:+:)) (:=:)
Laws Involving reverseM • Theorem: For finite m, if the following holds for reverseM :: T -> T, then it also holds for reverseM :: Media T -> Media T:reverseM (reverseM m) = m • Theorem: For any f :: T -> T, if f . reverseM = reverseM . f, then:fmap f . reverseM = reverseM . fmap f • Theorem: For all finite m :: Media T, functions g, h ::T -> T -> T, and f, f' :: T -> T such that f = f' . reverseM: foldM f g h m = foldM f' (flip g) h (reverse m)
Inductionless Proof Prove:reverse (reverse m) = m Inductionless proof, using fusion law: Let k = reverseM: (reverseM . reverseM) m= (k . foldM (Prim . k) (flip (:+:)) (:=:)) m fusion law= foldM Prim (:+:) (:=:) m fold law= m Justification for use of fusion law: Prim x assumption= Prim (k (k x)) fold k= k (Prim (k x)) fold (.)= k ((Prim . k) x) (:+:) (k x) (k y) fold k= k (y :+: x) fold flip= k (flip (:+:) x y) (:=:) (k x) (k y) unfold k= k (x :=: y)
Example 1: Music • We declare Note to be an instance of class Reverse:instance Reverse Note where reverseM = id(i.e. a note is the same played backwards or forwards) • The constraints in the previous laws are thus satisfied. • Furthermore, we have this corollary to the second law:reverseM . tempo r = tempo r . reverseM reverseM . trans i = trans i . reverseM • And this corollary to the third:foldM f g h m = foldM f (flip g) h (reverse m) • [Note: The reverse of a musical passage is called its retrograde. E.g.: J.S. Bach's “Crab Canons” and Franz Joseph Haydn's Piano Sonata No. 26 in A Major (Menueto al Rovescio). It is also a standard construction in modern twelve-tone music.]
Example 2: Animation • We declare Anim to be an instance of class Reverse:instance Reverse Animation where reverseM (d, f) = (d, \t -> f (d-t)) • It is easy to show that:reverseM (reverseM (d, f)) = (d, f) • Therefore the constraints are satisfied, and the laws hold for continuous animations. • Furthermore, we have this corollary:reverseM . scale s d = scale s d . reverseM
Temporal Properties • So far, all operations have been structural (even reverseM, which purportedly also reverses time). • Let’s now look at temporal properties that depend directly on time, and in particular on the duration of a media value. • Define:class Temporal a where dur :: a -> Dur none :: Dur -> a instance Temporal a => Temporal (Media a) where dur = foldM dur (+) max none = Prim . none • Intuitively,dur mis the duration ofm :: Media T, andnone d :: Media T is an “empty” media value with duration d.
The Intended Semantics of (:=:) • In Haskore, the arguments to (:=:) are left-aligned:m1 :=: m2 • In a recent paper, they are centered symmetrically: m1 :=: m2 • In the current treatment, they must have equal duration:m1 :=: m2 (This results in no loss of generality.) m1 m2 m1 m2 m1 m2
A Definition and an Example • Definition: A well-formed temporal media value m :: Media T is one for which each parallel composition m1 :=: m2 has the property that dur m1 = dur m2. • Example 1: We declare Note to be Temporal:instance Temporal Note where dur (Rest d) = d dur (Note p d) = d none d = Rest dThus dur (Note p1 d1 :+: Note p2 d2), for example, is d1+d2. • Example 2: We declare Anim to be Temporal:instance Temporal Anim where dur (d, f) = d none d = (d, const EmptyPic)
Take and Drop for Media • Analogous to take and drop on lists, except indexed by time:class Take a where takeM :: Dur -> a -> a dropM :: Dur -> a -> ainstance (Take a, Temporal a) => Take (Media a) where takeM d m | d <= 0 = none 0 takeM d (Prim x) = Prim (takeM d x) takeM d (m1 :+: m2) = let d1 = dur m1 in if d <= d1 then takeM d m1 else m1 :+: takeM (d-d1) m2 takeM d (m1 :=: m2) = takeM d m1 :=: takeM d m2 -- note: well-formed media dropM d m = . . .
Laws for TakeM and DropM • The following laws are analogous to ones for lists:For all non-negative d1, d2 :: Dur, if the following laws hold fortakeM, dropM :: Dur -> T -> T, then they also hold fortakeM, dropM :: Dur -> Media T -> Media T:takeM d1 . takeM d2 = takeM (min d1 d2) dropM d1 . dropM d2 = dropM (d1+d2) takeM d1 . dropM d2 = dropM d2 . takeM (d1+d2) dropM d1 . takeM d2 = takeM (d2-d1) . dropM d -- if d2>=d1 • But the following law does not hold:For all finite well-formed m :: Media a and non-negative d :: Dur <= dur m,if the following law holds for takeM, dropM :: Dur -> T -> T, then it also holds for takeM, dropM :: Dur -> Media T -> Media T:takeM d m :+: dropM d m = m(the constraint on the base media type cannot be satisfied)
Example • We declare Note to be an instance of Take:instance Take Note where takeM d1 (Rest d2) = Rest (min d1 d2) takeM d1 (Note p d2) = Note p (min d1 d2) dropM d1 (Rest d2) = Rest (max 0 (d2-d1)) dropM d1 (Note p d2) = Note p (max 0 (d2-d1)) • The constraints in the first four previous laws hold for this instance, and thus they hold for Music values. • But note that Note p 1 :+: Note p 1 /= Note p 2, and thus the last law on the previous slide does not hold. • An example using Animations can be constructed analogously.
Semantics • Consider these two expressions:m1 :+: (m2 :+: m3) (m1 :+: m2) :+: m3 • Intuition tells us that these represent the same media value; i.e. (:+:) should be associative. There are in fact several other examples of this. • What we need is an interpretation of media values that somehow gives meaning to them. • And we wish to do this in a polymorphic way.
The Meaning of Media • We use type classes to structure meanings:class Combine b where concatM :: b -> b -> b merge :: b -> b -> b zero :: Dur -> bclass Combine b => Meaning a b where meaning :: a -> binstance Meaning a b => Meaning (Media a) b where meaning = foldM meaning concatM merge • Intuitively, an instance Meaning T1 T2 means that T1 can be given meaning in terms of T2.
Laws • We require valid instances of Combine to obey these laws:b1 `concatM` (b2 `concatM` b3) = (b1 `concatM` b2) `concatM` b3b1 `merge` (b2 `merge` b3) = (b1 `merge` b2) `merge` b3b1 `merge` b2 = b2 `merge` b1zero 0 `concatM` b = bb `concatM` zero 0 = bzero d1 `concatM` zero d2 = zero (d1+d2)zero d `merge` b = b, if d = dur b(b1 `concatM` b2) `merge` (b3 `concatM` b4) = (b1 `merge` b3) `concatM` (b2 `merge` b4), if dur b1 = dur b3 and dur b2 = dur b4 • Laws for class Meaning:meaning . none = zerodur . meaning = dur
Semantic Equivalence • Definition: m1, m2 :: Media T are equivalent, written m1 === m2, if and only if meaning m1 = meaning m2. • Example: We take the meaning of music to be a pair: the duration, and a sequence of events, where each event marks the start-time, pitch, and duration of a single note:data Event = Event Time Pitch Dur type Time = Ratio Int type Performance = (Dur, [Event]) • This corresponds well to low-level representations of music such as Midi and Csound.
Example, cont’d • Three instance declarations complete the meaning of music:instance Combine Performance where concatM (d1, evs1) (d2, evs2) = (d1 + d2, evs1 ++ map shift evs2) where shift (Event t p d) = Event (t+d1) p d merge (d1, evs1) (d2, evs2) = (d1 `max` d2, sort (evs1 ++ evs2)) zero d = (d, [ ])instance Temporal Performance where dur (d, _) = d none = zeroinstance Meaning Note Performance where meaning (Rest d) = (d, [ ]) meaning (Note p d) = (d, [Event 0 p d])
The Structure of Meaning • Theorem: The following diagram commutes: meaning <Media T,:+:,:=:> <I,concatM,merge> h h-1 g <Media T/(===),:+:,:=:>
An Axiomatic Semantics • Define A to be the axiomatic semantics given by the following nine axioms:(1) associativity of (:+:)m1 :+: (m2 :+: m3) === (m1 :+: m2) :+: m3(2) associative of (:=:)m1 :=: (m2 :=: m3) === (m1 :=: m2) :=: m3(3) commutativity of (:=:) m1 :=: m2 === m2 :=: m1(4) left (sequential) zero none 0 :+: m === m(5) right (sequential) zero m :+: none 0 === m(6) left (parallel) zero none d :=: m === m, if d = dur m(7) right (parallel) zero m :=: none d === m, if d = dur m(8) additivity of nonenone d1 :+: none d2 === none (d1+d2)(9) serial/parallel axiom:(m1 :+: m2) :=: (m3 :+: m4) === (m1 :=: m3) :+: (m2 :=: m4), if dur m1 = dur m3 and dur m2 = dur m4plus the reflexive, symmetric, and transitive axioms implied by (===) being an equivalence relation, and the substitution axioms implied by (===) being a congruence relation.
m1 m2 m1 m2 m3 m4 m3 m4 The Serial/Parallel Axiom • Suppose dur m1 = dur m3anddur m2 = dur m4. • Then, intuitively, these two phrases should be equivalent:(m1 :+: m2) :=: (m3 :+: m4) (m1 :=: m3) :+: (m2 :=: m4) • Or, graphically: • This is a critical axiom to many proofs. ===
Example • Theorem: For all finite x :: T and non-negative d :: Dur <= dur m, iftakeM d x :+: dropM d x === xthen for all finite well-formed m :: Media T,takeM d m :+: dropM d m === m • Proof (partial): By structural induction.Base case: Trivially true from the assumption.Induction step:takeM d (m1 :=: m2) :+: dropM d (m1 :=: m2) -- unfold takeM and dropM= (takeM d m1 :=: takeM d m2) :+: (dropM d m1 :=: dropM d m2)serial/parallel axiom= (takeM d m1 :+: dropM d m1) :=: (takeM d m2 :+: dropM d m2) induction hypothesis= m1 :=: m2
Soundness • We write “A|- m1 = m2” iff m1 === m2 is provable from the axioms in A. • Theorem: The axiomatic semantics A is sound. That is, for all m1, m2 :: Media T: A |- m1 = m2 m1 === m2Proof: By induction on the derivation, and validity of the axioms.
Completeness • In what sense are the axioms complete? That is, if two media values are equivalent, can we always prove it from the axioms? • The answer is “yes, if…” • Definition: A well-formed media term m :: Media T is in normal form iff it is of the form:none d, d >=0--- or ---(none d11 :+: Prim x1 :+: none d12) :=:(none d21 :+: Prim x2 :+: none d22) :=: . . .(none dn1 :+: Prim xn :+: none dn2), n >= 1, where for all(1 <= i <= n), di1 + di2 + dur xi = dur m, and for all(1 <= i < n), (di1,xi,di2) <= d(i+1)1, xi+1, d(i+1)2 • We denote the set of media normal forms as MediaNF T.
Normalization • Theorem: Any m : Media T can be transformed into a media normal-form using only the axioms of A. • Proof: Define a normalization function:normalize :: Media a -> Media aand establish it’s validity using only the axioms of A.
Completeness, cont’d • Theorem: The axiomatic semantics A is complete, that is, for all m1, m2 :: Media T:m1 === m2 A -| m1 = m2if and only if the normal forms in MediaNF T are unique. • The “if and only if” means that our design of the normal forms is rather special.
Example • Elements of MusicNF = MediaNF Note are unique. • To see why, note that each normal form m:(none d11 :+: Prim x1 :+: none d12) :=: (none d21 :+: Prim x2 :+: none d22) :=: . . . (none dn1 :+: Prim xn :+: none dn2)corresponds uniquely to an interpretation:(dur m, [ Event d11 p1 (dur x1), Event d21 p2 (dur x2), . . . Event dn1 pn (dur xn) ]) • This correspondence is invertible, and therefore a bijection, because each di2 is computable from the other durations;i.e. di2 = dur m - di1 – dur xi.
Example • Elements of AnimationNF = MediaNF Anim are not unique. • There are two problems: • There are more equivalences:ball :=: ball === balltakeM d (Prim x) :+: dropM d (Prim x) === Prim xBoth of these imply more equivalences than the axioms alone can establish.Solution:add “domain-specific” axioms to regain completeness. • We have assumed commutativity of (:=:), but this is unlikely to be true for most graphics/animation systems.Solution: devise a non-commutative semantics.
Final Thoughts • Details of domain-specific axioms. • Details of non-commutative semantics. • What about infinite media values?Which laws still hold? • Can we make a unified programming language for multimedia? • Other concrete domains (e.g. robot control language). • What about reactivity? (see Yampa for a start) • haskell.org/yale • haskell.org/yampa
Algebraic Structure • An algebra <S,op1,op2,...> consists of a non-empty carrier set (or sort) S together with one or more n-ary operations op1, op2, ..., on that set. • The algebra of well-formed temporal media over type T is denoted by <Media T,:+:,:=:> • The algebra of interpretations in terms of type I is denoted by <I,concatM,merge> • Theorem: The semantic function meaning is a homomorphism from <Media T,:+:,:=:> to <I,concatM,merge,zero>.
Algebraic Structure, cont’d • Theorem: (===) is a congruence relation on <Media,:+:,:=:>. • Definition: Let [[m]] denote the equivalence class (induced by (===)) that contains m. Let Media/(===) denote the quotient set of such equivalence classes, and let <Media/(===),:+:,:=:> denote the quotient algebra (also called the initial algebra). The function:g :: Media -> Media/(===) g m = [[m]]is called the natural homomorphism from <Media,:+:,:=:> to <Media/(===),:+:,:=:>. Also define:h :: Media/(===) -> I h [[m]] = meaning mwhich is an isomorphism, whose inverse is: h-1 p = [[m]], if p = meaning m
Normalization • Theorem: Any m : Media T can be transformed into a media normal-form using only the axioms of A. • Proof: Define this normalization function:normalize :: (Ord (Media a), Temporal a) => Media a -> Media anormalize m = sortM (norm (dur m) 0 m)norm :: (Ord (Media a), Temporal a) => Dur -> Time -> Media a -> Media anorm d t m | isNone m = mnorm d t (Prim x) = none t :+: Prim x :+: none (d-t-dur x)norm d t (m1 :+: m2) = norm d t m1 :=: norm d (t+dur m1) m2norm d t (m1 :=: m2) = norm d t m1 :=: norm d t m2and establish it’s validity using only the axioms of A.In particular:Lemma: norm (dur m) 0 m === mor more generally: norm d t m === none t :+: m :+: none (d-t-dur m)
Completeness, cont’d • Theorem: The axiomatic semantics A is complete, that is, for all m1, m2 :: Media T:m1 === m2 A -| m1 = m2if and only if the normal forms in MediaNF T are unique. • Proof (reverse direction): • If m1 === m2, then p = meaning m1 = meaning m2. • Let n1 = normalize m1 and n2 = normalize m2. Then:A -| n1 = m1 and A -| n2 = m2. • Thus: meaning n1 = meaning m1 = p = meaning m2 = meaning n2. • But there is an isomorphism between Media/(===) and I. Therefore p corresponds uniquely to a normal form h-1 p. • This implies that n1 = h-1 p = n2, and thusA -| m1 = m2. • Proof in forward direction is by contradiction.