620 likes | 1.01k Views
MRCPv2 – the end of proprietary speech APIs?. Daniel C. Burnett. Overview. What is MRCP? Why MRCP? Why MRCPv2? Why an IETF protocol?/Status Relationship to other standards Features of MRCP Sample call flow with ASR/TTS. What is MRCP?.
E N D
MRCPv2 – the end of proprietary speech APIs? Daniel C. Burnett SpeechTek West 2007
Overview • What is MRCP? • Why MRCP? • Why MRCPv2? • Why an IETF protocol?/Status • Relationship to other standards • Features of MRCP • Sample call flow with ASR/TTS SpeechTek West 2007
What is MRCP? • IETF Protocol allowing a client to control the server’s ASR, TTS, Recording, and SIV resources • A standard, programming-language agnostic API for using ASR, TTS, and SIV resources SpeechTek West 2007
Why MRCP? • Pre-MRCP • Every ASR and TTS vendor has a proprietary API • Some vendors support Microsoft’s SAPI • Some vendors support JSAPI • Today: every major ASR and TTS vendor supports MRCP SpeechTek West 2007
Why MRCPv2? • MRCP v1 • Was designed by Cisco, Nuance, and SpeechWorks • “Tunneled over” RTSP • IETF draft but not IETF standard (http://www.ietf.org/rfc/rfc4463.txt) • MRCP v2 • Designed in a public forum by • Multiple ASR/TTS vendors • Multiple technology integrators • Multiple VoiceXML implementers • “Top-level” application protocol similar to HTTP • IETF standards-track document SpeechTek West 2007
Why an IETF protocol?/Status • IETF protocols are • Implementation programming language agnostic • Public • Widely reviewed • Well-respected • Status • Developed in SPEECHSC Working Group (Real-time Applications area) • Published as Work Group Last Call (http://www.ietf.org/internet-drafts/draft-ietf-speechsc-mrcpv2-11.txt) SpeechTek West 2007
Relationship to other standards • TCP: carrier for MRCP messages • SIP: used to setup calls • RTP: carries MRCP-controlled media • VoiceXML: higher-level language for ASR/TTS that is often built on top of an MRCP client • IMS: framework that allows mobile phones to use MRCP-controlled resources • SRGS, SSML: ASR grammars and TTS controls that MRCP clients can use to configure ASR/TTS resources • TLS: secure alternative to TCP for carrying MRCP SpeechTek West 2007
Features of MRCP • Control of • Synthesizer resource • Recognizer resource • Recorder resource • Speaker Identification and Verification resource • Optional control channel sharing among resources SpeechTek West 2007
Synthesizer • Two resource types • “basicsynth”: concatenated audio clips only • “speechsynth”: full SSML support • Capabilities • Start/stop/pause/resume speaking • Optional stop on barge-in • Live notification of <mark> encounters SpeechTek West 2007
Recognizer • Two resource types • “speechrecog”: full speech and dtmf recognition with user-enrolled phrases • “dtmfrecog”: dtmf digit string recognition only • Capabilities • Start/stop recognition • Support for SRGS grammars • Interpretation of text string • Hotword mode capability (listen until match) • Voice- (user-) enrolled phrases • Recording of recognized audio • Barge-in support SpeechTek West 2007
Recorder • One resource type • “recorder” • Capabilities • Start/stop recording • Barge-in support • Optional speech activity detection • Optional automatic end trimming SpeechTek West 2007
SIV • One resource type • “speakverify” • Capabilities • Verification and identification using one or multiple utterances • Simultaneous verification and recognition or recording • Verification using live or buffered utterances • Voiceprint creation, querying, and deletion SpeechTek West 2007
NLSML • XML data format • Carries results from the MRCP server • Can store simultaneous recognition, enrollment, and verification results • W3C’s EMMA is a future replacement for this format SpeechTek West 2007
Sample call flow with ASR/TTS • Setup • Client contacts server using SIP • Setup of synthesizer resource • Setup of recognizer resource • Play • Client issues SPEAK request • <mark> and SPEAK completion • Play & Recognize (with barge-in) • Client issues RECOGNIZE request • Client issues bargeable SPEAK request • Barge-in occurs • Server returns result • Teardown • Client closes session SpeechTek West 2007
Setup Play Play & Recognize Teardown Client contacts server using SIP • S->C: • SIP/2.0 200 OK • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314159 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:131 • v=0 • o=sarvi 2890844526 2890842807 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • C->S: • INVITE sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mresources@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314159 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:142 • v=0 • o=sarvi 2890844526 2890842807 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • C->S: • ACK sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com>;tag=a6c85cf • From:Sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314160 ACK • Content-Length:0 15 SpeechTek West 2007
C->S: INVITE sip:mrcp@server.example.com SIP/2.0 Max-Forwards:6 To:MediaServer <sip:mrcp@server.example.com> From:sarvi <sip:sarvi@example.com>;tag=1928301774 Setup Play Play & Recognize Teardown Client contacts server using SIP • S->C: • SIP/2.0 200 OK • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314159 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:131 • v=0 • o=sarvi 2890844526 2890842807 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • C->S: • INVITE sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314159 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:142 • v=0 • o=sarvi 2890844526 2890842807 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • C->S: • ACK sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com>;tag=a6c85cf • From:Sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314160 ACK • Content-Length:0 15 SpeechTek West 2007
S->C: SIP/2.0 200 OK To:MediaServer <sip:mrcp@server.example.com> From:sarvi <sip:sarvi@example.com>;tag=1928301774 Setup Play Play & Recognize Teardown Client contacts server using SIP • S->C: • SIP/2.0 200 OK • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314159 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:131 • v=0 • o=sarvi 2890844526 2890842807 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • C->S: • INVITE sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314159 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:142 • v=0 • o=sarvi 2890844526 2890842807 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • C->S: • ACK sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com>;tag=a6c85cf • From:Sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314160 ACK • Content-Length:0 15 SpeechTek West 2007
C->S: ACK sip:mrcp@server.example.com SIP/2.0 Max-Forwards:6 To:MediaServer <sip:mrcp@server.example.com>;tag=a6c85cf From:Sarvi <sip:sarvi@example.com>;tag=1928301774 Setup Play Play & Recognize Teardown Client contacts server using SIP • S->C: • SIP/2.0 200 OK • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314159 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:131 • v=0 • o=sarvi 2890844526 2890842807 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • C->S: • INVITE sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314159 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:142 • v=0 • o=sarvi 2890844526 2890842807 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • C->S: • ACK sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com>;tag=a6c85cf • From:Sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314160 ACK • Content-Length:0 15 SpeechTek West 2007
Setup Play Play & Recognize Teardown Setup of synthesizer resource • S->C: • SIP/2.0 200 OK • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314161 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:131 • v=0 • o=sarvi 2890844526 2890842808 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • m=application 32416 TCP/MRCPv2 • a=setup:passive • a=connection:existing • a=channel:32AECB23433801@speechsynth • a=cmid:1 • m=audio 48260 RTP/AVP 0 • a=rtpmap:0 pcmu/8000 • a=sendonly • a=mid:1 • C->S: • INVITE sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314161 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:142 • v=0 • o=sarvi 2890844526 2890842808 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • m=application 9 TCP/MRCPv2 • a=setup:active • a=connection:existing • a=resource:speechsynth • a=cmid:1 • m=audio 49170 RTP/AVP 0 96 • a=rtpmap:0 pcmu/8000 • a=recvonly • a=mid:1 • C->S: • ACK sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com>;tag=a6c85cf • From:Sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314162 ACK • Content-Length:0 16 SpeechTek West 2007
C->S: • INVITE sip:mrcp@server.example.com SIP/2.0 • … • m=application 9 TCP/MRCPv2 • a=setup:active • a=connection:existing • a=resource:speechsynth • m=audio 49170 RTP/AVP 0 96 • a=rtpmap:0 pcmu/8000 • a=recvonly Setup Play Play & Recognize Teardown Setup of synthesizer resource • S->C: • SIP/2.0 200 OK • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314161 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:131 • v=0 • o=sarvi 2890844526 2890842808 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • m=application 32416 TCP/MRCPv2 • a=setup:passive • a=connection:existing • a=channel:32AECB23433801@speechsynth • a=cmid:1 • m=audio 48260 RTP/AVP 0 • a=rtpmap:0 pcmu/8000 • a=sendonly • a=mid:1 • C->S: • INVITE sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314161 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:142 • v=0 • o=sarvi 2890844526 2890842808 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • m=application 9 TCP/MRCPv2 • a=setup:active • a=connection:existing • a=resource:speechsynth • a=cmid:1 • m=audio 49170 RTP/AVP 0 96 • a=rtpmap:0 pcmu/8000 • a=recvonly • a=mid:1 • C->S: • ACK sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com>;tag=a6c85cf • From:Sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314162 ACK • Content-Length:0 16 SpeechTek West 2007
S->C: SIP/2.0 200 OK … m=application 32416 TCP/MRCPv2 a=setup:passive a=connection:existing a=channel:32AECB23433801@speechsynth m=audio 48260 RTP/AVP 0 a=rtpmap:0 pcmu/8000 a=sendonly Setup Play Play & Recognize Teardown Setup of synthesizer resource • S->C: • SIP/2.0 200 OK • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314161 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:131 • v=0 • o=sarvi 2890844526 2890842808 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • m=application 32416 TCP/MRCPv2 • a=setup:passive • a=connection:existing • a=channel:32AECB23433801@speechsynth • a=cmid:1 • m=audio 48260 RTP/AVP 0 • a=rtpmap:0 pcmu/8000 • a=sendonly • a=mid:1 • C->S: • INVITE sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314161 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:142 • v=0 • o=sarvi 2890844526 2890842808 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • m=application 9 TCP/MRCPv2 • a=setup:active • a=connection:existing • a=resource:speechsynth • a=cmid:1 • m=audio 49170 RTP/AVP 0 96 • a=rtpmap:0 pcmu/8000 • a=recvonly • a=mid:1 • C->S: • ACK sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com>;tag=a6c85cf • From:Sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314162 ACK • Content-Length:0 16 SpeechTek West 2007
C->S: • ACK sip:mrcp@server.example.com SIP/2.0 • … Setup Play Play & Recognize Teardown Setup of synthesizer resource • S->C: • SIP/2.0 200 OK • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314161 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:131 • v=0 • o=sarvi 2890844526 2890842808 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • m=application 32416 TCP/MRCPv2 • a=setup:passive • a=connection:existing • a=channel:32AECB23433801@speechsynth • a=cmid:1 • m=audio 48260 RTP/AVP 0 • a=rtpmap:0 pcmu/8000 • a=sendonly • a=mid:1 • C->S: • INVITE sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314161 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:142 • v=0 • o=sarvi 2890844526 2890842808 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • m=application 9 TCP/MRCPv2 • a=setup:active • a=connection:existing • a=resource:speechsynth • a=cmid:1 • m=audio 49170 RTP/AVP 0 96 • a=rtpmap:0 pcmu/8000 • a=recvonly • a=mid:1 • C->S: • ACK sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com>;tag=a6c85cf • From:Sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314162 ACK • Content-Length:0 16 SpeechTek West 2007
Setup Play Play & Recognize Teardown Setup of recognizer resource • S->C: • SIP/2.0 200 OK • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314163 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:131 • v=0 • o=sarvi 2890844526 2890842809 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • m=application 32416 TCP/MRCPv2 • a=channel:32AECB23433801@speechsynth • a=cmid:1 • m=audio 48260 RTP/AVP 0 • a=rtpmap:0 pcmu/8000 • a=sendonly • a=mid:1 • m=application 32416 TCP/MRCPv2 • a=channel:32AECB23433801@speechrecog • a=cmid:2 • m=audio 48260 RTP/AVP 0 • a=rtpmap:0 pcmu/8000 • a=rtpmap:96 telephone-event/8000 • a=fmtp:96 0-15 • a=recvonly • a=mid:2 • C->S: • INVITE sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314163 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:142 • v=0 • o=sarvi 2890844526 2890842809 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • m=application 9 TCP/MRCPv2 • a=setup:active • a=connection:existing • a=resource:speechsynth • a=cmid:1 • m=audio 49170 RTP/AVP 0 96 • a=rtpmap:0 pcmu/8000 • a=recvonly • a=mid:1 • m=application 9 TCP/MRCPv2 • a=setup:active • a=connection:existing • a=resource:speechrecog • a=cmid:2 • m=audio 49180 RTP/AVP 0 96 • a=rtpmap:0 pcmu/8000 • a=rtpmap:96 telephone-event/8000 • a=fmtp:96 0-15 • a=sendonly • a=mid:2 Note: final C->S ack not shown 17 SpeechTek West 2007
C->S: INVITE sip:mrcp@server.example.com SIP/2.0 … (same synth lines as before, plus the following) m=application 9 TCP/MRCPv2 a=setup:active a=connection:existing a=resource:speechrecog m=audio 49180 RTP/AVP 0 96 a=rtpmap:0 pcmu/8000 a=rtpmap:96 telephone-event/8000 a=fmtp:96 0-15 a=sendonly Setup Play Play & Recognize Teardown Setup of recognizer resource • S->C: • SIP/2.0 200 OK • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314163 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:131 • v=0 • o=sarvi 2890844526 2890842809 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • m=application 32416 TCP/MRCPv2 • a=channel:32AECB23433801@speechsynth • a=cmid:1 • m=audio 48260 RTP/AVP 0 • a=rtpmap:0 pcmu/8000 • a=sendonly • a=mid:1 • m=application 32416 TCP/MRCPv2 • a=channel:32AECB23433801@speechrecog • a=cmid:2 • m=audio 48260 RTP/AVP 0 • a=rtpmap:0 pcmu/8000 • a=rtpmap:96 telephone-event/8000 • a=fmtp:96 0-15 • a=recvonly • a=mid:2 • C->S: • INVITE sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314163 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:142 • v=0 • o=sarvi 2890844526 2890842809 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • m=application 9 TCP/MRCPv2 • a=setup:active • a=connection:existing • a=resource:speechsynth • a=cmid:1 • m=audio 49170 RTP/AVP 0 96 • a=rtpmap:0 pcmu/8000 • a=recvonly • a=mid:1 • m=application 9 TCP/MRCPv2 • a=setup:active • a=connection:existing • a=resource:speechrecog • a=cmid:2 • m=audio 49180 RTP/AVP 0 96 • a=rtpmap:0 pcmu/8000 • a=rtpmap:96 telephone-event/8000 • a=fmtp:96 0-15 • a=sendonly • a=mid:2 Note: final C->S ack not shown 17 SpeechTek West 2007
S->C: SIP/2.0 200 OK … (same synth lines as before, plus the following) m=application 32416 TCP/MRCPv2 a=channel:32AECB23433801@speechrecog m=audio 48260 RTP/AVP 0 a=rtpmap:0 pcmu/8000 a=rtpmap:96 telephone-event/8000 a=fmtp:96 0-15 a=recvonly Setup Play Play & Recognize Teardown Setup of recognizer resource • S->C: • SIP/2.0 200 OK • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314163 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:131 • v=0 • o=sarvi 2890844526 2890842809 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • m=application 32416 TCP/MRCPv2 • a=channel:32AECB23433801@speechsynth • a=cmid:1 • m=audio 48260 RTP/AVP 0 • a=rtpmap:0 pcmu/8000 • a=sendonly • a=mid:1 • m=application 32416 TCP/MRCPv2 • a=channel:32AECB23433801@speechrecog • a=cmid:2 • m=audio 48260 RTP/AVP 0 • a=rtpmap:0 pcmu/8000 • a=rtpmap:96 telephone-event/8000 • a=fmtp:96 0-15 • a=recvonly • a=mid:2 • C->S: • INVITE sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314163 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:142 • v=0 • o=sarvi 2890844526 2890842809 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • m=application 9 TCP/MRCPv2 • a=setup:active • a=connection:existing • a=resource:speechsynth • a=cmid:1 • m=audio 49170 RTP/AVP 0 96 • a=rtpmap:0 pcmu/8000 • a=recvonly • a=mid:1 • m=application 9 TCP/MRCPv2 • a=setup:active • a=connection:existing • a=resource:speechrecog • a=cmid:2 • m=audio 49180 RTP/AVP 0 96 • a=rtpmap:0 pcmu/8000 • a=rtpmap:96 telephone-event/8000 • a=fmtp:96 0-15 • a=sendonly • a=mid:2 Note: final C->S ack not shown 17 SpeechTek West 2007
Setup Play Play & Recognize Teardown Setup of recognizer resource • S->C: • SIP/2.0 200 OK • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314163 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:131 • v=0 • o=sarvi 2890844526 2890842809 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • m=application 32416 TCP/MRCPv2 • a=channel:32AECB23433801@speechsynth • a=cmid:1 • m=audio 48260 RTP/AVP 0 • a=rtpmap:0 pcmu/8000 • a=sendonly • a=mid:1 • m=application 32416 TCP/MRCPv2 • a=channel:32AECB23433801@speechrecog • a=cmid:2 • m=audio 48260 RTP/AVP 0 • a=rtpmap:0 pcmu/8000 • a=rtpmap:96 telephone-event/8000 • a=fmtp:96 0-15 • a=recvonly • a=mid:2 • C->S: • INVITE sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • To:MediaServer <sip:mrcp@server.example.com> • From:sarvi <sip:sarvi@example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:314163 INVITE • Contact:<sip:sarvi@example.com> • Content-Type:application/sdp • Content-Length:142 • v=0 • o=sarvi 2890844526 2890842809 IN IP4 126.16.64.4 • s=SDP Seminar • i=A session for processing media • c=IN IP4 224.2.17.12/127 • m=application 9 TCP/MRCPv2 • a=setup:active • a=connection:existing • a=resource:speechsynth • a=cmid:1 • m=audio 49170 RTP/AVP 0 96 • a=rtpmap:0 pcmu/8000 • a=recvonly • a=mid:1 • m=application 9 TCP/MRCPv2 • a=setup:active • a=connection:existing • a=resource:speechrecog • a=cmid:2 • m=audio 49180 RTP/AVP 0 96 • a=rtpmap:0 pcmu/8000 • a=rtpmap:96 telephone-event/8000 • a=fmtp:96 0-15 • a=sendonly • a=mid:2 Note: final C->S ack not shown 17 SpeechTek West 2007
Setup Play Play & Recognize Teardown Client issues SPEAK request • S->C: • MRCP/2.0 49 543257 200 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechsynth • Speech-Marker:timestamp=857205015059 • C->S: • MRCP/2.0 386 SPEAK 543257 • Channel-Identifier:32AECB23433801@speechsynth • Kill-On-Barge-In:false • Voice-gender:neutral • Voice-age:25 • Prosody-volume:medium • Content-Type:application/ssml+xml • Content-Length:104 • <?xml version="1.0"?> • <speak version="1.0" • xmlns="http://www.w3.org/2001/10/synthesis" • xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" • xsi:schemaLocation="http://www.w3.org/2001/10/synthesis • http://www.w3.org/TR/speech-synthesis/synthesis.xsd" • xml:lang="en-US"> • <p> • <s>You have 4 new messages.</s> • <s>The first is from Stephanie Williams • <mark name="Stephanie"/> • and arrived at <break/> • <say-as interpret-as="vxml:time">0345p</say-as>.</s> • <s>The subject is <prosody • rate="-20%">ski trip</prosody></s> • </p> • </speak> 18 SpeechTek West 2007
C->S: MRCP/2.0 386 SPEAK 543257 Channel-Identifier:32AECB23433801@speechsynth Kill-On-Barge-In:false Voice-gender:neutral Voice-age:25 Prosody-volume:medium Content-Type:application/ssml+xml Content-Length:104 Setup Play Play & Recognize Teardown Client issues SPEAK request • S->C: • MRCP/2.0 49 543257 200 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechsynth • Speech-Marker:timestamp=857205015059 • C->S: • MRCP/2.0 386 SPEAK 543257 • Channel-Identifier:32AECB23433801@speechsynth • Kill-On-Barge-In:false • Voice-gender:neutral • Voice-age:25 • Prosody-volume:medium • Content-Type:application/ssml+xml • Content-Length:104 • <?xml version="1.0"?> • <speak version="1.0" • xmlns="http://www.w3.org/2001/10/synthesis" • xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" • xsi:schemaLocation="http://www.w3.org/2001/10/synthesis • http://www.w3.org/TR/speech-synthesis/synthesis.xsd" • xml:lang="en-US"> • <p> • <s>You have 4 new messages.</s> • <s>The first is from Stephanie Williams • <mark name="Stephanie"/> • and arrived at <break/> • <say-as interpret-as="vxml:time">0345p</say-as>.</s> • <s>The subject is <prosody • rate="-20%">ski trip</prosody></s> • </p> • </speak> 18 SpeechTek West 2007
<?xml version="1.0"?> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang="en-US"> <p> <s>You have 4 new messages.</s> <s>The first is from Stephanie Williams <mark name="Stephanie"/> and arrived at <break/> <say-as interpret-as="vxml:time">0345p</say-as>.</s> <s>The subject is <prosody rate="-20%">ski trip</prosody></s> </p> </speak> Setup Play Play & Recognize Teardown Client issues SPEAK request • S->C: • MRCP/2.0 49 543257 200 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechsynth • Speech-Marker:timestamp=857205015059 • C->S: • MRCP/2.0 386 SPEAK 543257 • Channel-Identifier:32AECB23433801@speechsynth • Kill-On-Barge-In:false • Voice-gender:neutral • Voice-age:25 • Prosody-volume:medium • Content-Type:application/ssml+xml • Content-Length:104 • <?xml version="1.0"?> • <speak version="1.0" • xmlns="http://www.w3.org/2001/10/synthesis" • xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" • xsi:schemaLocation="http://www.w3.org/2001/10/synthesis • http://www.w3.org/TR/speech-synthesis/synthesis.xsd" • xml:lang="en-US"> • <p> • <s>You have 4 new messages.</s> • <s>The first is from Stephanie Williams • <mark name="Stephanie"/> • and arrived at <break/> • <say-as interpret-as="vxml:time">0345p</say-as>.</s> • <s>The subject is <prosody • rate="-20%">ski trip</prosody></s> • </p> • </speak> 18 SpeechTek West 2007
<?xml version="1.0"?> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang="en-US"> <p> <s>You have 4 new messages.</s> <s>The first is from Stephanie Williams <mark name="Stephanie"/> and arrived at <break/> <say-as interpret-as="vxml:time">0345p</say-as>.</s> <s>The subject is <prosody rate="-20%">ski trip</prosody></s> </p> </speak> Setup Play Play & Recognize Teardown Client issues SPEAK request • S->C: • MRCP/2.0 49 543257 200 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechsynth • Speech-Marker:timestamp=857205015059 • C->S: • MRCP/2.0 386 SPEAK 543257 • Channel-Identifier:32AECB23433801@speechsynth • Kill-On-Barge-In:false • Voice-gender:neutral • Voice-age:25 • Prosody-volume:medium • Content-Type:application/ssml+xml • Content-Length:104 • <?xml version="1.0"?> • <speak version="1.0" • xmlns="http://www.w3.org/2001/10/synthesis" • xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" • xsi:schemaLocation="http://www.w3.org/2001/10/synthesis • http://www.w3.org/TR/speech-synthesis/synthesis.xsd" • xml:lang="en-US"> • <p> • <s>You have 4 new messages.</s> • <s>The first is from Stephanie Williams • <mark name="Stephanie"/> • and arrived at <break/> • <say-as interpret-as="vxml:time">0345p</say-as>.</s> • <s>The subject is <prosody • rate="-20%">ski trip</prosody></s> • </p> • </speak> 18 SpeechTek West 2007
<?xml version="1.0"?> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang="en-US"> <p> <s>You have 4 new messages.</s> <s>The first is from Stephanie Williams <mark name="Stephanie"/> and arrived at <break/> <say-as interpret-as="vxml:time">0345p</say-as>.</s> <s>The subject is <prosody rate="-20%">ski trip</prosody></s> </p> </speak> Setup Play Play & Recognize Teardown Client issues SPEAK request • S->C: • MRCP/2.0 49 543257 200 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechsynth • Speech-Marker:timestamp=857205015059 • C->S: • MRCP/2.0 386 SPEAK 543257 • Channel-Identifier:32AECB23433801@speechsynth • Kill-On-Barge-In:false • Voice-gender:neutral • Voice-age:25 • Prosody-volume:medium • Content-Type:application/ssml+xml • Content-Length:104 • <?xml version="1.0"?> • <speak version="1.0" • xmlns="http://www.w3.org/2001/10/synthesis" • xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" • xsi:schemaLocation="http://www.w3.org/2001/10/synthesis • http://www.w3.org/TR/speech-synthesis/synthesis.xsd" • xml:lang="en-US"> • <p> • <s>You have 4 new messages.</s> • <s>The first is from Stephanie Williams • <mark name="Stephanie"/> • and arrived at <break/> • <say-as interpret-as="vxml:time">0345p</say-as>.</s> • <s>The subject is <prosody • rate="-20%">ski trip</prosody></s> • </p> • </speak> 18 SpeechTek West 2007
S->C: MRCP/2.0 49 543257 200 IN-PROGRESS Channel-Identifier:32AECB23433801@speechsynth Speech-Marker:timestamp=857205015059 Setup Play Play & Recognize Teardown Client issues SPEAK request • S->C: • MRCP/2.0 49 543257 200 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechsynth • Speech-Marker:timestamp=857205015059 • C->S: • MRCP/2.0 386 SPEAK 543257 • Channel-Identifier:32AECB23433801@speechsynth • Kill-On-Barge-In:false • Voice-gender:neutral • Voice-age:25 • Prosody-volume:medium • Content-Type:application/ssml+xml • Content-Length:104 • <?xml version="1.0"?> • <speak version="1.0" • xmlns="http://www.w3.org/2001/10/synthesis" • xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" • xsi:schemaLocation="http://www.w3.org/2001/10/synthesis • http://www.w3.org/TR/speech-synthesis/synthesis.xsd" • xml:lang="en-US"> • <p> • <s>You have 4 new messages.</s> • <s>The first is from Stephanie Williams • <mark name="Stephanie"/> • and arrived at <break/> • <say-as interpret-as="vxml:time">0345p</say-as>.</s> • <s>The subject is <prosody • rate="-20%">ski trip</prosody></s> • </p> • </speak> 18 SpeechTek West 2007
Setup Play Play & Recognize Teardown <mark> and SPEAK completion • S->C: MRCP/2.0 46 SPEECH-MARKER 543257 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechsynth • Speech-Marker:timestamp=857206027059;Stephanie • S->C: MRCP/2.0 48 SPEAK-COMPLETE 543257 COMPLETE • Channel-Identifier:32AECB23433801@speechsynth • Speech-Marker:timestamp=857207685213;Stephanie 19 SpeechTek West 2007
Setup Play Play & Recognize Teardown <mark> and SPEAK completion • S->C: MRCP/2.0 46 SPEECH-MARKER 543257 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechsynth • Speech-Marker:timestamp=857206027059;Stephanie • S->C: MRCP/2.0 48 SPEAK-COMPLETE 543257 COMPLETE • Channel-Identifier:32AECB23433801@speechsynth • Speech-Marker:timestamp=857207685213;Stephanie 19 SpeechTek West 2007
Setup Play Play & Recognize Teardown <mark> and SPEAK completion • S->C: MRCP/2.0 46 SPEECH-MARKER 543257 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechsynth • Speech-Marker:timestamp=857206027059;Stephanie • S->C: MRCP/2.0 48 SPEAK-COMPLETE 543257 COMPLETE • Channel-Identifier:32AECB23433801@speechsynth • Speech-Marker:timestamp=857207685213;Stephanie 19 SpeechTek West 2007
Setup Play Play & Recognize Teardown Client issues RECOGNIZE request • S->C: MRCP/2.0 49 543258 200 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechrecog • C->S: MRCP/2.0 343 RECOGNIZE 543258 • Channel-Identifier:32AECB23433801@speechrecog • Content-Type:application/srgs+xml • Content-Length:104 • <?xml version="1.0"?> • <!-- the default grammar language is US English --> • <grammar xmlns="http://www.w3.org/2001/06/grammar" • xml:lang="en-US" version="1.0" root="request"> • <!-- single language attachment to a rule expansion --> • <rule id="request"> • Can I speak to • <one-of xml:lang="fr-CA"> • <item>Michel Tremblay</item> • <item>Andre Roy</item> • </one-of> • </rule> • </grammar> 20 SpeechTek West 2007
Setup Play Play & Recognize Teardown Client issues RECOGNIZE request • S->C: MRCP/2.0 49 543258 200 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechrecog • C->S: MRCP/2.0 343 RECOGNIZE 543258 • Channel-Identifier:32AECB23433801@speechrecog • Content-Type:application/srgs+xml • Content-Length:104 • <?xml version="1.0"?> • <!-- the default grammar language is US English --> • <grammar xmlns="http://www.w3.org/2001/06/grammar" • xml:lang="en-US" version="1.0" root="request"> • <!-- single language attachment to a rule expansion --> • <rule id="request"> • Can I speak to • <one-of xml:lang="fr-CA"> • <item>Michel Tremblay</item> • <item>Andre Roy</item> • </one-of> • </rule> • </grammar> 20 SpeechTek West 2007
Setup Play Play & Recognize Teardown Client issues RECOGNIZE request • S->C: MRCP/2.0 49 543258 200 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechrecog • C->S: MRCP/2.0 343 RECOGNIZE 543258 • Channel-Identifier:32AECB23433801@speechrecog • Content-Type:application/srgs+xml • Content-Length:104 • <?xml version="1.0"?> • <!-- the default grammar language is US English --> • <grammar xmlns="http://www.w3.org/2001/06/grammar" • xml:lang="en-US" version="1.0" root="request"> • <!-- single language attachment to a rule expansion --> • <rule id="request"> • Can I speak to • <one-of xml:lang="fr-CA"> • <item>Michel Tremblay</item> • <item>Andre Roy</item> • </one-of> • </rule> • </grammar> 20 SpeechTek West 2007
Setup Play Play & Recognize Teardown Client issues bargeable SPEAK request • S->C: MRCP/2.0 52 543259 200 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechsynth • Speech-Marker:timestamp=857207696314 • C->S: MRCP/2.0 289 SPEAK 543259 • Channel-Identifier:32AECB23433801@speechsynth • Kill-On-Barge-In:true • Content-Type:application/ssml+xml • Content-Length:104 • <?xml version="1.0"?> • <speak version="1.0" • xmlns="http://www.w3.org/2001/10/synthesis" • xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" • xsi:schemaLocation="http://www.w3.org/2001/10/synthesis • http://www.w3.org/TR/speech-synthesis/synthesis.xsd" • xml:lang="en-US"> • <p> • <s>Welcome to ABC corporation.</s> • <s>Who would you like Talk to.</s> • </p> • </speak> 21 SpeechTek West 2007
Setup Play Play & Recognize Teardown Client issues bargeable SPEAK request • S->C: MRCP/2.0 52 543259 200 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechsynth • Speech-Marker:timestamp=857207696314 • C->S: MRCP/2.0 289 SPEAK 543259 • Channel-Identifier:32AECB23433801@speechsynth • Kill-On-Barge-In:true • Content-Type:application/ssml+xml • Content-Length:104 • <?xml version="1.0"?> • <speak version="1.0" • xmlns="http://www.w3.org/2001/10/synthesis" • xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" • xsi:schemaLocation="http://www.w3.org/2001/10/synthesis • http://www.w3.org/TR/speech-synthesis/synthesis.xsd" • xml:lang="en-US"> • <p> • <s>Welcome to ABC corporation.</s> • <s>Who would you like Talk to.</s> • </p> • </speak> 21 SpeechTek West 2007
Setup Play Play & Recognize Teardown Barge-in occurs Recognizer (MRCP server) sends start of input to client when input is detected • S->C: MRCP/2.0 49 START-OF-INPUT 543258 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechrecog • Proxy-Sync-Id:987654321 • C->S: MRCP/2.0 69 BARGE-IN-OCCURRED 543259 • Channel-Identifier:32AECB23433801@speechsynth • Proxy-Sync-Id:987654321 • S->C: MRCP/2.0 72 543259 200 COMPLETE • Channel-Identifier:32AECB23433801@speechsynth • Active-Request-Id-List:543258 • Speech-Marker:timestamp=857206096314 • S->C: MRCP/2.0 73 SPEAK-COMPLETE 543259 COMPLETE • Channel-Identifier:32AECB23433801@speechsynth • Completion-Cause:001 barge-in • Speech-Marker:timestamp=857207685213 22 SpeechTek West 2007
Setup Play Play & Recognize Teardown Barge-in occurs • S->C: MRCP/2.0 49 START-OF-INPUT 543258 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechrecog • Proxy-Sync-Id:987654321 • C->S: MRCP/2.0 69 BARGE-IN-OCCURRED 543259 • Channel-Identifier:32AECB23433801@speechsynth • Proxy-Sync-Id:987654321 • S->C: MRCP/2.0 72 543259 200 COMPLETE • Channel-Identifier:32AECB23433801@speechsynth • Active-Request-Id-List:543258 • Speech-Marker:timestamp=857206096314 • S->C: MRCP/2.0 73 SPEAK-COMPLETE 543259 COMPLETE • Channel-Identifier:32AECB23433801@speechsynth • Completion-Cause:001 barge-in • Speech-Marker:timestamp=857207685213 MRCP client notifies synthesizer (MRCP server) that barge-in has occurred 22 SpeechTek West 2007
Setup Play Play & Recognize Teardown Barge-in occurs • S->C: MRCP/2.0 49 START-OF-INPUT 543258 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechrecog • Proxy-Sync-Id:987654321 • C->S: MRCP/2.0 69 BARGE-IN-OCCURRED 543259 • Channel-Identifier:32AECB23433801@speechsynth • Proxy-Sync-Id:987654321 • S->C: MRCP/2.0 72 543259 200 COMPLETE • Channel-Identifier:32AECB23433801@speechsynth • Active-Request-Id-List:543258 • Speech-Marker:timestamp=857206096314 • S->C: MRCP/2.0 73 SPEAK-COMPLETE 543259 COMPLETE • Channel-Identifier:32AECB23433801@speechsynth • Completion-Cause:001 barge-in • Speech-Marker:timestamp=857207685213 Because Kill-on-barge-in was set to true, the synthesizer stops playing 22 SpeechTek West 2007
Setup Play Play & Recognize Teardown Barge-in occurs • S->C: MRCP/2.0 49 START-OF-INPUT 543258 IN-PROGRESS • Channel-Identifier:32AECB23433801@speechrecog • Proxy-Sync-Id:987654321 • C->S: MRCP/2.0 69 BARGE-IN-OCCURRED 543259 • Channel-Identifier:32AECB23433801@speechsynth • Proxy-Sync-Id:987654321 • S->C: MRCP/2.0 72 543259 200 COMPLETE • Channel-Identifier:32AECB23433801@speechsynth • Active-Request-Id-List:543258 • Speech-Marker:timestamp=857206096314 • S->C: MRCP/2.0 73 SPEAK-COMPLETE 543259 COMPLETE • Channel-Identifier:32AECB23433801@speechsynth • Completion-Cause:001 barge-in • Speech-Marker:timestamp=857207685213 Note that combined asr/tts resources can sometimes automatically terminate playback sooner. 22 SpeechTek West 2007
Setup Play Play & Recognize Teardown Server returns result • S->C: MRCP/2.0 412 RECOGNITION-COMPLETE 543258 COMPLETE • Channel-Identifier:32AECB23433801@speechrecog • Completion-Cause:000 success • Waveform-URI:<http://web.media.com/session123/audio.wav>; • size=423523;duration=25432 • Content-Type:application/nlsml+xml • Content-Length:104 • <?xml version="1.0"?> • <result xmlns="http://www.ietf.org/xml/ns/mrcpv2" • xmlns:ex="http://www.example.com/example" • grammar="session:request1@form-level.store"> • <interpretation> • <instance name="Person"> • <ex:Person> • <ex:Name> Andre Roy </ex:Name> • </ex:Person> • </instance> • <input> Can I speak to Andre Roy </input> • </interpretation> • </result> 23 SpeechTek West 2007
Setup Play Play & Recognize Teardown Client closes session • C->S: BYE sip:mrcp@server.example.com SIP/2.0 • Max-Forwards:6 • From:Sarvi <sip:sarvi@example.com>;tag=a6c85cf • To:MediaServer <sip:mrcp@server.example.com>;tag=1928301774 • Call-ID:a84b4c76e66710 • CSeq:231 BYE • Content-Length:0 24 SpeechTek West 2007
Dan Burnett • Daniel.Burnett@nuance.com 25 SpeechTek West 2007