1 files changed, 12547 insertions, 0 deletions
diff --git a/doc/rfc/rfc6787.txt b/doc/rfc/rfc6787.txt
new file mode 100644
index 0000000..ca651b7
--- /dev/null
+++ b/doc/rfc/rfc6787.txt
@@ -0,0 +1,12547 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF)                        D. Burnett
+Request for Comments: 6787                                         Voxeo
+Category: Standards Track                                  S. Shanmugham
+ISSN: 2070-1721                                      Cisco Systems, Inc.
+                                                           November 2012
+
+
+           Media Resource Control Protocol Version 2 (MRCPv2)
+
+Abstract
+
+   The Media Resource Control Protocol Version 2 (MRCPv2) allows client
+   hosts to control media service resources such as speech synthesizers,
+   recognizers, verifiers, and identifiers residing in servers on the
+   network.  MRCPv2 is not a "stand-alone" protocol -- it relies on
+   other protocols, such as the Session Initiation Protocol (SIP), to
+   coordinate MRCPv2 clients and servers and manage sessions between
+   them, and the Session Description Protocol (SDP) to describe,
+   discover, and exchange capabilities.  It also depends on SIP and SDP
+   to establish the media sessions and associated parameters between the
+   media source or sink and the media server.  Once this is done, the
+   MRCPv2 exchange operates over the control session established above,
+   allowing the client to control the media processing resources on the
+   speech resource server.
+
+Status of This Memo
+
+   This is an Internet Standards Track document.
+
+   This document is a product of the Internet Engineering Task Force
+   (IETF).  It represents the consensus of the IETF community.  It has
+   received public review and has been approved for publication by the
+   Internet Engineering Steering Group (IESG).  Further information on
+   Internet Standards is available in Section 2 of RFC 5741.
+
+   Information about the current status of this document, any errata,
+   and how to provide feedback on it may be obtained at
+   http://www.rfc-editor.org/info/rfc6787.
+
+Copyright Notice
+
+   Copyright (c) 2012 IETF Trust and the persons identified as the
+   document authors.  All rights reserved.
+
+   This document is subject to BCP 78 and the IETF Trust's Legal
+   Provisions Relating to IETF Documents
+   (http://trustee.ietf.org/license-info) in effect on the date of
+   publication of this document.  Please review these documents
+
+
+
+Burnett & Shanmugham         Standards Track                    [Page 1]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   carefully, as they describe your rights and restrictions with respect
+   to this document.  Code Components extracted from this document must
+   include Simplified BSD License text as described in Section 4.e of
+   the Trust Legal Provisions and are provided without warranty as
+   described in the Simplified BSD License.
+
+   This document may contain material from IETF Documents or IETF
+   Contributions published or made publicly available before November
+   10, 2008.  The person(s) controlling the copyright in some of this
+   material may not have granted the IETF Trust the right to allow
+   modifications of such material outside the IETF Standards Process.
+   Without obtaining an adequate license from the person(s) controlling
+   the copyright in such materials, this document may not be modified
+   outside the IETF Standards Process, and derivative works of it may
+   not be created outside the IETF Standards Process, except to format
+   it for publication as an RFC or to translate it into languages other
+   than English.
+
+Table of Contents
+
+   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   8
+   2.  Document Conventions  . . . . . . . . . . . . . . . . . . . .   9
+     2.1.   Definitions  . . . . . . . . . . . . . . . . . . . . . .  10
+     2.2.   State-Machine Diagrams . . . . . . . . . . . . . . . . .  10
+     2.3.   URI Schemes  . . . . . . . . . . . . . . . . . . . . . .  11
+   3.  Architecture  . . . . . . . . . . . . . . . . . . . . . . . .  11
+     3.1.   MRCPv2 Media Resource Types  . . . . . . . . . . . . . .  12
+     3.2.   Server and Resource Addressing . . . . . . . . . . . . .  14
+   4.  MRCPv2 Basics . . . . . . . . . . . . . . . . . . . . . . . .  14
+     4.1.   Connecting to the Server . . . . . . . . . . . . . . . .  14
+     4.2.   Managing Resource Control Channels . . . . . . . . . . .  14
+     4.3.   SIP Session Example  . . . . . . . . . . . . . . . . . .  17
+     4.4.   Media Streams and RTP Ports  . . . . . . . . . . . . . .  22
+     4.5.   MRCPv2 Message Transport . . . . . . . . . . . . . . . .  24
+     4.6.   MRCPv2 Session Termination . . . . . . . . . . . . . . .  24
+   5.  MRCPv2 Specification  . . . . . . . . . . . . . . . . . . . .  24
+     5.1.   Common Protocol Elements . . . . . . . . . . . . . . . .  25
+     5.2.   Request  . . . . . . . . . . . . . . . . . . . . . . . .  28
+     5.3.   Response . . . . . . . . . . . . . . . . . . . . . . . .  29
+     5.4.   Status Codes . . . . . . . . . . . . . . . . . . . . . .  30
+     5.5.   Events . . . . . . . . . . . . . . . . . . . . . . . . .  31
+   6.  MRCPv2 Generic Methods, Headers, and Result Structure . . . .  32
+     6.1.   Generic Methods  . . . . . . . . . . . . . . . . . . . .  32
+       6.1.1.   SET-PARAMS . . . . . . . . . . . . . . . . . . . . .  32
+       6.1.2.   GET-PARAMS . . . . . . . . . . . . . . . . . . . . .  33
+     6.2.   Generic Message Headers  . . . . . . . . . . . . . . . .  34
+       6.2.1.   Channel-Identifier . . . . . . . . . . . . . . . . .  35
+       6.2.2.   Accept . . . . . . . . . . . . . . . . . . . . . . .  36
+
+
+
+Burnett & Shanmugham         Standards Track                    [Page 2]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+       6.2.3.   Active-Request-Id-List . . . . . . . . . . . . . . .  36
+       6.2.4.   Proxy-Sync-Id  . . . . . . . . . . . . . . . . . . .  36
+       6.2.5.   Accept-Charset . . . . . . . . . . . . . . . . . . .  37
+       6.2.6.   Content-Type . . . . . . . . . . . . . . . . . . . .  37
+       6.2.7.   Content-ID . . . . . . . . . . . . . . . . . . . . .  38
+       6.2.8.   Content-Base . . . . . . . . . . . . . . . . . . . .  38
+       6.2.9.   Content-Encoding . . . . . . . . . . . . . . . . . .  38
+       6.2.10.  Content-Location . . . . . . . . . . . . . . . . . .  39
+       6.2.11.  Content-Length . . . . . . . . . . . . . . . . . . .  39
+       6.2.12.  Fetch Timeout  . . . . . . . . . . . . . . . . . . .  39
+       6.2.13.  Cache-Control  . . . . . . . . . . . . . . . . . . .  40
+       6.2.14.  Logging-Tag  . . . . . . . . . . . . . . . . . . . .  41
+       6.2.15.  Set-Cookie . . . . . . . . . . . . . . . . . . . . .  42
+       6.2.16.  Vendor-Specific Parameters . . . . . . . . . . . . .  44
+     6.3.   Generic Result Structure . . . . . . . . . . . . . . . .  44
+       6.3.1.   Natural Language Semantics Markup Language . . . . .  45
+   7.  Resource Discovery  . . . . . . . . . . . . . . . . . . . . .  46
+   8.  Speech Synthesizer Resource . . . . . . . . . . . . . . . . .  47
+     8.1.   Synthesizer State Machine  . . . . . . . . . . . . . . .  48
+     8.2.   Synthesizer Methods  . . . . . . . . . . . . . . . . . .  48
+     8.3.   Synthesizer Events . . . . . . . . . . . . . . . . . . .  49
+     8.4.   Synthesizer Header Fields  . . . . . . . . . . . . . . .  49
+       8.4.1.   Jump-Size  . . . . . . . . . . . . . . . . . . . . .  49
+       8.4.2.   Kill-On-Barge-In . . . . . . . . . . . . . . . . . .  50
+       8.4.3.   Speaker-Profile  . . . . . . . . . . . . . . . . . .  51
+       8.4.4.   Completion-Cause . . . . . . . . . . . . . . . . . .  51
+       8.4.5.   Completion-Reason  . . . . . . . . . . . . . . . . .  52
+       8.4.6.   Voice-Parameter  . . . . . . . . . . . . . . . . . .  52
+       8.4.7.   Prosody-Parameters . . . . . . . . . . . . . . . . .  53
+       8.4.8.   Speech-Marker  . . . . . . . . . . . . . . . . . . .  53
+       8.4.9.   Speech-Language  . . . . . . . . . . . . . . . . . .  54
+       8.4.10.  Fetch-Hint . . . . . . . . . . . . . . . . . . . . .  54
+       8.4.11.  Audio-Fetch-Hint . . . . . . . . . . . . . . . . . .  55
+       8.4.12.  Failed-URI . . . . . . . . . . . . . . . . . . . . .  55
+       8.4.13.  Failed-URI-Cause . . . . . . . . . . . . . . . . . .  55
+       8.4.14.  Speak-Restart  . . . . . . . . . . . . . . . . . . .  56
+       8.4.15.  Speak-Length . . . . . . . . . . . . . . . . . . . .  56
+       8.4.16.  Load-Lexicon . . . . . . . . . . . . . . . . . . . .  57
+       8.4.17.  Lexicon-Search-Order . . . . . . . . . . . . . . . .  57
+     8.5.   Synthesizer Message Body . . . . . . . . . . . . . . . .  57
+       8.5.1.   Synthesizer Speech Data  . . . . . . . . . . . . . .  57
+       8.5.2.   Lexicon Data . . . . . . . . . . . . . . . . . . . .  59
+     8.6.   SPEAK Method . . . . . . . . . . . . . . . . . . . . . .  60
+     8.7.   STOP . . . . . . . . . . . . . . . . . . . . . . . . . .  62
+     8.8.   BARGE-IN-OCCURRED  . . . . . . . . . . . . . . . . . . .  63
+     8.9.   PAUSE  . . . . . . . . . . . . . . . . . . . . . . . . .  65
+     8.10.  RESUME . . . . . . . . . . . . . . . . . . . . . . . . .  66
+     8.11.  CONTROL  . . . . . . . . . . . . . . . . . . . . . . . .  67
+
+
+
+Burnett & Shanmugham         Standards Track                    [Page 3]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+     8.12.  SPEAK-COMPLETE . . . . . . . . . . . . . . . . . . . . .  69
+     8.13.  SPEECH-MARKER  . . . . . . . . . . . . . . . . . . . . .  70
+     8.14.  DEFINE-LEXICON . . . . . . . . . . . . . . . . . . . . .  71
+   9.  Speech Recognizer Resource  . . . . . . . . . . . . . . . . .  72
+     9.1.   Recognizer State Machine . . . . . . . . . . . . . . . .  74
+     9.2.   Recognizer Methods . . . . . . . . . . . . . . . . . . .  74
+     9.3.   Recognizer Events  . . . . . . . . . . . . . . . . . . .  75
+     9.4.   Recognizer Header Fields . . . . . . . . . . . . . . . .  75
+       9.4.1.   Confidence-Threshold . . . . . . . . . . . . . . . .  77
+       9.4.2.   Sensitivity-Level  . . . . . . . . . . . . . . . . .  77
+       9.4.3.   Speed-Vs-Accuracy  . . . . . . . . . . . . . . . . .  77
+       9.4.4.   N-Best-List-Length . . . . . . . . . . . . . . . . .  78
+       9.4.5.   Input-Type . . . . . . . . . . . . . . . . . . . . .  78
+       9.4.6.   No-Input-Timeout . . . . . . . . . . . . . . . . . .  78
+       9.4.7.   Recognition-Timeout  . . . . . . . . . . . . . . . .  79
+       9.4.8.   Waveform-URI . . . . . . . . . . . . . . . . . . . .  79
+       9.4.9.   Media-Type . . . . . . . . . . . . . . . . . . . . .  80
+       9.4.10.  Input-Waveform-URI . . . . . . . . . . . . . . . . .  80
+       9.4.11.  Completion-Cause . . . . . . . . . . . . . . . . . .  80
+       9.4.12.  Completion-Reason  . . . . . . . . . . . . . . . . .  83
+       9.4.13.  Recognizer-Context-Block . . . . . . . . . . . . . .  83
+       9.4.14.  Start-Input-Timers . . . . . . . . . . . . . . . . .  83
+       9.4.15.  Speech-Complete-Timeout  . . . . . . . . . . . . . .  84
+       9.4.16.  Speech-Incomplete-Timeout  . . . . . . . . . . . . .  84
+       9.4.17.  DTMF-Interdigit-Timeout  . . . . . . . . . . . . . .  85
+       9.4.18.  DTMF-Term-Timeout  . . . . . . . . . . . . . . . . .  85
+       9.4.19.  DTMF-Term-Char . . . . . . . . . . . . . . . . . . .  85
+       9.4.20.  Failed-URI . . . . . . . . . . . . . . . . . . . . .  86
+       9.4.21.  Failed-URI-Cause . . . . . . . . . . . . . . . . . .  86
+       9.4.22.  Save-Waveform  . . . . . . . . . . . . . . . . . . .  86
+       9.4.23.  New-Audio-Channel  . . . . . . . . . . . . . . . . .  86
+       9.4.24.  Speech-Language  . . . . . . . . . . . . . . . . . .  87
+       9.4.25.  Ver-Buffer-Utterance . . . . . . . . . . . . . . . .  87
+       9.4.26.  Recognition-Mode . . . . . . . . . . . . . . . . . .  87
+       9.4.27.  Cancel-If-Queue  . . . . . . . . . . . . . . . . . .  88
+       9.4.28.  Hotword-Max-Duration . . . . . . . . . . . . . . . .  88
+       9.4.29.  Hotword-Min-Duration . . . . . . . . . . . . . . . .  88
+       9.4.30.  Interpret-Text . . . . . . . . . . . . . . . . . . .  89
+       9.4.31.  DTMF-Buffer-Time . . . . . . . . . . . . . . . . . .  89
+       9.4.32.  Clear-DTMF-Buffer  . . . . . . . . . . . . . . . . .  89
+       9.4.33.  Early-No-Match . . . . . . . . . . . . . . . . . . .  90
+       9.4.34.  Num-Min-Consistent-Pronunciations  . . . . . . . . .  90
+       9.4.35.  Consistency-Threshold  . . . . . . . . . . . . . . .  90
+       9.4.36.  Clash-Threshold  . . . . . . . . . . . . . . . . . .  90
+       9.4.37.  Personal-Grammar-URI . . . . . . . . . . . . . . . .  91
+       9.4.38.  Enroll-Utterance . . . . . . . . . . . . . . . . . .  91
+       9.4.39.  Phrase-Id  . . . . . . . . . . . . . . . . . . . . .  91
+       9.4.40.  Phrase-NL  . . . . . . . . . . . . . . . . . . . . .  92
+
+
+
+Burnett & Shanmugham         Standards Track                    [Page 4]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+       9.4.41.  Weight . . . . . . . . . . . . . . . . . . . . . . .  92
+       9.4.42.  Save-Best-Waveform . . . . . . . . . . . . . . . . .  92
+       9.4.43.  New-Phrase-Id  . . . . . . . . . . . . . . . . . . .  93
+       9.4.44.  Confusable-Phrases-URI . . . . . . . . . . . . . . .  93
+       9.4.45.  Abort-Phrase-Enrollment  . . . . . . . . . . . . . .  93
+     9.5.   Recognizer Message Body  . . . . . . . . . . . . . . . .  93
+       9.5.1.   Recognizer Grammar Data  . . . . . . . . . . . . . .  93
+       9.5.2.   Recognizer Result Data . . . . . . . . . . . . . . .  97
+       9.5.3.   Enrollment Result Data . . . . . . . . . . . . . . .  98
+       9.5.4.   Recognizer Context Block . . . . . . . . . . . . . .  98
+     9.6.   Recognizer Results . . . . . . . . . . . . . . . . . . .  99
+       9.6.1.   Markup Functions . . . . . . . . . . . . . . . . . .  99
+       9.6.2.   Overview of Recognizer Result Elements and Their
+                Relationships  . . . . . . . . . . . . . . . . . . . 100
+       9.6.3.   Elements and Attributes  . . . . . . . . . . . . . . 101
+     9.7.   Enrollment Results . . . . . . . . . . . . . . . . . . . 106
+       9.7.1.   <num-clashes> Element  . . . . . . . . . . . . . . . 106
+       9.7.2.   <num-good-repetitions> Element . . . . . . . . . . . 106
+       9.7.3.   <num-repetitions-still-needed> Element . . . . . . . 107
+       9.7.4.   <consistency-status> Element . . . . . . . . . . . . 107
+       9.7.5.   <clash-phrase-ids> Element . . . . . . . . . . . . . 107
+       9.7.6.   <transcriptions> Element . . . . . . . . . . . . . . 107
+       9.7.7.   <confusable-phrases> Element . . . . . . . . . . . . 107
+     9.8.   DEFINE-GRAMMAR . . . . . . . . . . . . . . . . . . . . . 107
+     9.9.   RECOGNIZE  . . . . . . . . . . . . . . . . . . . . . . . 111
+     9.10.  STOP . . . . . . . . . . . . . . . . . . . . . . . . . . 118
+     9.11.  GET-RESULT . . . . . . . . . . . . . . . . . . . . . . . 119
+     9.12.  START-OF-INPUT . . . . . . . . . . . . . . . . . . . . . 120
+     9.13.  START-INPUT-TIMERS . . . . . . . . . . . . . . . . . . . 120
+     9.14.  RECOGNITION-COMPLETE . . . . . . . . . . . . . . . . . . 120
+     9.15.  START-PHRASE-ENROLLMENT  . . . . . . . . . . . . . . . . 123
+     9.16.  ENROLLMENT-ROLLBACK  . . . . . . . . . . . . . . . . . . 124
+     9.17.  END-PHRASE-ENROLLMENT  . . . . . . . . . . . . . . . . . 124
+     9.18.  MODIFY-PHRASE  . . . . . . . . . . . . . . . . . . . . . 125
+     9.19.  DELETE-PHRASE  . . . . . . . . . . . . . . . . . . . . . 125
+     9.20.  INTERPRET  . . . . . . . . . . . . . . . . . . . . . . . 125
+     9.21.  INTERPRETATION-COMPLETE  . . . . . . . . . . . . . . . . 127
+     9.22.  DTMF Detection . . . . . . . . . . . . . . . . . . . . . 128
+   10. Recorder Resource . . . . . . . . . . . . . . . . . . . . . . 129
+     10.1.  Recorder State Machine . . . . . . . . . . . . . . . . . 129
+     10.2.  Recorder Methods . . . . . . . . . . . . . . . . . . . . 130
+     10.3.  Recorder Events  . . . . . . . . . . . . . . . . . . . . 130
+     10.4.  Recorder Header Fields . . . . . . . . . . . . . . . . . 130
+       10.4.1.  Sensitivity-Level  . . . . . . . . . . . . . . . . . 130
+       10.4.2.  No-Input-Timeout . . . . . . . . . . . . . . . . . . 131
+       10.4.3.  Completion-Cause . . . . . . . . . . . . . . . . . . 131
+       10.4.4.  Completion-Reason  . . . . . . . . . . . . . . . . . 132
+       10.4.5.  Failed-URI . . . . . . . . . . . . . . . . . . . . . 132
+
+
+
+Burnett & Shanmugham         Standards Track                    [Page 5]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+       10.4.6.  Failed-URI-Cause . . . . . . . . . . . . . . . . . . 132
+       10.4.7.  Record-URI . . . . . . . . . . . . . . . . . . . . . 132
+       10.4.8.  Media-Type . . . . . . . . . . . . . . . . . . . . . 133
+       10.4.9.  Max-Time . . . . . . . . . . . . . . . . . . . . . . 133
+       10.4.10. Trim-Length  . . . . . . . . . . . . . . . . . . . . 134
+       10.4.11. Final-Silence  . . . . . . . . . . . . . . . . . . . 134
+       10.4.12. Capture-On-Speech  . . . . . . . . . . . . . . . . . 134
+       10.4.13. Ver-Buffer-Utterance . . . . . . . . . . . . . . . . 134
+       10.4.14. Start-Input-Timers . . . . . . . . . . . . . . . . . 135
+       10.4.15. New-Audio-Channel  . . . . . . . . . . . . . . . . . 135
+     10.5.  Recorder Message Body  . . . . . . . . . . . . . . . . . 135
+     10.6.  RECORD . . . . . . . . . . . . . . . . . . . . . . . . . 135
+     10.7.  STOP . . . . . . . . . . . . . . . . . . . . . . . . . . 136
+     10.8.  RECORD-COMPLETE  . . . . . . . . . . . . . . . . . . . . 137
+     10.9.  START-INPUT-TIMERS . . . . . . . . . . . . . . . . . . . 138
+     10.10. START-OF-INPUT . . . . . . . . . . . . . . . . . . . . . 138
+   11. Speaker Verification and Identification . . . . . . . . . . . 139
+     11.1.  Speaker Verification State Machine . . . . . . . . . . . 140
+     11.2.  Speaker Verification Methods . . . . . . . . . . . . . . 142
+     11.3.  Verification Events  . . . . . . . . . . . . . . . . . . 144
+     11.4.  Verification Header Fields . . . . . . . . . . . . . . . 144
+       11.4.1.  Repository-URI . . . . . . . . . . . . . . . . . . . 144
+       11.4.2.  Voiceprint-Identifier  . . . . . . . . . . . . . . . 145
+       11.4.3.  Verification-Mode  . . . . . . . . . . . . . . . . . 145
+       11.4.4.  Adapt-Model  . . . . . . . . . . . . . . . . . . . . 146
+       11.4.5.  Abort-Model  . . . . . . . . . . . . . . . . . . . . 146
+       11.4.6.  Min-Verification-Score . . . . . . . . . . . . . . . 147
+       11.4.7.  Num-Min-Verification-Phrases . . . . . . . . . . . . 147
+       11.4.8.  Num-Max-Verification-Phrases . . . . . . . . . . . . 147
+       11.4.9.  No-Input-Timeout . . . . . . . . . . . . . . . . . . 148
+       11.4.10. Save-Waveform  . . . . . . . . . . . . . . . . . . . 148
+       11.4.11. Media-Type . . . . . . . . . . . . . . . . . . . . . 148
+       11.4.12. Waveform-URI . . . . . . . . . . . . . . . . . . . . 148
+       11.4.13. Voiceprint-Exists  . . . . . . . . . . . . . . . . . 149
+       11.4.14. Ver-Buffer-Utterance . . . . . . . . . . . . . . . . 149
+       11.4.15. Input-Waveform-URI . . . . . . . . . . . . . . . . . 149
+       11.4.16. Completion-Cause . . . . . . . . . . . . . . . . . . 150
+       11.4.17. Completion-Reason  . . . . . . . . . . . . . . . . . 151
+       11.4.18. Speech-Complete-Timeout  . . . . . . . . . . . . . . 151
+       11.4.19. New-Audio-Channel  . . . . . . . . . . . . . . . . . 152
+       11.4.20. Abort-Verification . . . . . . . . . . . . . . . . . 152
+       11.4.21. Start-Input-Timers . . . . . . . . . . . . . . . . . 152
+     11.5.  Verification Message Body  . . . . . . . . . . . . . . . 152
+       11.5.1.  Verification Result Data . . . . . . . . . . . . . . 152
+       11.5.2.  Verification Result Elements . . . . . . . . . . . . 153
+     11.6.  START-SESSION  . . . . . . . . . . . . . . . . . . . . . 157
+     11.7.  END-SESSION  . . . . . . . . . . . . . . . . . . . . . . 158
+     11.8.  QUERY-VOICEPRINT . . . . . . . . . . . . . . . . . . . . 159
+
+
+
+Burnett & Shanmugham         Standards Track                    [Page 6]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+     11.9.  DELETE-VOICEPRINT  . . . . . . . . . . . . . . . . . . . 160
+     11.10. VERIFY . . . . . . . . . . . . . . . . . . . . . . . . . 160
+     11.11. VERIFY-FROM-BUFFER . . . . . . . . . . . . . . . . . . . 160
+     11.12. VERIFY-ROLLBACK  . . . . . . . . . . . . . . . . . . . . 164
+     11.13. STOP . . . . . . . . . . . . . . . . . . . . . . . . . . 164
+     11.14. START-INPUT-TIMERS . . . . . . . . . . . . . . . . . . . 165
+     11.15. VERIFICATION-COMPLETE  . . . . . . . . . . . . . . . . . 165
+     11.16. START-OF-INPUT . . . . . . . . . . . . . . . . . . . . . 166
+     11.17. CLEAR-BUFFER . . . . . . . . . . . . . . . . . . . . . . 166
+     11.18. GET-INTERMEDIATE-RESULT  . . . . . . . . . . . . . . . . 167
+   12. Security Considerations . . . . . . . . . . . . . . . . . . . 168
+     12.1.  Rendezvous and Session Establishment . . . . . . . . . . 168
+     12.2.  Control Channel Protection . . . . . . . . . . . . . . . 168
+     12.3.  Media Session Protection . . . . . . . . . . . . . . . . 169
+     12.4.  Indirect Content Access  . . . . . . . . . . . . . . . . 169
+     12.5.  Protection of Stored Media . . . . . . . . . . . . . . . 170
+     12.6.  DTMF and Recognition Buffers . . . . . . . . . . . . . . 171
+     12.7.  Client-Set Server Parameters . . . . . . . . . . . . . . 171
+     12.8.  DELETE-VOICEPRINT and Authorization  . . . . . . . . . . 171
+   13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 171
+     13.1.  New Registries . . . . . . . . . . . . . . . . . . . . . 171
+       13.1.1.  MRCPv2 Resource Types  . . . . . . . . . . . . . . . 171
+       13.1.2.  MRCPv2 Methods and Events  . . . . . . . . . . . . . 172
+       13.1.3.  MRCPv2 Header Fields . . . . . . . . . . . . . . . . 173
+       13.1.4.  MRCPv2 Status Codes  . . . . . . . . . . . . . . . . 176
+       13.1.5.  Grammar Reference List Parameters  . . . . . . . . . 176
+       13.1.6.  MRCPv2 Vendor-Specific Parameters  . . . . . . . . . 176
+     13.2.  NLSML-Related Registrations  . . . . . . . . . . . . . . 177
+       13.2.1.  'application/nlsml+xml' Media Type Registration  . . 177
+     13.3.  NLSML XML Schema Registration  . . . . . . . . . . . . . 178
+     13.4.  MRCPv2 XML Namespace Registration  . . . . . . . . . . . 178
+     13.5.  Text Media Type Registrations  . . . . . . . . . . . . . 178
+       13.5.1.  text/grammar-ref-list  . . . . . . . . . . . . . . . 178
+     13.6.  'session' URI Scheme Registration  . . . . . . . . . . . 180
+     13.7.  SDP Parameter Registrations  . . . . . . . . . . . . . . 181
+       13.7.1.  Sub-Registry "proto" . . . . . . . . . . . . . . . . 181
+       13.7.2.  Sub-Registry "att-field (media-level)" . . . . . . . 182
+   14. Examples  . . . . . . . . . . . . . . . . . . . . . . . . . . 183
+     14.1.  Message Flow . . . . . . . . . . . . . . . . . . . . . . 183
+     14.2.  Recognition Result Examples  . . . . . . . . . . . . . . 192
+       14.2.1.  Simple ASR Ambiguity . . . . . . . . . . . . . . . . 192
+       14.2.2.  Mixed Initiative . . . . . . . . . . . . . . . . . . 192
+       14.2.3.  DTMF Input . . . . . . . . . . . . . . . . . . . . . 193
+       14.2.4.  Interpreting Meta-Dialog and Meta-Task Utterances  . 194
+       14.2.5.  Anaphora and Deixis  . . . . . . . . . . . . . . . . 195
+       14.2.6.  Distinguishing Individual Items from Sets with
+                One Member . . . . . . . . . . . . . . . . . . . . . 195
+       14.2.7.  Extensibility  . . . . . . . . . . . . . . . . . . . 196
+
+
+
+Burnett & Shanmugham         Standards Track                    [Page 7]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   15. ABNF Normative Definition . . . . . . . . . . . . . . . . . . 196
+   16. XML Schemas . . . . . . . . . . . . . . . . . . . . . . . . . 211
+     16.1.  NLSML Schema Definition  . . . . . . . . . . . . . . . . 211
+     16.2.  Enrollment Results Schema Definition . . . . . . . . . . 213
+     16.3.  Verification Results Schema Definition . . . . . . . . . 214
+   17. References  . . . . . . . . . . . . . . . . . . . . . . . . . 218
+     17.1.  Normative References . . . . . . . . . . . . . . . . . . 218
+     17.2.  Informative References . . . . . . . . . . . . . . . . . 220
+   Appendix A.  Contributors . . . . . . . . . . . . . . . . . . . . 223
+   Appendix B.  Acknowledgements . . . . . . . . . . . . . . . . . . 223
+
+1.  Introduction
+
+   MRCPv2 is designed to allow a client device to control media
+   processing resources on the network.  Some of these media processing
+   resources include speech recognition engines, speech synthesis
+   engines, speaker verification, and speaker identification engines.
+   MRCPv2 enables the implementation of distributed Interactive Voice
+   Response platforms using VoiceXML [W3C.REC-voicexml20-20040316]
+   browsers or other client applications while maintaining separate
+   back-end speech processing capabilities on specialized speech
+   processing servers.  MRCPv2 is based on the earlier Media Resource
+   Control Protocol (MRCP) [RFC4463] developed jointly by Cisco Systems,
+   Inc., Nuance Communications, and Speechworks, Inc.  Although some of
+   the method names are similar, the way in which these methods are
+   communicated is different.  There are also more resources and more
+   methods for each resource.  The first version of MRCP was essentially
+   taken only as input to the development of this protocol.  There is no
+   expectation that an MRCPv2 client will work with an MRCPv1 server or
+   vice versa.  There is no migration plan or gateway definition between
+   the two protocols.
+
+   The protocol requirements of Speech Services Control (SPEECHSC)
+   [RFC4313] include that the solution be capable of reaching a media
+   processing server, setting up communication channels to the media
+   resources, and sending and receiving control messages and media
+   streams to/from the server.  The Session Initiation Protocol (SIP)
+   [RFC3261] meets these requirements.
+
+   The proprietary version of MRCP ran over the Real Time Streaming
+   Protocol (RTSP) [RFC2326].  At the time work on MRCPv2 was begun, the
+   consensus was that this use of RTSP would break the RTSP protocol or
+   cause backward-compatibility problems, something forbidden by Section
+   3.2 of [RFC4313].  This is the reason why MRCPv2 does not run over
+   RTSP.
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                    [Page 8]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   MRCPv2 leverages these capabilities by building upon SIP and the
+   Session Description Protocol (SDP) [RFC4566].  MRCPv2 uses SIP to set
+   up and tear down media and control sessions with the server.  In
+   addition, the client can use a SIP re-INVITE method (an INVITE dialog
+   sent within an existing SIP session) to change the characteristics of
+   these media and control session while maintaining the SIP dialog
+   between the client and server.  SDP is used to describe the
+   parameters of the media sessions associated with that dialog.  It is
+   mandatory to support SIP as the session establishment protocol to
+   ensure interoperability.  Other protocols can be used for session
+   establishment by prior agreement.  This document only describes the
+   use of SIP and SDP.
+
+   MRCPv2 uses SIP and SDP to create the speech client/server dialog and
+   set up the media channels to the server.  It also uses SIP and SDP to
+   establish MRCPv2 control sessions between the client and the server
+   for each media processing resource required for that dialog.  The
+   MRCPv2 protocol exchange between the client and the media resource is
+   carried on that control session.  MRCPv2 exchanges do not change the
+   state of the SIP dialog, the media sessions, or other parameters of
+   the dialog initiated via SIP.  It controls and affects the state of
+   the media processing resource associated with the MRCPv2 session(s).
+
+   MRCPv2 defines the messages to control the different media processing
+   resources and the state machines required to guide their operation.
+   It also describes how these messages are carried over a transport-
+   layer protocol such as the Transmission Control Protocol (TCP)
+   [RFC0793] or the Transport Layer Security (TLS) Protocol [RFC5246].
+   (Note: the Stream Control Transmission Protocol (SCTP) [RFC4960] is a
+   viable transport for MRCPv2 as well, but the mapping onto SCTP is not
+   described in this specification.)
+
+2.  Document Conventions
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+   document are to be interpreted as described in RFC 2119 [RFC2119].
+
+   Since many of the definitions and syntax are identical to those for
+   the Hypertext Transfer Protocol -- HTTP/1.1 [RFC2616], this
+   specification refers to the section where they are defined rather
+   than copying it.  For brevity, [HX.Y] is to be taken to refer to
+   Section X.Y of RFC 2616.
+
+   All the mechanisms specified in this document are described in both
+   prose and an augmented Backus-Naur form (ABNF [RFC5234]).
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                    [Page 9]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   The complete message format in ABNF form is provided in Section 15
+   and is the normative format definition.  Note that productions may be
+   duplicated within the main body of the document for reading
+   convenience.  If a production in the body of the text conflicts with
+   one in the normative definition, the latter rules.
+
+2.1.  Definitions
+
+   Media Resource
+                  An entity on the speech processing server that can be
+                  controlled through MRCPv2.
+
+   MRCP Server
+                  Aggregate of one or more "Media Resource" entities on
+                  a server, exposed through MRCPv2.  Often, 'server' in
+                  this document refers to an MRCP server.
+
+   MRCP Client
+                  An entity controlling one or more Media Resources
+                  through MRCPv2 ("Client" for short).
+
+   DTMF
+                  Dual-Tone Multi-Frequency; a method of transmitting
+                  key presses in-band, either as actual tones (Q.23
+                  [Q.23]) or as named tone events (RFC 4733 [RFC4733]).
+
+   Endpointing
+                  The process of automatically detecting the beginning
+                  and end of speech in an audio stream.  This is
+                  critical both for speech recognition and for automated
+                  recording as one would find in voice mail systems.
+
+   Hotword Mode
+                  A mode of speech recognition where a stream of
+                  utterances is evaluated for match against a small set
+                  of command words.  This is generally employed either
+                  to trigger some action or to control the subsequent
+                  grammar to be used for further recognition.
+
+2.2.  State-Machine Diagrams
+
+   The state-machine diagrams in this document do not show every
+   possible method call.  Rather, they reflect the state of the resource
+   based on the methods that have moved to IN-PROGRESS or COMPLETE
+   states (see Section 5.3).  Note that since PENDING requests
+   essentially have not affected the resource yet and are in the queue
+   to be processed, they are not reflected in the state-machine
+   diagrams.
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 10]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+2.3.  URI Schemes
+
+   This document defines many protocol headers that contain URIs
+   (Uniform Resource Identifiers [RFC3986]) or lists of URIs for
+   referencing media.  The entire document, including the Security
+   Considerations section (Section 12), assumes that HTTP or HTTP over
+   TLS (HTTPS) [RFC2818] will be used as the URI addressing scheme
+   unless otherwise stated.  However, implementations MAY support other
+   schemes (such as 'file'), provided they have addressed any security
+   considerations described in this document and any others particular
+   to the specific scheme.  For example, implementations where the
+   client and server both reside on the same physical hardware and the
+   file system is secured by traditional user-level file access controls
+   could be reasonable candidates for supporting the 'file' scheme.
+
+3.  Architecture
+
+   A system using MRCPv2 consists of a client that requires the
+   generation and/or consumption of media streams and a media resource
+   server that has the resources or "engines" to process these streams
+   as input or generate these streams as output.  The client uses SIP
+   and SDP to establish an MRCPv2 control channel with the server to use
+   its media processing resources.  MRCPv2 servers are addressed using
+   SIP URIs.
+
+   SIP uses SDP with the offer/answer model described in RFC 3264
+   [RFC3264] to set up the MRCPv2 control channels and describe their
+   characteristics.  A separate MRCPv2 session is needed to control each
+   of the media processing resources associated with the SIP dialog
+   between the client and server.  Within a SIP dialog, the individual
+   resource control channels for the different resources are added or
+   removed through SDP offer/answer carried in a SIP re-INVITE
+   transaction.
+
+   The server, through the SDP exchange, provides the client with a
+   difficult-to-guess, unambiguous channel identifier and a TCP port
+   number (see Section 4.2).  The client MAY then open a new TCP
+   connection with the server on this port number.  Multiple MRCPv2
+   channels can share a TCP connection between the client and the
+   server.  All MRCPv2 messages exchanged between the client and the
+   server carry the specified channel identifier that the server MUST
+   ensure is unambiguous among all MRCPv2 control channels that are
+   active on that server.  The client uses this channel identifier to
+   indicate the media processing resource associated with that channel.
+   For information on message framing, see Section 5.
+
+   SIP also establishes the media sessions between the client (or other
+   source/sink of media) and the MRCPv2 server using SDP "m=" lines.
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 11]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   One or more media processing resources may share a media session
+   under a SIP session, or each media processing resource may have its
+   own media session.
+
+   The following diagram shows the general architecture of a system that
+   uses MRCPv2.  To simplify the diagram, only a few resources are
+   shown.
+
+     MRCPv2 client                   MRCPv2 Media Resource Server
+|--------------------|            |------------------------------------|
+||------------------||            ||----------------------------------||
+|| Application Layer||            ||Synthesis|Recognition|Verification||
+||------------------||            || Engine  |  Engine   |   Engine   ||
+||Media Resource API||            ||    ||   |    ||     |    ||      ||
+||------------------||            ||Synthesis|Recognizer |  Verifier  ||
+|| SIP  |  MRCPv2   ||            ||Resource | Resource  |  Resource  ||
+||Stack |           ||            ||     Media Resource Management    ||
+||      |           ||            ||----------------------------------||
+||------------------||            ||   SIP  |        MRCPv2           ||
+||   TCP/IP Stack   ||---MRCPv2---||  Stack |                         ||
+||                  ||            ||----------------------------------||
+||------------------||----SIP-----||           TCP/IP Stack           ||
+|--------------------|            ||                                  ||
+         |                        ||----------------------------------||
+        SIP                       |------------------------------------|
+         |                          /
+|-------------------|             RTP
+|                   |             /
+| Media Source/Sink |------------/
+|                   |
+|-------------------|
+
+                      Figure 1: Architectural Diagram
+
+3.1.  MRCPv2 Media Resource Types
+
+   An MRCPv2 server may offer one or more of the following media
+   processing resources to its clients.
+
+   Basic Synthesizer
+                  A speech synthesizer resource that has very limited
+                  capabilities and can generate its media stream
+                  exclusively from concatenated audio clips.  The speech
+                  data is described using a limited subset of the Speech
+                  Synthesis Markup Language (SSML)
+                  [W3C.REC-speech-synthesis-20040907] elements.  A basic
+                  synthesizer MUST support the SSML tags <speak>,
+                  <audio>, <say-as>, and <mark>.
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 12]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Speech Synthesizer
+                  A full-capability speech synthesis resource that can
+                  render speech from text.  Such a synthesizer MUST have
+                  full SSML [W3C.REC-speech-synthesis-20040907] support.
+
+   Recorder
+                  A resource capable of recording audio and providing a
+                  URI pointer to the recording.  A recorder MUST provide
+                  endpointing capabilities for suppressing silence at
+                  the beginning and end of a recording, and MAY also
+                  suppress silence in the middle of a recording.  If
+                  such suppression is done, the recorder MUST maintain
+                  timing metadata to indicate the actual timestamps of
+                  the recorded media.
+
+   DTMF Recognizer
+                  A recognizer resource capable of extracting and
+                  interpreting Dual-Tone Multi-Frequency (DTMF) [Q.23]
+                  digits in a media stream and matching them against a
+                  supplied digit grammar.  It could also do a semantic
+                  interpretation based on semantic tags in the grammar.
+
+   Speech Recognizer
+                  A full speech recognition resource that is capable of
+                  receiving a media stream containing audio and
+                  interpreting it to recognition results.  It also has a
+                  natural language semantic interpreter to post-process
+                  the recognized data according to the semantic data in
+                  the grammar and provide semantic results along with
+                  the recognized input.  The recognizer MAY also support
+                  enrolled grammars, where the client can enroll and
+                  create new personal grammars for use in future
+                  recognition operations.
+
+   Speaker Verifier
+                  A resource capable of verifying the authenticity of a
+                  claimed identity by matching a media stream containing
+                  spoken input to a pre-existing voiceprint.  This may
+                  also involve matching the caller's voice against more
+                  than one voiceprint, also called multi-verification or
+                  speaker identification.
+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 13]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+3.2.  Server and Resource Addressing
+
+   The MRCPv2 server is a generic SIP server, and is thus addressed by a
+   SIP URI (RFC 3261 [RFC3261]).
+
+   For example:
+
+        sip:mrcpv2@example.net   or
+        sips:mrcpv2@example.net
+
+4.  MRCPv2 Basics
+
+   MRCPv2 requires a connection-oriented transport-layer protocol such
+   as TCP to guarantee reliable sequencing and delivery of MRCPv2
+   control messages between the client and the server.  In order to meet
+   the requirements for security enumerated in SPEECHSC requirements
+   [RFC4313], clients and servers MUST implement TLS as well.  One or
+   more connections between the client and the server can be shared
+   among different MRCPv2 channels to the server.  The individual
+   messages carry the channel identifier to differentiate messages on
+   different channels.  MRCPv2 encoding is text based with mechanisms to
+   carry embedded binary data.  This allows arbitrary data like
+   recognition grammars, recognition results, synthesizer speech markup,
+   etc., to be carried in MRCPv2 messages.  For information on message
+   framing, see Section 5.
+
+4.1.  Connecting to the Server
+
+   MRCPv2 employs SIP, in conjunction with SDP, as the session
+   establishment and management protocol.  The client reaches an MRCPv2
+   server using conventional INVITE and other SIP requests for
+   establishing, maintaining, and terminating SIP dialogs.  The SDP
+   offer/answer exchange model over SIP is used to establish a resource
+   control channel for each resource.  The SDP offer/answer exchange is
+   also used to establish media sessions between the server and the
+   source or sink of audio.
+
+4.2.  Managing Resource Control Channels
+
+   The client needs a separate MRCPv2 resource control channel to
+   control each media processing resource under the SIP dialog.  A
+   unique channel identifier string identifies these resource control
+   channels.  The channel identifier is a difficult-to-guess,
+   unambiguous string followed by an "@", then by a string token
+   specifying the type of resource.  The server generates the channel
+   identifier and MUST make sure it does not clash with the identifier
+   of any other MRCP channel currently allocated by that server.  MRCPv2
+   defines the following IANA-registered types of media processing
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 14]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   resources.  Additional resource types and their associated methods/
+   events and state machines may be added as described below in
+   Section 13.
+
+          +---------------+----------------------+--------------+
+          | Resource Type | Resource Description | Described in |
+          +---------------+----------------------+--------------+
+          | speechrecog   | Speech Recognizer    | Section 9    |
+          | dtmfrecog     | DTMF Recognizer      | Section 9    |
+          | speechsynth   | Speech Synthesizer   | Section 8    |
+          | basicsynth    | Basic Synthesizer    | Section 8    |
+          | speakverify   | Speaker Verification | Section 11   |
+          | recorder      | Speech Recorder      | Section 10   |
+          +---------------+----------------------+--------------+
+
+                          Table 1: Resource Types
+
+   The SIP INVITE or re-INVITE transaction and the SDP offer/answer
+   exchange it carries contain "m=" lines describing the resource
+   control channel to be allocated.  There MUST be one SDP "m=" line for
+   each MRCPv2 resource to be used in the session.  This "m=" line MUST
+   have a media type field of "application" and a transport type field
+   of either "TCP/MRCPv2" or "TCP/TLS/MRCPv2".  The port number field of
+   the "m=" line MUST contain the "discard" port of the transport
+   protocol (port 9 for TCP) in the SDP offer from the client and MUST
+   contain the TCP listen port on the server in the SDP answer.  The
+   client may then either set up a TCP or TLS connection to that server
+   port or share an already established connection to that port.  Since
+   MRCPv2 allows multiple sessions to share the same TCP connection,
+   multiple "m=" lines in a single SDP document MAY share the same port
+   field value; MRCPv2 servers MUST NOT assume any relationship between
+   resources using the same port other than the sharing of the
+   communication channel.
+
+   MRCPv2 resources do not use the port or format field of the "m=" line
+   to distinguish themselves from other resources using the same
+   channel.  The client MUST specify the resource type identifier in the
+   resource attribute associated with the control "m=" line of the SDP
+   offer.  The server MUST respond with the full Channel-Identifier
+   (which includes the resource type identifier and a difficult-to-
+   guess, unambiguous string) in the "channel" attribute associated with
+   the control "m=" line of the SDP answer.  To remain backwards
+   compatible with conventional SDP usage, the format field of the "m="
+   line MUST have the arbitrarily selected value of "1".
+
+   When the client wants to add a media processing resource to the
+   session, it issues a new SDP offer, according to the procedures of
+   RFC 3264 [RFC3264], in a SIP re-INVITE request.  The SDP offer/answer
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 15]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   exchange carried by this SIP transaction contains one or more
+   additional control "m=" lines for the new resources to be allocated
+   to the session.  The server, on seeing the new "m=" line, allocates
+   the resources (if they are available) and responds with a
+   corresponding control "m=" line in the SDP answer carried in the SIP
+   response.  If the new resources are not available, the re-INVITE
+   receives an error message, and existing media processing going on
+   before the re-INVITE will continue as it was before.  It is not
+   possible to allocate more than one resource of each type.  If a
+   client requests more than one resource of any type, the server MUST
+   behave as if the resources of that type (beyond the first one) are
+   not available.
+
+   MRCPv2 clients and servers using TCP as a transport protocol MUST use
+   the procedures specified in RFC 4145 [RFC4145] for setting up the TCP
+   connection, with the considerations described hereby.  Similarly,
+   MRCPv2 clients and servers using TCP/TLS as a transport protocol MUST
+   use the procedures specified in RFC 4572 [RFC4572] for setting up the
+   TLS connection, with the considerations described hereby.  The
+   a=setup attribute, as described in RFC 4145 [RFC4145], MUST be
+   "active" for the offer from the client and MUST be "passive" for the
+   answer from the MRCPv2 server.  The a=connection attribute MUST have
+   a value of "new" on the very first control "m=" line offer from the
+   client to an MRCPv2 server.  Subsequent control "m=" line offers from
+   the client to the MRCP server MAY contain "new" or "existing",
+   depending on whether the client wants to set up a new connection or
+   share an existing connection, respectively.  If the client specifies
+   a value of "new", the server MUST respond with a value of "new".  If
+   the client specifies a value of "existing", the server MUST respond.
+   The legal values in the response are "existing" if the server prefers
+   to share an existing connection or "new" if not.  In the latter case,
+   the client MUST initiate a new transport connection.
+
+   When the client wants to deallocate the resource from this session,
+   it issues a new SDP offer, according to RFC 3264 [RFC3264], where the
+   control "m=" line port MUST be set to 0.  This SDP offer is sent in a
+   SIP re-INVITE request.  This deallocates the associated MRCPv2
+   identifier and resource.  The server MUST NOT close the TCP or TLS
+   connection if it is currently being shared among multiple MRCP
+   channels.  When all MRCP channels that may be sharing the connection
+   are released and/or the associated SIP dialog is terminated, the
+   client or server terminates the connection.
+
+   When the client wants to tear down the whole session and all its
+   resources, it MUST issue a SIP BYE request to close the SIP session.
+   This will deallocate all the control channels and resources allocated
+   under the session.
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 16]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   All servers MUST support TLS.  Servers MAY use TCP without TLS in
+   controlled environments (e.g., not in the public Internet) where both
+   nodes are inside a protected perimeter, for example, preventing
+   access to the MRCP server from remote nodes outside the controlled
+   perimeter.  It is up to the client, through the SDP offer, to choose
+   which transport it wants to use for an MRCPv2 session.  Aside from
+   the exceptions given above, when using TCP, the "m=" lines MUST
+   conform to RFC4145 [RFC4145], which describes the usage of SDP for
+   connection-oriented transport.  When using TLS, the SDP "m=" line for
+   the control stream MUST conform to Connection-Oriented Media
+   (COMEDIA) over TLS [RFC4572], which specifies the usage of SDP for
+   establishing a secure connection-oriented transport over TLS.
+
+4.3.  SIP Session Example
+
+   This first example shows the power of using SIP to route to the
+   appropriate resource.  In the example, note the use of a request to a
+   domain's speech server service in the INVITE to
+   mresources@example.com.  The SIP routing machinery in the domain
+   locates the actual server, mresources@server.example.com, which gets
+   returned in the 200 OK.  Note that "cmid" is defined in Section 4.4.
+
+   This example exchange adds a resource control channel for a
+   synthesizer.  Since a synthesizer also generates an audio stream,
+   this interaction also creates a receive-only Real-Time Protocol (RTP)
+   [RFC3550] media session for the server to send audio to.  The SIP
+   dialog with the media source/sink is independent of MRCP and is not
+   shown.
+
+   C->S:  INVITE sip:mresources@example.com SIP/2.0
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bf1
+          Max-Forwards:6
+          To:MediaServer <sip:mresources@example.com>
+          From:sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:314161 INVITE
+          Contact:<sip:sarvi@client.example.com>
+          Content-Type:application/sdp
+          Content-Length:...
+
+          v=0
+          o=sarvi 2890844526 2890844526 IN IP4 192.0.2.12
+          s=-
+          c=IN IP4 192.0.2.12
+          t=0 0
+          m=application 9 TCP/MRCPv2 1
+          a=setup:active
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 17]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+          a=connection:new
+          a=resource:speechsynth
+          a=cmid:1
+          m=audio 49170 RTP/AVP 0
+          a=rtpmap:0 pcmu/8000
+          a=recvonly
+          a=mid:1
+
+
+   S->C:  SIP/2.0 200 OK
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bf1;received=192.0.32.10
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          From:sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:314161 INVITE
+          Contact:<sip:mresources@server.example.com>
+          Content-Type:application/sdp
+          Content-Length:...
+
+          v=0
+          o=- 2890842808 2890842808 IN IP4 192.0.2.11
+          s=-
+          c=IN IP4 192.0.2.11
+          t=0 0
+          m=application 32416 TCP/MRCPv2 1
+          a=setup:passive
+          a=connection:new
+          a=channel:32AECB234338@speechsynth
+          a=cmid:1
+          m=audio 48260 RTP/AVP 0
+          a=rtpmap:0 pcmu/8000
+          a=sendonly
+          a=mid:1
+
+
+   C->S:  ACK sip:mresources@server.example.com SIP/2.0
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bf2
+          Max-Forwards:6
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          From:Sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:314161 ACK
+          Content-Length:0
+
+                 Example: Add Synthesizer Control Channel
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 18]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   This example exchange continues from the previous figure and
+   allocates an additional resource control channel for a recognizer.
+   Since a recognizer would need to receive an audio stream for
+   recognition, this interaction also updates the audio stream to
+   sendrecv, making it a two-way RTP media session.
+
+   C->S:  INVITE sip:mresources@server.example.com SIP/2.0
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bf3
+          Max-Forwards:6
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          From:sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:314162 INVITE
+          Contact:<sip:sarvi@client.example.com>
+          Content-Type:application/sdp
+          Content-Length:...
+
+          v=0
+          o=sarvi 2890844526 2890844527 IN IP4 192.0.2.12
+          s=-
+          c=IN IP4 192.0.2.12
+          t=0 0
+          m=application 9 TCP/MRCPv2 1
+          a=setup:active
+          a=connection:existing
+          a=resource:speechsynth
+          a=cmid:1
+          m=audio 49170 RTP/AVP 0 96
+          a=rtpmap:0 pcmu/8000
+          a=rtpmap:96 telephone-event/8000
+          a=fmtp:96 0-15
+          a=sendrecv
+          a=mid:1
+          m=application 9 TCP/MRCPv2 1
+          a=setup:active
+          a=connection:existing
+          a=resource:speechrecog
+          a=cmid:1
+
+
+   S->C:  SIP/2.0 200 OK
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bf3;received=192.0.32.10
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          From:sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:314162 INVITE
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 19]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+          Contact:<sip:mresources@server.example.com>
+          Content-Type:application/sdp
+          Content-Length:...
+
+          v=0
+          o=- 2890842808 2890842809 IN IP4 192.0.2.11
+          s=-
+          c=IN IP4 192.0.2.11
+          t=0 0
+          m=application 32416 TCP/MRCPv2 1
+          a=setup:passive
+          a=connection:existing
+          a=channel:32AECB234338@speechsynth
+          a=cmid:1
+          m=audio 48260 RTP/AVP 0 96
+          a=rtpmap:0 pcmu/8000
+          a=rtpmap:96 telephone-event/8000
+          a=fmtp:96 0-15
+          a=sendrecv
+          a=mid:1
+          m=application 32416 TCP/MRCPv2 1
+          a=setup:passive
+          a=connection:existing
+          a=channel:32AECB234338@speechrecog
+          a=cmid:1
+
+
+   C->S:  ACK sip:mresources@server.example.com SIP/2.0
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bf4
+          Max-Forwards:6
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          From:Sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:314162 ACK
+          Content-Length:0
+
+                          Example: Add Recognizer
+
+   This example exchange continues from the previous figure and
+   deallocates the recognizer channel.  Since a recognizer no longer
+   needs to receive an audio stream, this interaction also updates the
+   RTP media session to recvonly.
+
+   C->S:  INVITE sip:mresources@server.example.com SIP/2.0
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bf5
+          Max-Forwards:6
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 20]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          From:sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:314163 INVITE
+          Contact:<sip:sarvi@client.example.com>
+          Content-Type:application/sdp
+          Content-Length:...
+
+          v=0
+          o=sarvi 2890844526 2890844528 IN IP4 192.0.2.12
+          s=-
+          c=IN IP4 192.0.2.12
+          t=0 0
+          m=application 9 TCP/MRCPv2 1
+          a=resource:speechsynth
+          a=cmid:1
+          m=audio 49170 RTP/AVP 0
+          a=rtpmap:0 pcmu/8000
+          a=recvonly
+          a=mid:1
+          m=application 0 TCP/MRCPv2 1
+          a=resource:speechrecog
+          a=cmid:1
+
+
+   S->C:  SIP/2.0 200 OK
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bf5;received=192.0.32.10
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          From:sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:314163 INVITE
+          Contact:<sip:mresources@server.example.com>
+          Content-Type:application/sdp
+          Content-Length:...
+
+          v=0
+          o=- 2890842808 2890842810 IN IP4 192.0.2.11
+          s=-
+          c=IN IP4 192.0.2.11
+          t=0 0
+          m=application 32416 TCP/MRCPv2 1
+          a=channel:32AECB234338@speechsynth
+          a=cmid:1
+          m=audio 48260 RTP/AVP 0
+          a=rtpmap:0 pcmu/8000
+          a=sendonly
+          a=mid:1
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 21]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+          m=application 0 TCP/MRCPv2 1
+          a=channel:32AECB234338@speechrecog
+          a=cmid:1
+
+   C->S:  ACK sip:mresources@server.example.com SIP/2.0
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bf6
+          Max-Forwards:6
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          From:Sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:314163 ACK
+          Content-Length:0
+
+                      Example: Deallocate Recognizer
+
+4.4.  Media Streams and RTP Ports
+
+   Since MRCPv2 resources either generate or consume media streams, the
+   client or the server needs to associate media sessions with their
+   corresponding resource or resources.  More than one resource could be
+   associated with a single media session or each resource could be
+   assigned a separate media session.  Also, note that more than one
+   media session can be associated with a single resource if need be,
+   but this scenario is not useful for the current set of resources.
+   For example, a synthesizer and a recognizer could be associated to
+   the same media session (m=audio line), if it is opened in "sendrecv"
+   mode.  Alternatively, the recognizer could have its own "sendonly"
+   audio session, and the synthesizer could have its own "recvonly"
+   audio session.
+
+   The association between control channels and their corresponding
+   media sessions is established using a new "resource channel media
+   identifier" media-level attribute ("cmid").  Valid values of this
+   attribute are the values of the "mid" attribute defined in RFC 5888
+   [RFC5888].  If there is more than one audio "m=" line, then each
+   audio "m=" line MUST have a "mid" attribute.  Each control "m=" line
+   MAY have one or more "cmid" attributes that match the resource
+   control channel to the "mid" attributes of the audio "m=" lines it is
+   associated with.  Note that if a control "m=" line does not have a
+   "cmid" attribute it will not be associated with any media.  The
+   operations on such a resource will hence be limited.  For example, if
+   it was a recognizer resource, the RECOGNIZE method requires an
+   associated media to process while the INTERPRET method does not.  The
+   formatting of the "cmid" attribute is described by the following
+   ABNF:
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 22]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   cmid-attribute     = "a=cmid:" identification-tag
+   identification-tag = token
+
+   To allow this flexible mapping of media sessions to MRCPv2 control
+   channels, a single audio "m=" line can be associated with multiple
+   resources, or each resource can have its own audio "m=" line.  For
+   example, if the client wants to allocate a recognizer and a
+   synthesizer and associate them with a single two-way audio stream,
+   the SDP offer would contain two control "m=" lines and a single audio
+   "m=" line with an attribute of "sendrecv".  Each of the control "m="
+   lines would have a "cmid" attribute whose value matches the "mid" of
+   the audio "m=" line.  If, on the other hand, the client wants to
+   allocate a recognizer and a synthesizer each with its own separate
+   audio stream, the SDP offer would carry two control "m=" lines (one
+   for the recognizer and another for the synthesizer) and two audio
+   "m=" lines (one with the attribute "sendonly" and another with
+   attribute "recvonly").  The "cmid" attribute of the recognizer
+   control "m=" line would match the "mid" value of the "sendonly" audio
+   "m=" line, and the "cmid" attribute of the synthesizer control "m="
+   line would match the "mid" attribute of the "recvonly" "m=" line.
+
+   When a server receives media (e.g., audio) on a media session that is
+   associated with more than one media processing resource, it is the
+   responsibility of the server to receive and fork the media to the
+   resources that need to consume it.  If multiple resources in an
+   MRCPv2 session are generating audio (or other media) to be sent on a
+   single associated media session, it is the responsibility of the
+   server either to multiplex the multiple streams onto the single RTP
+   session or to contain an embedded RTP mixer (see RFC 3550 [RFC3550])
+   to combine the multiple streams into one.  In the former case, the
+   media stream will contain RTP packets generated by different sources,
+   and hence the packets will have different Synchronization Source
+   Identifiers (SSRCs).  In the latter case, the RTP packets will
+   contain multiple Contributing Source Identifiers (CSRCs)
+   corresponding to the original streams before being combined by the
+   mixer.  If an MRCPv2 server implementation neither multiplexes nor
+   mixes, it MUST disallow the client from associating multiple such
+   resources to a single audio stream by rejecting the SDP offer with a
+   SIP 488 "Not Acceptable" error.  Note that there is a large installed
+   base that will return a SIP 501 "Not Implemented" error in this case.
+   To facilitate interoperability with this installed base, new
+   implementations SHOULD treat a 501 in this context as a 488 when it
+   is received from an element known to be a legacy implementation.
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 23]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+4.5.  MRCPv2 Message Transport
+
+   The MRCPv2 messages defined in this document are transported over a
+   TCP or TLS connection between the client and the server.  The method
+   for setting up this transport connection and the resource control
+   channel is discussed in Sections 4.1 and 4.2.  Multiple resource
+   control channels between a client and a server that belong to
+   different SIP dialogs can share one or more TLS or TCP connections
+   between them; the server and client MUST support this mode of
+   operation.  Clients and servers MUST use the MRCPv2 channel
+   identifier, carried in the Channel-Identifier header field in
+   individual MRCPv2 messages, to differentiate MRCPv2 messages from
+   different resource channels (see Section 6.2.1 for details).  All
+   MRCPv2 servers MUST support TLS.  Servers MAY use TCP without TLS in
+   controlled environments (e.g., not in the public Internet) where both
+   nodes are inside a protected perimeter, for example, preventing
+   access to the MRCP server from remote nodes outside the controlled
+   perimeter.  It is up to the client to choose which mode of transport
+   it wants to use for an MRCPv2 session.
+
+   Most examples from here on show only the MRCPv2 messages and do not
+   show the SIP messages that may have been used to establish the MRCPv2
+   control channel.
+
+4.6.  MRCPv2 Session Termination
+
+   If an MRCP client notices that the underlying connection has been
+   closed for one of its MRCP channels, and it has not previously
+   initiated a re-INVITE to close that channel, it MUST send a BYE to
+   close down the SIP dialog and all other MRCP channels.  If an MRCP
+   server notices that the underlying connection has been closed for one
+   of its MRCP channels, and it has not previously received and accepted
+   a re-INVITE closing that channel, then it MUST send a BYE to close
+   down the SIP dialog and all other MRCP channels.
+
+5.  MRCPv2 Specification
+
+   Except as otherwise indicated, MRCPv2 messages are Unicode encoded in
+   UTF-8 (RFC 3629 [RFC3629]) to allow many different languages to be
+   represented.  DEFINE-GRAMMAR (Section 9.8), for example, is one such
+   exception, since its body can contain arbitrary XML in arbitrary (but
+   specified via XML) encodings.  MRCPv2 also allows message bodies to
+   be represented in other character sets (for example, ISO 8859-1
+   [ISO.8859-1.1987]) because, in some locales, other character sets are
+   already in widespread use.  The MRCPv2 headers (the first line of an
+   MRCP message) and header field names use only the US-ASCII subset of
+   UTF-8.
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 24]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Lines are terminated by CRLF (carriage return, then line feed).
+   Also, some parameters in the message may contain binary data or a
+   record spanning multiple lines.  Such fields have a length value
+   associated with the parameter, which indicates the number of octets
+   immediately following the parameter.
+
+5.1.  Common Protocol Elements
+
+   The MRCPv2 message set consists of requests from the client to the
+   server, responses from the server to the client, and asynchronous
+   events from the server to the client.  All these messages consist of
+   a start-line, one or more header fields, an empty line (i.e., a line
+   with nothing preceding the CRLF) indicating the end of the header
+   fields, and an optional message body.
+
+generic-message  =    start-line
+                      message-header
+                      CRLF
+                      [ message-body ]
+
+message-body     =    *OCTET
+
+start-line       =    request-line / response-line / event-line
+
+message-header   =  1*(generic-header / resource-header / generic-field)
+
+resource-header  =    synthesizer-header
+                 /    recognizer-header
+                 /    recorder-header
+                 /    verifier-header
+
+   The message-body contains resource-specific and message-specific
+   data.  The actual media types used to carry the data are specified in
+   the sections defining the individual messages.  Generic header fields
+   are described in Section 6.2.
+
+   If a message contains a message body, the message MUST contain
+   content-headers indicating the media type and encoding of the data in
+   the message body.
+
+   Request, response and event messages (described in following
+   sections) include the version of MRCP that the message conforms to.
+   Version compatibility rules follow [H3.1] regarding version ordering,
+   compliance requirements, and upgrading of version numbers.  The
+   version information is indicated by "MRCP" (as opposed to "HTTP" in
+   [H3.1]) or "MRCP/2.0" (as opposed to "HTTP/1.1" in [H3.1]).  To be
+   compliant with this specification, clients and servers sending MRCPv2
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 25]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   messages MUST indicate an mrcp-version of "MRCP/2.0".  ABNF
+   productions using mrcp-version can be found in Sections 5.2, 5.3, and
+   5.5.
+
+   mrcp-version   =    "MRCP" "/" 1*2DIGIT "." 1*2DIGIT
+
+   The message-length field specifies the length of the message in
+   octets, including the start-line, and MUST be the second token from
+   the beginning of the message.  This is to make the framing and
+   parsing of the message simpler to do.  This field specifies the
+   length of the message including data that may be encoded into the
+   body of the message.  Note that this value MAY be given as a fixed-
+   length integer that is zero-padded (with leading zeros) in order to
+   eliminate or reduce inefficiency in cases where the message-length
+   value would change as a result of the length of the message-length
+   token itself.  This value, as with all lengths in MRCP, is to be
+   interpreted as a base-10 number.  In particular, leading zeros do not
+   indicate that the value is to be interpreted as a base-8 number.
+
+   message-length =    1*19DIGIT
+
+   The following sample MRCP exchange demonstrates proper message-length
+   values.  The values for message-length have been removed from all
+   other examples in the specification and replaced by '...' to reduce
+   confusion in the case of minor message-length computation errors in
+   those examples.
+
+   C->S:   MRCP/2.0 877 INTERPRET 543266
+           Channel-Identifier:32AECB23433801@speechrecog
+           Interpret-Text:may I speak to Andre Roy
+           Content-Type:application/srgs+xml
+           Content-ID:<request1@form-level.store>
+           Content-Length:661
+
+           <?xml version="1.0"?>
+           <!-- the default grammar language is US English -->
+           <grammar xmlns="http://www.w3.org/2001/06/grammar"
+                    xml:lang="en-US" version="1.0" root="request">
+           <!-- single language attachment to tokens -->
+               <rule id="yes">
+                   <one-of>
+                       <item xml:lang="fr-CA">oui</item>
+                       <item xml:lang="en-US">yes</item>
+                   </one-of>
+               </rule>
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 26]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+           <!-- single language attachment to a rule expansion -->
+               <rule id="request">
+                   may I speak to
+                   <one-of xml:lang="fr-CA">
+                       <item>Michel Tremblay</item>
+                       <item>Andre Roy</item>
+                   </one-of>
+               </rule>
+           </grammar>
+
+   S->C:   MRCP/2.0 82 543266 200 IN-PROGRESS
+           Channel-Identifier:32AECB23433801@speechrecog
+
+   S->C:   MRCP/2.0 634 INTERPRETATION-COMPLETE 543266 200 COMPLETE
+           Channel-Identifier:32AECB23433801@speechrecog
+           Completion-Cause:000 success
+           Content-Type:application/nlsml+xml
+           Content-Length:441
+
+           <?xml version="1.0"?>
+           <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+                   xmlns:ex="http://www.example.com/example"
+                   grammar="session:request1@form-level.store">
+               <interpretation>
+                   <instance name="Person">
+                       <ex:Person>
+                           <ex:Name> Andre Roy </ex:Name>
+                       </ex:Person>
+                   </instance>
+                   <input>   may I speak to Andre Roy </input>
+               </interpretation>
+           </result>
+
+   All MRCPv2 messages, responses and events MUST carry the Channel-
+   Identifier header field so the server or client can differentiate
+   messages from different control channels that may share the same
+   transport connection.
+
+   In the resource-specific header field descriptions in Sections 8-11,
+   a header field is disallowed on a method (request, response, or
+   event) for that resource unless specifically listed as being allowed.
+   Also, the phrasing "This header field MAY occur on method X"
+   indicates that the header field is allowed on that method but is not
+   required to be used in every instance of that method.
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 27]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+5.2.  Request
+
+   An MRCPv2 request consists of a Request line followed by the message
+   header section and an optional message body containing data specific
+   to the request message.
+
+   The Request message from a client to the server includes within the
+   first line the method to be applied, a method tag for that request
+   and the version of the protocol in use.
+
+   request-line   =    mrcp-version SP message-length SP method-name
+                       SP request-id CRLF
+
+   The mrcp-version field is the MRCP protocol version that is being
+   used by the client.
+
+   The message-length field specifies the length of the message,
+   including the start-line.
+
+   Details about the mrcp-version and message-length fields are given in
+   Section 5.1.
+
+   The method-name field identifies the specific request that the client
+   is making to the server.  Each resource supports a subset of the
+   MRCPv2 methods.  The subset for each resource is defined in the
+   section of the specification for the corresponding resource.
+
+   method-name    =    generic-method
+                  /    synthesizer-method
+                  /    recognizer-method
+                  /    recorder-method
+                  /    verifier-method
+
+   The request-id field is a unique identifier representable as an
+   unsigned 32-bit integer created by the client and sent to the server.
+   Clients MUST utilize monotonically increasing request-ids for
+   consecutive requests within an MRCP session.  The request-id space is
+   linear (i.e., not mod(32)), so the space does not wrap, and validity
+   can be checked with a simple unsigned comparison operation.  The
+   client may choose any initial value for its first request, but a
+   small integer is RECOMMENDED to avoid exhausting the space in long
+   sessions.  If the server receives duplicate or out-of-order requests,
+   the server MUST reject the request with a response code of 410.
+   Since request-ids are scoped to the MRCP session, they are unique
+   across all TCP connections and all resource channels in the session.
+
+   The server resource MUST use the client-assigned identifier in its
+   response to the request.  If the request does not complete
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 28]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   synchronously, future asynchronous events associated with this
+   request MUST carry the client-assigned request-id.
+
+   request-id     =    1*10DIGIT
+
+5.3.  Response
+
+   After receiving and interpreting the request message for a method,
+   the server resource responds with an MRCPv2 response message.  The
+   response consists of a response line followed by the message header
+   section and an optional message body containing data specific to the
+   method.
+
+   response-line  =    mrcp-version SP message-length SP request-id
+                       SP status-code SP request-state CRLF
+
+   The mrcp-version field MUST contain the version of the request if
+   supported; otherwise, it MUST contain the highest version of MRCP
+   supported by the server.
+
+   The message-length field specifies the length of the message,
+   including the start-line.
+
+   Details about the mrcp-version and message-length fields are given in
+   Section 5.1.
+
+   The request-id used in the response MUST match the one sent in the
+   corresponding request message.
+
+   The status-code field is a 3-digit code representing the success or
+   failure or other status of the request.
+
+   status-code     =    3DIGIT
+
+   The request-state field indicates if the action initiated by the
+   Request is PENDING, IN-PROGRESS, or COMPLETE.  The COMPLETE status
+   means that the request was processed to completion and that there
+   will be no more events or other messages from that resource to the
+   client with that request-id.  The PENDING status means that the
+   request has been placed in a queue and will be processed in first-in-
+   first-out order.  The IN-PROGRESS status means that the request is
+   being processed and is not yet complete.  A PENDING or IN-PROGRESS
+   status indicates that further Event messages may be delivered with
+   that request-id.
+
+   request-state    =  "COMPLETE"
+                    /  "IN-PROGRESS"
+                    /  "PENDING"
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 29]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+5.4.  Status Codes
+
+   The status codes are classified under the Success (2xx), Client
+   Failure (4xx), and Server Failure (5xx) codes.
+
+     +------------+--------------------------------------------------+
+     | Code       | Meaning                                          |
+     +------------+--------------------------------------------------+
+     | 200        | Success                                          |
+     | 201        | Success with some optional header fields ignored |
+     +------------+--------------------------------------------------+
+
+                               Success (2xx)
+
+   +--------+----------------------------------------------------------+
+   | Code   | Meaning                                                  |
+   +--------+----------------------------------------------------------+
+   | 401    | Method not allowed                                       |
+   | 402    | Method not valid in this state                           |
+   | 403    | Unsupported header field                                 |
+   | 404    | Illegal value for header field. This is the error for a  |
+   |        | syntax violation.                                        |
+   | 405    | Resource not allocated for this session or does not      |
+   |        | exist                                                    |
+   | 406    | Mandatory Header Field Missing                           |
+   | 407    | Method or Operation Failed (e.g., Grammar compilation    |
+   |        | failed in the recognizer. Detailed cause codes might be  |
+   |        | available through a resource-specific header.)           |
+   | 408    | Unrecognized or unsupported message entity               |
+   | 409    | Unsupported Header Field Value. This is a value that is  |
+   |        | syntactically legal but exceeds the implementation's     |
+   |        | capabilities or expectations.                            |
+   | 410    | Non-Monotonic or Out-of-order sequence number in request.|
+   | 411-420| Reserved for future assignment                           |
+   +--------+----------------------------------------------------------+
+
+                           Client Failure (4xx)
+
+              +------------+--------------------------------+
+              | Code       | Meaning                        |
+              +------------+--------------------------------+
+              | 501        | Server Internal Error          |
+              | 502        | Protocol Version not supported |
+              | 503        | Reserved for future assignment |
+              | 504        | Message too large              |
+              +------------+--------------------------------+
+
+                           Server Failure (5xx)
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 30]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+5.5.  Events
+
+   The server resource may need to communicate a change in state or the
+   occurrence of a certain event to the client.  These messages are used
+   when a request does not complete immediately and the response returns
+   a status of PENDING or IN-PROGRESS.  The intermediate results and
+   events of the request are indicated to the client through the event
+   message from the server.  The event message consists of an event
+   header line followed by the message header section and an optional
+   message body containing data specific to the event message.  The
+   header line has the request-id of the corresponding request and
+   status value.  The request-state value is COMPLETE if the request is
+   done and this was the last event, else it is IN-PROGRESS.
+
+   event-line       =  mrcp-version SP message-length SP event-name
+                       SP request-id SP request-state CRLF
+
+   The mrcp-version used here is identical to the one used in the
+   Request/Response line and indicates the highest version of MRCP
+   running on the server.
+
+   The message-length field specifies the length of the message,
+   including the start-line.
+
+   Details about the mrcp-version and message-length fields are given in
+   Section 5.1.
+
+   The event-name identifies the nature of the event generated by the
+   media resource.  The set of valid event names depends on the resource
+   generating it.  See the corresponding resource-specific section of
+   the document.
+
+   event-name       =  synthesizer-event
+                    /  recognizer-event
+                    /  recorder-event
+                    /  verifier-event
+
+   The request-id used in the event MUST match the one sent in the
+   request that caused this event.
+
+   The request-state indicates whether the Request/Command causing this
+   event is complete or still in progress and whether it is the same as
+   the one mentioned in Section 5.3.  The final event for a request has
+   a COMPLETE status indicating the completion of the request.
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 31]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+6.  MRCPv2 Generic Methods, Headers, and Result Structure
+
+   MRCPv2 supports a set of methods and header fields that are common to
+   all resources.  These are discussed here; resource-specific methods
+   and header fields are discussed in the corresponding resource-
+   specific section of the document.
+
+6.1.  Generic Methods
+
+   MRCPv2 supports two generic methods for reading and writing the state
+   associated with a resource.
+
+   generic-method      =    "SET-PARAMS"
+                       /    "GET-PARAMS"
+
+   These are described in the following subsections.
+
+6.1.1.  SET-PARAMS
+
+   The SET-PARAMS method, from the client to the server, tells the
+   MRCPv2 resource to define parameters for the session, such as voice
+   characteristics and prosody on synthesizers, recognition timers on
+   recognizers, etc.  If the server accepts and sets all parameters, it
+   MUST return a response status-code of 200.  If it chooses to ignore
+   some optional header fields that can be safely ignored without
+   affecting operation of the server, it MUST return 201.
+
+   If one or more of the header fields being sent is incorrect, error
+   403, 404, or 409 MUST be returned as follows:
+
+   o  If one or more of the header fields being set has an illegal
+      value, the server MUST reject the request with a 404 Illegal Value
+      for Header Field.
+
+   o  If one or more of the header fields being set is unsupported for
+      the resource, the server MUST reject the request with a 403
+      Unsupported Header Field, except as described in the next
+      paragraph.
+
+   o  If one or more of the header fields being set has an unsupported
+      value, the server MUST reject the request with a 409 Unsupported
+      Header Field Value, except as described in the next paragraph.
+
+   If both error 404 and another error have occurred, only error 404
+   MUST be returned.  If both errors 403 and 409 have occurred, but not
+   error 404, only error 403 MUST be returned.
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 32]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   If error 403, 404, or 409 is returned, the response MUST include the
+   bad or unsupported header fields and their values exactly as they
+   were sent from the client.  Session parameters modified using
+   SET-PARAMS do not override parameters explicitly specified on
+   individual requests or requests that are IN-PROGRESS.
+
+   C->S:  MRCP/2.0 ... SET-PARAMS 543256
+          Channel-Identifier:32AECB23433802@speechsynth
+          Voice-gender:female
+          Voice-variant:3
+
+   S->C:  MRCP/2.0 ... 543256 200 COMPLETE
+          Channel-Identifier:32AECB23433802@speechsynth
+
+6.1.2.  GET-PARAMS
+
+   The GET-PARAMS method, from the client to the server, asks the MRCPv2
+   resource for its current session parameters, such as voice
+   characteristics and prosody on synthesizers, recognition timers on
+   recognizers, etc.  For every header field the client sends in the
+   request without a value, the server MUST include the header field and
+   its corresponding value in the response.  If no parameter header
+   fields are specified by the client, then the server MUST return all
+   the settable parameters and their values in the corresponding header
+   section of the response, including vendor-specific parameters.  Such
+   wildcard parameter requests can be very processing-intensive, since
+   the number of settable parameters can be large depending on the
+   implementation.  Hence, it is RECOMMENDED that the client not use the
+   wildcard GET-PARAMS operation very often.  Note that GET-PARAMS
+   returns header field values that apply to the whole session and not
+   values that have a request-level scope.  For example, Input-Waveform-
+   URI is a request-level header field and thus would not be returned by
+   GET-PARAMS.
+
+   If all of the header fields requested are supported, the server MUST
+   return a response status-code of 200.  If some of the header fields
+   being retrieved are unsupported for the resource, the server MUST
+   reject the request with a 403 Unsupported Header Field.  Such a
+   response MUST include the unsupported header fields exactly as they
+   were sent from the client, without values.
+
+   C->S:   MRCP/2.0 ... GET-PARAMS 543256
+           Channel-Identifier:32AECB23433802@speechsynth
+           Voice-gender:
+           Voice-variant:
+           Vendor-Specific-Parameters:com.example.param1;
+                         com.example.param2
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 33]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   S->C:   MRCP/2.0 ... 543256 200 COMPLETE
+           Channel-Identifier:32AECB23433802@speechsynth
+           Voice-gender:female
+           Voice-variant:3
+           Vendor-Specific-Parameters:com.example.param1="Company Name";
+                         com.example.param2="124324234@example.com"
+
+6.2.  Generic Message Headers
+
+   All MRCPv2 header fields, which include both the generic-headers
+   defined in the following subsections and the resource-specific header
+   fields defined later, follow the same generic format as that given in
+   Section 3.1 of RFC 5322 [RFC5322].  Each header field consists of a
+   name followed by a colon (":") and the value.  Header field names are
+   case-insensitive.  The value MAY be preceded by any amount of LWS
+   (linear white space), though a single SP (space) is preferred.
+   Header fields may extend over multiple lines by preceding each extra
+   line with at least one SP or HT (horizontal tab).
+
+   generic-field  = field-name ":" [ field-value ]
+   field-name     = token
+   field-value    = *LWS field-content *( CRLF 1*LWS field-content)
+   field-content  = <the OCTETs making up the field-value
+                    and consisting of either *TEXT or combinations
+                    of token, separators, and quoted-string>
+
+   The field-content does not include any leading or trailing LWS (i.e.,
+   linear white space occurring before the first non-whitespace
+   character of the field-value or after the last non-whitespace
+   character of the field-value).  Such leading or trailing LWS MAY be
+   removed without changing the semantics of the field value.  Any LWS
+   that occurs between field-content MAY be replaced with a single SP
+   before interpreting the field value or forwarding the message
+   downstream.
+
+   MRCPv2 servers and clients MUST NOT depend on header field order.  It
+   is RECOMMENDED to send general-header fields first, followed by
+   request-header or response-header fields, and ending with the entity-
+   header fields.  However, MRCPv2 servers and clients MUST be prepared
+   to process the header fields in any order.  The only exception to
+   this rule is when there are multiple header fields with the same name
+   in a message.
+
+   Multiple header fields with the same name MAY be present in a message
+   if and only if the entire value for that header field is defined as a
+   comma-separated list [i.e., #(values)].
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 34]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Since vendor-specific parameters may be order-dependent, it MUST be
+   possible to combine multiple header fields of the same name into one
+   "name:value" pair without changing the semantics of the message, by
+   appending each subsequent value to the first, each separated by a
+   comma.  The order in which header fields with the same name are
+   received is therefore significant to the interpretation of the
+   combined header field value, and thus an intermediary MUST NOT change
+   the order of these values when a message is forwarded.
+
+   generic-header      =    channel-identifier
+                       /    accept
+                       /    active-request-id-list
+                       /    proxy-sync-id
+                       /    accept-charset
+                       /    content-type
+                       /    content-id
+                       /    content-base
+                       /    content-encoding
+                       /    content-location
+                       /    content-length
+                       /    fetch-timeout
+                       /    cache-control
+                       /    logging-tag
+                       /    set-cookie
+                       /    vendor-specific
+
+6.2.1.  Channel-Identifier
+
+   All MRCPv2 requests, responses, and events MUST contain the Channel-
+   Identifier header field.  The value is allocated by the server when a
+   control channel is added to the session and communicated to the
+   client by the "a=channel" attribute in the SDP answer from the
+   server.  The header field value consists of 2 parts separated by the
+   '@' symbol.  The first part is an unambiguous string identifying the
+   MRCPv2 session.  The second part is a string token that specifies one
+   of the media processing resource types listed in Section 3.1.  The
+   unambiguous string (first part) MUST be difficult to guess, unique
+   among the resource instances managed by the server, and common to all
+   resource channels with that server established through a single SIP
+   dialog.
+
+   channel-identifier  = "Channel-Identifier" ":" channel-id CRLF
+   channel-id          = 1*alphanum "@" 1*alphanum
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 35]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+6.2.2.  Accept
+
+   The Accept header field follows the syntax defined in [H14.1].  The
+   semantics are also identical, with the exception that if no Accept
+   header field is present, the server MUST assume a default value that
+   is specific to the resource type that is being controlled.  This
+   default value can be changed for a resource on a session by sending
+   this header field in a SET-PARAMS method.  The current default value
+   of this header field for a resource in a session can be found through
+   a GET-PARAMS method.  This header field MAY occur on any request.
+
+6.2.3.  Active-Request-Id-List
+
+   In a request, this header field indicates the list of request-ids to
+   which the request applies.  This is useful when there are multiple
+   requests that are PENDING or IN-PROGRESS and the client wants this
+   request to apply to one or more of these specifically.
+
+   In a response, this header field returns the list of request-ids that
+   the method modified or affected.  There could be one or more requests
+   in a request-state of PENDING or IN-PROGRESS.  When a method
+   affecting one or more PENDING or IN-PROGRESS requests is sent from
+   the client to the server, the response MUST contain the list of
+   request-ids that were affected or modified by this command in its
+   header section.
+
+   The Active-Request-Id-List is only used in requests and responses,
+   not in events.
+
+   For example, if a STOP request with no Active-Request-Id-List is sent
+   to a synthesizer resource that has one or more SPEAK requests in the
+   PENDING or IN-PROGRESS state, all SPEAK requests MUST be cancelled,
+   including the one IN-PROGRESS.  The response to the STOP request
+   contains in the Active-Request-Id-List value the request-ids of all
+   the SPEAK requests that were terminated.  After sending the STOP
+   response, the server MUST NOT send any SPEAK-COMPLETE or RECOGNITION-
+   COMPLETE events for the terminated requests.
+
+   active-request-id-list  =  "Active-Request-Id-List" ":"
+                              request-id *("," request-id) CRLF
+
+6.2.4.  Proxy-Sync-Id
+
+   When any server resource generates a "barge-in-able" event, it also
+   generates a unique tag.  The tag is sent as this header field's value
+   in an event to the client.  The client then acts as an intermediary
+   among the server resources and sends a BARGE-IN-OCCURRED method to
+   the synthesizer server resource with the Proxy-Sync-Id it received
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 36]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   from the server resource.  When the recognizer and synthesizer
+   resources are part of the same session, they may choose to work
+   together to achieve quicker interaction and response.  Here, the
+   Proxy-Sync-Id helps the resource receiving the event, intermediated
+   by the client, to decide if this event has been processed through a
+   direct interaction of the resources.  This header field MAY occur
+   only on events and the BARGE-IN-OCCURRED method.  The name of this
+   header field contains the word 'proxy' only for historical reasons
+   and does not imply that a proxy server is involved.
+
+   proxy-sync-id    =  "Proxy-Sync-Id" ":" 1*VCHAR CRLF
+
+6.2.5.  Accept-Charset
+
+   See [H14.2].  This specifies the acceptable character sets for
+   entities returned in the response or events associated with this
+   request.  This is useful in specifying the character set to use in
+   the Natural Language Semantic Markup Language (NLSML) results of a
+   RECOGNITION-COMPLETE event.  This header field is only used on
+   requests.
+
+6.2.6.  Content-Type
+
+   See [H14.17].  MRCPv2 supports a restricted set of registered media
+   types for content, including speech markup, grammar, and recognition
+   results.  The content types applicable to each MRCPv2 resource-type
+   are specified in the corresponding section of the document and are
+   registered in the MIME Media Types registry maintained by IANA.  The
+   multipart content type "multipart/mixed" is supported to communicate
+   multiple of the above mentioned contents, in which case the body
+   parts MUST NOT contain any MRCPv2-specific header fields.  This
+   header field MAY occur on all messages.
+
+   content-type     =    "Content-Type" ":" media-type-value CRLF
+
+   media-type-value =    type "/" subtype *( ";" parameter )
+
+   type             =    token
+
+   subtype          =    token
+
+   parameter        =    attribute "=" value
+
+   attribute        =    token
+
+   value            =    token / quoted-string
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 37]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+6.2.7.  Content-ID
+
+   This header field contains an ID or name for the content by which it
+   can be referenced.  This header field operates according to the
+   specification in RFC 2392 [RFC2392] and is required for content
+   disambiguation in multipart messages.  In MRCPv2, whenever the
+   associated content is stored by either the client or the server, it
+   MUST be retrievable using this ID.  Such content can be referenced
+   later in a session by addressing it with the 'session' URI scheme
+   described in Section 13.6.  This header field MAY occur on all
+   messages.
+
+6.2.8.  Content-Base
+
+   The Content-Base entity-header MAY be used to specify the base URI
+   for resolving relative URIs within the entity.
+
+   content-base      = "Content-Base" ":" absoluteURI CRLF
+
+   Note, however, that the base URI of the contents within the entity-
+   body may be redefined within that entity-body.  An example of this
+   would be multipart media, which in turn can have multiple entities
+   within it.  This header field MAY occur on all messages.
+
+6.2.9.  Content-Encoding
+
+   The Content-Encoding entity-header is used as a modifier to the
+   Content-Type.  When present, its value indicates what additional
+   content encoding has been applied to the entity-body, and thus what
+   decoding mechanisms must be applied in order to obtain the Media Type
+   referenced by the Content-Type header field.  Content-Encoding is
+   primarily used to allow a document to be compressed without losing
+   the identity of its underlying media type.  Note that the SIP session
+   can be used to determine accepted encodings (see Section 7).  This
+   header field MAY occur on all messages.
+
+   content-encoding  = "Content-Encoding" ":"
+                       *WSP content-coding
+                       *(*WSP "," *WSP content-coding *WSP )
+                       CRLF
+
+   Content codings are defined in [H3.5].  An example of its use is
+   Content-Encoding:gzip
+
+   If multiple encodings have been applied to an entity, the content
+   encodings MUST be listed in the order in which they were applied.
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 38]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+6.2.10.  Content-Location
+
+   The Content-Location entity-header MAY be used to supply the resource
+   location for the entity enclosed in the message when that entity is
+   accessible from a location separate from the requested resource's
+   URI.  Refer to [H14.14].
+
+   content-location  =  "Content-Location" ":"
+                        ( absoluteURI / relativeURI ) CRLF
+
+   The Content-Location value is a statement of the location of the
+   resource corresponding to this particular entity at the time of the
+   request.  This header field is provided for optimization purposes
+   only.  The receiver of this header field MAY assume that the entity
+   being sent is identical to what would have been retrieved or might
+   already have been retrieved from the Content-Location URI.
+
+   For example, if the client provided a grammar markup inline, and it
+   had previously retrieved it from a certain URI, that URI can be
+   provided as part of the entity, using the Content-Location header
+   field.  This allows a resource like the recognizer to look into its
+   cache to see if this grammar was previously retrieved, compiled, and
+   cached.  In this case, it might optimize by using the previously
+   compiled grammar object.
+
+   If the Content-Location is a relative URI, the relative URI is
+   interpreted relative to the Content-Base URI.  This header field MAY
+   occur on all messages.
+
+6.2.11.  Content-Length
+
+   This header field contains the length of the content of the message
+   body (i.e., after the double CRLF following the last header field).
+   Unlike in HTTP, it MUST be included in all messages that carry
+   content beyond the header section.  If it is missing, a default value
+   of zero is assumed.  Otherwise, it is interpreted according to
+   [H14.13].  When a message having no use for a message body contains
+   one, i.e., the Content-Length is non-zero, the receiver MUST ignore
+   the content of the message body.  This header field MAY occur on all
+   messages.
+
+   content-length  =  "Content-Length" ":" 1*19DIGIT CRLF
+
+6.2.12.  Fetch Timeout
+
+   When the recognizer or synthesizer needs to fetch documents or other
+   resources, this header field controls the corresponding URI access
+   properties.  This defines the timeout for content that the server may
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 39]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   need to fetch over the network.  The value is interpreted to be in
+   milliseconds and ranges from 0 to an implementation-specific maximum
+   value.  It is RECOMMENDED that servers be cautious about accepting
+   long timeout values.  The default value for this header field is
+   implementation specific.  This header field MAY occur in DEFINE-
+   GRAMMAR, RECOGNIZE, SPEAK, SET-PARAMS, or GET-PARAMS.
+
+   fetch-timeout       =   "Fetch-Timeout" ":" 1*19DIGIT CRLF
+
+6.2.13.  Cache-Control
+
+   If the server implements content caching, it MUST adhere to the cache
+   correctness rules of HTTP 1.1 [RFC2616] when accessing and caching
+   stored content.  In particular, the "expires" and "cache-control"
+   header fields of the cached URI or document MUST be honored and take
+   precedence over the Cache-Control defaults set by this header field.
+   The Cache-Control directives are used to define the default caching
+   algorithms on the server for the session or request.  The scope of
+   the directive is based on the method it is sent on.  If the directive
+   is sent on a SET-PARAMS method, it applies for all requests for
+   external documents the server makes during that session, unless it is
+   overridden by a Cache-Control header field on an individual request.
+   If the directives are sent on any other requests, they apply only to
+   external document requests the server makes for that request.  An
+   empty Cache-Control header field on the GET-PARAMS method is a
+   request for the server to return the current Cache-Control directives
+   setting on the server.  This header field MAY occur only on requests.
+
+   cache-control    =    "Cache-Control" ":"
+                         [*WSP cache-directive
+                         *( *WSP "," *WSP cache-directive *WSP )]
+                         CRLF
+
+   cache-directive     = "max-age" "=" delta-seconds
+                       / "max-stale" [ "=" delta-seconds ]
+                       / "min-fresh" "=" delta-seconds
+
+   delta-seconds       = 1*19DIGIT
+
+   Here, delta-seconds is a decimal time value specifying the number of
+   seconds since the instant the message response or data was received
+   by the server.
+
+   The different cache-directive options allow the client to ask the
+   server to override the default cache expiration mechanisms:
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 40]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   max-age        Indicates that the client can tolerate the server
+                  using content whose age is no greater than the
+                  specified time in seconds.  Unless a "max-stale"
+                  directive is also included, the client is not willing
+                  to accept a response based on stale data.
+
+   min-fresh      Indicates that the client is willing to accept a
+                  server response with cached data whose expiration is
+                  no less than its current age plus the specified time
+                  in seconds.  If the server's cache time-to-live
+                  exceeds the client-supplied min-fresh value, the
+                  server MUST NOT utilize cached content.
+
+   max-stale      Indicates that the client is willing to allow a server
+                  to utilize cached data that has exceeded its
+                  expiration time.  If "max-stale" is assigned a value,
+                  then the client is willing to allow the server to use
+                  cached data that has exceeded its expiration time by
+                  no more than the specified number of seconds.  If no
+                  value is assigned to "max-stale", then the client is
+                  willing to allow the server to use stale data of any
+                  age.
+
+   If the server cache is requested to use stale response/data without
+   validation, it MAY do so only if this does not conflict with any
+   "MUST"-level requirements concerning cache validation (e.g., a "must-
+   revalidate" Cache-Control directive in the HTTP 1.1 specification
+   pertaining to the corresponding URI).
+
+   If both the MRCPv2 Cache-Control directive and the cached entry on
+   the server include "max-age" directives, then the lesser of the two
+   values is used for determining the freshness of the cached entry for
+   that request.
+
+6.2.14.  Logging-Tag
+
+   This header field MAY be sent as part of a SET-PARAMS/GET-PARAMS
+   method to set or retrieve the logging tag for logs generated by the
+   server.  Once set, the value persists until a new value is set or the
+   session ends.  The MRCPv2 server MAY provide a mechanism to create
+   subsets of its output logs so that system administrators can examine
+   or extract only the log file portion during which the logging tag was
+   set to a certain value.
+
+   It is RECOMMENDED that clients include in the logging tag information
+   to identify the MRCPv2 client User Agent, so that one can determine
+   which MRCPv2 client request generated a given log message at the
+   server.  It is also RECOMMENDED that MRCPv2 clients not log
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 41]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   personally identifiable information such as credit card numbers and
+   national identification numbers.
+
+   logging-tag    = "Logging-Tag" ":" 1*UTFCHAR CRLF
+
+6.2.15.  Set-Cookie
+
+   Since the associated HTTP client on an MRCPv2 server fetches
+   documents for processing on behalf of the MRCPv2 client, the cookie
+   store in the HTTP client of the MRCPv2 server is treated as an
+   extension of the cookie store in the HTTP client of the MRCPv2
+   client.  This requires that the MRCPv2 client and server be able to
+   synchronize their common cookie store as needed.  To enable the
+   MRCPv2 client to push its stored cookies to the MRCPv2 server and get
+   new cookies from the MRCPv2 server stored back to the MRCPv2 client,
+   the Set-Cookie entity-header field MAY be included in MRCPv2 requests
+   to update the cookie store on a server and be returned in final
+   MRCPv2 responses or events to subsequently update the client's own
+   cookie store.  The stored cookies on the server persist for the
+   duration of the MRCPv2 session and MUST be destroyed at the end of
+   the session.  To ensure support for cookies, MRCPv2 clients and
+   servers MUST support the Set-Cookie entity-header field.
+
+   Note that it is the MRCPv2 client that determines which, if any,
+   cookies are sent to the server.  There is no requirement that all
+   cookies be shared.  Rather, it is RECOMMENDED that MRCPv2 clients
+   communicate only cookies needed by the MRCPv2 server to process its
+   requests.
+
+ set-cookie      =       "Set-Cookie:" cookies CRLF
+ cookies         =       cookie *("," *LWS cookie)
+ cookie          =       attribute "=" value *(";" cookie-av)
+ cookie-av       =       "Comment" "=" value
+                 /       "Domain" "=" value
+                 /       "Max-Age" "=" value
+                 /       "Path" "=" value
+                 /       "Secure"
+                 /       "Version" "=" 1*19DIGIT
+                 /       "Age" "=" delta-seconds
+
+ set-cookie        = "Set-Cookie:" SP set-cookie-string
+ set-cookie-string = cookie-pair *( ";" SP cookie-av )
+ cookie-pair       = cookie-name "=" cookie-value
+ cookie-name       = token
+ cookie-value      = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )
+ cookie-octet      = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
+ token             = <token, defined in [RFC2616], Section 2.2>
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 42]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+ cookie-av         = expires-av / max-age-av / domain-av /
+                      path-av / secure-av / httponly-av /
+                      extension-av / age-av
+ expires-av        = "Expires=" sane-cookie-date
+ sane-cookie-date  = <rfc1123-date, defined in [RFC2616], Section 3.3.1>
+ max-age-av        = "Max-Age=" non-zero-digit *DIGIT
+ non-zero-digit    = %x31-39
+ domain-av         = "Domain=" domain-value
+ domain-value      = <subdomain>
+ path-av           = "Path=" path-value
+ path-value        = <any CHAR except CTLs or ";">
+ secure-av         = "Secure"
+ httponly-av       = "HttpOnly"
+ extension-av      = <any CHAR except CTLs or ";">
+ age-av            = "Age=" delta-seconds
+
+   The Set-Cookie header field is specified in RFC 6265 [RFC6265].  The
+   "Age" attribute is introduced in this specification to indicate the
+   age of the cookie and is OPTIONAL.  An MRCPv2 client or server MUST
+   calculate the age of the cookie according to the age calculation
+   rules in the HTTP/1.1 specification [RFC2616] and append the "Age"
+   attribute accordingly.  This attribute is provided because time may
+   have passed since the client received the cookie from an HTTP server.
+   Rather than having the client reduce Max-Age by the actual age, it
+   passes Max-Age verbatim and appends the "Age" attribute, thus
+   maintaining the cookie as received while still accounting for the
+   fact that time has passed.
+
+   The MRCPv2 client or server MUST supply defaults for the "Domain" and
+   "Path" attributes, as specified in RFC 6265, if they are omitted by
+   the HTTP origin server.  Note that there is no leading dot present in
+   the "Domain" attribute value in this case.  Although an explicitly
+   specified "Domain" value received via the HTTP protocol may be
+   modified to include a leading dot, an MRCPv2 client or server MUST
+   NOT modify the "Domain" value when received via the MRCPv2 protocol.
+
+   An MRCPv2 client or server MAY combine multiple cookie header fields
+   of the same type into a single "field-name:field-value" pair as
+   described in Section 6.2.
+
+   The Set-Cookie header field MAY be specified in any request that
+   subsequently results in the server performing an HTTP access.  When a
+   server receives new cookie information from an HTTP origin server,
+   and assuming the cookie store is modified according to RFC 6265, the
+   server MUST return the new cookie information in the MRCPv2 COMPLETE
+   response or event, as appropriate, to allow the client to update its
+   own cookie store.
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 43]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   The SET-PARAMS request MAY specify the Set-Cookie header field to
+   update the cookie store on a server.  The GET-PARAMS request MAY be
+   used to return the entire cookie store of "Set-Cookie" type cookies
+   to the client.
+
+6.2.16.  Vendor-Specific Parameters
+
+   This set of header fields allows for the client to set or retrieve
+   vendor-specific parameters.
+
+   vendor-specific          =    "Vendor-Specific-Parameters" ":"
+                                 [vendor-specific-av-pair
+                                 *(";" vendor-specific-av-pair)] CRLF
+
+   vendor-specific-av-pair  = vendor-av-pair-name "="
+                              value
+
+   vendor-av-pair-name     = 1*UTFCHAR
+
+   Header fields of this form MAY be sent in any method (request) and
+   are used to manage implementation-specific parameters on the server
+   side.  The vendor-av-pair-name follows the reverse Internet Domain
+   Name convention (see Section 13.1.6 for syntax and registration
+   information).  The value of the vendor attribute is specified after
+   the "=" symbol and MAY be quoted.  For example:
+
+   com.example.companyA.paramxyz=256
+   com.example.companyA.paramabc=High
+   com.example.companyB.paramxyz=Low
+
+   When used in GET-PARAMS to get the current value of these parameters
+   from the server, this header field value MAY contain a semicolon-
+   separated list of implementation-specific attribute names.
+
+6.3.  Generic Result Structure
+
+   Result data from the server for the Recognizer and Verifier resources
+   is carried as a typed media entity in the MRCPv2 message body of
+   various events.  The Natural Language Semantics Markup Language
+   (NLSML), an XML markup based on an early draft from the W3C, is the
+   default standard for returning results back to the client.  Hence,
+   all servers implementing these resource types MUST support the media
+   type 'application/nlsml+xml'.  The Extensible MultiModal Annotation
+   (EMMA) [W3C.REC-emma-20090210] format can be used to return results
+   as well.  This can be done by negotiating the format at session
+   establishment time with SDP (a=resultformat:application/emma+xml) or
+   with SIP (Allow/Accept).  With SIP, for example, if a client wants
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 44]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   results in EMMA, an MRCPv2 server can route the request to another
+   server that supports EMMA by inspecting the SIP header fields, rather
+   than having to inspect the SDP.
+
+   MRCPv2 uses this representation to convey content among the clients
+   and servers that generate and make use of the markup.  MRCPv2 uses
+   NSLML specifically to convey recognition, enrollment, and
+   verification results between the corresponding resource on the MRCPv2
+   server and the MRCPv2 client.  Details of this result format are
+   fully described in Section 6.3.1.
+
+   Content-Type:application/nlsml+xml
+   Content-Length:...
+
+   <?xml version="1.0"?>
+   <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+           xmlns:ex="http://www.example.com/example"
+           grammar="http://theYesNoGrammar">
+       <interpretation>
+           <instance>
+                   <ex:response>yes</ex:response>
+           </instance>
+           <input>OK</input>
+       </interpretation>
+   </result>
+
+                              Result Example
+
+6.3.1.  Natural Language Semantics Markup Language
+
+   The Natural Language Semantics Markup Language (NLSML) is an XML data
+   structure with elements and attributes designed to carry result
+   information from recognizer (including enrollment) and verifier
+   resources.  The normative definition of NLSML is the RelaxNG schema
+   in Section 16.1.  Note that the elements and attributes of this
+   format are defined in the MRCPv2 namespace.  In the result structure,
+   they must either be prefixed by a namespace prefix declared within
+   the result or must be children of an element identified as belonging
+   to the respective namespace.  For details on how to use XML
+   Namespaces, see [W3C.REC-xml-names11-20040204].  Section 2 of
+   [W3C.REC-xml-names11-20040204] provides details on how to declare
+   namespaces and namespace prefixes.
+
+   The root element of NLSML is <result>.  Optional child elements are
+   <interpretation>, <enrollment-result>, and <verification-result>, at
+   least one of which must be present.  A single <result> MAY contain
+   any or all of the optional child elements.  Details of the <result>
+   and <interpretation> elements and their subelements and attributes
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 45]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   can be found in Section 9.6.  Details of the <enrollment-result>
+   element and its subelements can be found in Section 9.7.  Details of
+   the <verification-result> element and its subelements can be found in
+   Section 11.5.2.
+
+7.  Resource Discovery
+
+   Server resources may be discovered and their capabilities learned by
+   clients through standard SIP machinery.  The client MAY issue a SIP
+   OPTIONS transaction to a server, which has the effect of requesting
+   the capabilities of the server.  The server MUST respond to such a
+   request with an SDP-encoded description of its capabilities according
+   to RFC 3264 [RFC3264].  The MRCPv2 capabilities are described by a
+   single "m=" line containing the media type "application" and
+   transport type "TCP/TLS/MRCPv2" or "TCP/MRCPv2".  There MUST be one
+   "resource" attribute for each media resource that the server
+   supports, and it has the resource type identifier as its value.
+
+   The SDP description MUST also contain "m=" lines describing the audio
+   capabilities and the coders the server supports.
+
+   In this example, the client uses the SIP OPTIONS method to query the
+   capabilities of the MRCPv2 server.
+
+   C->S:
+        OPTIONS sip:mrcp@server.example.com SIP/2.0
+        Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+         branch=z9hG4bK74bf7
+        Max-Forwards:6
+        To:<sip:mrcp@example.com>
+        From:Sarvi <sip:sarvi@example.com>;tag=1928301774
+        Call-ID:a84b4c76e66710
+        CSeq:63104 OPTIONS
+        Contact:<sip:sarvi@client.example.com>
+        Accept:application/sdp
+        Content-Length:0
+
+
+   S->C:
+        SIP/2.0 200 OK
+        Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+         branch=z9hG4bK74bf7;received=192.0.32.10
+        To:<sip:mrcp@example.com>;tag=62784
+        From:Sarvi <sip:sarvi@example.com>;tag=1928301774
+        Call-ID:a84b4c76e66710
+        CSeq:63104 OPTIONS
+        Contact:<sip:mrcp@server.example.com>
+        Allow:INVITE, ACK, CANCEL, OPTIONS, BYE
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 46]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+        Accept:application/sdp
+        Accept-Encoding:gzip
+        Accept-Language:en
+        Supported:foo
+        Content-Type:application/sdp
+        Content-Length:...
+
+        v=0
+        o=sarvi 2890844536 2890842811 IN IP4 192.0.2.12
+        s=-
+        i=MRCPv2 server capabilities
+        c=IN IP4 192.0.2.12/127
+        t=0 0
+        m=application 0 TCP/TLS/MRCPv2 1
+        a=resource:speechsynth
+        a=resource:speechrecog
+        a=resource:speakverify
+        m=audio 0 RTP/AVP 0 3
+        a=rtpmap:0 PCMU/8000
+        a=rtpmap:3 GSM/8000
+
+         Using SIP OPTIONS for MRCPv2 Server Capability Discovery
+
+8.  Speech Synthesizer Resource
+
+   This resource processes text markup provided by the client and
+   generates a stream of synthesized speech in real time.  Depending
+   upon the server implementation and capability of this resource, the
+   client can also dictate parameters of the synthesized speech such as
+   voice characteristics, speaker speed, etc.
+
+   The synthesizer resource is controlled by MRCPv2 requests from the
+   client.  Similarly, the resource can respond to these requests or
+   generate asynchronous events to the client to indicate conditions of
+   interest to the client during the generation of the synthesized
+   speech stream.
+
+   This section applies for the following resource types:
+
+   o  speechsynth
+
+   o  basicsynth
+
+   The capabilities of these resources are defined in Section 3.1.
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 47]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+8.1.  Synthesizer State Machine
+
+   The synthesizer maintains a state machine to process MRCPv2 requests
+   from the client.  The state transitions shown below describe the
+   states of the synthesizer and reflect the state of the request at the
+   head of the synthesizer resource queue.  A SPEAK request in the
+   PENDING state can be deleted or stopped by a STOP request without
+   affecting the state of the resource.
+
+   Idle                    Speaking                  Paused
+   State                   State                     State
+     |                        |                          |
+     |----------SPEAK-------->|                 |--------|
+     |<------STOP-------------|             CONTROL      |
+     |<----SPEAK-COMPLETE-----|                 |------->|
+     |<----BARGE-IN-OCCURRED--|                          |
+     |              |---------|                          |
+     |          CONTROL       |-----------PAUSE--------->|
+     |              |-------->|<----------RESUME---------|
+     |                        |               |----------|
+     |----------|             |              PAUSE       |
+     |    BARGE-IN-OCCURRED   |               |--------->|
+     |<---------|             |----------|               |
+     |                        |      SPEECH-MARKER       |
+     |                        |<---------|               |
+     |----------|             |----------|               |
+     |         STOP           |       RESUME             |
+     |          |             |<---------|               |
+     |<---------|             |                          |
+     |<---------------------STOP-------------------------|
+     |----------|             |                          |
+     |     DEFINE-LEXICON     |                          |
+     |          |             |                          |
+     |<---------|             |                          |
+     |<---------------BARGE-IN-OCCURRED------------------|
+
+                         Synthesizer State Machine
+
+8.2.  Synthesizer Methods
+
+   The synthesizer supports the following methods.
+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 48]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   synthesizer-method   =  "SPEAK"
+                        /  "STOP"
+                        /  "PAUSE"
+                        /  "RESUME"
+                        /  "BARGE-IN-OCCURRED"
+                        /  "CONTROL"
+                        /  "DEFINE-LEXICON"
+
+8.3.  Synthesizer Events
+
+   The synthesizer can generate the following events.
+
+   synthesizer-event    =  "SPEECH-MARKER"
+                        /  "SPEAK-COMPLETE"
+
+8.4.  Synthesizer Header Fields
+
+   A synthesizer method can contain header fields containing request
+   options and information to augment the Request, Response, or Event it
+   is associated with.
+
+   synthesizer-header  =  jump-size
+                       /  kill-on-barge-in
+                       /  speaker-profile
+                       /  completion-cause
+                       /  completion-reason
+                       /  voice-parameter
+                       /  prosody-parameter
+                       /  speech-marker
+                       /  speech-language
+                       /  fetch-hint
+                       /  audio-fetch-hint
+                       /  failed-uri
+                       /  failed-uri-cause
+                       /  speak-restart
+                       /  speak-length
+                       /  load-lexicon
+                       /  lexicon-search-order
+
+8.4.1.  Jump-Size
+
+   This header field MAY be specified in a CONTROL method and controls
+   the amount to jump forward or backward in an active SPEAK request.  A
+   '+' or '-' indicates a relative value to what is being currently
+   played.  This header field MAY also be specified in a SPEAK request
+   as a desired offset into the synthesized speech.  In this case, the
+   synthesizer MUST begin speaking from this amount of time into the
+   speech markup.  Note that an offset that extends beyond the end of
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 49]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   the produced speech will result in audio of length zero.  The
+   different speech length units supported are dependent on the
+   synthesizer implementation.  If the synthesizer resource does not
+   support a unit for the operation, the resource MUST respond with a
+   status-code of 409 "Unsupported Header Field Value".
+
+   jump-size             =   "Jump-Size" ":" speech-length-value CRLF
+
+   speech-length-value   =   numeric-speech-length
+                         /   text-speech-length
+
+   text-speech-length    =   1*UTFCHAR SP "Tag"
+
+   numeric-speech-length =    ("+" / "-") positive-speech-length
+
+   positive-speech-length =   1*19DIGIT SP numeric-speech-unit
+
+   numeric-speech-unit   =   "Second"
+                         /   "Word"
+                         /   "Sentence"
+                         /   "Paragraph"
+
+8.4.2.  Kill-On-Barge-In
+
+   This header field MAY be sent as part of the SPEAK method to enable
+   "kill-on-barge-in" support.  If enabled, the SPEAK method is
+   interrupted by DTMF input detected by a signal detector resource or
+   by the start of speech sensed or recognized by the speech recognizer
+   resource.
+
+   kill-on-barge-in      =   "Kill-On-Barge-In" ":" BOOLEAN CRLF
+
+   The client MUST send a BARGE-IN-OCCURRED method to the synthesizer
+   resource when it receives a barge-in-able event from any source.
+   This source could be a synthesizer resource or signal detector
+   resource and MAY be either local or distributed.  If this header
+   field is not specified in a SPEAK request or explicitly set by a
+   SET-PARAMS, the default value for this header field is "true".
+
+   If the recognizer or signal detector resource is on the same server
+   as the synthesizer and both are part of the same session, the server
+   MAY work with both to provide internal notification to the
+   synthesizer so that audio may be stopped without having to wait for
+   the client's BARGE-IN-OCCURRED event.
+
+   It is generally RECOMMENDED when playing a prompt to the user with
+   Kill-On-Barge-In and asking for input, that the client issue the
+   RECOGNIZE request ahead of the SPEAK request for optimum performance
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 50]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   and user experience.  This way, it is guaranteed that the recognizer
+   is online before the prompt starts playing and the user's speech will
+   not be truncated at the beginning (especially for power users).
+
+8.4.3.  Speaker-Profile
+
+   This header field MAY be part of the SET-PARAMS/GET-PARAMS or SPEAK
+   request from the client to the server and specifies a URI that
+   references the profile of the speaker.  Speaker profiles are
+   collections of voice parameters like gender, accent, etc.
+
+   speaker-profile       =   "Speaker-Profile" ":" uri CRLF
+
+8.4.4.  Completion-Cause
+
+   This header field MUST be specified in a SPEAK-COMPLETE event coming
+   from the synthesizer resource to the client.  This indicates the
+   reason the SPEAK request completed.
+
+   completion-cause      =   "Completion-Cause" ":" 3DIGIT SP
+                             1*VCHAR CRLF
+
+   +------------+-----------------------+------------------------------+
+   | Cause-Code | Cause-Name            | Description                  |
+   +------------+-----------------------+------------------------------+
+   | 000        | normal                | SPEAK completed normally.    |
+   | 001        | barge-in              | SPEAK request was terminated |
+   |            |                       | because of barge-in.         |
+   | 002        | parse-failure         | SPEAK request terminated     |
+   |            |                       | because of a failure to      |
+   |            |                       | parse the speech markup      |
+   |            |                       | text.                        |
+   | 003        | uri-failure           | SPEAK request terminated     |
+   |            |                       | because access to one of the |
+   |            |                       | URIs failed.                 |
+   | 004        | error                 | SPEAK request terminated     |
+   |            |                       | prematurely due to           |
+   |            |                       | synthesizer error.           |
+   | 005        | language-unsupported  | Language not supported.      |
+   | 006        | lexicon-load-failure  | Lexicon loading failed.      |
+   | 007        | cancelled             | A prior SPEAK request failed |
+   |            |                       | while this one was still in  |
+   |            |                       | the queue.                   |
+   +------------+-----------------------+------------------------------+
+
+                Synthesizer Resource Completion Cause Codes
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 51]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+8.4.5.  Completion-Reason
+
+   This header field MAY be specified in a SPEAK-COMPLETE event coming
+   from the synthesizer resource to the client.  This contains the
+   reason text behind the SPEAK request completion.  This header field
+   communicates text describing the reason for the failure, such as an
+   error in parsing the speech markup text.
+
+   completion-reason   =   "Completion-Reason" ":"
+                           quoted-string CRLF
+
+   The completion reason text is provided for client use in logs and for
+   debugging and instrumentation purposes.  Clients MUST NOT interpret
+   the completion reason text.
+
+8.4.6.  Voice-Parameter
+
+   This set of header fields defines the voice of the speaker.
+
+   voice-parameter    =   voice-gender
+                       /   voice-age
+                       /   voice-variant
+                       /   voice-name
+
+   voice-gender        =   "Voice-Gender:" voice-gender-value CRLF
+   voice-gender-value  =   "male"
+                       /   "female"
+                       /   "neutral"
+   voice-age           =   "Voice-Age:" 1*3DIGIT CRLF
+   voice-variant       =   "Voice-Variant:" 1*19DIGIT CRLF
+   voice-name          =   "Voice-Name:"
+                           1*UTFCHAR *(1*WSP 1*UTFCHAR) CRLF
+
+   The "Voice-" parameters are derived from the similarly named
+   attributes of the voice element specified in W3C's Speech Synthesis
+   Markup Language Specification (SSML)
+   [W3C.REC-speech-synthesis-20040907].  Legal values for these
+   parameters are as defined in that specification.
+
+   These header fields MAY be sent in SET-PARAMS or GET-PARAMS requests
+   to define or get default values for the entire session or MAY be sent
+   in the SPEAK request to define default values for that SPEAK request.
+   Note that SSML content can itself set these values internal to the
+   SSML document, of course.
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 52]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Voice parameter header fields MAY also be sent in a CONTROL method to
+   affect a SPEAK request in progress and change its behavior on the
+   fly.  If the synthesizer resource does not support this operation, it
+   MUST reject the request with a status-code of 403 "Unsupported Header
+   Field".
+
+8.4.7.  Prosody-Parameters
+
+   This set of header fields defines the prosody of the speech.
+
+   prosody-parameter   =   "Prosody-" prosody-param-name ":"
+                           prosody-param-value CRLF
+
+   prosody-param-name    =    1*VCHAR
+
+   prosody-param-value   =    1*VCHAR
+
+   prosody-param-name is any one of the attribute names under the
+   prosody element specified in W3C's Speech Synthesis Markup Language
+   Specification [W3C.REC-speech-synthesis-20040907].  The prosody-
+   param-value is any one of the value choices of the corresponding
+   prosody element attribute from that specification.
+
+   These header fields MAY be sent in SET-PARAMS or GET-PARAMS requests
+   to define or get default values for the entire session or MAY be sent
+   in the SPEAK request to define default values for that SPEAK request.
+   Furthermore, these attributes can be part of the speech text marked
+   up in SSML.
+
+   The prosody parameter header fields in the SET-PARAMS or SPEAK
+   request only apply if the speech data is of type 'text/plain' and
+   does not use a speech markup format.
+
+   These prosody parameter header fields MAY also be sent in a CONTROL
+   method to affect a SPEAK request in progress and change its behavior
+   on the fly.  If the synthesizer resource does not support this
+   operation, it MUST respond back to the client with a status-code of
+   403 "Unsupported Header Field".
+
+8.4.8.  Speech-Marker
+
+   This header field contains timestamp information in a "timestamp"
+   field.  This is a Network Time Protocol (NTP) [RFC5905] timestamp, a
+   64-bit number in decimal form.  It MUST be synced with the Real-Time
+   Protocol (RTP) [RFC3550] timestamp of the media stream through the
+   Real-Time Control Protocol (RTCP) [RFC3550].
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 53]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Markers are bookmarks that are defined within the markup.  Most
+   speech markup formats provide mechanisms to embed marker fields
+   within speech texts.  The synthesizer generates SPEECH-MARKER events
+   when it reaches these marker fields.  This header field MUST be part
+   of the SPEECH-MARKER event and contain the marker tag value after the
+   timestamp, separated by a semicolon.  In these events, the timestamp
+   marks the time the text corresponding to the marker was emitted as
+   speech by the synthesizer.
+
+   This header field MUST also be returned in responses to STOP,
+   CONTROL, and BARGE-IN-OCCURRED methods, in the SPEAK-COMPLETE event,
+   and in an IN-PROGRESS SPEAK response.  In these messages, if any
+   markers have been encountered for the current SPEAK, the marker tag
+   value MUST be the last embedded marker encountered.  If no markers
+   have yet been encountered for the current SPEAK, only the timestamp
+   is REQUIRED.  Note that in these events, the purpose of this header
+   field is to provide timestamp information associated with important
+   events within the lifecycle of a request (start of SPEAK processing,
+   end of SPEAK processing, receipt of CONTROL/STOP/BARGE-IN-OCCURRED).
+
+   timestamp           =   "timestamp" "=" time-stamp-value
+
+   time-stamp-value    =   1*20DIGIT
+
+   speech-marker       =   "Speech-Marker" ":"
+                           timestamp
+                           [";" 1*(UTFCHAR / %x20)] CRLF
+
+8.4.9.  Speech-Language
+
+   This header field specifies the default language of the speech data
+   if the language is not specified in the markup.  The value of this
+   header field MUST follow RFC 5646 [RFC5646] for its values.  The
+   header field MAY occur in SPEAK, SET-PARAMS, or GET-PARAMS requests.
+
+   speech-language     =   "Speech-Language" ":" 1*VCHAR CRLF
+
+8.4.10.  Fetch-Hint
+
+   When the synthesizer needs to fetch documents or other resources like
+   speech markup or audio files, this header field controls the
+   corresponding URI access properties.  This provides client policy on
+   when the synthesizer should retrieve content from the server.  A
+   value of "prefetch" indicates the content MAY be downloaded when the
+   request is received, whereas "safe" indicates that content MUST NOT
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 54]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   be downloaded until actually referenced.  The default value is
+   "prefetch".  This header field MAY occur in SPEAK, SET-PARAMS, or
+   GET-PARAMS requests.
+
+   fetch-hint          =   "Fetch-Hint" ":" ("prefetch" / "safe") CRLF
+
+8.4.11.  Audio-Fetch-Hint
+
+   When the synthesizer needs to fetch documents or other resources like
+   speech audio files, this header field controls the corresponding URI
+   access properties.  This provides client policy whether or not the
+   synthesizer is permitted to attempt to optimize speech by pre-
+   fetching audio.  The value is either "safe" to say that audio is only
+   fetched when it is referenced, never before; "prefetch" to permit,
+   but not require the implementation to pre-fetch the audio; or
+   "stream" to allow it to stream the audio fetches.  The default value
+   is "prefetch".  This header field MAY occur in SPEAK, SET-PARAMS, or
+   GET-PARAMS requests.
+
+   audio-fetch-hint    =   "Audio-Fetch-Hint" ":"
+                           ("prefetch" / "safe" / "stream") CRLF
+
+8.4.12.  Failed-URI
+
+   When a synthesizer method needs a synthesizer to fetch or access a
+   URI and the access fails, the server SHOULD provide the failed URI in
+   this header field in the method response, unless there are multiple
+   URI failures, in which case the server MUST provide one of the failed
+   URIs in this header field in the method response.
+
+   failed-uri          =   "Failed-URI" ":" absoluteURI CRLF
+
+8.4.13.  Failed-URI-Cause
+
+   When a synthesizer method needs a synthesizer to fetch or access a
+   URI and the access fails, the server MUST provide the URI-specific or
+   protocol-specific response code for the URI in the Failed-URI header
+   field in the method response through this header field.  The value
+   encoding is UTF-8 (RFC 3629 [RFC3629]) to accommodate any access
+   protocol -- some access protocols might have a response string
+   instead of a numeric response code.
+
+   failed-uri-cause    =   "Failed-URI-Cause" ":" 1*UTFCHAR CRLF
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 55]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+8.4.14.  Speak-Restart
+
+   When a client issues a CONTROL request to a currently speaking
+   synthesizer resource to jump backward, and the target jump point is
+   before the start of the current SPEAK request, the current SPEAK
+   request MUST restart from the beginning of its speech data and the
+   server's response to the CONTROL request MUST contain this header
+   field with a value of "true" indicating a restart.
+
+   speak-restart       =   "Speak-Restart" ":" BOOLEAN CRLF
+
+8.4.15.  Speak-Length
+
+   This header field MAY be specified in a CONTROL method to control the
+   maximum length of speech to speak, relative to the current speaking
+   point in the currently active SPEAK request.  If numeric, the value
+   MUST be a positive integer.  If a header field with a Tag unit is
+   specified, then the speech output continues until the tag is reached
+   or the SPEAK request is completed, whichever comes first.  This
+   header field MAY be specified in a SPEAK request to indicate the
+   length to speak from the speech data and is relative to the point in
+   speech that the SPEAK request starts.  The different speech length
+   units supported are synthesizer implementation dependent.  If a
+   server does not support the specified unit, the server MUST respond
+   with a status-code of 409 "Unsupported Header Field Value".
+
+   speak-length          =   "Speak-Length" ":" positive-length-value
+                             CRLF
+
+   positive-length-value =   positive-speech-length
+                         /   text-speech-length
+
+   text-speech-length    =   1*UTFCHAR SP "Tag"
+
+   positive-speech-length =  1*19DIGIT SP numeric-speech-unit
+
+   numeric-speech-unit   =   "Second"
+                         /   "Word"
+                         /   "Sentence"
+                         /   "Paragraph"
+
+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 56]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+8.4.16.  Load-Lexicon
+
+   This header field is used to indicate whether a lexicon has to be
+   loaded or unloaded.  The value "true" means to load the lexicon if
+   not already loaded, and the value "false" means to unload the lexicon
+   if it is loaded.  The default value for this header field is "true".
+   This header field MAY be specified in a DEFINE-LEXICON method.
+
+   load-lexicon       =   "Load-Lexicon" ":" BOOLEAN CRLF
+
+8.4.17.  Lexicon-Search-Order
+
+   This header field is used to specify a list of active pronunciation
+   lexicon URIs and the search order among the active lexicons.
+   Lexicons specified within the SSML document take precedence over the
+   lexicons specified in this header field.  This header field MAY be
+   specified in the SPEAK, SET-PARAMS, and GET-PARAMS methods.
+
+   lexicon-search-order =   "Lexicon-Search-Order" ":"
+             "<" absoluteURI ">" *(" " "<" absoluteURI ">") CRLF
+
+8.5.  Synthesizer Message Body
+
+   A synthesizer message can contain additional information associated
+   with the Request, Response, or Event in its message body.
+
+8.5.1.  Synthesizer Speech Data
+
+   Marked-up text for the synthesizer to speak is specified as a typed
+   media entity in the message body.  The speech data to be spoken by
+   the synthesizer can be specified inline by embedding the data in the
+   message body or by reference by providing a URI for accessing the
+   data.  In either case, the data and the format used to markup the
+   speech needs to be of a content type supported by the server.
+
+   All MRCPv2 servers containing synthesizer resources MUST support both
+   plain text speech data and W3C's Speech Synthesis Markup Language
+   [W3C.REC-speech-synthesis-20040907] and hence MUST support the media
+   types 'text/plain' and 'application/ssml+xml'.  Other formats MAY be
+   supported.
+
+   If the speech data is to be fetched by URI reference, the media type
+   'text/uri-list' (see RFC 2483 [RFC2483]) is used to indicate one or
+   more URIs that, when dereferenced, will contain the content to be
+   spoken.  If a list of speech URIs is specified, the resource MUST
+   speak the speech data provided by each URI in the order in which the
+   URIs are specified in the content.
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 57]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   MRCPv2 clients and servers MUST support the 'multipart/mixed' media
+   type.  This is the appropriate media type to use when providing a mix
+   of URI and inline speech data.  Embedded within the multipart content
+   block, there MAY be content for the 'text/uri-list', 'application/
+   ssml+xml', and/or 'text/plain' media types.  The character set and
+   encoding used in the speech data is specified according to standard
+   media type definitions.  The multipart content MAY also contain
+   actual audio data.  Clients may have recorded audio clips stored in
+   memory or on a local device and wish to play it as part of the SPEAK
+   request.  The audio portions MAY be sent by the client as part of the
+   multipart content block.  This audio is referenced in the speech
+   markup data that is another part in the multipart content block
+   according to the 'multipart/mixed' media type specification.
+
+   Content-Type:text/uri-list
+   Content-Length:...
+
+   http://www.example.com/ASR-Introduction.ssml
+   http://www.example.com/ASR-Document-Part1.ssml
+   http://www.example.com/ASR-Document-Part2.ssml
+   http://www.example.com/ASR-Conclusion.ssml
+
+                             URI List Example
+
+
+   Content-Type:application/ssml+xml
+   Content-Length:...
+
+   <?xml version="1.0"?>
+        <speak version="1.0"
+               xmlns="http://www.w3.org/2001/10/synthesis"
+               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+               xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
+                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
+               xml:lang="en-US">
+          <p>
+            <s>You have 4 new messages.</s>
+            <s>The first is from Aldine Turnbet
+            and arrived at <break/>
+            <say-as interpret-as="vxml:time">0345p</say-as>.</s>
+
+            <s>The subject is <prosody
+            rate="-20%">ski trip</prosody></s>
+         </p>
+        </speak>
+
+                               SSML Example
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 58]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Content-Type:multipart/mixed; boundary="break"
+
+   --break
+   Content-Type:text/uri-list
+   Content-Length:...
+
+   http://www.example.com/ASR-Introduction.ssml
+   http://www.example.com/ASR-Document-Part1.ssml
+   http://www.example.com/ASR-Document-Part2.ssml
+   http://www.example.com/ASR-Conclusion.ssml
+
+   --break
+   Content-Type:application/ssml+xml
+   Content-Length:...
+
+   <?xml version="1.0"?>
+       <speak version="1.0"
+              xmlns="http://www.w3.org/2001/10/synthesis"
+              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+              xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
+                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
+              xml:lang="en-US">
+          <p>
+            <s>You have 4 new messages.</s>
+            <s>The first is from Stephanie Williams
+            and arrived at <break/>
+            <say-as interpret-as="vxml:time">0342p</say-as>.</s>
+
+            <s>The subject is <prosody
+            rate="-20%">ski trip</prosody></s>
+          </p>
+       </speak>
+   --break--
+
+                             Multipart Example
+
+8.5.2.  Lexicon Data
+
+   Synthesizer lexicon data from the client to the server can be
+   provided inline or by reference.  Either way, they are carried as
+   typed media in the message body of the MRCPv2 request message (see
+   Section 8.14).
+
+   When a lexicon is specified inline in the message, the client MUST
+   provide a Content-ID for that lexicon as part of the content header
+   fields.  The server MUST store the lexicon associated with that
+   Content-ID for the duration of the session.  A stored lexicon can be
+   overwritten by defining a new lexicon with the same Content-ID.
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 59]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Lexicons that have been associated with a Content-ID can be
+   referenced through the 'session' URI scheme (see Section 13.6).
+
+   If lexicon data is specified by external URI reference, the media
+   type 'text/uri-list' (see RFC 2483 [RFC2483] ) is used to list the
+   one or more URIs that may be dereferenced to obtain the lexicon data.
+   All MRCPv2 servers MUST support the "http" and "https" URI access
+   mechanisms, and MAY support other mechanisms.
+
+   If the data in the message body consists of a mix of URI and inline
+   lexicon data, the 'multipart/mixed' media type is used.  The
+   character set and encoding used in the lexicon data may be specified
+   according to standard media type definitions.
+
+8.6.  SPEAK Method
+
+   The SPEAK request provides the synthesizer resource with the speech
+   text and initiates speech synthesis and streaming.  The SPEAK method
+   MAY carry voice and prosody header fields that alter the behavior of
+   the voice being synthesized, as well as a typed media message body
+   containing the actual marked-up text to be spoken.
+
+   The SPEAK method implementation MUST do a fetch of all external URIs
+   that are part of that operation.  If caching is implemented, this URI
+   fetching MUST conform to the cache-control hints and parameter header
+   fields associated with the method in deciding whether it is to be
+   fetched from cache or from the external server.  If these hints/
+   parameters are not specified in the method, the values set for the
+   session using SET-PARAMS/GET-PARAMS apply.  If it was not set for the
+   session, their default values apply.
+
+   When applying voice parameters, there are three levels of precedence.
+   The highest precedence are those specified within the speech markup
+   text, followed by those specified in the header fields of the SPEAK
+   request and hence that apply for that SPEAK request only, followed by
+   the session default values that can be set using the SET-PARAMS
+   request and apply for subsequent methods invoked during the session.
+
+   If the resource was idle at the time the SPEAK request arrived at the
+   server and the SPEAK method is being actively processed, the resource
+   responds immediately with a success status code and a request-state
+   of IN-PROGRESS.
+
+   If the resource is in the speaking or paused state when the SPEAK
+   method arrives at the server, i.e., it is in the middle of processing
+   a previous SPEAK request, the status returns success with a request-
+   state of PENDING.  The server places the SPEAK request in the
+   synthesizer resource request queue.  The request queue operates
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 60]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   strictly FIFO: requests are processed serially in order of receipt.
+   If the current SPEAK fails, all SPEAK methods in the pending queue
+   are cancelled and each generates a SPEAK-COMPLETE event with a
+   Completion-Cause of "cancelled".
+
+   For the synthesizer resource, SPEAK is the only method that can
+   return a request-state of IN-PROGRESS or PENDING.  When the text has
+   been synthesized and played into the media stream, the resource
+   issues a SPEAK-COMPLETE event with the request-id of the SPEAK
+   request and a request-state of COMPLETE.
+
+   C->S: MRCP/2.0 ... SPEAK 543257
+         Channel-Identifier:32AECB23433802@speechsynth
+         Voice-gender:neutral
+         Voice-Age:25
+         Prosody-volume:medium
+         Content-Type:application/ssml+xml
+         Content-Length:...
+
+         <?xml version="1.0"?>
+            <speak version="1.0"
+                xmlns="http://www.w3.org/2001/10/synthesis"
+                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+                xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
+                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
+                xml:lang="en-US">
+            <p>
+             <s>You have 4 new messages.</s>
+             <s>The first is from Stephanie Williams and arrived at
+                <break/>
+                <say-as interpret-as="vxml:time">0342p</say-as>.
+                </s>
+             <s>The subject is
+                    <prosody rate="-20%">ski trip</prosody>
+             </s>
+            </p>
+           </speak>
+
+   S->C: MRCP/2.0 ... 543257 200 IN-PROGRESS
+         Channel-Identifier:32AECB23433802@speechsynth
+         Speech-Marker:timestamp=857206027059
+
+   S->C: MRCP/2.0 ... SPEAK-COMPLETE 543257 COMPLETE
+         Channel-Identifier:32AECB23433802@speechsynth
+         Completion-Cause:000 normal
+         Speech-Marker:timestamp=857206027059
+
+                               SPEAK Example
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 61]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+8.7.  STOP
+
+   The STOP method from the client to the server tells the synthesizer
+   resource to stop speaking if it is speaking something.
+
+   The STOP request can be sent with an Active-Request-Id-List header
+   field to stop the zero or more specific SPEAK requests that may be in
+   queue and return a response status-code of 200 "Success".  If no
+   Active-Request-Id-List header field is sent in the STOP request, the
+   server terminates all outstanding SPEAK requests.
+
+   If a STOP request successfully terminated one or more PENDING or
+   IN-PROGRESS SPEAK requests, then the response MUST contain an Active-
+   Request-Id-List header field enumerating the SPEAK request-ids that
+   were terminated.  Otherwise, there is no Active-Request-Id-List
+   header field in the response.  No SPEAK-COMPLETE events are sent for
+   such terminated requests.
+
+   If a SPEAK request that was IN-PROGRESS and speaking was stopped, the
+   next pending SPEAK request, if any, becomes IN-PROGRESS at the
+   resource and enters the speaking state.
+
+   If a SPEAK request that was IN-PROGRESS and paused was stopped, the
+   next pending SPEAK request, if any, becomes IN-PROGRESS and enters
+   the paused state.
+
+   C->S: MRCP/2.0 ... SPEAK 543258
+         Channel-Identifier:32AECB23433802@speechsynth
+         Content-Type:application/ssml+xml
+         Content-Length:...
+
+         <?xml version="1.0"?>
+           <speak version="1.0"
+                xmlns="http://www.w3.org/2001/10/synthesis"
+                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+                xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
+                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
+                xml:lang="en-US">
+            <p>
+             <s>You have 4 new messages.</s>
+             <s>The first is from Stephanie Williams and arrived at
+                <break/>
+                <say-as interpret-as="vxml:time">0342p</say-as>.</s>
+             <s>The subject is
+                 <prosody rate="-20%">ski trip</prosody></s>
+            </p>
+           </speak>
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 62]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   S->C: MRCP/2.0 ... 543258 200 IN-PROGRESS
+         Channel-Identifier:32AECB23433802@speechsynth
+         Speech-Marker:timestamp=857206027059
+
+   C->S: MRCP/2.0 ... STOP 543259
+         Channel-Identifier:32AECB23433802@speechsynth
+
+   S->C: MRCP/2.0 ... 543259 200 COMPLETE
+         Channel-Identifier:32AECB23433802@speechsynth
+         Active-Request-Id-List:543258
+         Speech-Marker:timestamp=857206039059
+
+                               STOP Example
+
+8.8.  BARGE-IN-OCCURRED
+
+   The BARGE-IN-OCCURRED method, when used with the synthesizer
+   resource, provides a client that has detected a barge-in-able event a
+   means to communicate the occurrence of the event to the synthesizer
+   resource.
+
+   This method is useful in two scenarios:
+
+   1.  The client has detected DTMF digits in the input media or some
+       other barge-in-able event and wants to communicate that to the
+       synthesizer resource.
+
+   2.  The recognizer resource and the synthesizer resource are in
+       different servers.  In this case, the client acts as an
+       intermediary for the two servers.  It receives an event from the
+       recognition resource and sends a BARGE-IN-OCCURRED request to the
+       synthesizer.  In such cases, the BARGE-IN-OCCURRED method would
+       also have a Proxy-Sync-Id header field received from the resource
+       generating the original event.
+
+   If a SPEAK request is active with kill-on-barge-in enabled (see
+   Section 8.4.2), and the BARGE-IN-OCCURRED event is received, the
+   synthesizer MUST immediately stop streaming out audio.  It MUST also
+   terminate any speech requests queued behind the current active one,
+   irrespective of whether or not they have barge-in enabled.  If a
+   barge-in-able SPEAK request was playing and it was terminated, the
+   response MUST contain an Active-Request-Id-List header field listing
+   the request-ids of all SPEAK requests that were terminated.  The
+   server generates no SPEAK-COMPLETE events for these requests.
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 63]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   If there were no SPEAK requests terminated by the synthesizer
+   resource as a result of the BARGE-IN-OCCURRED method, the server MUST
+   respond to the BARGE-IN-OCCURRED with a status-code of 200 "Success",
+   and the response MUST NOT contain an Active-Request-Id-List header
+   field.
+
+   If the synthesizer and recognizer resources are part of the same
+   MRCPv2 session, they can be optimized for a quicker kill-on-barge-in
+   response if the recognizer and synthesizer interact directly.  In
+   these cases, the client MUST still react to a START-OF-INPUT event
+   from the recognizer by invoking the BARGE-IN-OCCURRED method to the
+   synthesizer.  The client MUST invoke the BARGE-IN-OCCURRED if it has
+   any outstanding requests to the synthesizer resource in either the
+   PENDING or IN-PROGRESS state.
+
+   C->S: MRCP/2.0 ... SPEAK 543258
+         Channel-Identifier:32AECB23433802@speechsynth
+         Voice-gender:neutral
+         Voice-Age:25
+         Prosody-volume:medium
+         Content-Type:application/ssml+xml
+         Content-Length:...
+
+         <?xml version="1.0"?>
+           <speak version="1.0"
+                xmlns="http://www.w3.org/2001/10/synthesis"
+                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+                xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
+                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
+                xml:lang="en-US">
+            <p>
+             <s>You have 4 new messages.</s>
+             <s>The first is from Stephanie Williams and arrived at
+                <break/>
+                <say-as interpret-as="vxml:time">0342p</say-as>.</s>
+             <s>The subject is
+                <prosody rate="-20%">ski trip</prosody></s>
+            </p>
+           </speak>
+
+   S->C: MRCP/2.0 ... 543258 200 IN-PROGRESS
+         Channel-Identifier:32AECB23433802@speechsynth
+         Speech-Marker:timestamp=857206027059
+
+   C->S: MRCP/2.0 ... BARGE-IN-OCCURRED 543259
+         Channel-Identifier:32AECB23433802@speechsynth
+         Proxy-Sync-Id:987654321
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 64]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   S->C:MRCP/2.0 ... 543259 200 COMPLETE
+         Channel-Identifier:32AECB23433802@speechsynth
+         Active-Request-Id-List:543258
+         Speech-Marker:timestamp=857206039059
+
+                         BARGE-IN-OCCURRED Example
+
+8.9.  PAUSE
+
+   The PAUSE method from the client to the server tells the synthesizer
+   resource to pause speech output if it is speaking something.  If a
+   PAUSE method is issued on a session when a SPEAK is not active, the
+   server MUST respond with a status-code of 402 "Method not valid in
+   this state".  If a PAUSE method is issued on a session when a SPEAK
+   is active and paused, the server MUST respond with a status-code of
+   200 "Success".  If a SPEAK request was active, the server MUST return
+   an Active-Request-Id-List header field whose value contains the
+   request-id of the SPEAK request that was paused.
+
+   C->S: MRCP/2.0 ... SPEAK 543258
+         Channel-Identifier:32AECB23433802@speechsynth
+         Voice-gender:neutral
+         Voice-Age:25
+         Prosody-volume:medium
+         Content-Type:application/ssml+xml
+         Content-Length:...
+
+         <?xml version="1.0"?>
+           <speak version="1.0"
+                xmlns="http://www.w3.org/2001/10/synthesis"
+                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+                xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
+                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
+                xml:lang="en-US">
+            <p>
+             <s>You have 4 new messages.</s>
+             <s>The first is from Stephanie Williams and arrived at
+                <break/>
+                <say-as interpret-as="vxml:time">0342p</say-as>.</s>
+
+             <s>The subject is
+                <prosody rate="-20%">ski trip</prosody></s>
+            </p>
+           </speak>
+
+   S->C: MRCP/2.0 ... 543258 200 IN-PROGRESS
+         Channel-Identifier:32AECB23433802@speechsynth
+         Speech-Marker:timestamp=857206027059
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 65]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   C->S: MRCP/2.0 ... PAUSE 543259
+         Channel-Identifier:32AECB23433802@speechsynth
+
+   S->C: MRCP/2.0 ... 543259 200 COMPLETE
+         Channel-Identifier:32AECB23433802@speechsynth
+         Active-Request-Id-List:543258
+
+                               PAUSE Example
+
+8.10.  RESUME
+
+   The RESUME method from the client to the server tells a paused
+   synthesizer resource to resume speaking.  If a RESUME request is
+   issued on a session with no active SPEAK request, the server MUST
+   respond with a status-code of 402 "Method not valid in this state".
+   If a RESUME request is issued on a session with an active SPEAK
+   request that is speaking (i.e., not paused), the server MUST respond
+   with a status-code of 200 "Success".  If a SPEAK request was paused,
+   the server MUST return an Active-Request-Id-List header field whose
+   value contains the request-id of the SPEAK request that was resumed.
+
+   C->S: MRCP/2.0 ... SPEAK 543258
+         Channel-Identifier:32AECB23433802@speechsynth
+         Voice-gender:neutral
+         Voice-age:25
+         Prosody-volume:medium
+         Content-Type:application/ssml+xml
+         Content-Length:...
+
+         <?xml version="1.0"?>
+           <speak version="1.0"
+                xmlns="http://www.w3.org/2001/10/synthesis"
+                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+                xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
+                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
+                xml:lang="en-US">
+            <p>
+             <s>You have 4 new messages.</s>
+             <s>The first is from Stephanie Williams and arrived at
+                <break/>
+                <say-as interpret-as="vxml:time">0342p</say-as>.</s>
+             <s>The subject is
+                <prosody rate="-20%">ski trip</prosody></s>
+            </p>
+           </speak>
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 66]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   S->C: MRCP/2.0 ... 543258 200 IN-PROGRESS@speechsynth
+         Channel-Identifier:32AECB23433802
+         Speech-Marker:timestamp=857206027059
+
+   C->S: MRCP/2.0 ... PAUSE 543259
+         Channel-Identifier:32AECB23433802@speechsynth
+
+   S->C: MRCP/2.0 ... 543259 200 COMPLETE
+         Channel-Identifier:32AECB23433802@speechsynth
+         Active-Request-Id-List:543258
+
+   C->S: MRCP/2.0 ... RESUME 543260
+         Channel-Identifier:32AECB23433802@speechsynth
+
+   S->C: MRCP/2.0 ... 543260 200 COMPLETE
+         Channel-Identifier:32AECB23433802@speechsynth
+         Active-Request-Id-List:543258
+
+                              RESUME Example
+
+8.11.  CONTROL
+
+   The CONTROL method from the client to the server tells a synthesizer
+   that is speaking to modify what it is speaking on the fly.  This
+   method is used to request the synthesizer to jump forward or backward
+   in what it is speaking, change speaker rate, speaker parameters, etc.
+   It affects only the currently IN-PROGRESS SPEAK request.  Depending
+   on the implementation and capability of the synthesizer resource, it
+   may or may not support the various modifications indicated by header
+   fields in the CONTROL request.
+
+   When a client invokes a CONTROL method to jump forward and the
+   operation goes beyond the end of the active SPEAK method's text, the
+   CONTROL request still succeeds.  The active SPEAK request completes
+   and returns a SPEAK-COMPLETE event following the response to the
+   CONTROL method.  If there are more SPEAK requests in the queue, the
+   synthesizer resource starts at the beginning of the next SPEAK
+   request in the queue.
+
+   When a client invokes a CONTROL method to jump backward and the
+   operation jumps to the beginning or beyond the beginning of the
+   speech data of the active SPEAK method, the CONTROL request still
+   succeeds.  The response to the CONTROL request contains the speak-
+   restart header field, and the active SPEAK request restarts from the
+   beginning of its speech data.
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 67]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   These two behaviors can be used to rewind or fast-forward across
+   multiple speech requests, if the client wants to break up a speech
+   markup text into multiple SPEAK requests.
+
+   If a SPEAK request was active when the CONTROL method was received,
+   the server MUST return an Active-Request-Id-List header field
+   containing the request-id of the SPEAK request that was active.
+
+   C->S: MRCP/2.0 ... SPEAK 543258
+         Channel-Identifier:32AECB23433802@speechsynth
+         Voice-gender:neutral
+         Voice-age:25
+         Prosody-volume:medium
+         Content-Type:application/ssml+xml
+         Content-Length:...
+
+         <?xml version="1.0"?>
+           <speak version="1.0"
+                xmlns="http://www.w3.org/2001/10/synthesis"
+                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+                xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
+                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
+                xml:lang="en-US">
+            <p>
+             <s>You have 4 new messages.</s>
+             <s>The first is from Stephanie Williams
+                and arrived at <break/>
+                <say-as interpret-as="vxml:time">0342p</say-as>.</s>
+
+             <s>The subject is <prosody
+                rate="-20%">ski trip</prosody></s>
+            </p>
+           </speak>
+
+   S->C: MRCP/2.0 ... 543258 200 IN-PROGRESS
+         Channel-Identifier:32AECB23433802@speechsynth
+         Speech-Marker:timestamp=857205016059
+
+   C->S: MRCP/2.0 ... CONTROL 543259
+         Channel-Identifier:32AECB23433802@speechsynth
+         Prosody-rate:fast
+
+   S->C: MRCP/2.0 ... 543259 200 COMPLETE
+         Channel-Identifier:32AECB23433802@speechsynth
+         Active-Request-Id-List:543258
+         Speech-Marker:timestamp=857206027059
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 68]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   C->S: MRCP/2.0 ... CONTROL 543260
+         Channel-Identifier:32AECB23433802@speechsynth
+         Jump-Size:-15 Words
+
+   S->C: MRCP/2.0 ... 543260 200 COMPLETE
+         Channel-Identifier:32AECB23433802@speechsynth
+         Active-Request-Id-List:543258
+         Speech-Marker:timestamp=857206039059
+
+                              CONTROL Example
+
+8.12.  SPEAK-COMPLETE
+
+   This is an Event message from the synthesizer resource to the client
+   that indicates the corresponding SPEAK request was completed.  The
+   request-id field matches the request-id of the SPEAK request that
+   initiated the speech that just completed.  The request-state field is
+   set to COMPLETE by the server, indicating that this is the last event
+   with the corresponding request-id.  The Completion-Cause header field
+   specifies the cause code pertaining to the status and reason of
+   request completion, such as the SPEAK completed normally or because
+   of an error, kill-on-barge-in, etc.
+
+   C->S: MRCP/2.0 ... SPEAK 543260
+         Channel-Identifier:32AECB23433802@speechsynth
+         Voice-gender:neutral
+         Voice-age:25
+         Prosody-volume:medium
+         Content-Type:application/ssml+xml
+         Content-Length:...
+
+         <?xml version="1.0"?>
+           <speak version="1.0"
+                xmlns="http://www.w3.org/2001/10/synthesis"
+                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+                xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
+                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
+                xml:lang="en-US">
+            <p>
+             <s>You have 4 new messages.</s>
+             <s>The first is from Stephanie Williams
+                and arrived at <break/>
+                <say-as interpret-as="vxml:time">0342p</say-as>.</s>
+             <s>The subject is
+                <prosody rate="-20%">ski trip</prosody></s>
+            </p>
+           </speak>
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 69]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   S->C: MRCP/2.0 ... 543260 200 IN-PROGRESS
+         Channel-Identifier:32AECB23433802@speechsynth
+         Speech-Marker:timestamp=857206027059
+
+   S->C: MRCP/2.0 ... SPEAK-COMPLETE 543260 COMPLETE
+         Channel-Identifier:32AECB23433802@speechsynth
+         Completion-Cause:000 normal
+         Speech-Marker:timestamp=857206039059
+
+                          SPEAK-COMPLETE Example
+
+8.13.  SPEECH-MARKER
+
+   This is an event generated by the synthesizer resource to the client
+   when the synthesizer encounters a marker tag in the speech markup it
+   is currently processing.  The value of the request-id field MUST
+   match that of the corresponding SPEAK request.  The request-state
+   field MUST have the value "IN-PROGRESS" as the speech is still not
+   complete.  The value of the speech marker tag hit, describing where
+   the synthesizer is in the speech markup, MUST be returned in the
+   Speech-Marker header field, along with an NTP timestamp indicating
+   the instant in the output speech stream that the marker was
+   encountered.  The SPEECH-MARKER event MUST also be generated with a
+   null marker value and output NTP timestamp when a SPEAK request in
+   Pending-State (i.e., in the queue) changes state to IN-PROGRESS and
+   starts speaking.  The NTP timestamp MUST be synchronized with the RTP
+   timestamp used to generate the speech stream through standard RTCP
+   machinery.
+
+   C->S: MRCP/2.0 ... SPEAK 543261
+         Channel-Identifier:32AECB23433802@speechsynth
+         Voice-gender:neutral
+         Voice-age:25
+         Prosody-volume:medium
+         Content-Type:application/ssml+xml
+         Content-Length:...
+
+         <?xml version="1.0"?>
+           <speak version="1.0"
+                xmlns="http://www.w3.org/2001/10/synthesis"
+                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+                xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
+                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
+                xml:lang="en-US">
+            <p>
+             <s>You have 4 new messages.</s>
+             <s>The first is from Stephanie Williams
+                and arrived at <break/>
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 70]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+                <say-as interpret-as="vxml:time">0342p</say-as>.</s>
+                <mark name="here"/>
+             <s>The subject is
+                <prosody rate="-20%">ski trip</prosody>
+             </s>
+             <mark name="ANSWER"/>
+            </p>
+           </speak>
+
+   S->C: MRCP/2.0 ... 543261 200 IN-PROGRESS
+         Channel-Identifier:32AECB23433802@speechsynth
+         Speech-Marker:timestamp=857205015059
+
+   S->C: MRCP/2.0 ... SPEECH-MARKER 543261 IN-PROGRESS
+         Channel-Identifier:32AECB23433802@speechsynth
+         Speech-Marker:timestamp=857206027059;here
+
+   S->C: MRCP/2.0 ... SPEECH-MARKER 543261 IN-PROGRESS
+         Channel-Identifier:32AECB23433802@speechsynth
+         Speech-Marker:timestamp=857206039059;ANSWER
+
+   S->C: MRCP/2.0 ... SPEAK-COMPLETE 543261 COMPLETE
+         Channel-Identifier:32AECB23433802@speechsynth
+         Completion-Cause:000 normal
+         Speech-Marker:timestamp=857207689259;ANSWER
+
+                           SPEECH-MARKER Example
+
+8.14.  DEFINE-LEXICON
+
+   The DEFINE-LEXICON method, from the client to the server, provides a
+   lexicon and tells the server to load or unload the lexicon (see
+   Section 8.4.16).  The media type of the lexicon is provided in the
+   Content-Type header (see Section 8.5.2).  One such media type is
+   "application/pls+xml" for the Pronunciation Lexicon Specification
+   (PLS) [W3C.REC-pronunciation-lexicon-20081014] [RFC4267].
+
+   If the server resource is in the speaking or paused state, the server
+   MUST respond with a failure status-code of 402 "Method not valid in
+   this state".
+
+   If the resource is in the idle state and is able to successfully
+   load/unload the lexicon, the status MUST return a 200 "Success"
+   status-code and the request-state MUST be COMPLETE.
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 71]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   If the synthesizer could not define the lexicon for some reason, for
+   example, because the download failed or the lexicon was in an
+   unsupported form, the server MUST respond with a failure status-code
+   of 407 and a Completion-Cause header field describing the failure
+   reason.
+
+9.  Speech Recognizer Resource
+
+   The speech recognizer resource receives an incoming voice stream and
+   provides the client with an interpretation of what was spoken in
+   textual form.
+
+   The recognizer resource is controlled by MRCPv2 requests from the
+   client.  The recognizer resource can both respond to these requests
+   and generate asynchronous events to the client to indicate conditions
+   of interest during the processing of the method.
+
+   This section applies to the following resource types.
+
+   1.  speechrecog
+
+   2.  dtmfrecog
+
+   The difference between the above two resources is in their level of
+   support for recognition grammars.  The "dtmfrecog" resource type is
+   capable of recognizing only DTMF digits and hence accepts only DTMF
+   grammars.  It only generates barge-in for DTMF inputs and ignores
+   speech.  The "speechrecog" resource type can recognize regular speech
+   as well as DTMF digits and hence MUST support grammars describing
+   either speech or DTMF.  This resource generates barge-in events for
+   speech and/or DTMF.  By analyzing the grammars that are activated by
+   the RECOGNIZE method, it determines if a barge-in should occur for
+   speech and/or DTMF.  When the recognizer decides it needs to generate
+   a barge-in, it also generates a START-OF-INPUT event to the client.
+   The recognizer resource MAY support recognition in the normal or
+   hotword modes or both (although note that a single "speechrecog"
+   resource does not perform normal and hotword mode recognition
+   simultaneously).  For implementations where a single recognizer
+   resource does not support both modes, or simultaneous normal and
+   hotword recognition is desired, the two modes can be invoked through
+   separate resources allocated to the same SIP dialog (with different
+   MRCP session identifiers) and share the RTP audio feed.
+
+   The capabilities of the recognizer resource are enumerated below:
+
+   Normal Mode Recognition  Normal mode recognition tries to match all
+      of the speech or DTMF against the grammar and returns a no-match
+      status if the input fails to match or the method times out.
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 72]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Hotword Mode Recognition  Hotword mode is where the recognizer looks
+      for a match against specific speech grammar or DTMF sequence and
+      ignores speech or DTMF that does not match.  The recognition
+      completes only if there is a successful match of grammar, if the
+      client cancels the request, or if there is a non-input or
+      recognition timeout.
+
+   Voice Enrolled Grammars  A recognizer resource MAY optionally support
+      Voice Enrolled Grammars.  With this functionality, enrollment is
+      performed using a person's voice.  For example, a list of contacts
+      can be created and maintained by recording the person's names
+      using the caller's voice.  This technique is sometimes also called
+      speaker-dependent recognition.
+
+   Interpretation  A recognizer resource MAY be employed strictly for
+      its natural language interpretation capabilities by supplying it
+      with a text string as input instead of speech.  In this mode, the
+      resource takes text as input and produces an "interpretation" of
+      the input according to the supplied grammar.
+
+   Voice enrollment has the concept of an enrollment session.  A session
+   to add a new phrase to a personal grammar involves the initial
+   enrollment followed by a repeat of enough utterances before
+   committing the new phrase to the personal grammar.  Each time an
+   utterance is recorded, it is compared for similarity with the other
+   samples and a clash test is performed against other entries in the
+   personal grammar to ensure there are no similar and confusable
+   entries.
+
+   Enrollment is done using a recognizer resource.  Controlling which
+   utterances are to be considered for enrollment of a new phrase is
+   done by setting a header field (see Section 9.4.39) in the Recognize
+   request.
+
+   Interpretation is accomplished through the INTERPRET method
+   (Section 9.20) and the Interpret-Text header field (Section 9.4.30).
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 73]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+9.1.  Recognizer State Machine
+
+   The recognizer resource maintains a state machine to process MRCPv2
+   requests from the client.
+
+   Idle                   Recognizing               Recognized
+   State                  State                     State
+    |                       |                          |
+    |---------RECOGNIZE---->|---RECOGNITION-COMPLETE-->|
+    |<------STOP------------|<-----RECOGNIZE-----------|
+    |                       |                          |
+    |              |--------|              |-----------|
+    |       START-OF-INPUT  |       GET-RESULT         |
+    |              |------->|              |---------->|
+    |------------|          |                          |
+    |      DEFINE-GRAMMAR   |----------|               |
+    |<-----------|          | START-INPUT-TIMERS       |
+    |                       |<---------|               |
+    |------|                |                          |
+    |  INTERPRET            |                          |
+    |<-----|                |------|                   |
+    |                       |   RECOGNIZE              |
+    |-------|               |<-----|                   |
+    |      STOP                                        |
+    |<------|                                          |
+    |<-------------------STOP--------------------------|
+    |<-------------------DEFINE-GRAMMAR----------------|
+
+                         Recognizer State Machine
+
+   If a recognizer resource supports voice enrolled grammars, starting
+   an enrollment session does not change the state of the recognizer
+   resource.  Once an enrollment session is started, then utterances are
+   enrolled by calling the RECOGNIZE method repeatedly.  The state of
+   the speech recognizer resource goes from IDLE to RECOGNIZING state
+   each time RECOGNIZE is called.
+
+9.2.  Recognizer Methods
+
+   The recognizer supports the following methods.
+
+   recognizer-method    =  recog-only-method
+                        /  enrollment-method
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 74]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   recog-only-method    =  "DEFINE-GRAMMAR"
+                        /  "RECOGNIZE"
+                        /  "INTERPRET"
+                        /  "GET-RESULT"
+                        /  "START-INPUT-TIMERS"
+                        /  "STOP"
+
+   It is OPTIONAL for a recognizer resource to support voice enrolled
+   grammars.  If the recognizer resource does support voice enrolled
+   grammars, it MUST support the following methods.
+
+   enrollment-method    =  "START-PHRASE-ENROLLMENT"
+                        /  "ENROLLMENT-ROLLBACK"
+                        /  "END-PHRASE-ENROLLMENT"
+                        /  "MODIFY-PHRASE"
+                        /  "DELETE-PHRASE"
+
+9.3.  Recognizer Events
+
+   The recognizer can generate the following events.
+
+   recognizer-event     =  "START-OF-INPUT"
+                        /  "RECOGNITION-COMPLETE"
+                        /  "INTERPRETATION-COMPLETE"
+
+9.4.  Recognizer Header Fields
+
+   A recognizer message can contain header fields containing request
+   options and information to augment the Method, Response, or Event
+   message it is associated with.
+
+   recognizer-header    =  recog-only-header
+                        /  enrollment-header
+
+   recog-only-header    =  confidence-threshold
+                        /  sensitivity-level
+                        /  speed-vs-accuracy
+                        /  n-best-list-length
+                        /  no-input-timeout
+                        /  input-type
+                        /  recognition-timeout
+                        /  waveform-uri
+                        /  input-waveform-uri
+                        /  completion-cause
+                        /  completion-reason
+                        /  recognizer-context-block
+                        /  start-input-timers
+                        /  speech-complete-timeout
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 75]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+                        /  speech-incomplete-timeout
+                        /  dtmf-interdigit-timeout
+                        /  dtmf-term-timeout
+                        /  dtmf-term-char
+                        /  failed-uri
+                        /  failed-uri-cause
+                        /  save-waveform
+                        /  media-type
+                        /  new-audio-channel
+                        /  speech-language
+                        /  ver-buffer-utterance
+                        /  recognition-mode
+                        /  cancel-if-queue
+                        /  hotword-max-duration
+                        /  hotword-min-duration
+                        /  interpret-text
+                        /  dtmf-buffer-time
+                        /  clear-dtmf-buffer
+                        /  early-no-match
+
+   If a recognizer resource supports voice enrolled grammars, the
+   following header fields are also used.
+
+   enrollment-header    =  num-min-consistent-pronunciations
+                        /  consistency-threshold
+                        /  clash-threshold
+                        /  personal-grammar-uri
+                        /  enroll-utterance
+                        /  phrase-id
+                        /  phrase-nl
+                        /  weight
+                        /  save-best-waveform
+                        /  new-phrase-id
+                        /  confusable-phrases-uri
+                        /  abort-phrase-enrollment
+
+   For enrollment-specific header fields that can appear as part of
+   SET-PARAMS or GET-PARAMS methods, the following general rule applies:
+   the START-PHRASE-ENROLLMENT method MUST be invoked before these
+   header fields may be set through the SET-PARAMS method or retrieved
+   through the GET-PARAMS method.
+
+   Note that the Waveform-URI header field of the Recognizer resource
+   can also appear in the response to the END-PHRASE-ENROLLMENT method.
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 76]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+9.4.1.  Confidence-Threshold
+
+   When a recognizer resource recognizes or matches a spoken phrase with
+   some portion of the grammar, it associates a confidence level with
+   that match.  The Confidence-Threshold header field tells the
+   recognizer resource what confidence level the client considers a
+   successful match.  This is a float value between 0.0-1.0 indicating
+   the recognizer's confidence in the recognition.  If the recognizer
+   determines that there is no candidate match with a confidence that is
+   greater than the confidence threshold, then it MUST return no-match
+   as the recognition result.  This header field MAY occur in RECOGNIZE,
+   SET-PARAMS, or GET-PARAMS.  The default value for this header field
+   is implementation specific, as is the interpretation of any specific
+   value for this header field.  Although values for servers from
+   different vendors are not comparable, it is expected that clients
+   will tune this value over time for a given server.
+
+   confidence-threshold     =  "Confidence-Threshold" ":" FLOAT CRLF
+
+9.4.2.  Sensitivity-Level
+
+   To filter out background noise and not mistake it for speech, the
+   recognizer resource supports a variable level of sound sensitivity.
+   The Sensitivity-Level header field is a float value between 0.0 and
+   1.0 and allows the client to set the sensitivity level for the
+   recognizer.  This header field MAY occur in RECOGNIZE, SET-PARAMS, or
+   GET-PARAMS.  A higher value for this header field means higher
+   sensitivity.  The default value for this header field is
+   implementation specific, as is the interpretation of any specific
+   value for this header field.  Although values for servers from
+   different vendors are not comparable, it is expected that clients
+   will tune this value over time for a given server.
+
+   sensitivity-level        =  "Sensitivity-Level" ":" FLOAT CRLF
+
+9.4.3.  Speed-Vs-Accuracy
+
+   Depending on the implementation and capability of the recognizer
+   resource it may be tunable towards Performance or Accuracy.  Higher
+   accuracy may mean more processing and higher CPU utilization, meaning
+   fewer active sessions per server and vice versa.  The value is a
+   float between 0.0 and 1.0.  A value of 0.0 means fastest recognition.
+   A value of 1.0 means best accuracy.  This header field MAY occur in
+   RECOGNIZE, SET-PARAMS, or GET-PARAMS.  The default value for this
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 77]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   header field is implementation specific.  Although values for servers
+   from different vendors are not comparable, it is expected that
+   clients will tune this value over time for a given server.
+
+   speed-vs-accuracy        =  "Speed-Vs-Accuracy" ":" FLOAT CRLF
+
+9.4.4.  N-Best-List-Length
+
+   When the recognizer matches an incoming stream with the grammar, it
+   may come up with more than one alternative match because of
+   confidence levels in certain words or conversation paths.  If this
+   header field is not specified, by default, the recognizer resource
+   returns only the best match above the confidence threshold.  The
+   client, by setting this header field, can ask the recognition
+   resource to send it more than one alternative.  All alternatives must
+   still be above the Confidence-Threshold.  A value greater than one
+   does not guarantee that the recognizer will provide the requested
+   number of alternatives.  This header field MAY occur in RECOGNIZE,
+   SET-PARAMS, or GET-PARAMS.  The minimum value for this header field
+   is 1.  The default value for this header field is 1.
+
+   n-best-list-length       =  "N-Best-List-Length" ":" 1*19DIGIT CRLF
+
+9.4.5.  Input-Type
+
+   When the recognizer detects barge-in-able input and generates a
+   START-OF-INPUT event, that event MUST carry this header field to
+   specify whether the input that caused the barge-in was DTMF or
+   speech.
+
+   input-type         =  "Input-Type" ":"  inputs CRLF
+   inputs             =  "speech" / "dtmf"
+
+9.4.6.  No-Input-Timeout
+
+   When recognition is started and there is no speech detected for a
+   certain period of time, the recognizer can send a RECOGNITION-
+   COMPLETE event to the client with a Completion-Cause of "no-input-
+   timeout" and terminate the recognition operation.  The client can use
+   the No-Input-Timeout header field to set this timeout.  The value is
+   in milliseconds and can range from 0 to an implementation-specific
+   maximum value.  This header field MAY occur in RECOGNIZE, SET-PARAMS,
+   or GET-PARAMS.  The default value is implementation specific.
+
+   no-input-timeout         =  "No-Input-Timeout" ":" 1*19DIGIT CRLF
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 78]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+9.4.7.  Recognition-Timeout
+
+   When recognition is started and there is no match for a certain
+   period of time, the recognizer can send a RECOGNITION-COMPLETE event
+   to the client and terminate the recognition operation.  The
+   Recognition-Timeout header field allows the client to set this
+   timeout value.  The value is in milliseconds.  The value for this
+   header field ranges from 0 to an implementation-specific maximum
+   value.  The default value is 10 seconds.  This header field MAY occur
+   in RECOGNIZE, SET-PARAMS, or GET-PARAMS.
+
+   recognition-timeout      =  "Recognition-Timeout" ":" 1*19DIGIT CRLF
+
+9.4.8.  Waveform-URI
+
+   If the Save-Waveform header field is set to "true", the recognizer
+   MUST record the incoming audio stream of the recognition into a
+   stored form and provide a URI for the client to access it.  This
+   header field MUST be present in the RECOGNITION-COMPLETE event if the
+   Save-Waveform header field was set to "true".  The value of the
+   header field MUST be empty if there was some error condition
+   preventing the server from recording.  Otherwise, the URI generated
+   by the server MUST be unambiguous across the server and all its
+   recognition sessions.  The content associated with the URI MUST be
+   available to the client until the MRCPv2 session terminates.
+
+   Similarly, if the Save-Best-Waveform header field is set to "true",
+   the recognizer MUST save the audio stream for the best repetition of
+   the phrase that was used during the enrollment session.  The
+   recognizer MUST then record the recognized audio and make it
+   available to the client by returning a URI in the Waveform-URI header
+   field in the response to the END-PHRASE-ENROLLMENT method.  The value
+   of the header field MUST be empty if there was some error condition
+   preventing the server from recording.  Otherwise, the URI generated
+   by the server MUST be unambiguous across the server and all its
+   recognition sessions.  The content associated with the URI MUST be
+   available to the client until the MRCPv2 session terminates.  See the
+   discussion on the sensitivity of saved waveforms in Section 12.
+
+   The server MUST also return the size in octets and the duration in
+   milliseconds of the recorded audio waveform as parameters associated
+   with the header field.
+
+   waveform-uri             =  "Waveform-URI" ":" ["<" uri ">"
+                               ";" "size" "=" 1*19DIGIT
+                               ";" "duration" "=" 1*19DIGIT] CRLF
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 79]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+9.4.9.  Media-Type
+
+   This header field MAY be specified in the SET-PARAMS, GET-PARAMS, or
+   the RECOGNIZE methods and tells the server resource the media type in
+   which to store captured audio or video, such as the one captured and
+   returned by the Waveform-URI header field.
+
+   media-type               =  "Media-Type" ":" media-type-value
+                               CRLF
+
+9.4.10.  Input-Waveform-URI
+
+   This optional header field specifies a URI pointing to audio content
+   to be processed by the RECOGNIZE operation.  This enables the client
+   to request recognition from a specified buffer or audio file.
+
+   input-waveform-uri       =  "Input-Waveform-URI" ":" uri CRLF
+
+9.4.11.  Completion-Cause
+
+   This header field MUST be part of a RECOGNITION-COMPLETE event coming
+   from the recognizer resource to the client.  It indicates the reason
+   behind the RECOGNIZE method completion.  This header field MUST be
+   sent in the DEFINE-GRAMMAR and RECOGNIZE responses, if they return
+   with a failure status and a COMPLETE state.  In the ABNF below, the
+   cause-code contains a numerical value selected from the Cause-Code
+   column of the following table.  The cause-name contains the
+   corresponding token selected from the Cause-Name column.
+
+   completion-cause         =  "Completion-Cause" ":" cause-code SP
+                               cause-name CRLF
+   cause-code               =  3DIGIT
+   cause-name               =  *VCHAR
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 80]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   +------------+-----------------------+------------------------------+
+   | Cause-Code | Cause-Name            | Description                  |
+   +------------+-----------------------+------------------------------+
+   | 000        | success               | RECOGNIZE completed with a   |
+   |            |                       | match or DEFINE-GRAMMAR      |
+   |            |                       | succeeded in downloading and |
+   |            |                       | compiling the grammar.       |
+   |            |                       |                              |
+   | 001        | no-match              | RECOGNIZE completed, but no  |
+   |            |                       | match was found.             |
+   |            |                       |                              |
+   | 002        | no-input-timeout      | RECOGNIZE completed without  |
+   |            |                       | a match due to a             |
+   |            |                       | no-input-timeout.            |
+   |            |                       |                              |
+   | 003        | hotword-maxtime       | RECOGNIZE in hotword mode    |
+   |            |                       | completed without a match    |
+   |            |                       | due to a                     |
+   |            |                       | recognition-timeout.         |
+   |            |                       |                              |
+   | 004        | grammar-load-failure  | RECOGNIZE failed due to      |
+   |            |                       | grammar load failure.        |
+   |            |                       |                              |
+   | 005        | grammar-compilation-  | RECOGNIZE failed due to      |
+   |            | failure               | grammar compilation failure. |
+   |            |                       |                              |
+   | 006        | recognizer-error      | RECOGNIZE request terminated |
+   |            |                       | prematurely due to a         |
+   |            |                       | recognizer error.            |
+   |            |                       |                              |
+   | 007        | speech-too-early      | RECOGNIZE request terminated |
+   |            |                       | because speech was too       |
+   |            |                       | early. This happens when the |
+   |            |                       | audio stream is already      |
+   |            |                       | "in-speech" when the         |
+   |            |                       | RECOGNIZE request was        |
+   |            |                       | received.                    |
+   |            |                       |                              |
+   | 008        | success-maxtime       | RECOGNIZE request terminated |
+   |            |                       | because speech was too long  |
+   |            |                       | but whatever was spoken till |
+   |            |                       | that point was a full match. |
+   |            |                       |                              |
+   | 009        | uri-failure           | Failure accessing a URI.     |
+   |            |                       |                              |
+   | 010        | language-unsupported  | Language not supported.      |
+   |            |                       |                              |
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 81]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   | 011        | cancelled             | A new RECOGNIZE cancelled    |
+   |            |                       | this one, or a prior         |
+   |            |                       | RECOGNIZE failed while this  |
+   |            |                       | one was still in the queue.  |
+   |            |                       |                              |
+   | 012        | semantics-failure     | Recognition succeeded, but   |
+   |            |                       | semantic interpretation of   |
+   |            |                       | the recognized input failed. |
+   |            |                       | The RECOGNITION-COMPLETE     |
+   |            |                       | event MUST contain the       |
+   |            |                       | Recognition result with only |
+   |            |                       | input text and no            |
+   |            |                       | interpretation.              |
+   |            |                       |                              |
+   | 013        | partial-match         | Speech Incomplete Timeout    |
+   |            |                       | expired before there was a   |
+   |            |                       | full match. But whatever was |
+   |            |                       | spoken till that point was a |
+   |            |                       | partial match to one or more |
+   |            |                       | grammars.                    |
+   |            |                       |                              |
+   | 014        | partial-match-maxtime | The Recognition-Timeout      |
+   |            |                       | expired before full match    |
+   |            |                       | was achieved. But whatever   |
+   |            |                       | was spoken till that point   |
+   |            |                       | was a partial match to one   |
+   |            |                       | or more grammars.            |
+   |            |                       |                              |
+   | 015        | no-match-maxtime      | The Recognition-Timeout      |
+   |            |                       | expired. Whatever was spoken |
+   |            |                       | till that point did not      |
+   |            |                       | match any of the grammars.   |
+   |            |                       | This cause could also be     |
+   |            |                       | returned if the recognizer   |
+   |            |                       | does not support detecting   |
+   |            |                       | partial grammar matches.     |
+   |            |                       |                              |
+   | 016        | grammar-definition-   | Any DEFINE-GRAMMAR error     |
+   |            | failure               | other than                   |
+   |            |                       | grammar-load-failure and     |
+   |            |                       | grammar-compilation-failure. |
+   +------------+-----------------------+------------------------------+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 82]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+9.4.12.  Completion-Reason
+
+   This header field MAY be specified in a RECOGNITION-COMPLETE event
+   coming from the recognizer resource to the client.  This contains the
+   reason text behind the RECOGNIZE request completion.  The server uses
+   this header field to communicate text describing the reason for the
+   failure, such as the specific error encountered in parsing a grammar
+   markup.
+
+   The completion reason text is provided for client use in logs and for
+   debugging and instrumentation purposes.  Clients MUST NOT interpret
+   the completion reason text.
+
+   completion-reason        =  "Completion-Reason" ":"
+                               quoted-string CRLF
+
+9.4.13.  Recognizer-Context-Block
+
+   This header field MAY be sent as part of the SET-PARAMS or GET-PARAMS
+   request.  If the GET-PARAMS method contains this header field with no
+   value, then it is a request to the recognizer to return the
+   recognizer context block.  The response to such a message MAY contain
+   a recognizer context block as a typed media message body.  If the
+   server returns a recognizer context block, the response MUST contain
+   this header field and its value MUST match the Content-ID of the
+   corresponding media block.
+
+   If the SET-PARAMS method contains this header field, it MUST also
+   contain a message body containing the recognizer context data and a
+   Content-ID matching this header field value.  This Content-ID MUST
+   match the Content-ID that came with the context data during the
+   GET-PARAMS operation.
+
+   An implementation choosing to use this mechanism to hand off
+   recognizer context data between servers MUST distinguish its
+   implementation-specific block of data by using an IANA-registered
+   content type in the IANA Media Type vendor tree.
+
+   recognizer-context-block  =  "Recognizer-Context-Block" ":"
+                                [1*VCHAR] CRLF
+
+9.4.14.  Start-Input-Timers
+
+   This header field MAY be sent as part of the RECOGNIZE request.  A
+   value of false tells the recognizer to start recognition but not to
+   start the no-input timer yet.  The recognizer MUST NOT start the
+   timers until the client sends a START-INPUT-TIMERS request to the
+   recognizer.  This is useful in the scenario when the recognizer and
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 83]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   synthesizer engines are not part of the same session.  In such
+   configurations, when a kill-on-barge-in prompt is being played (see
+   Section 8.4.2), the client wants the RECOGNIZE request to be
+   simultaneously active so that it can detect and implement kill-on-
+   barge-in.  However, the recognizer SHOULD NOT start the no-input
+   timers until the prompt is finished.  The default value is "true".
+
+   start-input-timers  =  "Start-Input-Timers" ":" BOOLEAN CRLF
+
+9.4.15.  Speech-Complete-Timeout
+
+   This header field specifies the length of silence required following
+   user speech before the speech recognizer finalizes a result (either
+   accepting it or generating a no-match result).  The Speech-Complete-
+   Timeout value applies when the recognizer currently has a complete
+   match against an active grammar, and specifies how long the
+   recognizer MUST wait for more input before declaring a match.  By
+   contrast, the Speech-Incomplete-Timeout is used when the speech is an
+   incomplete match to an active grammar.  The value is in milliseconds.
+
+  speech-complete-timeout = "Speech-Complete-Timeout" ":" 1*19DIGIT CRLF
+
+   A long Speech-Complete-Timeout value delays the result to the client
+   and therefore makes the application's response to a user slow.  A
+   short Speech-Complete-Timeout may lead to an utterance being broken
+   up inappropriately.  Reasonable speech complete timeout values are
+   typically in the range of 0.3 seconds to 1.0 seconds.  The value for
+   this header field ranges from 0 to an implementation-specific maximum
+   value.  The default value for this header field is implementation
+   specific.  This header field MAY occur in RECOGNIZE, SET-PARAMS, or
+   GET-PARAMS.
+
+9.4.16.  Speech-Incomplete-Timeout
+
+   This header field specifies the required length of silence following
+   user speech after which a recognizer finalizes a result.  The
+   incomplete timeout applies when the speech prior to the silence is an
+   incomplete match of all active grammars.  In this case, once the
+   timeout is triggered, the partial result is rejected (with a
+   Completion-Cause of "partial-match").  The value is in milliseconds.
+   The value for this header field ranges from 0 to an implementation-
+   specific maximum value.  The default value for this header field is
+   implementation specific.
+
+   speech-incomplete-timeout = "Speech-Incomplete-Timeout" ":" 1*19DIGIT
+                                CRLF
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 84]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   The Speech-Incomplete-Timeout also applies when the speech prior to
+   the silence is a complete match of an active grammar, but where it is
+   possible to speak further and still match the grammar.  By contrast,
+   the Speech-Complete-Timeout is used when the speech is a complete
+   match to an active grammar and no further spoken words can continue
+   to represent a match.
+
+   A long Speech-Incomplete-Timeout value delays the result to the
+   client and therefore makes the application's response to a user slow.
+   A short Speech-Incomplete-Timeout may lead to an utterance being
+   broken up inappropriately.
+
+   The Speech-Incomplete-Timeout is usually longer than the Speech-
+   Complete-Timeout to allow users to pause mid-utterance (for example,
+   to breathe).  This header field MAY occur in RECOGNIZE, SET-PARAMS,
+   or GET-PARAMS.
+
+9.4.17.  DTMF-Interdigit-Timeout
+
+   This header field specifies the inter-digit timeout value to use when
+   recognizing DTMF input.  The value is in milliseconds.  The value for
+   this header field ranges from 0 to an implementation-specific maximum
+   value.  The default value is 5 seconds.  This header field MAY occur
+   in RECOGNIZE, SET-PARAMS, or GET-PARAMS.
+
+  dtmf-interdigit-timeout = "DTMF-Interdigit-Timeout" ":" 1*19DIGIT CRLF
+
+9.4.18.  DTMF-Term-Timeout
+
+   This header field specifies the terminating timeout to use when
+   recognizing DTMF input.  The DTMF-Term-Timeout applies only when no
+   additional input is allowed by the grammar; otherwise, the
+   DTMF-Interdigit-Timeout applies.  The value is in milliseconds.  The
+   value for this header field ranges from 0 to an implementation-
+   specific maximum value.  The default value is 10 seconds.  This
+   header field MAY occur in RECOGNIZE, SET-PARAMS, or GET-PARAMS.
+
+   dtmf-term-timeout        =  "DTMF-Term-Timeout" ":" 1*19DIGIT CRLF
+
+9.4.19.  DTMF-Term-Char
+
+   This header field specifies the terminating DTMF character for DTMF
+   input recognition.  The default value is NULL, which is indicated by
+   an empty header field value.  This header field MAY occur in
+   RECOGNIZE, SET-PARAMS, or GET-PARAMS.
+
+   dtmf-term-char           =  "DTMF-Term-Char" ":" VCHAR CRLF
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 85]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+9.4.20.  Failed-URI
+
+   When a recognizer needs to fetch or access a URI and the access
+   fails, the server SHOULD provide the failed URI in this header field
+   in the method response, unless there are multiple URI failures, in
+   which case one of the failed URIs MUST be provided in this header
+   field in the method response.
+
+   failed-uri               =  "Failed-URI" ":" absoluteURI CRLF
+
+9.4.21.  Failed-URI-Cause
+
+   When a recognizer method needs a recognizer to fetch or access a URI
+   and the access fails, the server MUST provide the URI-specific or
+   protocol-specific response code for the URI in the Failed-URI header
+   field through this header field in the method response.  The value
+   encoding is UTF-8 (RFC 3629 [RFC3629]) to accommodate any access
+   protocol, some of which might have a response string instead of a
+   numeric response code.
+
+   failed-uri-cause         =  "Failed-URI-Cause" ":" 1*UTFCHAR CRLF
+
+9.4.22.  Save-Waveform
+
+   This header field allows the client to request the recognizer
+   resource to save the audio input to the recognizer.  The recognizer
+   resource MUST then attempt to record the recognized audio, without
+   endpointing, and make it available to the client in the form of a URI
+   returned in the Waveform-URI header field in the RECOGNITION-COMPLETE
+   event.  If there was an error in recording the stream or the audio
+   content is otherwise not available, the recognizer MUST return an
+   empty Waveform-URI header field.  The default value for this field is
+   "false".  This header field MAY occur in RECOGNIZE, SET-PARAMS, or
+   GET-PARAMS.  See the discussion on the sensitivity of saved waveforms
+   in Section 12.
+
+   save-waveform            =  "Save-Waveform" ":" BOOLEAN CRLF
+
+9.4.23.  New-Audio-Channel
+
+   This header field MAY be specified in a RECOGNIZE request and allows
+   the client to tell the server that, from this point on, further input
+   audio comes from a different audio source, channel, or speaker.  If
+   the recognizer resource had collected any input statistics or
+   adaptation state, the recognizer resource MUST do what is appropriate
+   for the specific recognition technology, which includes but is not
+   limited to discarding any collected input statistics or adaptation
+   state before starting the RECOGNIZE request.  Note that if there are
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 86]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   multiple resources that are sharing a media stream and are collecting
+   or using this data, and the client issues this header field to one of
+   the resources, the reset operation applies to all resources that use
+   the shared media stream.  This helps in a number of use cases,
+   including where the client wishes to reuse an open recognition
+   session with an existing media session for multiple telephone calls.
+
+   new-audio-channel        =  "New-Audio-Channel" ":" BOOLEAN
+                               CRLF
+
+9.4.24.  Speech-Language
+
+   This header field specifies the language of recognition grammar data
+   within a session or request, if it is not specified within the data.
+   The value of this header field MUST follow RFC 5646 [RFC5646] for its
+   values.  This MAY occur in DEFINE-GRAMMAR, RECOGNIZE, SET-PARAMS, or
+   GET-PARAMS requests.
+
+   speech-language          =  "Speech-Language" ":" 1*VCHAR CRLF
+
+9.4.25.  Ver-Buffer-Utterance
+
+   This header field lets the client request the server to buffer the
+   utterance associated with this recognition request into a buffer
+   available to a co-resident verifier resource.  The buffer is shared
+   across resources within a session and is allocated when a verifier
+   resource is added to this session.  The client MUST NOT send this
+   header field unless a verifier resource is instantiated for the
+   session.  The buffer is released when the verifier resource is
+   released from the session.
+
+9.4.26.  Recognition-Mode
+
+   This header field specifies what mode the RECOGNIZE method will
+   operate in.  The value choices are "normal" or "hotword".  If the
+   value is "normal", the RECOGNIZE starts matching speech and DTMF to
+   the grammars specified in the RECOGNIZE request.  If any portion of
+   the speech does not match the grammar, the RECOGNIZE command
+   completes with a no-match status.  Timers may be active to detect
+   speech in the audio (see Section 9.4.14), so the RECOGNIZE method may
+   complete because of a timeout waiting for speech.  If the value of
+   this header field is "hotword", the RECOGNIZE method operates in
+   hotword mode, where it only looks for the particular keywords or DTMF
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 87]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   sequences specified in the grammar and ignores silence or other
+   speech in the audio stream.  The default value for this header field
+   is "normal".  This header field MAY occur on the RECOGNIZE method.
+
+   recognition-mode         =  "Recognition-Mode" ":"
+                               "normal" / "hotword" CRLF
+
+9.4.27.  Cancel-If-Queue
+
+   This header field specifies what will happen if the client attempts
+   to invoke another RECOGNIZE method when this RECOGNIZE request is
+   already in progress for the resource.  The value for this header
+   field is a Boolean.  A value of "true" means the server MUST
+   terminate this RECOGNIZE request, with a Completion-Cause of
+   "cancelled", if the client issues another RECOGNIZE request for the
+   same resource.  A value of "false" for this header field indicates to
+   the server that this RECOGNIZE request will continue to completion,
+   and if the client issues more RECOGNIZE requests to the same
+   resource, they are queued.  When the currently active RECOGNIZE
+   request is stopped or completes with a successful match, the first
+   RECOGNIZE method in the queue becomes active.  If the current
+   RECOGNIZE fails, all RECOGNIZE methods in the pending queue are
+   cancelled, and each generates a RECOGNITION-COMPLETE event with a
+   Completion-Cause of "cancelled".  This header field MUST be present
+   in every RECOGNIZE request.  There is no default value.
+
+   cancel-if-queue          =  "Cancel-If-Queue" ":" BOOLEAN CRLF
+
+9.4.28.  Hotword-Max-Duration
+
+   This header field MAY be sent in a hotword mode RECOGNIZE request.
+   It specifies the maximum length of an utterance (in seconds) that
+   will be considered for hotword recognition.  This header field, along
+   with Hotword-Min-Duration, can be used to tune performance by
+   preventing the recognizer from evaluating utterances that are too
+   short or too long to be one of the hotwords in the grammar(s).  The
+   value is in milliseconds.  The default is implementation dependent.
+   If present in a RECOGNIZE request specifying a mode other than
+   "hotword", the header field is ignored.
+
+   hotword-max-duration     =  "Hotword-Max-Duration" ":" 1*19DIGIT
+                               CRLF
+
+9.4.29.  Hotword-Min-Duration
+
+   This header field MAY be sent in a hotword mode RECOGNIZE request.
+   It specifies the minimum length of an utterance (in seconds) that
+   will be considered for hotword recognition.  This header field, along
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 88]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   with Hotword-Max-Duration, can be used to tune performance by
+   preventing the recognizer from evaluating utterances that are too
+   short or too long to be one of the hotwords in the grammar(s).  The
+   value is in milliseconds.  The default value is implementation
+   dependent.  If present in a RECOGNIZE request specifying a mode other
+   than "hotword", the header field is ignored.
+
+   hotword-min-duration     =  "Hotword-Min-Duration" ":" 1*19DIGIT CRLF
+
+9.4.30.  Interpret-Text
+
+   The value of this header field is used to provide a pointer to the
+   text for which a natural language interpretation is desired.  The
+   value is either a URI or text.  If the value is a URI, it MUST be a
+   Content-ID that refers to an entity of type 'text/plain' in the body
+   of the message.  Otherwise, the server MUST treat the value as the
+   text to be interpreted.  This header field MUST be used when invoking
+   the INTERPRET method.
+
+   interpret-text           =  "Interpret-Text" ":" 1*VCHAR CRLF
+
+9.4.31.  DTMF-Buffer-Time
+
+   This header field MAY be specified in a GET-PARAMS or SET-PARAMS
+   method and is used to specify the amount of time, in milliseconds, of
+   the type-ahead buffer for the recognizer.  This is the buffer that
+   collects DTMF digits as they are pressed even when there is no
+   RECOGNIZE command active.  When a subsequent RECOGNIZE method is
+   received, it MUST look to this buffer to match the RECOGNIZE request.
+   If the digits in the buffer are not sufficient, then it can continue
+   to listen to more digits to match the grammar.  The default size of
+   this DTMF buffer is platform specific.
+
+   dtmf-buffer-time  =  "DTMF-Buffer-Time" ":" 1*19DIGIT CRLF
+
+9.4.32.  Clear-DTMF-Buffer
+
+   This header field MAY be specified in a RECOGNIZE method and is used
+   to tell the recognizer to clear the DTMF type-ahead buffer before
+   starting the RECOGNIZE.  The default value of this header field is
+   "false", which does not clear the type-ahead buffer before starting
+   the RECOGNIZE method.  If this header field is specified to be
+   "true", then the RECOGNIZE will clear the DTMF buffer before starting
+   recognition.  This means digits pressed by the caller before the
+   RECOGNIZE command was issued are discarded.
+
+   clear-dtmf-buffer  = "Clear-DTMF-Buffer" ":" BOOLEAN CRLF
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 89]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+9.4.33.  Early-No-Match
+
+   This header field MAY be specified in a RECOGNIZE method and is used
+   to tell the recognizer that it MUST NOT wait for the end of speech
+   before processing the collected speech to match active grammars.  A
+   value of "true" indicates the recognizer MUST do early matching.  The
+   default value for this header field if not specified is "false".  If
+   the recognizer does not support the processing of the collected audio
+   before the end of speech, this header field can be safely ignored.
+
+   early-no-match  = "Early-No-Match" ":" BOOLEAN CRLF
+
+9.4.34.  Num-Min-Consistent-Pronunciations
+
+   This header field MAY be specified in a START-PHRASE-ENROLLMENT,
+   SET-PARAMS, or GET-PARAMS method and is used to specify the minimum
+   number of consistent pronunciations that must be obtained to voice
+   enroll a new phrase.  The minimum value is 1.  The default value is
+   implementation specific and MAY be greater than 1.
+
+   num-min-consistent-pronunciations  =
+                 "Num-Min-Consistent-Pronunciations" ":" 1*19DIGIT CRLF
+
+9.4.35.  Consistency-Threshold
+
+   This header field MAY be sent as part of the START-PHRASE-ENROLLMENT,
+   SET-PARAMS, or GET-PARAMS method.  Used during voice enrollment, this
+   header field specifies how similar to a previously enrolled
+   pronunciation of the same phrase an utterance needs to be in order to
+   be considered "consistent".  The higher the threshold, the closer the
+   match between an utterance and previous pronunciations must be for
+   the pronunciation to be considered consistent.  The range for this
+   threshold is a float value between 0.0 and 1.0.  The default value
+   for this header field is implementation specific.
+
+   consistency-threshold    =  "Consistency-Threshold" ":" FLOAT CRLF
+
+9.4.36.  Clash-Threshold
+
+   This header field MAY be sent as part of the START-PHRASE-ENROLLMENT,
+   SET-PARAMS, or GET-PARAMS method.  Used during voice enrollment, this
+   header field specifies how similar the pronunciations of two
+   different phrases can be before they are considered to be clashing.
+   For example, pronunciations of phrases such as "John Smith" and "Jon
+   Smits" may be so similar that they are difficult to distinguish
+   correctly.  A smaller threshold reduces the number of clashes
+   detected.  The range for this threshold is a float value between 0.0
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 90]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   and 1.0.  The default value for this header field is implementation
+   specific.  Clash testing can be turned off completely by setting the
+   Clash-Threshold header field value to 0.
+
+   clash-threshold          =  "Clash-Threshold" ":" FLOAT CRLF
+
+9.4.37.  Personal-Grammar-URI
+
+   This header field specifies the speaker-trained grammar to be used or
+   referenced during enrollment operations.  Phrases are added to this
+   grammar during enrollment.  For example, a contact list for user
+   "Jeff" could be stored at the Personal-Grammar-URI
+   "http://myserver.example.com/myenrollmentdb/jeff-list".  The
+   generated grammar syntax MAY be implementation specific.  There is no
+   default value for this header field.  This header field MAY be sent
+   as part of the START-PHRASE-ENROLLMENT, SET-PARAMS, or GET-PARAMS
+   method.
+
+   personal-grammar-uri     =  "Personal-Grammar-URI" ":" uri CRLF
+
+9.4.38.  Enroll-Utterance
+
+   This header field MAY be specified in the RECOGNIZE method.  If this
+   header field is set to "true" and an Enrollment is active, the
+   RECOGNIZE command MUST add the collected utterance to the personal
+   grammar that is being enrolled.  The way in which this occurs is
+   engine specific and may be an area of future standardization.  The
+   default value for this header field is "false".
+
+   enroll-utterance     =  "Enroll-Utterance" ":" BOOLEAN CRLF
+
+9.4.39.  Phrase-Id
+
+   This header field in a request identifies a phrase in an existing
+   personal grammar for which enrollment is desired.  It is also
+   returned to the client in the RECOGNIZE complete event.  This header
+   field MAY occur in START-PHRASE-ENROLLMENT, MODIFY-PHRASE, or DELETE-
+   PHRASE requests.  There is no default value for this header field.
+
+   phrase-id                =  "Phrase-ID" ":" 1*VCHAR CRLF
+
+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 91]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+9.4.40.  Phrase-NL
+
+   This string specifies the interpreted text to be returned when the
+   phrase is recognized.  This header field MAY occur in START-PHRASE-
+   ENROLLMENT and MODIFY-PHRASE requests.  There is no default value for
+   this header field.
+
+   phrase-nl                =  "Phrase-NL" ":" 1*UTFCHAR CRLF
+
+9.4.41.  Weight
+
+   The value of this header field represents the occurrence likelihood
+   of a phrase in an enrolled grammar.  When using grammar enrollment,
+   the system is essentially constructing a grammar segment consisting
+   of a list of possible match phrases.  This can be thought of to be
+   similar to the dynamic construction of a <one-of> tag in the W3C
+   grammar specification.  Each enrolled-phrase becomes an item in the
+   list that can be matched against spoken input similar to the <item>
+   within a <one-of> list.  This header field allows you to assign a
+   weight to the phrase (i.e., <item> entry) in the <one-of> list that
+   is enrolled.  Grammar weights are normalized to a sum of one at
+   grammar compilation time, so a weight value of 1 for each phrase in
+   an enrolled grammar list indicates all items in that list have the
+   same weight.  This header field MAY occur in START-PHRASE-ENROLLMENT
+   and MODIFY-PHRASE requests.  The default value for this header field
+   is implementation specific.
+
+   weight                   =  "Weight" ":" FLOAT CRLF
+
+9.4.42.  Save-Best-Waveform
+
+   This header field allows the client to request the recognizer
+   resource to save the audio stream for the best repetition of the
+   phrase that was used during the enrollment session.  The recognizer
+   MUST attempt to record the recognized audio and make it available to
+   the client in the form of a URI returned in the Waveform-URI header
+   field in the response to the END-PHRASE-ENROLLMENT method.  If there
+   was an error in recording the stream or the audio data is otherwise
+   not available, the recognizer MUST return an empty Waveform-URI
+   header field.  This header field MAY occur in the START-PHRASE-
+   ENROLLMENT, SET-PARAMS, and GET-PARAMS methods.
+
+   save-best-waveform  =  "Save-Best-Waveform" ":" BOOLEAN CRLF
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 92]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+9.4.43.  New-Phrase-Id
+
+   This header field replaces the ID used to identify the phrase in a
+   personal grammar.  The recognizer returns the new ID when using an
+   enrollment grammar.  This header field MAY occur in MODIFY-PHRASE
+   requests.
+
+   new-phrase-id            =  "New-Phrase-ID" ":" 1*VCHAR CRLF
+
+9.4.44.  Confusable-Phrases-URI
+
+   This header field specifies a grammar that defines invalid phrases
+   for enrollment.  For example, typical applications do not allow an
+   enrolled phrase that is also a command word.  This header field MAY
+   occur in RECOGNIZE requests that are part of an enrollment session.
+
+   confusable-phrases-uri   =  "Confusable-Phrases-URI" ":" uri CRLF
+
+9.4.45.  Abort-Phrase-Enrollment
+
+   This header field MAY be specified in the END-PHRASE-ENROLLMENT
+   method to abort the phrase enrollment, rather than committing the
+   phrase to the personal grammar.
+
+   abort-phrase-enrollment  =  "Abort-Phrase-Enrollment" ":"
+                               BOOLEAN CRLF
+
+9.5.  Recognizer Message Body
+
+   A recognizer message can carry additional data associated with the
+   request, response, or event.  The client MAY provide the grammar to
+   be recognized in DEFINE-GRAMMAR or RECOGNIZE requests.  When one or
+   more grammars are specified using the DEFINE-GRAMMAR method, the
+   server MUST attempt to fetch, compile, and optimize the grammar
+   before returning a response to the DEFINE-GRAMMAR method.  A
+   RECOGNIZE request MUST completely specify the grammars to be active
+   during the recognition operation, except when the RECOGNIZE method is
+   being used to enroll a grammar.  During grammar enrollment, such
+   grammars are OPTIONAL.  The server resource sends the recognition
+   results in the RECOGNITION-COMPLETE event and the GET-RESULT
+   response.  Grammars and recognition results are carried in the
+   message body of the corresponding MRCPv2 messages.
+
+9.5.1.  Recognizer Grammar Data
+
+   Recognizer grammar data from the client to the server can be provided
+   inline or by reference.  Either way, grammar data is carried as typed
+   media entities in the message body of the RECOGNIZE or DEFINE-GRAMMAR
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 93]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   request.  All MRCPv2 servers MUST accept grammars in the XML form
+   (media type 'application/srgs+xml') of the W3C's XML-based Speech
+   Grammar Markup Format (SRGS) [W3C.REC-speech-grammar-20040316] and
+   MAY accept grammars in other formats.  Examples include but are not
+   limited to:
+
+   o  the ABNF form (media type 'application/srgs') of SRGS
+
+   o  Sun's Java Speech Grammar Format (JSGF)
+      [refs.javaSpeechGrammarFormat]
+
+   Additionally, MRCPv2 servers MAY support the Semantic Interpretation
+   for Speech Recognition (SISR)
+   [W3C.REC-semantic-interpretation-20070405] specification.
+
+   When a grammar is specified inline in the request, the client MUST
+   provide a Content-ID for that grammar as part of the content header
+   fields.  If there is no space on the server to store the inline
+   grammar, the request MUST return with a Completion-Cause code of 016
+   "grammar-definition-failure".  Otherwise, the server MUST associate
+   the inline grammar block with that Content-ID and MUST store it on
+   the server for the duration of the session.  However, if the
+   Content-ID is redefined later in the session through a subsequent
+   DEFINE-GRAMMAR, the inline grammar previously associated with the
+   Content-ID MUST be freed.  If the Content-ID is redefined through a
+   subsequent DEFINE-GRAMMAR with an empty message body (i.e., no
+   grammar definition), then in addition to freeing any grammar
+   previously associated with the Content-ID, the server MUST clear all
+   bindings and associations to the Content-ID.  Unless and until
+   subsequently redefined, this URI MUST be interpreted by the server as
+   one that has never been set.
+
+   Grammars that have been associated with a Content-ID can be
+   referenced through the 'session' URI scheme (see Section 13.6).  For
+   example:
+   session:help@root-level.store
+
+   Grammar data MAY be specified using external URI references.  To do
+   so, the client uses a body of media type 'text/uri-list' (see RFC
+   2483 [RFC2483] ) to list the one or more URIs that point to the
+   grammar data.  The client can use a body of media type 'text/
+   grammar-ref-list' (see Section 13.5.1) if it wants to assign weights
+   to the list of grammar URI.  All MRCPv2 servers MUST support grammar
+   access using the 'http' and 'https' URI schemes.
+
+   If the grammar data the client wishes to be used on a request
+   consists of a mix of URI and inline grammar data, the client uses the
+   'multipart/mixed' media type to enclose the 'text/uri-list',
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 94]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   'application/srgs', or 'application/srgs+xml' content entities.  The
+   character set and encoding used in the grammar data are specified
+   using to standard media type definitions.
+
+   When more than one grammar URI or inline grammar block is specified
+   in a message body of the RECOGNIZE request, the server interprets
+   this as a list of grammar alternatives to match against.
+
+   Content-Type:application/srgs+xml
+   Content-ID:<request1@form-level.store>
+   Content-Length:...
+
+   <?xml version="1.0"?>
+
+   <!-- the default grammar language is US English -->
+   <grammar xmlns="http://www.w3.org/2001/06/grammar"
+            xml:lang="en-US" version="1.0" root="request">
+
+   <!-- single language attachment to tokens -->
+         <rule id="yes">
+               <one-of>
+                     <item xml:lang="fr-CA">oui</item>
+                     <item xml:lang="en-US">yes</item>
+               </one-of>
+         </rule>
+
+   <!-- single language attachment to a rule expansion -->
+         <rule id="request">
+               may I speak to
+               <one-of xml:lang="fr-CA">
+                     <item>Michel Tremblay</item>
+                     <item>Andre Roy</item>
+               </one-of>
+         </rule>
+
+         <!-- multiple language attachment to a token -->
+         <rule id="people1">
+               <token lexicon="en-US,fr-CA"> Robert </token>
+         </rule>
+
+
+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 95]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+         <!-- the equivalent single-language attachment expansion -->
+         <rule id="people2">
+               <one-of>
+                     <item xml:lang="en-US">Robert</item>
+                     <item xml:lang="fr-CA">Robert</item>
+               </one-of>
+         </rule>
+
+         </grammar>
+
+                           SRGS Grammar Example
+
+
+   Content-Type:text/uri-list
+   Content-Length:...
+
+   session:help@root-level.store
+   http://www.example.com/Directory-Name-List.grxml
+   http://www.example.com/Department-List.grxml
+   http://www.example.com/TAC-Contact-List.grxml
+   session:menu1@menu-level.store
+
+                         Grammar Reference Example
+
+
+   Content-Type:multipart/mixed; boundary="break"
+
+   --break
+   Content-Type:text/uri-list
+   Content-Length:...
+
+   http://www.example.com/Directory-Name-List.grxml
+   http://www.example.com/Department-List.grxml
+   http://www.example.com/TAC-Contact-List.grxml
+
+   --break
+   Content-Type:application/srgs+xml
+   Content-ID:<request1@form-level.store>
+   Content-Length:...
+
+
+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 96]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   <?xml version="1.0"?>
+
+   <!-- the default grammar language is US English -->
+   <grammar xmlns="http://www.w3.org/2001/06/grammar"
+            xml:lang="en-US" version="1.0">
+
+   <!-- single language attachment to tokens -->
+         <rule id="yes">
+               <one-of>
+                     <item xml:lang="fr-CA">oui</item>
+                     <item xml:lang="en-US">yes</item>
+               </one-of>
+         </rule>
+
+   <!-- single language attachment to a rule expansion -->
+         <rule id="request">
+               may I speak to
+               <one-of xml:lang="fr-CA">
+                     <item>Michel Tremblay</item>
+                     <item>Andre Roy</item>
+               </one-of>
+         </rule>
+
+         <!-- multiple language attachment to a token -->
+         <rule id="people1">
+               <token lexicon="en-US,fr-CA"> Robert </token>
+         </rule>
+
+         <!-- the equivalent single-language attachment expansion -->
+         <rule id="people2">
+               <one-of>
+                     <item xml:lang="en-US">Robert</item>
+                     <item xml:lang="fr-CA">Robert</item>
+               </one-of>
+         </rule>
+
+         </grammar>
+   --break--
+
+                      Mixed Grammar Reference Example
+
+9.5.2.  Recognizer Result Data
+
+   Recognition results are returned to the client in the message body of
+   the RECOGNITION-COMPLETE event or the GET-RESULT response message as
+   described in Section 6.3.  Element and attribute descriptions for the
+   recognition portion of the NLSML format are provided in Section 9.6
+   with a normative definition of the schema in Section 16.1.
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 97]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Content-Type:application/nlsml+xml
+   Content-Length:...
+
+   <?xml version="1.0"?>
+   <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+           xmlns:ex="http://www.example.com/example"
+           grammar="http://www.example.com/theYesNoGrammar">
+       <interpretation>
+           <instance>
+                   <ex:response>yes</ex:response>
+           </instance>
+           <input>OK</input>
+       </interpretation>
+   </result>
+
+                              Result Example
+
+9.5.3.  Enrollment Result Data
+
+   Enrollment results are returned to the client in the message body of
+   the RECOGNITION-COMPLETE event as described in Section 6.3.  Element
+   and attribute descriptions for the enrollment portion of the NLSML
+   format are provided in Section 9.7 with a normative definition of the
+   schema in Section 16.2.
+
+9.5.4.  Recognizer Context Block
+
+   When a client changes servers while operating on the behalf of the
+   same incoming communication session, this header field allows the
+   client to collect a block of opaque data from one server and provide
+   it to another server.  This capability is desirable if the client
+   needs different language support or because the server issued a
+   redirect.  Here, the first recognizer resource may have collected
+   acoustic and other data during its execution of recognition methods.
+   After a server switch, communicating this data may allow the
+   recognizer resource on the new server to provide better recognition.
+   This block of data is implementation specific and MUST be carried as
+   media type 'application/octets' in the body of the message.
+
+   This block of data is communicated in the SET-PARAMS and GET-PARAMS
+   method/response messages.  In the GET-PARAMS method, if an empty
+   Recognizer-Context-Block header field is present, then the recognizer
+   SHOULD return its vendor-specific context block, if any, in the
+   message body as an entity of media type 'application/octets' with a
+   specific Content-ID.  The Content-ID value MUST also be specified in
+   the Recognizer-Context-Block header field in the GET-PARAMS response.
+   The SET-PARAMS request wishing to provide this vendor-specific data
+   MUST send it in the message body as a typed entity with the same
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 98]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Content-ID that it received from the GET-PARAMS.  The Content-ID MUST
+   also be sent in the Recognizer-Context-Block header field of the
+   SET-PARAMS message.
+
+   Each speech recognition implementation choosing to use this mechanism
+   to hand off recognizer context data among servers MUST distinguish
+   its implementation-specific block of data from other implementations
+   by choosing a Content-ID that is recognizable among the participating
+   servers and unlikely to collide with values chosen by another
+   implementation.
+
+9.6.  Recognizer Results
+
+   The recognizer portion of NLSML (see Section 6.3.1) represents
+   information automatically extracted from a user's utterances by a
+   semantic interpretation component, where "utterance" is to be taken
+   in the general sense of a meaningful user input in any modality
+   supported by the MRCPv2 implementation.
+
+9.6.1.  Markup Functions
+
+   MRCPv2 recognizer resources employ the Natural Language Semantics
+   Markup Language (NLSML) to interpret natural language speech input
+   and to format the interpretation for consumption by an MRCPv2 client.
+
+   The elements of the markup fall into the following general functional
+   categories: interpretation, side information, and multi-modal
+   integration.
+
+9.6.1.1.  Interpretation
+
+   Elements and attributes represent the semantics of a user's
+   utterance, including the <result>, <interpretation>, and <instance>
+   elements.  The <result> element contains the full result of
+   processing one utterance.  It MAY contain multiple <interpretation>
+   elements if the interpretation of the utterance results in multiple
+   alternative meanings due to uncertainty in speech recognition or
+   natural language understanding.  There are at least two reasons for
+   providing multiple interpretations:
+
+   1.  The client application might have additional information, for
+       example, information from a database, that would allow it to
+       select a preferred interpretation from among the possible
+       interpretations returned from the semantic interpreter.
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                   [Page 99]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   2.  A client-based dialog manager (e.g., VoiceXML
+       [W3C.REC-voicexml20-20040316]) that was unable to select between
+       several competing interpretations could use this information to
+       go back to the user and find out what was intended.  For example,
+       it could issue a SPEAK request to a synthesizer resource to emit
+       "Did you say 'Boston' or 'Austin'?"
+
+9.6.1.2.  Side Information
+
+   These are elements and attributes representing additional information
+   about the interpretation, over and above the interpretation itself.
+   Side information includes:
+
+   1.  Whether an interpretation was achieved (the <nomatch> element)
+       and the system's confidence in an interpretation (the
+       "confidence" attribute of <interpretation>).
+
+   2.  Alternative interpretations (<interpretation>)
+
+   3.  Input formats and Automatic Speech Recognition (ASR) information:
+       the <input> element, representing the input to the semantic
+       interpreter.
+
+9.6.1.3.  Multi-Modal Integration
+
+   When more than one modality is available for input, the
+   interpretation of the inputs needs to be coordinated.  The "mode"
+   attribute of <input> supports this by indicating whether the
+   utterance was input by speech, DTMF, pointing, etc.  The "timestamp-
+   start" and "timestamp-end" attributes of <input> also provide for
+   temporal coordination by indicating when inputs occurred.
+
+9.6.2.  Overview of Recognizer Result Elements and Their Relationships
+
+   The recognizer elements in NLSML fall into two categories:
+
+   1.  description of the input that was processed, and
+
+   2.  description of the meaning which was extracted from the input.
+
+   Next to each element are its attributes.  In addition, some elements
+   can contain multiple instances of other elements.  For example, a
+   <result> can contain multiple <interpretation> elements, each of
+   which is taken to be an alternative.  Similarly, <input> can contain
+   multiple child <input> elements, which are taken to be cumulative.
+   To illustrate the basic usage of these elements, as a simple example,
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 100]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   consider the utterance "OK" (interpreted as "yes").  The example
+   illustrates how that utterance and its interpretation would be
+   represented in the NLSML markup.
+
+   <?xml version="1.0"?>
+   <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+           xmlns:ex="http://www.example.com/example"
+           grammar="http://www.example.com/theYesNoGrammar">
+     <interpretation>
+        <instance>
+           <ex:response>yes</ex:response>
+         </instance>
+       <input>OK</input>
+     </interpretation>
+   </result>
+
+   This example includes only the minimum required information.  There
+   is an overall <result> element, which includes one interpretation and
+   an input element.  The interpretation contains the application-
+   specific element "<response>", which is the semantically interpreted
+   result.
+
+9.6.3.  Elements and Attributes
+
+9.6.3.1.  <result> Root Element
+
+   The root element of the markup is <result>.  The <result> element
+   includes one or more <interpretation> elements.  Multiple
+   interpretations can result from ambiguities in the input or in the
+   semantic interpretation.  If the "grammar" attribute does not apply
+   to all of the interpretations in the result, it can be overridden for
+   individual interpretations at the <interpretation> level.
+
+   Attributes:
+
+   1.  grammar: The grammar or recognition rule matched by this result.
+       The format of the grammar attribute will match the rule reference
+       semantics defined in the grammar specification.  Specifically,
+       the rule reference is in the external XML form for grammar rule
+       references.  The markup interpreter needs to know the grammar
+       rule that is matched by the utterance because multiple rules may
+       be simultaneously active.  The value is the grammar URI used by
+       the markup interpreter to specify the grammar.  The grammar can
+       be overridden by a grammar attribute in the <interpretation>
+       element if the input was ambiguous as to which grammar it
+       matched.  If all interpretation elements within the result
+       element contain their own grammar attributes, the attribute can
+       be dropped from the result element.
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 101]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   <?xml version="1.0"?>
+   <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+           grammar="http://www.example.com/grammar">
+     <interpretation>
+      ....
+     </interpretation>
+   </result>
+
+9.6.3.2.  <interpretation> Element
+
+   An <interpretation> element contains a single semantic
+   interpretation.
+
+   Attributes:
+
+   1.  confidence: A float value from 0.0-1.0 indicating the semantic
+       analyzer's confidence in this interpretation.  A value of 1.0
+       indicates maximum confidence.  The values are implementation
+       dependent but are intended to align with the value interpretation
+       for the confidence MRCPv2 header field defined in Section 9.4.1.
+       This attribute is OPTIONAL.
+
+   2.  grammar: The grammar or recognition rule matched by this
+       interpretation (if needed to override the grammar specification
+       at the <interpretation> level.)  This attribute is only needed
+       under <interpretation> if it is necessary to override a grammar
+       that was defined at the <result> level.  Note that the grammar
+       attribute for the interpretation element is optional if and only
+       if the grammar attribute is specified in the <result> element.
+
+   Interpretations MUST be sorted best-first by some measure of
+   "goodness".  The goodness measure is "confidence" if present;
+   otherwise, it is some implementation-specific indication of quality.
+
+   The grammar is expected to be specified most frequently at the
+   <result> level.  However, it can be overridden at the
+   <interpretation> level because it is possible that different
+   interpretations may match different grammar rules.
+
+   The <interpretation> element includes an optional <input> element
+   containing the input being analyzed, and at least one <instance>
+   element containing the interpretation of the utterance.
+
+   <interpretation confidence="0.75"
+                   grammar="http://www.example.com/grammar">
+       ...
+   </interpretation>
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 102]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+9.6.3.3.  <instance> Element
+
+   The <instance> element contains the interpretation of the utterance.
+   When the Semantic Interpretation for Speech Recognition format is
+   used, the <instance> element contains the XML serialization of the
+   result using the approach defined in that specification.  When there
+   is semantic markup in the grammar that does not create semantic
+   objects, but instead only does a semantic translation of a portion of
+   the input, such as translating "coke" to "coca-cola", the instance
+   contains the whole input but with the translation applied.  The NLSML
+   looks like the markup in Figure 2 below.  If there are no semantic
+   objects created, nor any semantic translation, the instance value is
+   the same as the input value.
+
+   Attributes:
+
+   1.  confidence: Each element of the instance MAY have a confidence
+       attribute, defined in the NLSML namespace.  The confidence
+       attribute contains a float value in the range from 0.0-1.0
+       reflecting the system's confidence in the analysis of that slot.
+       A value of 1.0 indicates maximum confidence.  The values are
+       implementation dependent, but are intended to align with the
+       value interpretation for the MRCPv2 header field Confidence-
+       Threshold defined in Section 9.4.1.  This attribute is OPTIONAL.
+
+   <instance>
+     <nameAddress>
+         <street confidence="0.75">123 Maple Street</street>
+         <city>Mill Valley</city>
+         <state>CA</state>
+         <zip>90952</zip>
+     </nameAddress>
+   </instance>
+   <input>
+     My address is 123 Maple Street,
+     Mill Valley, California, 90952
+   </input>
+
+
+   <instance>
+       I would like to buy a coca-cola
+   </instance>
+   <input>
+     I would like to buy a coke
+   </input>
+
+                          Figure 2: NSLML Example
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 103]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+9.6.3.4.  <input> Element
+
+   The <input> element is the text representation of a user's input.  It
+   includes an optional "confidence" attribute, which indicates the
+   recognizer's confidence in the recognition result (as opposed to the
+   confidence in the interpretation, which is indicated by the
+   "confidence" attribute of <interpretation>).  Optional "timestamp-
+   start" and "timestamp-end" attributes indicate the start and end
+   times of a spoken utterance, in ISO 8601 format [ISO.8601.1988].
+
+   Attributes:
+
+   1.  timestamp-start: The time at which the input began. (optional)
+
+   2.  timestamp-end: The time at which the input ended. (optional)
+
+   3.  mode: The modality of the input, for example, speech, DTMF, etc.
+       (optional)
+
+   4.  confidence: The confidence of the recognizer in the correctness
+       of the input in the range 0.0 to 1.0. (optional)
+
+   Note that it may not make sense for temporally overlapping inputs to
+   have the same mode; however, this constraint is not expected to be
+   enforced by implementations.
+
+   When there is no time zone designator, ISO 8601 time representations
+   default to local time.
+
+   There are three possible formats for the <input> element.
+
+   1.  The <input> element can contain simple text:
+
+       <input>onions</input>
+
+       A future possibility is for <input> to contain not only text but
+       additional markup that represents prosodic information that was
+       contained in the original utterance and extracted by the speech
+       recognizer.  This depends on the availability of ASRs that are
+       capable of producing prosodic information.  MRCPv2 clients MUST
+       be prepared to receive such markup and MAY make use of it.
+
+   2.  An <input> tag can also contain additional <input> tags.  Having
+       additional input elements allows the representation to support
+       future multi-modal inputs as well as finer-grained speech
+       information, such as timestamps for individual words and word-
+       level confidences.
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 104]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+       <input>
+            <input mode="speech" confidence="0.5"
+                timestamp-start="2000-04-03T0:00:00"
+                timestamp-end="2000-04-03T0:00:00.2">fried</input>
+            <input mode="speech" confidence="1.0"
+                timestamp-start="2000-04-03T0:00:00.25"
+                timestamp-end="2000-04-03T0:00:00.6">onions</input>
+       </input>
+
+   3.  Finally, the <input> element can contain <nomatch> and <noinput>
+       elements, which describe situations in which the speech
+       recognizer received input that it was unable to process or did
+       not receive any input at all, respectively.
+
+9.6.3.5.  <nomatch> Element
+
+   The <nomatch> element under <input> is used to indicate that the
+   semantic interpreter was unable to successfully match any input with
+   confidence above the threshold.  It can optionally contain the text
+   of the best of the (rejected) matches.
+
+   <interpretation>
+      <instance/>
+         <input confidence="0.1">
+            <nomatch/>
+         </input>
+   </interpretation>
+   <interpretation>
+      <instance/>
+      <input mode="speech" confidence="0.1">
+        <nomatch>I want to go to New York</nomatch>
+      </input>
+   </interpretation>
+
+9.6.3.6.  <noinput> Element
+
+   <noinput> indicates that there was no input -- a timeout occurred in
+   the speech recognizer due to silence.
+   <interpretation>
+      <instance/>
+      <input>
+         <noinput/>
+      </input>
+   </interpretation>
+
+   If there are multiple levels of inputs, the most natural place for
+   <nomatch> and <noinput> elements to appear is under the highest level
+   of <input> for <noinput>, and under the appropriate level of
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 105]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   <interpretation> for <nomatch>.  So, <noinput> means "no input at
+   all" and <nomatch> means "no match in speech modality" or "no match
+   in DTMF modality".  For example, to represent garbled speech combined
+   with DTMF "1 2 3 4", the markup would be:
+   <input>
+      <input mode="speech"><nomatch/></input>
+      <input mode="dtmf">1 2 3 4</input>
+   </input>
+
+   Note: while <noinput> could be represented as an attribute of input,
+   <nomatch> cannot, since it could potentially include PCDATA content
+   with the best match.  For parallelism, <noinput> is also an element.
+
+9.7.  Enrollment Results
+
+   All enrollment elements are contained within a single
+   <enrollment-result> element under <result>.  The elements are
+   described below and have the schema defined in Section 16.2.  The
+   following elements are defined:
+
+   1.  num-clashes
+
+   2.  num-good-repetitions
+
+   3.  num-repetitions-still-needed
+
+   4.  consistency-status
+
+   5.  clash-phrase-ids
+
+   6.  transcriptions
+
+   7.  confusable-phrases
+
+9.7.1.  <num-clashes> Element
+
+   The <num-clashes> element contains the number of clashes that this
+   pronunciation has with other pronunciations in an active enrollment
+   session.  The associated Clash-Threshold header field determines the
+   sensitivity of the clash measurement.  Note that clash testing can be
+   turned off completely by setting the Clash-Threshold header field
+   value to 0.
+
+9.7.2.  <num-good-repetitions> Element
+
+   The <num-good-repetitions> element contains the number of consistent
+   pronunciations obtained so far in an active enrollment session.
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 106]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+9.7.3.  <num-repetitions-still-needed> Element
+
+   The <num-repetitions-still-needed> element contains the number of
+   consistent pronunciations that must still be obtained before the new
+   phrase can be added to the enrollment grammar.  The number of
+   consistent pronunciations required is specified by the client in the
+   request header field Num-Min-Consistent-Pronunciations.  The returned
+   value must be 0 before the client can successfully commit a phrase to
+   the grammar by ending the enrollment session.
+
+9.7.4.  <consistency-status> Element
+
+   The <consistency-status> element is used to indicate how consistent
+   the repetitions are when learning a new phrase.  It can have the
+   values of consistent, inconsistent, and undecided.
+
+9.7.5.  <clash-phrase-ids> Element
+
+   The <clash-phrase-ids> element contains the phrase IDs of clashing
+   pronunciation(s), if any.  This element is absent if there are no
+   clashes.
+
+9.7.6.  <transcriptions> Element
+
+   The <transcriptions> element contains the transcriptions returned in
+   the last repetition of the phrase being enrolled.
+
+9.7.7.  <confusable-phrases> Element
+
+   The <confusable-phrases> element contains a list of phrases from a
+   command grammar that are confusable with the phrase being added to
+   the personal grammar.  This element MAY be absent if there are no
+   confusable phrases.
+
+9.8.  DEFINE-GRAMMAR
+
+   The DEFINE-GRAMMAR method, from the client to the server, provides
+   one or more grammars and requests the server to access, fetch, and
+   compile the grammars as needed.  The DEFINE-GRAMMAR method
+   implementation MUST do a fetch of all external URIs that are part of
+   that operation.  If caching is implemented, this URI fetching MUST
+   conform to the cache control hints and parameter header fields
+   associated with the method in deciding whether the URIs should be
+   fetched from cache or from the external server.  If these hints/
+   parameters are not specified in the method, the values set for the
+   session using SET-PARAMS/GET-PARAMS apply.  If it was not set for the
+   session, their default values apply.
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 107]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   If the server resource is in the recognition state, the DEFINE-
+   GRAMMAR request MUST respond with a failure status.
+
+   If the resource is in the idle state and is able to successfully
+   process the supplied grammars, the server MUST return a success code
+   status and the request-state MUST be COMPLETE.
+
+   If the recognizer resource could not define the grammar for some
+   reason (for example, if the download failed, the grammar failed to
+   compile, or the grammar was in an unsupported form), the MRCPv2
+   response for the DEFINE-GRAMMAR method MUST contain a failure status-
+   code of 407 and contain a Completion-Cause header field describing
+   the failure reason.
+
+   C->S:MRCP/2.0 ... DEFINE-GRAMMAR 543257
+   Channel-Identifier:32AECB23433801@speechrecog
+   Content-Type:application/srgs+xml
+   Content-ID:<request1@form-level.store>
+   Content-Length:...
+
+   <?xml version="1.0"?>
+
+   <!-- the default grammar language is US English -->
+   <grammar xmlns="http://www.w3.org/2001/06/grammar"
+            xml:lang="en-US" version="1.0">
+
+   <!-- single language attachment to tokens -->
+   <rule id="yes">
+               <one-of>
+                     <item xml:lang="fr-CA">oui</item>
+                     <item xml:lang="en-US">yes</item>
+               </one-of>
+         </rule>
+
+   <!-- single language attachment to a rule expansion -->
+         <rule id="request">
+               may I speak to
+               <one-of xml:lang="fr-CA">
+                     <item>Michel Tremblay</item>
+                     <item>Andre Roy</item>
+               </one-of>
+         </rule>
+
+         </grammar>
+
+   S->C:MRCP/2.0 ... 543257 200 COMPLETE
+   Channel-Identifier:32AECB23433801@speechrecog
+           Completion-Cause:000 success
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 108]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   C->S:MRCP/2.0 ... DEFINE-GRAMMAR 543258
+   Channel-Identifier:32AECB23433801@speechrecog
+   Content-Type:application/srgs+xml
+   Content-ID:<helpgrammar@root-level.store>
+   Content-Length:...
+
+   <?xml version="1.0"?>
+
+   <!-- the default grammar language is US English -->
+   <grammar xmlns="http://www.w3.org/2001/06/grammar"
+            xml:lang="en-US" version="1.0">
+
+         <rule id="request">
+               I need help
+         </rule>
+
+   S->C:MRCP/2.0 ... 543258 200 COMPLETE
+   Channel-Identifier:32AECB23433801@speechrecog
+           Completion-Cause:000 success
+
+   C->S:MRCP/2.0 ... DEFINE-GRAMMAR 543259
+   Channel-Identifier:32AECB23433801@speechrecog
+   Content-Type:application/srgs+xml
+   Content-ID:<request2@field-level.store>
+   Content-Length:...
+
+   <?xml version="1.0" encoding="UTF-8"?>
+
+   <!DOCTYPE grammar PUBLIC "-//W3C//DTD GRAMMAR 1.0//EN"
+                     "http://www.w3.org/TR/speech-grammar/grammar.dtd">
+
+   <grammar xmlns="http://www.w3.org/2001/06/grammar" xml:lang="en"
+   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+          xsi:schemaLocation="http://www.w3.org/2001/06/grammar
+              http://www.w3.org/TR/speech-grammar/grammar.xsd"
+              version="1.0" mode="voice" root="basicCmd">
+
+   <meta name="author" content="Stephanie Williams"/>
+
+   <rule id="basicCmd" scope="public">
+     <example> please move the window </example>
+     <example> open a file </example>
+
+     <ruleref
+       uri="http://grammar.example.com/politeness.grxml#startPolite"/>
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 109]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+     <ruleref uri="#command"/>
+     <ruleref
+       uri="http://grammar.example.com/politeness.grxml#endPolite"/>
+   </rule>
+
+   <rule id="command">
+     <ruleref uri="#action"/> <ruleref uri="#object"/>
+   </rule>
+
+   <rule id="action">
+      <one-of>
+         <item weight="10"> open   <tag>open</tag>   </item>
+         <item weight="2">  close  <tag>close</tag>  </item>
+         <item weight="1">  delete <tag>delete</tag> </item>
+         <item weight="1">  move   <tag>move</tag>   </item>
+      </one-of>
+   </rule>
+
+   <rule id="object">
+     <item repeat="0-1">
+       <one-of>
+         <item> the </item>
+         <item> a </item>
+       </one-of>
+     </item>
+
+     <one-of>
+         <item> window </item>
+         <item> file </item>
+         <item> menu </item>
+     </one-of>
+   </rule>
+
+   </grammar>
+
+
+   S->C:MRCP/2.0 ... 543259 200 COMPLETE
+   Channel-Identifier:32AECB23433801@speechrecog
+           Completion-Cause:000 success
+
+   C->S:MRCP/2.0 ... RECOGNIZE 543260
+   Channel-Identifier:32AECB23433801@speechrecog
+           N-Best-List-Length:2
+   Content-Type:text/uri-list
+   Content-Length:...
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 110]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   session:request1@form-level.store
+   session:request2@field-level.store
+   session:helpgramar@root-level.store
+
+   S->C:MRCP/2.0 ... 543260 200 IN-PROGRESS
+   Channel-Identifier:32AECB23433801@speechrecog
+
+   S->C:MRCP/2.0 ... START-OF-INPUT 543260 IN-PROGRESS
+   Channel-Identifier:32AECB23433801@speechrecog
+
+   S->C:MRCP/2.0 ... RECOGNITION-COMPLETE 543260 COMPLETE
+   Channel-Identifier:32AECB23433801@speechrecog
+   Completion-Cause:000 success
+   Waveform-URI:<http://web.media.com/session123/audio.wav>;
+                size=124535;duration=2340
+   Content-Type:application/x-nlsml
+   Content-Length:...
+
+   <?xml version="1.0"?>
+   <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+           xmlns:ex="http://www.example.com/example"
+           grammar="session:request1@form-level.store">
+           <interpretation>
+               <instance name="Person">
+               <ex:Person>
+                   <ex:Name> Andre Roy </ex:Name>
+               </ex:Person>
+            </instance>
+            <input>   may I speak to Andre Roy </input>
+       </interpretation>
+   </result>
+
+                          Define Grammar Example
+
+9.9.  RECOGNIZE
+
+   The RECOGNIZE method from the client to the server requests the
+   recognizer to start recognition and provides it with one or more
+   grammar references for grammars to match against the input media.
+   The RECOGNIZE method can carry header fields to control the
+   sensitivity, confidence level, and the level of detail in results
+   provided by the recognizer.  These header field values override the
+   current values set by a previous SET-PARAMS method.
+
+   The RECOGNIZE method can request the recognizer resource to operate
+   in normal or hotword mode as specified by the Recognition-Mode header
+   field.  The default value is "normal".  If the resource could not
+   start a recognition, the server MUST respond with a failure status-
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 111]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   code of 407 and a Completion-Cause header field in the response
+   describing the cause of failure.
+
+   The RECOGNIZE request uses the message body to specify the grammars
+   applicable to the request.  The active grammar(s) for the request can
+   be specified in one of three ways.  If the client needs to explicitly
+   control grammar weights for the recognition operation, it MUST employ
+   method 3 below.  The order of these grammars specifies the precedence
+   of the grammars that is used when more than one grammar in the list
+   matches the speech; in this case, the grammar with the higher
+   precedence is returned as a match.  This precedence capability is
+   useful in applications like VoiceXML browsers to order grammars
+   specified at the dialog, document, and root level of a VoiceXML
+   application.
+
+   1.  The grammar MAY be placed directly in the message body as typed
+       content.  If more than one grammar is included in the body, the
+       order of inclusion controls the corresponding precedence for the
+       grammars during recognition, with earlier grammars in the body
+       having a higher precedence than later ones.
+
+   2.  The body MAY contain a list of grammar URIs specified in content
+       of media type 'text/uri-list' [RFC2483].  The order of the URIs
+       determines the corresponding precedence for the grammars during
+       recognition, with highest precedence first and decreasing for
+       each URI thereafter.
+
+   3.  The body MAY contain a list of grammar URIs specified in content
+       of media type 'text/grammar-ref-list'.  This type defines a list
+       of grammar URIs and allows each grammar URI to be assigned a
+       weight in the list.  This weight has the same meaning as the
+       weights described in Section 2.4.1 of the Speech Grammar Markup
+       Format (SRGS) [W3C.REC-speech-grammar-20040316].
+
+   In addition to performing recognition on the input, the recognizer
+   MUST also enroll the collected utterance in a personal grammar if the
+   Enroll-Utterance header field is set to true and an Enrollment is
+   active (via an earlier execution of the START-PHRASE-ENROLLMENT
+   method).  If so, and if the RECOGNIZE request contains a Content-ID
+   header field, then the resulting grammar (which includes the personal
+   grammar as a sub-grammar) can be referenced through the 'session' URI
+   scheme (see Section 13.6).
+
+   If the resource was able to successfully start the recognition, the
+   server MUST return a success status-code and a request-state of
+   IN-PROGRESS.  This means that the recognizer is active and that the
+   client MUST be prepared to receive further events with this
+   request-id.
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 112]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   If the resource was able to queue the request, the server MUST return
+   a success code and request-state of PENDING.  This means that the
+   recognizer is currently active with another request and that this
+   request has been queued for processing.
+
+   If the resource could not start a recognition, the server MUST
+   respond with a failure status-code of 407 and a Completion-Cause
+   header field in the response describing the cause of failure.
+
+   For the recognizer resource, RECOGNIZE and INTERPRET are the only
+   requests that return a request-state of IN-PROGRESS, meaning that
+   recognition is in progress.  When the recognition completes by
+   matching one of the grammar alternatives or by a timeout without a
+   match or for some other reason, the recognizer resource MUST send the
+   client a RECOGNITION-COMPLETE event (or INTERPRETATION-COMPLETE, if
+   INTERPRET was the request) with the result of the recognition and a
+   request-state of COMPLETE.
+
+   Large grammars can take a long time for the server to compile.  For
+   grammars that are used repeatedly, the client can improve server
+   performance by issuing a DEFINE-GRAMMAR request with the grammar
+   ahead of time.  In such a case, the client can issue the RECOGNIZE
+   request and reference the grammar through the 'session' URI scheme
+   (see Section 13.6).  This also applies in general if the client wants
+   to repeat recognition with a previous inline grammar.
+
+   The RECOGNIZE method implementation MUST do a fetch of all external
+   URIs that are part of that operation.  If caching is implemented,
+   this URI fetching MUST conform to the cache control hints and
+   parameter header fields associated with the method in deciding
+   whether it should be fetched from cache or from the external server.
+   If these hints/parameters are not specified in the method, the values
+   set for the session using SET-PARAMS/GET-PARAMS apply.  If it was not
+   set for the session, their default values apply.
+
+   Note that since the audio and the messages are carried over separate
+   communication paths there may be a race condition between the start
+   of the flow of audio and the receipt of the RECOGNIZE method.  For
+   example, if an audio flow is started by the client at the same time
+   as the RECOGNIZE method is sent, either the audio or the RECOGNIZE
+   can arrive at the recognizer first.  As another example, the client
+   may choose to continuously send audio to the server and signal the
+   server to recognize using the RECOGNIZE method.  Mechanisms to
+   resolve this condition are outside the scope of this specification.
+   The recognizer can expect the media to start flowing when it receives
+   the RECOGNIZE request, but it MUST NOT buffer anything it receives
+   beforehand in order to preserve the semantics that application
+   authors expect with respect to the input timers.
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 113]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   When a RECOGNIZE method has been received, the recognition is
+   initiated on the stream.  The No-Input-Timer MUST be started at this
+   time if the Start-Input-Timers header field is specified as "true".
+   If this header field is set to "false", the No-Input-Timer MUST be
+   started when it receives the START-INPUT-TIMERS method from the
+   client.  The Recognition-Timeout MUST be started when the recognition
+   resource detects speech or a DTMF digit in the media stream.
+
+   For recognition when not in hotword mode:
+
+   When the recognizer resource detects speech or a DTMF digit in the
+   media stream, it MUST send the START-OF-INPUT event.  When enough
+   speech has been collected for the server to process, the recognizer
+   can try to match the collected speech with the active grammars.  If
+   the speech collected at this point fully matches with any of the
+   active grammars, the Speech-Complete-Timer is started.  If it matches
+   partially with one or more of the active grammars, with more speech
+   needed before a full match is achieved, then the Speech-Incomplete-
+   Timer is started.
+
+   1.  When the No-Input-Timer expires, the recognizer MUST complete
+       with a Completion-Cause code of "no-input-timeout".
+
+   2.  The recognizer MUST support detecting a no-match condition upon
+       detecting end of speech.  The recognizer MAY support detecting a
+       no-match condition before waiting for end-of-speech.  If this is
+       supported, this capability is enabled by setting the Early-No-
+       Match header field to "true".  Upon detecting a no-match
+       condition, the RECOGNIZE MUST return with "no-match".
+
+   3.  When the Speech-Incomplete-Timer expires, the recognizer SHOULD
+       complete with a Completion-Cause code of "partial-match", unless
+       the recognizer cannot differentiate a partial-match, in which
+       case it MUST return a Completion-Cause code of "no-match".  The
+       recognizer MAY return results for the partially matched grammar.
+
+   4.  When the Speech-Complete-Timer expires, the recognizer MUST
+       complete with a Completion-Cause code of "success".
+
+
+
+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 114]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   5.  When the Recognition-Timeout expires, one of the following MUST
+       happen:
+
+       5.1.  If there was a partial-match, the recognizer SHOULD
+             complete with a Completion-Cause code of "partial-match-
+             maxtime", unless the recognizer cannot differentiate a
+             partial-match, in which case it MUST complete with a
+             Completion-Cause code of "no-match-maxtime".  The
+             recognizer MAY return results for the partially matched
+             grammar.
+
+       5.2.  If there was a full-match, the recognizer MUST complete
+             with a Completion-Cause code of "success-maxtime".
+
+       5.3.  If there was a no match, the recognizer MUST complete with
+             a Completion-Cause code of "no-match-maxtime".
+
+   For recognition in hotword mode:
+
+   Note that for recognition in hotword mode the START-OF-INPUT event is
+   not generated when speech or a DTMF digit is detected.
+
+   1.  When the No-Input-Timer expires, the recognizer MUST complete
+       with a Completion-Cause code of "no-input-timeout".
+
+   2.  If at any point a match occurs, the RECOGNIZE MUST complete with
+       a Completion-Cause code of "success".
+
+   3.  When the Recognition-Timeout expires and there is not a match,
+       the RECOGNIZE MUST complete with a Completion-Cause code of
+       "hotword-maxtime".
+
+   4.  When the Recognition-Timeout expires and there is a match, the
+       RECOGNIZE MUST complete with a Completion-Cause code of "success-
+       maxtime".
+
+   5.  When the Recognition-Timeout is running but the detected speech/
+       DTMF has not resulted in a match, the Recognition-Timeout MUST be
+       stopped and reset.  It MUST then be restarted when speech/DTMF is
+       again detected.
+
+   Below is a complete example of using RECOGNIZE.  It shows the call to
+   RECOGNIZE, the IN-PROGRESS and START-OF-INPUT status messages, and
+   the final RECOGNITION-COMPLETE message containing the result.
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 115]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   C->S:MRCP/2.0 ... RECOGNIZE 543257
+   Channel-Identifier:32AECB23433801@speechrecog
+           Confidence-Threshold:0.9
+   Content-Type:application/srgs+xml
+   Content-ID:<request1@form-level.store>
+   Content-Length:...
+
+   <?xml version="1.0"?>
+
+   <!-- the default grammar language is US English -->
+   <grammar xmlns="http://www.w3.org/2001/06/grammar"
+            xml:lang="en-US" version="1.0" root="request">
+
+   <!-- single language attachment to tokens -->
+       <rule id="yes">
+               <one-of>
+                     <item xml:lang="fr-CA">oui</item>
+                     <item xml:lang="en-US">yes</item>
+               </one-of>
+         </rule>
+
+   <!-- single language attachment to a rule expansion -->
+         <rule id="request">
+               may I speak to
+               <one-of xml:lang="fr-CA">
+                     <item>Michel Tremblay</item>
+                     <item>Andre Roy</item>
+               </one-of>
+         </rule>
+
+     </grammar>
+
+   S->C: MRCP/2.0 ... 543257 200 IN-PROGRESS
+   Channel-Identifier:32AECB23433801@speechrecog
+
+   S->C:MRCP/2.0 ... START-OF-INPUT 543257 IN-PROGRESS
+   Channel-Identifier:32AECB23433801@speechrecog
+
+   S->C:MRCP/2.0 ... RECOGNITION-COMPLETE 543257 COMPLETE
+   Channel-Identifier:32AECB23433801@speechrecog
+   Completion-Cause:000 success
+   Waveform-URI:<http://web.media.com/session123/audio.wav>;
+                 size=424252;duration=2543
+   Content-Type:application/nlsml+xml
+   Content-Length:...
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 116]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   <?xml version="1.0"?>
+   <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+           xmlns:ex="http://www.example.com/example"
+           grammar="session:request1@form-level.store">
+       <interpretation>
+           <instance name="Person">
+               <ex:Person>
+                   <ex:Name> Andre Roy </ex:Name>
+               </ex:Person>
+           </instance>
+               <input>   may I speak to Andre Roy </input>
+       </interpretation>
+   </result>
+
+   Below is an example of calling RECOGNIZE with a different grammar.
+   No status or completion messages are shown in this example, although
+   they would of course occur in normal usage.
+
+   C->S:   MRCP/2.0 ... RECOGNIZE 543257
+           Channel-Identifier:32AECB23433801@speechrecog
+           Confidence-Threshold:0.9
+           Fetch-Timeout:20
+           Content-Type:application/srgs+xml
+           Content-Length:...
+
+           <?xml version="1.0"? Version="1.0" mode="voice"
+                 root="Basic md">
+            <rule id="rule_list" scope="public">
+                <one-of>
+                    <item weight=10>
+                        <ruleref uri=
+               "http://grammar.example.com/world-cities.grxml#canada"/>
+                   </item>
+                   <item weight=1.5>
+                       <ruleref uri=
+               "http://grammar.example.com/world-cities.grxml#america"/>
+                   </item>
+                  <item weight=0.5>
+                       <ruleref uri=
+               "http://grammar.example.com/world-cities.grxml#india"/>
+                  </item>
+              </one-of>
+           </rule>
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 117]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+9.10.  STOP
+
+   The STOP method from the client to the server tells the resource to
+   stop recognition if a request is active.  If a RECOGNIZE request is
+   active and the STOP request successfully terminated it, then the
+   response header section contains an Active-Request-Id-List header
+   field containing the request-id of the RECOGNIZE request that was
+   terminated.  In this case, no RECOGNITION-COMPLETE event is sent for
+   the terminated request.  If there was no recognition active, then the
+   response MUST NOT contain an Active-Request-Id-List header field.
+   Either way, the response MUST contain a status-code of 200 "Success".
+
+   C->S:   MRCP/2.0 ... RECOGNIZE 543257
+           Channel-Identifier:32AECB23433801@speechrecog
+           Confidence-Threshold:0.9
+           Content-Type:application/srgs+xml
+           Content-ID:<request1@form-level.store>
+           Content-Length:...
+
+           <?xml version="1.0"?>
+
+           <!-- the default grammar language is US English -->
+           <grammar xmlns="http://www.w3.org/2001/06/grammar"
+                    xml:lang="en-US" version="1.0" root="request">
+
+           <!-- single language attachment to tokens -->
+               <rule id="yes">
+                   <one-of>
+                         <item xml:lang="fr-CA">oui</item>
+                         <item xml:lang="en-US">yes</item>
+                   </one-of>
+               </rule>
+
+           <!-- single language attachment to a rule expansion -->
+               <rule id="request">
+               may I speak to
+                   <one-of xml:lang="fr-CA">
+                         <item>Michel Tremblay</item>
+                         <item>Andre Roy</item>
+                   </one-of>
+               </rule>
+           </grammar>
+
+   S->C:   MRCP/2.0 ... 543257 200 IN-PROGRESS
+           Channel-Identifier:32AECB23433801@speechrecog
+
+   C->S:   MRCP/2.0 ... STOP 543258 200
+           Channel-Identifier:32AECB23433801@speechrecog
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 118]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   S->C:   MRCP/2.0 ... 543258 200 COMPLETE
+           Channel-Identifier:32AECB23433801@speechrecog
+           Active-Request-Id-List:543257
+
+9.11.  GET-RESULT
+
+   The GET-RESULT method from the client to the server MAY be issued
+   when the recognizer resource is in the recognized state.  This
+   request allows the client to retrieve results for a completed
+   recognition.  This is useful if the client decides it wants more
+   alternatives or more information.  When the server receives this
+   request, it re-computes and returns the results according to the
+   recognition constraints provided in the GET-RESULT request.
+
+   The GET-RESULT request can specify constraints such as a different
+   confidence-threshold or n-best-list-length.  This capability is
+   OPTIONAL for MRCPv2 servers and the automatic speech recognition
+   engine in the server MUST return a status of unsupported feature if
+   not supported.
+
+   C->S:   MRCP/2.0 ... GET-RESULT 543257
+           Channel-Identifier:32AECB23433801@speechrecog
+           Confidence-Threshold:0.9
+
+
+   S->C:   MRCP/2.0 ... 543257 200 COMPLETE
+           Channel-Identifier:32AECB23433801@speechrecog
+           Content-Type:application/nlsml+xml
+           Content-Length:...
+
+           <?xml version="1.0"?>
+           <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+                   xmlns:ex="http://www.example.com/example"
+                   grammar="session:request1@form-level.store">
+               <interpretation>
+                   <instance name="Person">
+                       <ex:Person>
+                           <ex:Name> Andre Roy </ex:Name>
+                       </ex:Person>
+                   </instance>
+                   <input>   may I speak to Andre Roy </input>
+               </interpretation>
+           </result>
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 119]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+9.12.  START-OF-INPUT
+
+   This is an event from the server to the client indicating that the
+   recognizer resource has detected speech or a DTMF digit in the media
+   stream.  This event is useful in implementing kill-on-barge-in
+   scenarios when a synthesizer resource is in a different session from
+   the recognizer resource and hence is not aware of an incoming audio
+   source (see Section 8.4.2).  In these cases, it is up to the client
+   to act as an intermediary and respond to this event by issuing a
+   BARGE-IN-OCCURRED event to the synthesizer resource.  The recognizer
+   resource also MUST send a Proxy-Sync-Id header field with a unique
+   value for this event.
+
+   This event MUST be generated by the server, irrespective of whether
+   or not the synthesizer and recognizer are on the same server.
+
+9.13.  START-INPUT-TIMERS
+
+   This request is sent from the client to the recognizer resource when
+   it knows that a kill-on-barge-in prompt has finished playing (see
+   Section 8.4.2).  This is useful in the scenario when the recognition
+   and synthesizer engines are not in the same session.  When a kill-on-
+   barge-in prompt is being played, the client may want a RECOGNIZE
+   request to be simultaneously active so that it can detect and
+   implement kill-on-barge-in.  But at the same time the client doesn't
+   want the recognizer to start the no-input timers until the prompt is
+   finished.  The Start-Input-Timers header field in the RECOGNIZE
+   request allows the client to say whether or not the timers should be
+   started immediately.  If not, the recognizer resource MUST NOT start
+   the timers until the client sends a START-INPUT-TIMERS method to the
+   recognizer.
+
+9.14.  RECOGNITION-COMPLETE
+
+   This is an event from the recognizer resource to the client
+   indicating that the recognition completed.  The recognition result is
+   sent in the body of the MRCPv2 message.  The request-state field MUST
+   be COMPLETE indicating that this is the last event with that
+   request-id and that the request with that request-id is now complete.
+   The server MUST maintain the recognizer context containing the
+   results and the audio waveform input of that recognition until the
+   next RECOGNIZE request is issued for that resource or the session
+   terminates.  If the server returns a URI to the audio waveform, it
+   MUST do so in a Waveform-URI header field in the RECOGNITION-COMPLETE
+   event.  The client can use this URI to retrieve or playback the
+   audio.
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 120]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Note, if an enrollment session was active, the RECOGNITION-COMPLETE
+   event can contain either recognition or enrollment results depending
+   on what was spoken.  The following example shows a complete exchange
+   with a recognition result.
+
+   C->S:   MRCP/2.0 ... RECOGNIZE 543257
+           Channel-Identifier:32AECB23433801@speechrecog
+           Confidence-Threshold:0.9
+           Content-Type:application/srgs+xml
+           Content-ID:<request1@form-level.store>
+           Content-Length:...
+
+           <?xml version="1.0"?>
+
+           <!-- the default grammar language is US English -->
+           <grammar xmlns="http://www.w3.org/2001/06/grammar"
+                    xml:lang="en-US" version="1.0" root="request">
+
+           <!-- single language attachment to tokens -->
+               <rule id="yes">
+                      <one-of>
+                          <item xml:lang="fr-CA">oui</item>
+                          <item xml:lang="en-US">yes</item>
+                      </one-of>
+                 </rule>
+
+           <!-- single language attachment to a rule expansion -->
+                 <rule id="request">
+                     may I speak to
+                      <one-of xml:lang="fr-CA">
+                             <item>Michel Tremblay</item>
+                             <item>Andre Roy</item>
+                      </one-of>
+                 </rule>
+           </grammar>
+
+   S->C:   MRCP/2.0 ... 543257 200 IN-PROGRESS
+           Channel-Identifier:32AECB23433801@speechrecog
+
+   S->C:   MRCP/2.0 ... START-OF-INPUT 543257 IN-PROGRESS
+           Channel-Identifier:32AECB23433801@speechrecog
+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 121]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   S->C:   MRCP/2.0 ... RECOGNITION-COMPLETE 543257 COMPLETE
+           Channel-Identifier:32AECB23433801@speechrecog
+           Completion-Cause:000 success
+           Waveform-URI:<http://web.media.com/session123/audio.wav>;
+                        size=342456;duration=25435
+           Content-Type:application/nlsml+xml
+           Content-Length:...
+
+           <?xml version="1.0"?>
+           <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+                   xmlns:ex="http://www.example.com/example"
+                   grammar="session:request1@form-level.store">
+               <interpretation>
+                   <instance name="Person">
+                       <ex:Person>
+                           <ex:Name> Andre Roy </ex:Name>
+                       </ex:Person>
+                   </instance>
+                   <input>   may I speak to Andre Roy </input>
+               </interpretation>
+           </result>
+
+   If the result were instead an enrollment result, the final message
+   from the server above could have been:
+
+   S->C:   MRCP/2.0 ... RECOGNITION-COMPLETE 543257 COMPLETE
+           Channel-Identifier:32AECB23433801@speechrecog
+           Completion-Cause:000 success
+           Content-Type:application/nlsml+xml
+           Content-Length:...
+
+           <?xml version= "1.0"?>
+           <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+                   grammar="Personal-Grammar-URI">
+               <enrollment-result>
+                   <num-clashes> 2 </num-clashes>
+                   <num-good-repetitions> 1 </num-good-repetitions>
+                   <num-repetitions-still-needed>
+                      1
+                   </num-repetitions-still-needed>
+                   <consistency-status> consistent </consistency-status>
+                   <clash-phrase-ids>
+                       <item> Jeff </item> <item> Andre </item>
+                   </clash-phrase-ids>
+                   <transcriptions>
+                        <item> m ay b r ow k er </item>
+                        <item> m ax r aa k ah </item>
+                   </transcriptions>
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 122]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+                   <confusable-phrases>
+                        <item>
+                             <phrase> call </phrase>
+                             <confusion-level> 10 </confusion-level>
+                        </item>
+                   </confusable-phrases>
+               </enrollment-result>
+           </result>
+
+9.15.  START-PHRASE-ENROLLMENT
+
+   The START-PHRASE-ENROLLMENT method from the client to the server
+   starts a new phrase enrollment session during which the client can
+   call RECOGNIZE multiple times to enroll a new utterance in a grammar.
+   An enrollment session consists of a set of calls to RECOGNIZE in
+   which the caller speaks a phrase several times so the system can
+   "learn" it.  The phrase is then added to a personal grammar (speaker-
+   trained grammar), so that the system can recognize it later.
+
+   Only one phrase enrollment session can be active at a time for a
+   resource.  The Personal-Grammar-URI identifies the grammar that is
+   used during enrollment to store the personal list of phrases.  Once
+   RECOGNIZE is called, the result is returned in a RECOGNITION-COMPLETE
+   event and will contain either an enrollment result OR a recognition
+   result for a regular recognition.
+
+   Calling END-PHRASE-ENROLLMENT ends the ongoing phrase enrollment
+   session, which is typically done after a sequence of successful calls
+   to RECOGNIZE.  This method can be called to commit the new phrase to
+   the personal grammar or to abort the phrase enrollment session.
+
+   The grammar to contain the new enrolled phrase, specified by
+   Personal-Grammar-URI, is created if it does not exist.  Also, the
+   personal grammar MUST ONLY contain phrases added via a phrase
+   enrollment session.
+
+   The Phrase-ID passed to this method is used to identify this phrase
+   in the grammar and will be returned as the speech input when doing a
+   RECOGNIZE on the grammar.  The Phrase-NL similarly is returned in a
+   RECOGNITION-COMPLETE event in the same manner as other Natural
+   Language (NL) in a grammar.  The tag-format of this NL is
+   implementation specific.
+
+   If the client has specified Save-Best-Waveform as true, then the
+   response after ending the phrase enrollment session MUST contain the
+   location/URI of a recording of the best repetition of the learned
+   phrase.
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 123]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   C->S:   MRCP/2.0 ... START-PHRASE-ENROLLMENT 543258
+           Channel-Identifier:32AECB23433801@speechrecog
+           Num-Min-Consistent-Pronunciations:2
+           Consistency-Threshold:30
+           Clash-Threshold:12
+           Personal-Grammar-URI:<personal grammar uri>
+           Phrase-Id:<phrase id>
+           Phrase-NL:<NL phrase>
+           Weight:1
+           Save-Best-Waveform:true
+
+   S->C:   MRCP/2.0 ... 543258 200 COMPLETE
+           Channel-Identifier:32AECB23433801@speechrecog
+
+9.16.  ENROLLMENT-ROLLBACK
+
+   The ENROLLMENT-ROLLBACK method discards the last live utterance from
+   the RECOGNIZE operation.  The client can invoke this method when the
+   caller provides undesirable input such as non-speech noises, side-
+   speech, commands, utterance from the RECOGNIZE grammar, etc.  Note
+   that this method does not provide a stack of rollback states.
+   Executing ENROLLMENT-ROLLBACK twice in succession without an
+   intervening recognition operation has no effect the second time.
+
+   C->S:   MRCP/2.0 ... ENROLLMENT-ROLLBACK 543261
+           Channel-Identifier:32AECB23433801@speechrecog
+
+   S->C:   MRCP/2.0 ... 543261 200 COMPLETE
+           Channel-Identifier:32AECB23433801@speechrecog
+
+9.17.  END-PHRASE-ENROLLMENT
+
+   The client MAY call the END-PHRASE-ENROLLMENT method ONLY during an
+   active phrase enrollment session.  It MUST NOT be called during an
+   ongoing RECOGNIZE operation.  To commit the new phrase in the
+   grammar, the client MAY call this method once successive calls to
+   RECOGNIZE have succeeded and Num-Repetitions-Still-Needed has been
+   returned as 0 in the RECOGNITION-COMPLETE event.  Alternatively, the
+   client MAY abort the phrase enrollment session by calling this method
+   with the Abort-Phrase-Enrollment header field.
+
+   If the client has specified Save-Best-Waveform as "true" in the
+   START-PHRASE-ENROLLMENT request, then the response MUST contain a
+   Waveform-URI header whose value is the location/URI of a recording of
+   the best repetition of the learned phrase.
+
+  C->S:   MRCP/2.0 ... END-PHRASE-ENROLLMENT 543262
+          Channel-Identifier:32AECB23433801@speechrecog
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 124]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+  S->C:   MRCP/2.0 ... 543262 200 COMPLETE
+          Channel-Identifier:32AECB23433801@speechrecog
+          Waveform-URI:<http://mediaserver.com/recordings/file1324.wav>;
+                       size=242453;duration=25432
+
+9.18.  MODIFY-PHRASE
+
+   The MODIFY-PHRASE method sent from the client to the server is used
+   to change the phrase ID, NL phrase, and/or weight for a given phrase
+   in a personal grammar.
+
+   If no fields are supplied, then calling this method has no effect.
+
+   C->S:   MRCP/2.0 ... MODIFY-PHRASE 543265
+           Channel-Identifier:32AECB23433801@speechrecog
+           Personal-Grammar-URI:<personal grammar uri>
+           Phrase-Id:<phrase id>
+           New-Phrase-Id:<new phrase id>
+           Phrase-NL:<NL phrase>
+           Weight:1
+
+   S->C:   MRCP/2.0 ... 543265 200 COMPLETE
+           Channel-Identifier:32AECB23433801@speechrecog
+
+9.19.  DELETE-PHRASE
+
+   The DELETE-PHRASE method sent from the client to the server is used
+   to delete a phase that is in a personal grammar and was added through
+   voice enrollment or text enrollment.  If the specified phrase does
+   not exist, this method has no effect.
+
+   C->S:   MRCP/2.0 ... DELETE-PHRASE 543266
+           Channel-Identifier:32AECB23433801@speechrecog
+           Personal-Grammar-URI:<personal grammar uri>
+           Phrase-Id:<phrase id>
+
+   S->C:   MRCP/2.0 ... 543266 200 COMPLETE
+           Channel-Identifier:32AECB23433801@speechrecog
+
+9.20.  INTERPRET
+
+   The INTERPRET method from the client to the server takes as input an
+   Interpret-Text header field containing the text for which the
+   semantic interpretation is desired, and returns, via the
+   INTERPRETATION-COMPLETE event, an interpretation result that is very
+   similar to the one returned from a RECOGNIZE method invocation.  Only
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 125]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   portions of the result relevant to acoustic matching are excluded
+   from the result.  The Interpret-Text header field MUST be included in
+   the INTERPRET request.
+
+   Recognizer grammar data is treated in the same way as it is when
+   issuing a RECOGNIZE method call.
+
+   If a RECOGNIZE, RECORD, or another INTERPRET operation is already in
+   progress for the resource, the server MUST reject the request with a
+   response having a status-code of 402 "Method not valid in this
+   state", and a COMPLETE request state.
+
+   C->S:   MRCP/2.0 ... INTERPRET 543266
+           Channel-Identifier:32AECB23433801@speechrecog
+           Interpret-Text:may I speak to Andre Roy
+           Content-Type:application/srgs+xml
+           Content-ID:<request1@form-level.store>
+           Content-Length:...
+
+           <?xml version="1.0"?>
+           <!-- the default grammar language is US English -->
+           <grammar xmlns="http://www.w3.org/2001/06/grammar"
+                    xml:lang="en-US" version="1.0" root="request">
+           <!-- single language attachment to tokens -->
+               <rule id="yes">
+                   <one-of>
+                       <item xml:lang="fr-CA">oui</item>
+                       <item xml:lang="en-US">yes</item>
+                   </one-of>
+               </rule>
+
+           <!-- single language attachment to a rule expansion -->
+               <rule id="request">
+                   may I speak to
+                   <one-of xml:lang="fr-CA">
+                       <item>Michel Tremblay</item>
+                       <item>Andre Roy</item>
+                   </one-of>
+               </rule>
+           </grammar>
+
+   S->C:   MRCP/2.0 ... 543266 200 IN-PROGRESS
+           Channel-Identifier:32AECB23433801@speechrecog
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 126]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   S->C:   MRCP/2.0 ... INTERPRETATION-COMPLETE 543266 200 COMPLETE
+           Channel-Identifier:32AECB23433801@speechrecog
+           Completion-Cause:000 success
+           Content-Type:application/nlsml+xml
+           Content-Length:...
+
+           <?xml version="1.0"?>
+           <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+                   xmlns:ex="http://www.example.com/example"
+                   grammar="session:request1@form-level.store">
+               <interpretation>
+                   <instance name="Person">
+                       <ex:Person>
+                           <ex:Name> Andre Roy </ex:Name>
+                       </ex:Person>
+                   </instance>
+                   <input>   may I speak to Andre Roy </input>
+               </interpretation>
+           </result>
+
+9.21.  INTERPRETATION-COMPLETE
+
+   This event from the recognizer resource to the client indicates that
+   the INTERPRET operation is complete.  The interpretation result is
+   sent in the body of the MRCP message.  The request state MUST be set
+   to COMPLETE.
+
+   The Completion-Cause header field MUST be included in this event and
+   MUST be set to an appropriate value from the list of cause codes.
+
+   C->S:    MRCP/2.0 ... INTERPRET 543266
+           Channel-Identifier:32AECB23433801@speechrecog
+           Interpret-Text:may I speak to Andre Roy
+           Content-Type:application/srgs+xml
+           Content-ID:<request1@form-level.store>
+           Content-Length:...
+
+           <?xml version="1.0"?>
+           <!-- the default grammar language is US English -->
+           <grammar xmlns="http://www.w3.org/2001/06/grammar"
+                    xml:lang="en-US" version="1.0" root="request">
+           <!-- single language attachment to tokens -->
+               <rule id="yes">
+                   <one-of>
+                       <item xml:lang="fr-CA">oui</item>
+                       <item xml:lang="en-US">yes</item>
+                   </one-of>
+               </rule>
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 127]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+           <!-- single language attachment to a rule expansion -->
+               <rule id="request">
+                   may I speak to
+                   <one-of xml:lang="fr-CA">
+                       <item>Michel Tremblay</item>
+                       <item>Andre Roy</item>
+                   </one-of>
+               </rule>
+           </grammar>
+
+   S->C:    MRCP/2.0 ... 543266 200 IN-PROGRESS
+           Channel-Identifier:32AECB23433801@speechrecog
+
+   S->C:    MRCP/2.0 ... INTERPRETATION-COMPLETE 543266 200 COMPLETE
+           Channel-Identifier:32AECB23433801@speechrecog
+           Completion-Cause:000 success
+           Content-Type:application/nlsml+xml
+           Content-Length:...
+
+           <?xml version="1.0"?>
+           <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+                   xmlns:ex="http://www.example.com/example"
+                   grammar="session:request1@form-level.store">
+               <interpretation>
+                   <instance name="Person">
+                       <ex:Person>
+                           <ex:Name> Andre Roy </ex:Name>
+                       </ex:Person>
+                   </instance>
+                   <input>   may I speak to Andre Roy </input>
+               </interpretation>
+           </result>
+
+9.22.  DTMF Detection
+
+   Digits received as DTMF tones are delivered to the recognition
+   resource in the MRCPv2 server in the RTP stream according to RFC 4733
+   [RFC4733].  The Automatic Speech Recognizer (ASR) MUST support RFC
+   4733 to recognize digits, and it MAY support recognizing DTMF tones
+   [Q.23] in the audio.
+
+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 128]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+10.  Recorder Resource
+
+   This resource captures received audio and video and stores it as
+   content pointed to by a URI.  The main usages of recorders are
+
+   1.  to capture speech audio that may be submitted for recognition at
+       a later time, and
+
+   2.  recording voice or video mails.
+
+   Both these applications require functionality above and beyond those
+   specified by protocols such as RTSP [RFC2326].  This includes audio
+   endpointing (i.e., detecting speech or silence).  The support for
+   video is OPTIONAL and is mainly capturing video mails that may
+   require the speech or audio processing mentioned above.
+
+   A recorder MUST provide endpointing capabilities for suppressing
+   silence at the beginning and end of a recording, and it MAY also
+   suppress silence in the middle of a recording.  If such suppression
+   is done, the recorder MUST maintain timing metadata to indicate the
+   actual time stamps of the recorded media.
+
+   See the discussion on the sensitivity of saved waveforms in
+   Section 12.
+
+10.1.  Recorder State Machine
+
+   Idle                   Recording
+   State                  State
+    |                       |
+    |---------RECORD------->|
+    |                       |
+    |<------STOP------------|
+    |                       |
+    |<--RECORD-COMPLETE-----|
+    |                       |
+    |              |--------|
+    |       START-OF-INPUT  |
+    |              |------->|
+    |                       |
+    |              |--------|
+    |    START-INPUT-TIMERS |
+    |              |------->|
+    |                       |
+
+                          Recorder State Machine
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 129]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+10.2.  Recorder Methods
+
+   The recorder resource supports the following methods.
+
+   recorder-method      =  "RECORD"
+                        /  "STOP"
+                        /  "START-INPUT-TIMERS"
+
+10.3.  Recorder Events
+
+   The recorder resource can generate the following events.
+
+   recorder-event       =  "START-OF-INPUT"
+                        /  "RECORD-COMPLETE"
+
+10.4.  Recorder Header Fields
+
+   Method invocations for the recorder resource can contain resource-
+   specific header fields containing request options and information to
+   augment the Method, Response, or Event message it is associated with.
+
+   recorder-header      =  sensitivity-level
+                        /  no-input-timeout
+                        /  completion-cause
+                        /  completion-reason
+                        /  failed-uri
+                        /  failed-uri-cause
+                        /  record-uri
+                        /  media-type
+                        /  max-time
+                        /  trim-length
+                        /  final-silence
+                        /  capture-on-speech
+                        /  ver-buffer-utterance
+                        /  start-input-timers
+                        /  new-audio-channel
+
+10.4.1.  Sensitivity-Level
+
+   To filter out background noise and not mistake it for speech, the
+   recorder can support a variable level of sound sensitivity.  The
+   Sensitivity-Level header field is a float value between 0.0 and 1.0
+   and allows the client to set the sensitivity level for the recorder.
+   This header field MAY occur in RECORD, SET-PARAMS, or GET-PARAMS.  A
+   higher value for this header field means higher sensitivity.  The
+   default value for this header field is implementation specific.
+
+   sensitivity-level    =     "Sensitivity-Level" ":" FLOAT CRLF
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 130]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+10.4.2.  No-Input-Timeout
+
+   When recording is started and there is no speech detected for a
+   certain period of time, the recorder can send a RECORD-COMPLETE event
+   to the client and terminate the record operation.  The No-Input-
+   Timeout header field can set this timeout value.  The value is in
+   milliseconds.  This header field MAY occur in RECORD, SET-PARAMS, or
+   GET-PARAMS.  The value for this header field ranges from 0 to an
+   implementation-specific maximum value.  The default value for this
+   header field is implementation specific.
+
+   no-input-timeout    =     "No-Input-Timeout" ":" 1*19DIGIT CRLF
+
+10.4.3.  Completion-Cause
+
+   This header field MUST be part of a RECORD-COMPLETE event from the
+   recorder resource to the client.  This indicates the reason behind
+   the RECORD method completion.  This header field MUST be sent in the
+   RECORD responses if they return with a failure status and a COMPLETE
+   state.  In the ABNF below, the 'cause-code' contains a numerical
+   value selected from the Cause-Code column of the following table.
+   The 'cause-name' contains the corresponding token selected from the
+   Cause-Name column.
+
+   completion-cause         =  "Completion-Cause" ":" cause-code SP
+                               cause-name CRLF
+   cause-code               =  3DIGIT
+   cause-name               =  *VCHAR
+
+   +------------+-----------------------+------------------------------+
+   | Cause-Code | Cause-Name            | Description                  |
+   +------------+-----------------------+------------------------------+
+   | 000        | success-silence       | RECORD completed with a      |
+   |            |                       | silence at the end.          |
+   | 001        | success-maxtime       | RECORD completed after       |
+   |            |                       | reaching maximum recording   |
+   |            |                       | time specified in record     |
+   |            |                       | method.                      |
+   | 002        | no-input-timeout      | RECORD failed due to no      |
+   |            |                       | input.                       |
+   | 003        | uri-failure           | Failure accessing the record |
+   |            |                       | URI.                         |
+   | 004        | error                 | RECORD request terminated    |
+   |            |                       | prematurely due to a         |
+   |            |                       | recorder error.              |
+   +------------+-----------------------+------------------------------+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 131]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+10.4.4.  Completion-Reason
+
+   This header field MAY be present in a RECORD-COMPLETE event coming
+   from the recorder resource to the client.  It contains the reason
+   text behind the RECORD request completion.  This header field
+   communicates text describing the reason for the failure.
+
+   The completion reason text is provided for client use in logs and for
+   debugging and instrumentation purposes.  Clients MUST NOT interpret
+   the completion reason text.
+
+   completion-reason        =  "Completion-Reason" ":"
+                               quoted-string CRLF
+
+10.4.5.  Failed-URI
+
+   When a recorder method needs to post the audio to a URI and access to
+   the URI fails, the server MUST provide the failed URI in this header
+   field in the method response.
+
+   failed-uri               =  "Failed-URI" ":" absoluteURI CRLF
+
+10.4.6.  Failed-URI-Cause
+
+   When a recorder method needs to post the audio to a URI and access to
+   the URI fails, the server MAY provide the URI-specific or protocol-
+   specific response code through this header field in the method
+   response.  The value encoding is UTF-8 (RFC 3629 [RFC3629]) to
+   accommodate any access protocol -- some access protocols might have a
+   response string instead of a numeric response code.
+
+   failed-uri-cause         =  "Failed-URI-Cause" ":" 1*UTFCHAR
+                               CRLF
+
+10.4.7.  Record-URI
+
+   When a recorder method contains this header field, the server MUST
+   capture the audio and store it.  If the header field is present but
+   specified with no value, the server MUST store the content locally
+   and generate a URI that points to it.  This URI is then returned in
+   either the STOP response or the RECORD-COMPLETE event.  If the header
+   field in the RECORD method specifies a URI, the server MUST attempt
+   to capture and store the audio at that location.  If this header
+   field is not specified in the RECORD request, the server MUST capture
+   the audio, MUST encode it, and MUST send it in the STOP response or
+   the RECORD-COMPLETE event as a message body.  In this case, the
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 132]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   response carrying the audio content MUST include a Content ID (cid)
+   [RFC2392] value in this header pointing to the Content-ID in the
+   message body.
+
+   The server MUST also return the size in octets and the duration in
+   milliseconds of the recorded audio waveform as parameters associated
+   with the header field.
+
+   Implementations MUST support 'http' [RFC2616], 'https' [RFC2818],
+   'file' [RFC3986], and 'cid' [RFC2392] schemes in the URI.  Note that
+   implementations already exist that support other schemes.
+
+   record-uri               =  "Record-URI" ":" ["<" uri ">"
+                               ";" "size" "=" 1*19DIGIT
+                               ";" "duration" "=" 1*19DIGIT] CRLF
+
+10.4.8.  Media-Type
+
+   A RECORD method MUST contain this header field, which specifies to
+   the server the media type of the captured audio or video.
+
+   media-type               =  "Media-Type" ":" media-type-value
+                               CRLF
+
+10.4.9.  Max-Time
+
+   When recording is started, this specifies the maximum length of the
+   recording in milliseconds, calculated from the time the actual
+   capture and store begins and is not necessarily the time the RECORD
+   method is received.  It specifies the duration before silence
+   suppression, if any, has been applied by the recorder resource.
+   After this time, the recording stops and the server MUST return a
+   RECORD-COMPLETE event to the client having a request-state of
+   COMPLETE.  This header field MAY occur in RECORD, SET-PARAMS, or GET-
+   PARAMS.  The value for this header field ranges from 0 to an
+   implementation-specific maximum value.  A value of 0 means infinity,
+   and hence the recording continues until one or more of the other stop
+   conditions are met.  The default value for this header field is 0.
+
+   max-time                 =  "Max-Time" ":" 1*19DIGIT CRLF
+
+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 133]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+10.4.10.  Trim-Length
+
+   This header field MAY be sent on a STOP method and specifies the
+   length of audio to be trimmed from the end of the recording after the
+   stop.  The length is interpreted to be in milliseconds.  The default
+   value for this header field is 0.
+
+   trim-length                 =  "Trim-Length" ":" 1*19DIGIT CRLF
+
+10.4.11.  Final-Silence
+
+   When the recorder is started and the actual capture begins, this
+   header field specifies the length of silence in the audio that is to
+   be interpreted as the end of the recording.  This header field MAY
+   occur in RECORD, SET-PARAMS, or GET-PARAMS.  The value for this
+   header field ranges from 0 to an implementation-specific maximum
+   value and is interpreted to be in milliseconds.  A value of 0 means
+   infinity, and hence the recording will continue until one of the
+   other stop conditions are met.  The default value for this header
+   field is implementation specific.
+
+   final-silence            =  "Final-Silence" ":" 1*19DIGIT CRLF
+
+10.4.12.  Capture-On-Speech
+
+   If "false", the recorder MUST start capturing immediately when
+   started.  If "true", the recorder MUST wait for the endpointing
+   functionality to detect speech before it starts capturing.  This
+   header field MAY occur in the RECORD, SET-PARAMS, or GET-PARAMS.  The
+   value for this header field is a Boolean.  The default value for this
+   header field is "false".
+
+   capture-on-speech        =  "Capture-On-Speech " ":" BOOLEAN CRLF
+
+10.4.13.  Ver-Buffer-Utterance
+
+   This header field is the same as the one described for the verifier
+   resource (see Section 11.4.14).  This tells the server to buffer the
+   utterance associated with this recording request into the
+   verification buffer.  Sending this header field is permitted only if
+   the verification buffer is for the session.  This buffer is shared
+   across resources within a session.  It gets instantiated when a
+   verifier resource is added to this session and is released when the
+   verifier resource is released from the session.
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 134]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+10.4.14.  Start-Input-Timers
+
+   This header field MAY be sent as part of the RECORD request.  A value
+   of "false" tells the recorder resource to start the operation, but
+   not to start the no-input timer until the client sends a START-INPUT-
+   TIMERS request to the recorder resource.  This is useful in the
+   scenario when the recorder and synthesizer resources are not part of
+   the same session.  When a kill-on-barge-in prompt is being played,
+   the client may want the RECORD request to be simultaneously active so
+   that it can detect and implement kill-on-barge-in (see
+   Section 8.4.2).  But at the same time, the client doesn't want the
+   recorder resource to start the no-input timers until the prompt is
+   finished.  The default value is "true".
+
+   start-input-timers       =  "Start-Input-Timers" ":"
+                               BOOLEAN CRLF
+
+10.4.15.  New-Audio-Channel
+
+   This header field is the same as the one described for the recognizer
+   resource (see Section 9.4.23).
+
+10.5.  Recorder Message Body
+
+   If the RECORD request did not have a Record-URI header field, the
+   STOP response or the RECORD-COMPLETE event MUST contain a message
+   body carrying the captured audio.  In this case, the message carrying
+   the audio content has a Record-URI header field with a Content ID
+   value pointing to the message body entity that contains the recorded
+   audio.  See Section 10.4.7 for details.
+
+10.6.  RECORD
+
+   The RECORD request places the recorder resource in the recording
+   state.  Depending on the header fields specified in the RECORD
+   method, the resource may start recording the audio immediately or
+   wait for the endpointing functionality to detect speech in the audio.
+   The audio is then made available to the client either in the message
+   body or as specified by Record-URI.
+
+   The server MUST support the 'https' URI scheme and MAY support other
+   schemes.  Note that, due to the sensitive nature of voice recordings,
+   any protocols used for dereferencing SHOULD employ integrity and
+   confidentiality, unless other means, such as use of a controlled
+   environment (see Section 4.2), are employed.
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 135]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   If a RECORD operation is already in progress, invoking this method
+   causes the server to issue a response having a status-code of 402
+   "Method not valid in this state" and a request-state of COMPLETE.
+
+   If the Record-URI is not valid, a status-code of 404 "Illegal Value
+   for Header Field" is returned in the response.  If it is impossible
+   for the server to create the requested stored content, a status-code
+   of 407 "Method or Operation Failed" is returned.
+
+   If the type specified in the Media-Type header field is not
+   supported, the server MUST respond with a status-code of 409
+   "Unsupported Header Field Value" with the Media-Type header field in
+   its response.
+
+   When the recording operation is initiated, the response indicates an
+   IN-PROGRESS request state.  The server MAY generate a subsequent
+   START-OF-INPUT event when speech is detected.  Upon completion of the
+   recording operation, the server generates a RECORD-COMPLETE event.
+
+   C->S:  MRCP/2.0 ... RECORD 543257
+          Channel-Identifier:32AECB23433802@recorder
+          Record-URI:<file://mediaserver/recordings/myfile.wav>
+          Media-Type:audio/wav
+          Capture-On-Speech:true
+          Final-Silence:300
+          Max-Time:6000
+
+   S->C:  MRCP/2.0 ... 543257 200 IN-PROGRESS
+          Channel-Identifier:32AECB23433802@recorder
+
+   S->C:  MRCP/2.0 ... START-OF-INPUT 543257 IN-PROGRESS
+          Channel-Identifier:32AECB23433802@recorder
+
+   S->C:  MRCP/2.0 ... RECORD-COMPLETE 543257 COMPLETE
+          Channel-Identifier:32AECB23433802@recorder
+          Completion-Cause:000 success-silence
+          Record-URI:<file://mediaserver/recordings/myfile.wav>;
+                     size=242552;duration=25645
+
+                              RECORD Example
+
+10.7.  STOP
+
+   The STOP method moves the recorder from the recording state back to
+   the idle state.  If a RECORD request is active and the STOP request
+   successfully terminates it, then the STOP response MUST contain an
+   Active-Request-Id-List header field containing the RECORD request-id
+   that was terminated.  In this case, no RECORD-COMPLETE event is sent
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 136]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   for the terminated request.  If there was no recording active, then
+   the response MUST NOT contain an Active-Request-Id-List header field.
+   If the recording was a success, the STOP response MUST contain a
+   Record-URI header field pointing to the recorded audio content or to
+   a typed entity in the body of the STOP response containing the
+   recorded audio.  The STOP method MAY have a Trim-Length header field,
+   in which case the specified length of audio is trimmed from the end
+   of the recording after the stop.  In any case, the response MUST
+   contain a status-code of 200 "Success".
+
+   C->S:  MRCP/2.0 ... RECORD 543257
+          Channel-Identifier:32AECB23433802@recorder
+          Record-URI:<file://mediaserver/recordings/myfile.wav>
+          Capture-On-Speech:true
+          Final-Silence:300
+          Max-Time:6000
+
+   S->C:  MRCP/2.0 ... 543257 200 IN-PROGRESS
+          Channel-Identifier:32AECB23433802@recorder
+
+   S->C:  MRCP/2.0 ... START-OF-INPUT 543257 IN-PROGRESS
+          Channel-Identifier:32AECB23433802@recorder
+
+   C->S:  MRCP/2.0 ... STOP 543257
+          Channel-Identifier:32AECB23433802@recorder
+          Trim-Length:200
+
+   S->C:  MRCP/2.0 ... 543257 200 COMPLETE
+          Channel-Identifier:32AECB23433802@recorder
+          Record-URI:<file://mediaserver/recordings/myfile.wav>;
+                     size=324253;duration=24561
+          Active-Request-Id-List:543257
+
+                               STOP Example
+
+10.8.  RECORD-COMPLETE
+
+   If the recording completes due to no input, silence after speech, or
+   reaching the max-time, the server MUST generate the RECORD-COMPLETE
+   event to the client with a request-state of COMPLETE.  If the
+   recording was a success, the RECORD-COMPLETE event contains a Record-
+   URI header field pointing to the recorded audio file on the server or
+   to a typed entity in the message body containing the recorded audio.
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 137]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   C->S:  MRCP/2.0 ... RECORD 543257
+          Channel-Identifier:32AECB23433802@recorder
+          Record-URI:<file://mediaserver/recordings/myfile.wav>
+          Capture-On-Speech:true
+          Final-Silence:300
+          Max-Time:6000
+
+   S->C:  MRCP/2.0 ... 543257 200 IN-PROGRESS
+          Channel-Identifier:32AECB23433802@recorder
+
+   S->C:  MRCP/2.0 ... START-OF-INPUT 543257 IN-PROGRESS
+          Channel-Identifier:32AECB23433802@recorder
+
+   S->C:  MRCP/2.0 ... RECORD-COMPLETE 543257 COMPLETE
+          Channel-Identifier:32AECB23433802@recorder
+          Completion-Cause:000 success
+          Record-URI:<file://mediaserver/recordings/myfile.wav>;
+                     size=325325;duration=24652
+
+                          RECORD-COMPLETE Example
+
+10.9.  START-INPUT-TIMERS
+
+   This request is sent from the client to the recorder resource when it
+   discovers that a kill-on-barge-in prompt has finished playing (see
+   Section 8.4.2).  This is useful in the scenario when the recorder and
+   synthesizer resources are not in the same MRCPv2 session.  When a
+   kill-on-barge-in prompt is being played, the client wants the RECORD
+   request to be simultaneously active so that it can detect and
+   implement kill-on-barge-in.  But at the same time, the client doesn't
+   want the recorder resource to start the no-input timers until the
+   prompt is finished.  The Start-Input-Timers header field in the
+   RECORD request allows the client to say if the timers should be
+   started or not.  In the above case, the recorder resource does not
+   start the timers until the client sends a START-INPUT-TIMERS method
+   to the recorder.
+
+10.10.  START-OF-INPUT
+
+   The START-OF-INPUT event is returned from the server to the client
+   once the server has detected speech.  This event is always returned
+   by the recorder resource when speech has been detected.  The recorder
+   resource also MUST send a Proxy-Sync-Id header field with a unique
+   value for this event.
+
+   S->C:  MRCP/2.0 ... START-OF-INPUT 543259 IN-PROGRESS
+          Channel-Identifier:32AECB23433801@recorder
+          Proxy-Sync-Id:987654321
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 138]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+11.  Speaker Verification and Identification
+
+   This section describes the methods, responses and events employed by
+   MRCPv2 for doing speaker verification/identification.
+
+   Speaker verification is a voice authentication methodology that can
+   be used to identify the speaker in order to grant the user access to
+   sensitive information and transactions.  Because speech is a
+   biometric, a number of essential security considerations related to
+   biometric authentication technologies apply to its implementation and
+   usage.  Implementers should carefully read Section 12 in this
+   document and the corresponding section of the SPEECHSC requirements
+   [RFC4313].  Implementers and deployers of this technology are
+   strongly encouraged to check the state of the art for any new risks
+   and solutions that might have been developed.
+
+   In speaker verification, a recorded utterance is compared to a
+   previously stored voiceprint, which is in turn associated with a
+   claimed identity for that user.  Verification typically consists of
+   two phases: a designation phase to establish the claimed identity of
+   the caller and an execution phase in which a voiceprint is either
+   created (training) or used to authenticate the claimed identity
+   (verification).
+
+   Speaker identification is the process of associating an unknown
+   speaker with a member in a population.  It does not employ a claim of
+   identity.  When an individual claims to belong to a group (e.g., one
+   of the owners of a joint bank account) a group authentication is
+   performed.  This is generally implemented as a kind of verification
+   involving comparison with more than one voice model.  It is sometimes
+   called 'multi-verification'.  If the individual speaker can be
+   identified from the group, this may be useful for applications where
+   multiple users share the same access privileges to some data or
+   application.  Speaker identification and group authentication are
+   also done in two phases, a designation phase and an execution phase.
+   Note that, from a functionality standpoint, identification can be
+   thought of as a special case of group authentication (if the
+   individual is identified) where the group is the entire population,
+   although the implementation of speaker identification may be
+   different from the way group authentication is performed.  To
+   accommodate single-voiceprint verification, verification against
+   multiple voiceprints, group authentication, and identification, this
+   specification provides a single set of methods that can take a list
+   of identifiers, called "voiceprint identifiers", and return a list of
+   identifiers, with a score for each that represents how well the input
+   speech matched each identifier.  The input and output lists of
+   identifiers do not have to match, allowing a vendor-specific group
+   identifier to be used as input to indicate that identification is to
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 139]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   be performed.  In this specification, the terms "identification" and
+   "multi-verification" are used to indicate that the input represents a
+   group (potentially the entire population) and that results for
+   multiple voiceprints may be returned.
+
+   It is possible for a verifier resource to share the same session with
+   a recognizer resource or to operate independently.  In order to share
+   the same session, the verifier and recognizer resources MUST be
+   allocated from within the same SIP dialog.  Otherwise, an independent
+   verifier resource, running on the same physical server or a separate
+   one, will be set up.  Note that, in addition to allowing both
+   resources to be allocated in the same INVITE, it is possible to
+   allocate one initially and the other later via a re-INVITE.
+
+   Some of the speaker verification methods, described below, apply only
+   to a specific mode of operation.
+
+   The verifier resource has a verification buffer associated with it
+   (see Section 11.4.14).  This allows the storage of speech utterances
+   for the purposes of verification, identification, or training from
+   the buffered speech.  This buffer is owned by the verifier resource,
+   but other input resources (such as the recognizer resource or
+   recorder resource) may write to it.  This allows the speech received
+   as part of a recognition or recording operation to be later used for
+   verification, identification, or training.  Access to the buffer is
+   limited to one operation at time.  Hence, when the resource is doing
+   read, write, or delete operations, such as a RECOGNIZE with
+   ver-buffer-utterance turned on, another operation involving the
+   buffer fails with a status-code of 402.  The verification buffer can
+   be cleared by a CLEAR-BUFFER request from the client and is freed
+   when the verifier resource is deallocated or the session with the
+   server terminates.
+
+   The verification buffer is different from collecting waveforms and
+   processing them using either the real-time audio stream or stored
+   audio, because this buffering mechanism does not simply accumulate
+   speech to a buffer.  The verification buffer MAY contain additional
+   information gathered by the recognizer resource that serves to
+   improve verification performance.
+
+11.1.  Speaker Verification State Machine
+
+   Speaker verification may operate in a training or a verification
+   session.  Starting one of these sessions does not change the state of
+   the verifier resource, i.e., it remains idle.  Once a verification or
+   training session is started, then utterances are trained or verified
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 140]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   by calling the VERIFY or VERIFY-FROM-BUFFER method.  The state of the
+   verifier resources goes from IDLE to VERIFYING state each time VERIFY
+   or VERIFY-FROM-BUFFER is called.
+
+     Idle              Session Opened       Verifying/Training
+     State             State                State
+      |                   |                         |
+      |--START-SESSION--->|                         |
+      |                   |                         |
+      |                   |----------|              |
+      |                   |     START-SESSION       |
+      |                   |<---------|              |
+      |                   |                         |
+      |<--END-SESSION-----|                         |
+      |                   |                         |
+      |                   |---------VERIFY--------->|
+      |                   |                         |
+      |                   |---VERIFY-FROM-BUFFER--->|
+      |                   |                         |
+      |                   |----------|              |
+      |                   |  VERIFY-ROLLBACK        |
+      |                   |<---------|              |
+      |                   |                         |
+      |                   |                |--------|
+      |                   | GET-INTERMEDIATE-RESULT |
+      |                   |                |------->|
+      |                   |                         |
+      |                   |                |--------|
+      |                   |     START-INPUT-TIMERS  |
+      |                   |                |------->|
+      |                   |                         |
+      |                   |                |--------|
+      |                   |         START-OF-INPUT  |
+      |                   |                |------->|
+      |                   |                         |
+      |                   |<-VERIFICATION-COMPLETE--|
+      |                   |                         |
+      |                   |<--------STOP------------|
+      |                   |                         |
+      |                   |----------|              |
+      |                   |         STOP            |
+      |                   |<---------|              |
+      |                   |                         |
+      |----------|        |                         |
+      |         STOP      |                         |
+      |<---------|        |                         |
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 141]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+      |                   |----------|              |
+      |                   |    CLEAR-BUFFER         |
+      |                   |<---------|              |
+      |                   |                         |
+      |----------|        |                         |
+      |   CLEAR-BUFFER    |                         |
+      |<---------|        |                         |
+      |                   |                         |
+      |                   |----------|              |
+      |                   |   QUERY-VOICEPRINT      |
+      |                   |<---------|              |
+      |                   |                         |
+      |----------|        |                         |
+      | QUERY-VOICEPRINT  |                         |
+      |<---------|        |                         |
+      |                   |                         |
+      |                   |----------|              |
+      |                   |  DELETE-VOICEPRINT      |
+      |                   |<---------|              |
+      |                   |                         |
+      |----------|        |                         |
+      | DELETE-VOICEPRINT |                         |
+      |<---------|        |                         |
+
+                      Verifier Resource State Machine
+
+11.2.  Speaker Verification Methods
+
+   The verifier resource supports the following methods.
+
+   verifier-method          =  "START-SESSION"
+                            / "END-SESSION"
+                            / "QUERY-VOICEPRINT"
+                            / "DELETE-VOICEPRINT"
+                            / "VERIFY"
+                            / "VERIFY-FROM-BUFFER"
+                            / "VERIFY-ROLLBACK"
+                            / "STOP"
+                            / "CLEAR-BUFFER"
+                            / "START-INPUT-TIMERS"
+                            / "GET-INTERMEDIATE-RESULT"
+
+   These methods allow the client to control the mode and target of
+   verification or identification operations within the context of a
+   session.  All the verification input operations that occur within a
+   session can be used to create, update, or validate against the
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 142]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   voiceprint specified during the session.  At the beginning of each
+   session, the verifier resource is reset to the state it had prior to
+   any previous verification session.
+
+   Verification/identification operations can be executed against live
+   or buffered audio.  The verifier resource provides methods for
+   collecting and evaluating live audio data, and methods for
+   controlling the verifier resource and adjusting its configured
+   behavior.
+
+   There are no dedicated methods for collecting buffered audio data.
+   This is accomplished by calling VERIFY, RECOGNIZE, or RECORD as
+   appropriate for the resource, with the header field
+   Ver-Buffer-Utterance.  Then, when the following method is called,
+   verification is performed using the set of buffered audio.
+
+   1.  VERIFY-FROM-BUFFER
+
+   The following methods are used for verification of live audio
+   utterances:
+
+   1.  VERIFY
+
+   2.  START-INPUT-TIMERS
+
+   The following methods are used for configuring the verifier resource
+   and for establishing resource states:
+
+   1.  START-SESSION
+
+   2.  END-SESSION
+
+   3.  QUERY-VOICEPRINT
+
+   4.  DELETE-VOICEPRINT
+
+   5.  VERIFY-ROLLBACK
+
+   6.  STOP
+
+   7.  CLEAR-BUFFER
+
+   The following method allows the polling of a verification in progress
+   for intermediate results.
+
+   1.  GET-INTERMEDIATE-RESULT
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 143]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+11.3.  Verification Events
+
+   The verifier resource generates the following events.
+
+   verifier-event       =  "VERIFICATION-COMPLETE"
+                        /  "START-OF-INPUT"
+
+11.4.  Verification Header Fields
+
+   A verifier resource message can contain header fields containing
+   request options and information to augment the Request, Response, or
+   Event message it is associated with.
+
+   verification-header      =  repository-uri
+                            /  voiceprint-identifier
+                            /  verification-mode
+                            /  adapt-model
+                            /  abort-model
+                            /  min-verification-score
+                            /  num-min-verification-phrases
+                            /  num-max-verification-phrases
+                            /  no-input-timeout
+                            /  save-waveform
+                            /  media-type
+                            /  waveform-uri
+                            /  voiceprint-exists
+                            /  ver-buffer-utterance
+                            /  input-waveform-uri
+                            /  completion-cause
+                            /  completion-reason
+                            /  speech-complete-timeout
+                            /  new-audio-channel
+                            /  abort-verification
+                            /  start-input-timers
+
+11.4.1.  Repository-URI
+
+   This header field specifies the voiceprint repository to be used or
+   referenced during speaker verification or identification operations.
+   This header field is required in the START-SESSION, QUERY-VOICEPRINT,
+   and DELETE-VOICEPRINT methods.
+
+   repository-uri           =  "Repository-URI" ":" uri CRLF
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 144]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+11.4.2.  Voiceprint-Identifier
+
+   This header field specifies the claimed identity for verification
+   applications.  The claimed identity MAY be used to specify an
+   existing voiceprint or to establish a new voiceprint.  This header
+   field MUST be present in the QUERY-VOICEPRINT and DELETE-VOICEPRINT
+   methods.  The Voiceprint-Identifier MUST be present in the START-
+   SESSION method for verification operations.  For identification or
+   multi-verification operations, this header field MAY contain a list
+   of voiceprint identifiers separated by semicolons.  For
+   identification operations, the client MAY also specify a voiceprint
+   group identifier instead of a list of voiceprint identifiers.
+
+   voiceprint-identifier        =  "Voiceprint-Identifier" ":"
+                                   vid *[";" vid] CRLF
+   vid                          =  1*VCHAR ["." 1*VCHAR]
+
+11.4.3.  Verification-Mode
+
+   This header field specifies the mode of the verifier resource and is
+   set by the START-SESSION method.  Acceptable values indicate whether
+   the verification session will train a voiceprint ("train") or verify/
+   identify using an existing voiceprint ("verify").
+
+   Training and verification sessions both require the voiceprint
+   Repository-URI to be specified in the START-SESSION.  In many usage
+   scenarios, however, the system does not know the speaker's claimed
+   identity until a recognition operation has, for example, recognized
+   an account number to which the user desires access.  In order to
+   allow the first few utterances of a dialog to be both recognized and
+   verified, the verifier resource on the MRCPv2 server retains a
+   buffer.  In this buffer, the MRCPv2 server accumulates recognized
+   utterances.  The client can later execute a verification method and
+   apply the buffered utterances to the current verification session.
+
+   Some voice user interfaces may require additional user input that
+   should not be subject to verification.  For example, the user's input
+   may have been recognized with low confidence and thus require a
+   confirmation cycle.  In such cases, the client SHOULD NOT execute the
+   VERIFY or VERIFY-FROM-BUFFER methods to collect and analyze the
+   caller's input.  A separate recognizer resource can analyze the
+   caller's response without any participation by the verifier resource.
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 145]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Once the following conditions have been met:
+
+   1.  the voiceprint identity has been successfully established through
+       the Voiceprint-Identifier header fields of the START-SESSION
+       method, and
+
+   2.  the verification mode has been set to one of "train" or "verify",
+
+   the verifier resource can begin providing verification information
+   during verification operations.  If the verifier resource does not
+   reach one of the two major states ("train" or "verify") , it MUST
+   report an error condition in the MRCPv2 status code to indicate why
+   the verifier resource is not ready for the corresponding usage.
+
+   The value of verification-mode is persistent within a verification
+   session.  If the client attempts to change the mode during a
+   verification session, the verifier resource reports an error and the
+   mode retains its current value.
+
+   verification-mode            =  "Verification-Mode" ":"
+                                   verification-mode-string
+
+   verification-mode-string     =  "train"
+                                /  "verify"
+
+11.4.4.  Adapt-Model
+
+   This header field indicates the desired behavior of the verifier
+   resource after a successful verification operation.  If the value of
+   this header field is "true", the server SHOULD use audio collected
+   during the verification session to update the voiceprint to account
+   for ongoing changes in a speaker's incoming speech characteristics,
+   unless local policy prohibits updating the voiceprint.  If the value
+   is "false" (the default), the server MUST NOT update the voiceprint.
+   This header field MAY occur in the START-SESSION method.
+
+   adapt-model              = "Adapt-Model" ":" BOOLEAN CRLF
+
+11.4.5.  Abort-Model
+
+   The Abort-Model header field indicates the desired behavior of the
+   verifier resource upon session termination.  If the value of this
+   header field is "true", the server MUST discard any pending changes
+   to a voiceprint due to verification training or verification
+   adaptation.  If the value is "false" (the default), the server MUST
+   commit any pending changes for a training session or a successful
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 146]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   verification session to the voiceprint repository.  A value of "true"
+   for Abort-Model overrides a value of "true" for the Adapt-Model
+   header field.  This header field MAY occur in the END-SESSION method.
+
+   abort-model             = "Abort-Model" ":" BOOLEAN CRLF
+
+11.4.6.  Min-Verification-Score
+
+   The Min-Verification-Score header field, when used with a verifier
+   resource through a SET-PARAMS, GET-PARAMS, or START-SESSION method,
+   determines the minimum verification score for which a verification
+   decision of "accepted" may be declared by the server.  This is a
+   float value between -1.0 and 1.0.  The default value for this header
+   field is implementation specific.
+
+   min-verification-score  = "Min-Verification-Score" ":"
+                             [ %x2D ] FLOAT CRLF
+
+11.4.7.  Num-Min-Verification-Phrases
+
+   The Num-Min-Verification-Phrases header field is used to specify the
+   minimum number of valid utterances before a positive decision is
+   given for verification.  The value for this header field is an
+   integer and the default value is 1.  The verifier resource MUST NOT
+   declare a verification 'accepted' unless Num-Min-Verification-Phrases
+   valid utterances have been received.  The minimum value is 1.  This
+   header field MAY occur in START-SESSION, SET-PARAMS, or GET-PARAMS.
+
+   num-min-verification-phrases =  "Num-Min-Verification-Phrases" ":"
+                                   1*19DIGIT CRLF
+
+11.4.8.  Num-Max-Verification-Phrases
+
+   The Num-Max-Verification-Phrases header field is used to specify the
+   number of valid utterances required before a decision is forced for
+   verification.  The verifier resource MUST NOT return a decision of
+   'undecided' once Num-Max-Verification-Phrases have been collected and
+   used to determine a verification score.  The value for this header
+   field is an integer and the minimum value is 1.  The default value is
+   implementation specific.  This header field MAY occur in START-
+   SESSION, SET-PARAMS, or GET-PARAMS.
+
+   num-max-verification-phrases =  "Num-Max-Verification-Phrases" ":"
+                                    1*19DIGIT CRLF
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 147]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+11.4.9.  No-Input-Timeout
+
+   The No-Input-Timeout header field sets the length of time from the
+   start of the verification timers (see START-INPUT-TIMERS) until the
+   VERIFICATION-COMPLETE server event message declares that no input has
+   been received (i.e., has a Completion-Cause of no-input-timeout).
+   The value is in milliseconds.  This header field MAY occur in VERIFY,
+   SET-PARAMS, or GET-PARAMS.  The value for this header field ranges
+   from 0 to an implementation-specific maximum value.  The default
+   value for this header field is implementation specific.
+
+   no-input-timeout         = "No-Input-Timeout" ":" 1*19DIGIT CRLF
+
+11.4.10.  Save-Waveform
+
+   This header field allows the client to request that the verifier
+   resource save the audio stream that was used for verification/
+   identification.  The verifier resource MUST attempt to record the
+   audio and make it available to the client in the form of a URI
+   returned in the Waveform-URI header field in the VERIFICATION-
+   COMPLETE event.  If there was an error in recording the stream, or
+   the audio content is otherwise not available, the verifier resource
+   MUST return an empty Waveform-URI header field.  The default value
+   for this header field is "false".  This header field MAY appear in
+   the VERIFY method.  Note that this header field does not appear in
+   the VERIFY-FROM-BUFFER method since it only controls whether or not
+   to save the waveform for live verification/identification operations.
+
+   save-waveform            =  "Save-Waveform" ":" BOOLEAN CRLF
+
+11.4.11.  Media-Type
+
+   This header field MAY be specified in the SET-PARAMS, GET-PARAMS, or
+   the VERIFY methods and tells the server resource the media type of
+   the captured audio or video such as the one captured and returned by
+   the Waveform-URI header field.
+
+   media-type               =  "Media-Type" ":" media-type-value
+                               CRLF
+
+11.4.12.  Waveform-URI
+
+   If the Save-Waveform header field is set to "true", the verifier
+   resource MUST attempt to record the incoming audio stream of the
+   verification into a file and provide a URI for the client to access
+   it.  This header field MUST be present in the VERIFICATION-COMPLETE
+   event if the Save-Waveform header field was set to true by the
+   client.  The value of the header field MUST be empty if there was
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 148]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   some error condition preventing the server from recording.
+   Otherwise, the URI generated by the server MUST be globally unique
+   across the server and all its verification sessions.  The content
+   MUST be available via the URI until the verification session ends.
+   Since the Save-Waveform header field applies only to live
+   verification/identification operations, the server can return the
+   Waveform-URI only in the VERIFICATION-COMPLETE event for live
+   verification/identification operations.
+
+   The server MUST also return the size in octets and the duration in
+   milliseconds of the recorded audio waveform as parameters associated
+   with the header field.
+
+   waveform-uri             =  "Waveform-URI" ":" ["<" uri ">"
+                               ";" "size" "=" 1*19DIGIT
+                               ";" "duration" "=" 1*19DIGIT] CRLF
+
+11.4.13.  Voiceprint-Exists
+
+   This header field MUST be returned in QUERY-VOICEPRINT and DELETE-
+   VOICEPRINT responses.  This is the status of the voiceprint specified
+   in the QUERY-VOICEPRINT method.  For the DELETE-VOICEPRINT method,
+   this header field indicates the status of the voiceprint at the
+   moment the method execution started.
+
+   voiceprint-exists    =  "Voiceprint-Exists" ":" BOOLEAN CRLF
+
+11.4.14.  Ver-Buffer-Utterance
+
+   This header field is used to indicate that this utterance could be
+   later considered for speaker verification.  This way, a client can
+   request the server to buffer utterances while doing regular
+   recognition or verification activities, and speaker verification can
+   later be requested on the buffered utterances.  This header field is
+   optional in the RECOGNIZE, VERIFY, and RECORD methods.  The default
+   value for this header field is "false".
+
+   ver-buffer-utterance     = "Ver-Buffer-Utterance" ":" BOOLEAN
+                              CRLF
+
+11.4.15.  Input-Waveform-URI
+
+   This header field specifies stored audio content that the client
+   requests the server to fetch and process according to the current
+   verification mode, either to train the voiceprint or verify a claimed
+   identity.  This header field enables the client to implement the
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 149]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   buffering use case where the recognizer and verifier resources are in
+   different sessions and the verification buffer technique cannot be
+   used.  It MAY be specified on the VERIFY request.
+
+   input-waveform-uri           =  "Input-Waveform-URI" ":" uri CRLF
+
+11.4.16.  Completion-Cause
+
+   This header field MUST be part of a VERIFICATION-COMPLETE event from
+   the verifier resource to the client.  This indicates the cause of
+   VERIFY or VERIFY-FROM-BUFFER method completion.  This header field
+   MUST be sent in the VERIFY, VERIFY-FROM-BUFFER, and QUERY-VOICEPRINT
+   responses, if they return with a failure status and a COMPLETE state.
+   In the ABNF below, the 'cause-code' contains a numerical value
+   selected from the Cause-Code column of the following table.  The
+   'cause-name' contains the corresponding token selected from the
+   Cause-Name column.
+
+   completion-cause         =  "Completion-Cause" ":" cause-code SP
+                               cause-name CRLF
+   cause-code               =  3DIGIT
+   cause-name               =  *VCHAR
+
+   +------------+--------------------------+---------------------------+
+   | Cause-Code | Cause-Name               | Description               |
+   +------------+--------------------------+---------------------------+
+   | 000        | success                  | VERIFY or                 |
+   |            |                          | VERIFY-FROM-BUFFER        |
+   |            |                          | request completed         |
+   |            |                          | successfully. The verify  |
+   |            |                          | decision can be           |
+   |            |                          | "accepted", "rejected",   |
+   |            |                          | or "undecided".           |
+   | 001        | error                    | VERIFY or                 |
+   |            |                          | VERIFY-FROM-BUFFER        |
+   |            |                          | request terminated        |
+   |            |                          | prematurely due to a      |
+   |            |                          | verifier resource or      |
+   |            |                          | system error.             |
+   | 002        | no-input-timeout         | VERIFY request completed  |
+   |            |                          | with no result due to a   |
+   |            |                          | no-input-timeout.         |
+   | 003        | too-much-speech-timeout  | VERIFY request completed  |
+   |            |                          | with no result due to too |
+   |            |                          | much speech.              |
+   | 004        | speech-too-early         | VERIFY request completed  |
+   |            |                          | with no result due to     |
+   |            |                          | speech too soon.          |
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 150]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   | 005        | buffer-empty             | VERIFY-FROM-BUFFER        |
+   |            |                          | request completed with no |
+   |            |                          | result due to empty       |
+   |            |                          | buffer.                   |
+   | 006        | out-of-sequence          | Verification operation    |
+   |            |                          | failed due to             |
+   |            |                          | out-of-sequence method    |
+   |            |                          | invocations, for example, |
+   |            |                          | calling VERIFY before     |
+   |            |                          | QUERY-VOICEPRINT.         |
+   | 007        | repository-uri-failure   | Failure accessing         |
+   |            |                          | Repository URI.           |
+   | 008        | repository-uri-missing   | Repository-URI is not     |
+   |            |                          | specified.                |
+   | 009        | voiceprint-id-missing    | Voiceprint-Identifier is  |
+   |            |                          | not specified.            |
+   | 010        | voiceprint-id-not-exist  | Voiceprint-Identifier     |
+   |            |                          | does not exist in the     |
+   |            |                          | voiceprint repository.    |
+   | 011        | speech-not-usable        | VERIFY request completed  |
+   |            |                          | with no result because    |
+   |            |                          | the speech was not usable |
+   |            |                          | (too noisy, too short,    |
+   |            |                          | etc.)                     |
+   +------------+--------------------------+---------------------------+
+
+11.4.17.  Completion-Reason
+
+   This header field MAY be specified in a VERIFICATION-COMPLETE event
+   coming from the verifier resource to the client.  It contains the
+   reason text behind the VERIFY request completion.  This header field
+   communicates text describing the reason for the failure.
+
+   The completion reason text is provided for client use in logs and for
+   debugging and instrumentation purposes.  Clients MUST NOT interpret
+   the completion reason text.
+
+   completion-reason        =  "Completion-Reason" ":"
+                               quoted-string CRLF
+
+11.4.18.  Speech-Complete-Timeout
+
+   This header field is the same as the one described for the Recognizer
+   resource.  See Section 9.4.15.  This header field MAY occur in
+   VERIFY, SET-PARAMS, or GET-PARAMS.
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 151]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+11.4.19.  New-Audio-Channel
+
+   This header field is the same as the one described for the Recognizer
+   resource.  See Section 9.4.23.  This header field MAY be specified in
+   a VERIFY request.
+
+11.4.20.  Abort-Verification
+
+   This header field MUST be sent in a STOP request to indicate whether
+   or not to abort a VERIFY method in progress.  A value of "true"
+   requests the server to discard the results.  A value of "false"
+   requests the server to return in the STOP response the verification
+   results obtained up to the point it received the STOP request.
+
+   abort-verification   =  "Abort-Verification " ":" BOOLEAN CRLF
+
+11.4.21.  Start-Input-Timers
+
+   This header field MAY be sent as part of a VERIFY request.  A value
+   of "false" tells the verifier resource to start the VERIFY operation
+   but not to start the no-input timer yet.  The verifier resource MUST
+   NOT start the timers until the client sends a START-INPUT-TIMERS
+   request to the resource.  This is useful in the scenario when the
+   verifier and synthesizer resources are not part of the same session.
+   In this scenario, when a kill-on-barge-in prompt is being played, the
+   client may want the VERIFY request to be simultaneously active so
+   that it can detect and implement kill-on-barge-in (see
+   Section 8.4.2).  But at the same time, the client doesn't want the
+   verifier resource to start the no-input timers until the prompt is
+   finished.  The default value is "true".
+
+   start-input-timers       =  "Start-Input-Timers" ":"
+                               BOOLEAN CRLF
+
+11.5.  Verification Message Body
+
+   A verification response or event message can carry additional data as
+   described in the following subsection.
+
+11.5.1.  Verification Result Data
+
+   Verification results are returned to the client in the message body
+   of the VERIFICATION-COMPLETE event or the GET-INTERMEDIATE-RESULT
+   response message as described in Section 6.3.  Element and attribute
+   descriptions for the verification portion of the NLSML format are
+   provided in Section 11.5.2 with a normative definition of the schema
+   in Section 16.3.
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 152]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+11.5.2.  Verification Result Elements
+
+   All verification elements are contained within a single
+   <verification-result> element under <result>.  The elements are
+   described below and have the schema defined in Section 16.2.  The
+   following elements are defined:
+
+   1.   <voiceprint>
+
+   2.   <incremental>
+
+   3.   <cumulative>
+
+   4.   <decision>
+
+   5.   <utterance-length>
+
+   6.   <device>
+
+   7.   <gender>
+
+   8.   <adapted>
+
+   9.   <verification-score>
+
+   10.  <vendor-specific-results>
+
+11.5.2.1.  <voiceprint> Element
+
+   This element in the verification results provides information on how
+   the speech data matched a single voiceprint.  The result data
+   returned MAY have more than one such entity in the case of
+   identification or multi-verification.  Each <voiceprint> element and
+   the XML data within the element describe verification result
+   information for how well the speech data matched that particular
+   voiceprint.  The list of <voiceprint> element data are ordered
+   according to their cumulative verification match scores, with the
+   highest score first.
+
+11.5.2.2.  <cumulative> Element
+
+   Within each <voiceprint> element there MUST be a <cumulative> element
+   with the cumulative scores of how well multiple utterances matched
+   the voiceprint.
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 153]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+11.5.2.3.  <incremental> Element
+
+   The first <voiceprint> element MAY contain an <incremental> element
+   with the incremental scores of how well the last utterance matched
+   the voiceprint.
+
+11.5.2.4.  <Decision> Element
+
+   This element is found within the <incremental> or <cumulative>
+   element within the verification results.  Its value indicates the
+   verification decision.  It can have the values of "accepted",
+   "rejected", or "undecided".
+
+11.5.2.5.  <utterance-length> Element
+
+   This element MAY occur within either the <incremental> or
+   <cumulative> elements within the first <voiceprint> element.  Its
+   value indicates the size in milliseconds, respectively, of the last
+   utterance or the cumulated set of utterances.
+
+11.5.2.6.  <device> Element
+
+   This element is found within the <incremental> or <cumulative>
+   element within the verification results.  Its value indicates the
+   apparent type of device used by the caller as determined by the
+   verifier resource.  It can have the values of "cellular-phone",
+   "electret-phone", "carbon-button-phone", or "unknown".
+
+11.5.2.7.  <gender> Element
+
+   This element is found within the <incremental> or <cumulative>
+   element within the verification results.  Its value indicates the
+   apparent gender of the speaker as determined by the verifier
+   resource.  It can have the values of "male", "female", or "unknown".
+
+11.5.2.8.  <adapted> Element
+
+   This element is found within the first <voiceprint> element within
+   the verification results.  When verification is trying to confirm the
+   voiceprint, this indicates if the voiceprint has been adapted as a
+   consequence of analyzing the source utterances.  It is not returned
+   during verification training.  The value can be "true" or "false".
+
+11.5.2.9.  <verification-score> Element
+
+   This element is found within the <incremental> or <cumulative>
+   element within the verification results.  Its value indicates the
+   score of the last utterance as determined by verification.
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 154]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   During verification, the higher the score, the more likely it is that
+   the speaker is the same one as the one who spoke the voiceprint
+   utterances.  During training, the higher the score, the more likely
+   the speaker is to have spoken all of the analyzed utterances.  The
+   value is a floating point between -1.0 and 1.0.  If there are no such
+   utterances, the score is -1.  Note that the verification score is not
+   a probability value.
+
+11.5.2.10.  <vendor-specific-results> Element
+
+   MRCPv2 servers MAY send verification results that contain
+   implementation-specific data that augment the information provided by
+   the MRCPv2-defined elements.  Such data might be useful to clients
+   who have private knowledge of how to interpret these schema
+   extensions.  Implementation-specific additions to the verification
+   results schema MUST belong to the vendor's own namespace.  In the
+   result structure, either they MUST be indicated by a namespace prefix
+   declared within the result, or they MUST be children of an element
+   identified as belonging to the respective namespace.
+
+   The following example shows the results of three voiceprints.  Note
+   that the first one has crossed the verification score threshold, and
+   the speaker has been accepted.  The voiceprint was also adapted with
+   the most recent utterance.
+
+   <?xml version="1.0"?>
+   <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+           grammar="What-Grammar-URI">
+     <verification-result>
+       <voiceprint id="johnsmith">
+         <adapted> true </adapted>
+         <incremental>
+           <utterance-length> 500 </utterance-length>
+           <device> cellular-phone </device>
+           <gender> male </gender>
+           <decision> accepted </decision>
+           <verification-score> 0.98514 </verification-score>
+         </incremental>
+         <cumulative>
+           <utterance-length> 10000 </utterance-length>
+           <device> cellular-phone </device>
+           <gender> male </gender>
+           <decision> accepted </decision>
+           <verification-score> 0.96725</verification-score>
+         </cumulative>
+       </voiceprint>
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 155]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+       <voiceprint id="marysmith">
+         <cumulative>
+           <verification-score> 0.93410 </verification-score>
+         </cumulative>
+       </voiceprint>
+       <voiceprint uri="juniorsmith">
+         <cumulative>
+           <verification-score> 0.74209 </verification-score>
+         </cumulative>
+       </voiceprint>
+     </verification-result>
+   </result>
+
+                      Verification Results Example 1
+
+   In this next example, the verifier has enough information to decide
+   to reject the speaker.
+
+   <?xml version="1.0"?>
+   <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+           xmlns:xmpl="http://www.example.org/2003/12/mrcpv2"
+           grammar="What-Grammar-URI">
+     <verification-result>
+       <voiceprint id="johnsmith">
+         <incremental>
+           <utterance-length> 500 </utterance-length>
+           <device> cellular-phone </device>
+           <gender> male </gender>
+           <verification-score> 0.88514 </verification-score>
+           <xmpl:raspiness> high </xmpl:raspiness>
+           <xmpl:emotion> sadness </xmpl:emotion>
+         </incremental>
+         <cumulative>
+           <utterance-length> 10000 </utterance-length>
+           <device> cellular-phone </device>
+           <gender> male </gender>
+           <decision> rejected </decision>
+           <verification-score> 0.9345 </verification-score>
+         </cumulative>
+       </voiceprint>
+     </verification-result>
+   </result>
+
+                      Verification Results Example 2
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 156]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+11.6.  START-SESSION
+
+   The START-SESSION method starts a speaker verification or speaker
+   identification session.  Execution of this method places the verifier
+   resource into its initial state.  If this method is called during an
+   ongoing verification session, the previous session is implicitly
+   aborted.  If this method is invoked when VERIFY or VERIFY-FROM-BUFFER
+   is active, the method fails and the server returns a status-code of
+   402.
+
+   Upon completion of the START-SESSION method, the verifier resource
+   MUST have terminated any ongoing verification session and cleared any
+   voiceprint designation.
+
+   A verification session is associated with the voiceprint repository
+   to be used during the session.  This is specified through the
+   Repository-URI header field (see Section 11.4.1).
+
+   The START-SESSION method also establishes, through the Voiceprint-
+   Identifier header field, which voiceprints are to be matched or
+   trained during the verification session.  If this is an
+   Identification session or if the client wants to do Multi-
+   Verification, the Voiceprint-Identifier header field contains a list
+   of semicolon-separated voiceprint identifiers.
+
+   The Adapt-Model header field MAY also be present in the START-SESSION
+   request to indicate whether or not to adapt a voiceprint based on
+   data collected during the session (if the voiceprint verification
+   phase succeeds).  By default, the voiceprint model MUST NOT be
+   adapted with data from a verification session.
+
+   The START-SESSION also determines whether the session is for a train
+   or verify of a voiceprint.  Hence, the Verification-Mode header field
+   MUST be sent in every START-SESSION request.  The value of the
+   Verification-Mode header field MUST be one of either "train" or
+   "verify".
+
+   Before a verification/identification session is started, the client
+   may only request that VERIFY-ROLLBACK and generic SET-PARAMS and
+   GET-PARAMS operations be performed on the verifier resource.  The
+   server MUST return status-code 402 "Method not valid in this state"
+   for all other verification operations.
+
+   A verifier resource MUST NOT have more than a single session active
+   at one time.
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 157]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   C->S:  MRCP/2.0 ... START-SESSION 314161
+          Channel-Identifier:32AECB23433801@speakverify
+          Repository-URI:http://www.example.com/voiceprintdbase/
+          Voiceprint-Mode:verify
+          Voiceprint-Identifier:johnsmith.voiceprint
+          Adapt-Model:true
+
+   S->C:  MRCP/2.0 ... 314161 200 COMPLETE
+          Channel-Identifier:32AECB23433801@speakverify
+
+11.7.  END-SESSION
+
+   The END-SESSION method terminates an ongoing verification session and
+   releases the verification voiceprint resources.  The session may
+   terminate in one of three ways:
+
+   1.  abort - the voiceprint adaptation or creation may be aborted so
+       that the voiceprint remains unchanged (or is not created).
+
+   2.  commit - when terminating a voiceprint training session, the new
+       voiceprint is committed to the repository.
+
+   3.  adapt - an existing voiceprint is modified using a successful
+       verification.
+
+   The Abort-Model header field MAY be included in the END-SESSION to
+   control whether or not to abort any pending changes to the
+   voiceprint.  The default behavior is to commit (not abort) any
+   pending changes to the designated voiceprint.
+
+   The END-SESSION method may be safely executed multiple times without
+   first executing the START-SESSION method.  Any additional executions
+   of this method without an intervening use of the START-SESSION method
+   have no effect on the verifier resource.
+
+   The following example assumes there is either a training session or a
+   verification session in progress.
+
+   C->S:  MRCP/2.0 ... END-SESSION 314174
+          Channel-Identifier:32AECB23433801@speakverify
+          Abort-Model:true
+
+   S->C:  MRCP/2.0 ... 314174 200 COMPLETE
+          Channel-Identifier:32AECB23433801@speakverify
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 158]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+11.8.  QUERY-VOICEPRINT
+
+   The QUERY-VOICEPRINT method is used to get status information on a
+   particular voiceprint and can be used by the client to ascertain if a
+   voiceprint or repository exists and if it contains trained
+   voiceprints.
+
+   The response to the QUERY-VOICEPRINT request contains an indication
+   of the status of the designated voiceprint in the Voiceprint-Exists
+   header field, allowing the client to determine whether to use the
+   current voiceprint for verification, train a new voiceprint, or
+   choose a different voiceprint.
+
+   A voiceprint is completely specified by providing a repository
+   location and a voiceprint identifier.  The particular voiceprint or
+   identity within the repository is specified by a string identifier
+   that is unique within the repository.  The Voiceprint-Identifier
+   header field carries this unique voiceprint identifier within a given
+   repository.
+
+   The following example assumes a verification session is in progress
+   and the voiceprint exists in the voiceprint repository.
+
+   C->S:  MRCP/2.0 ... QUERY-VOICEPRINT 314168
+          Channel-Identifier:32AECB23433801@speakverify
+          Repository-URI:http://www.example.com/voiceprints/
+          Voiceprint-Identifier:johnsmith.voiceprint
+
+   S->C:  MRCP/2.0 ... 314168 200 COMPLETE
+          Channel-Identifier:32AECB23433801@speakverify
+          Repository-URI:http://www.example.com/voiceprints/
+          Voiceprint-Identifier:johnsmith.voiceprint
+          Voiceprint-Exists:true
+
+   The following example assumes that the URI provided in the
+   Repository-URI header field is a bad URI.
+
+   C->S:  MRCP/2.0 ... QUERY-VOICEPRINT 314168
+          Channel-Identifier:32AECB23433801@speakverify
+          Repository-URI:http://www.example.com/bad-uri/
+          Voiceprint-Identifier:johnsmith.voiceprint
+
+   S->C:  MRCP/2.0 ... 314168 405 COMPLETE
+          Channel-Identifier:32AECB23433801@speakverify
+          Repository-URI:http://www.example.com/bad-uri/
+          Voiceprint-Identifier:johnsmith.voiceprint
+          Completion-Cause:007 repository-uri-failure
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 159]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+11.9.  DELETE-VOICEPRINT
+
+   The DELETE-VOICEPRINT method removes a voiceprint from a repository.
+   This method MUST carry the Repository-URI and Voiceprint-Identifier
+   header fields.
+
+   An MRCPv2 server MUST reject a DELETE-VOICEPRINT request with a 401
+   status code unless the MRCPv2 client has been authenticated and
+   authorized.  Note that MRCPv2 does not have a standard mechanism for
+   this.  See Section 12.8.
+
+   If the corresponding voiceprint does not exist, the DELETE-VOICEPRINT
+   method MUST return a 200 status code.
+
+   The following example demonstrates a DELETE-VOICEPRINT operation to
+   remove a specific voiceprint.
+
+   C->S:  MRCP/2.0 ... DELETE-VOICEPRINT 314168
+          Channel-Identifier:32AECB23433801@speakverify
+          Repository-URI:http://www.example.com/bad-uri/
+          Voiceprint-Identifier:johnsmith.voiceprint
+
+   S->C:  MRCP/2.0 ... 314168 200 COMPLETE
+          Channel-Identifier:32AECB23433801@speakverify
+
+11.10.  VERIFY
+
+   The VERIFY method is used to request that the verifier resource
+   either train/adapt the voiceprint or verify/identify a claimed
+   identity.  If the voiceprint is new or was deleted by a previous
+   DELETE-VOICEPRINT method, the VERIFY method trains the voiceprint.
+   If the voiceprint already exists, it is adapted and not retrained by
+   the VERIFY command.
+
+   C->S:  MRCP/2.0 ... VERIFY 543260
+          Channel-Identifier:32AECB23433801@speakverify
+
+   S->C:  MRCP/2.0 ... 543260 200 IN-PROGRESS
+          Channel-Identifier:32AECB23433801@speakverify
+
+   When the VERIFY request completes, the MRCPv2 server MUST send a
+   VERIFICATION-COMPLETE event to the client.
+
+11.11.  VERIFY-FROM-BUFFER
+
+   The VERIFY-FROM-BUFFER method directs the verifier resource to verify
+   buffered audio against a voiceprint.  Only one VERIFY or VERIFY-FROM-
+   BUFFER method may be active for a verifier resource at a time.
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 160]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   The buffered audio is not consumed by this method and thus VERIFY-
+   FROM-BUFFER may be invoked multiple times by the client to attempt
+   verification against different voiceprints.
+
+   For the VERIFY-FROM-BUFFER method, the server MAY optionally return
+   an IN-PROGRESS response before the VERIFICATION-COMPLETE event.
+
+   When the VERIFY-FROM-BUFFER method is invoked and the verification
+   buffer is in use by another resource sharing it, the server MUST
+   return an IN-PROGRESS response and wait until the buffer is available
+   to it.  The verification buffer is owned by the verifier resource but
+   is shared with write access from other input resources on the same
+   session.  Hence, it is considered to be in use if there is a read or
+   write operation such as a RECORD or RECOGNIZE with the
+   Ver-Buffer-Utterance header field set to "true" on a resource that
+   shares this buffer.  Note that if a RECORD or RECOGNIZE method
+   returns with a failure cause code, the VERIFY-FROM-BUFFER request
+   waiting to process that buffer MUST also fail with a Completion-Cause
+   of 005 (buffer-empty).
+
+   The following example illustrates the usage of some buffering
+   methods.  In this scenario, the client first performed a live
+   verification, but the utterance had been rejected.  In the meantime,
+   the utterance is also saved to the audio buffer.  Then, another
+   voiceprint is used to do verification against the audio buffer and
+   the utterance is accepted.  For the example, we assume both
+   Num-Min-Verification-Phrases and Num-Max-Verification-Phrases are 1.
+
+   C->S:  MRCP/2.0 ... START-SESSION 314161
+          Channel-Identifier:32AECB23433801@speakverify
+          Verification-Mode:verify
+          Adapt-Model:true
+          Repository-URI:http://www.example.com/voiceprints
+          Voiceprint-Identifier:johnsmith.voiceprint
+
+   S->C:  MRCP/2.0 ... 314161 200 COMPLETE
+          Channel-Identifier:32AECB23433801@speakverify
+
+   C->S:  MRCP/2.0 ... VERIFY 314162
+          Channel-Identifier:32AECB23433801@speakverify
+          Ver-buffer-utterance:true
+
+   S->C:  MRCP/2.0 ... 314162 200 IN-PROGRESS
+          Channel-Identifier:32AECB23433801@speakverify
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 161]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   S->C:  MRCP/2.0 ... VERIFICATION-COMPLETE 314162 COMPLETE
+          Channel-Identifier:32AECB23433801@speakverify
+          Completion-Cause:000 success
+          Content-Type:application/nlsml+xml
+          Content-Length:...
+
+          <?xml version="1.0"?>
+          <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+                  grammar="What-Grammar-URI">
+            <verification-result>
+              <voiceprint id="johnsmith">
+                <incremental>
+                  <utterance-length> 500 </utterance-length>
+                  <device> cellular-phone </device>
+                  <gender> female </gender>
+                  <decision> rejected </decision>
+                  <verification-score> 0.05465 </verification-score>
+                </incremental>
+                <cumulative>
+                  <utterance-length> 500 </utterance-length>
+                  <device> cellular-phone </device>
+                  <gender> female </gender>
+                  <decision> rejected </decision>
+                  <verification-score> 0.05465 </verification-score>
+                </cumulative>
+              </voiceprint>
+            </verification-result>
+          </result>
+
+   C->S:  MRCP/2.0 ... QUERY-VOICEPRINT 314163
+          Channel-Identifier:32AECB23433801@speakverify
+          Repository-URI:http://www.example.com/voiceprints/
+          Voiceprint-Identifier:johnsmith
+
+   S->C:  MRCP/2.0 ... 314163 200 COMPLETE
+          Channel-Identifier:32AECB23433801@speakverify
+          Repository-URI:http://www.example.com/voiceprints/
+          Voiceprint-Identifier:johnsmith.voiceprint
+          Voiceprint-Exists:true
+
+   C->S:  MRCP/2.0 ... START-SESSION 314164
+          Channel-Identifier:32AECB23433801@speakverify
+          Verification-Mode:verify
+          Adapt-Model:true
+          Repository-URI:http://www.example.com/voiceprints
+          Voiceprint-Identifier:marysmith.voiceprint
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 162]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   S->C:  MRCP/2.0 ... 314164 200 COMPLETE
+          Channel-Identifier:32AECB23433801@speakverify
+
+   C->S:  MRCP/2.0 ... VERIFY-FROM-BUFFER 314165
+          Channel-Identifier:32AECB23433801@speakverify
+
+   S->C:  MRCP/2.0 ... 314165 200 IN-PROGRESS
+          Channel-Identifier:32AECB23433801@speakverify
+
+   S->C:  MRCP/2.0 ... VERIFICATION-COMPLETE 314165 COMPLETE
+          Channel-Identifier:32AECB23433801@speakverify
+          Completion-Cause:000 success
+          Content-Type:application/nlsml+xml
+          Content-Length:...
+
+          <?xml version="1.0"?>
+          <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+                  grammar="What-Grammar-URI">
+            <verification-result>
+              <voiceprint id="marysmith">
+                <incremental>
+                  <utterance-length> 1000 </utterance-length>
+                  <device> cellular-phone </device>
+                  <gender> female </gender>
+                  <decision> accepted </decision>
+                  <verification-score> 0.98 </verification-score>
+                </incremental>
+                <cumulative>
+                  <utterance-length> 1000 </utterance-length>
+                  <device> cellular-phone </device>
+                  <gender> female </gender>
+                  <decision> accepted </decision>
+                  <verification-score> 0.98 </verification-score>
+                </cumulative>
+              </voiceprint>
+            </verification-result>
+          </result>
+
+
+   C->S:  MRCP/2.0 ... END-SESSION 314166
+          Channel-Identifier:32AECB23433801@speakverify
+
+   S->C:  MRCP/2.0 ... 314166 200 COMPLETE
+          Channel-Identifier:32AECB23433801@speakverify
+
+                        VERIFY-FROM-BUFFER Example
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 163]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+11.12.  VERIFY-ROLLBACK
+
+   The VERIFY-ROLLBACK method discards the last buffered utterance or
+   discards the last live utterances (when the mode is "train" or
+   "verify").  The client will likely want to invoke this method when
+   the user provides undesirable input such as non-speech noises, side-
+   speech, out-of-grammar utterances, commands, etc.  Note that this
+   method does not provide a stack of rollback states.  Executing
+   VERIFY-ROLLBACK twice in succession without an intervening
+   recognition operation has no effect on the second attempt.
+
+   C->S:  MRCP/2.0 ... VERIFY-ROLLBACK 314165
+          Channel-Identifier:32AECB23433801@speakverify
+
+   S->C:  MRCP/2.0 ... 314165 200 COMPLETE
+          Channel-Identifier:32AECB23433801@speakverify
+
+                          VERIFY-ROLLBACK Example
+
+11.13.  STOP
+
+   The STOP method from the client to the server tells the verifier
+   resource to stop the VERIFY or VERIFY-FROM-BUFFER request if one is
+   active.  If such a request is active and the STOP request
+   successfully terminated it, then the response header section contains
+   an Active-Request-Id-List header field containing the request-id of
+   the VERIFY or VERIFY-FROM-BUFFER request that was terminated.  In
+   this case, no VERIFICATION-COMPLETE event is sent for the terminated
+   request.  If there was no verify request active, then the response
+   MUST NOT contain an Active-Request-Id-List header field.  Either way,
+   the response MUST contain a status-code of 200 "Success".
+
+   The STOP method can carry an Abort-Verification header field, which
+   specifies if the verification result until that point should be
+   discarded or returned.  If this header field is not present or if the
+   value is "true", the verification result is discarded and the STOP
+   response does not contain any result data.  If the header field is
+   present and its value is "false", the STOP response MUST contain a
+   Completion-Cause header field and carry the Verification result data
+   in its body.
+
+   An aborted VERIFY request does an automatic rollback and hence does
+   not affect the cumulative score.  A VERIFY request that was stopped
+   with no Abort-Verification header field or with the Abort-
+   Verification header field set to "false" does affect cumulative
+   scores and would need to be explicitly rolled back if the client does
+   not want the verification result considered in the cumulative scores.
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 164]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   The following example assumes a voiceprint identity has already been
+   established.
+
+   C->S:  MRCP/2.0 ... VERIFY 314177
+          Channel-Identifier:32AECB23433801@speakverify
+
+   S->C:  MRCP/2.0 ... 314177 200 IN-PROGRESS
+          Channel-Identifier:32AECB23433801@speakverify
+
+   C->S:  MRCP/2.0 ... STOP 314178
+          Channel-Identifier:32AECB23433801@speakverify
+
+   S->C:  MRCP/2.0 ... 314178 200 COMPLETE
+          Channel-Identifier:32AECB23433801@speakverify
+          Active-Request-Id-List:314177
+
+                         STOP Verification Example
+
+11.14.  START-INPUT-TIMERS
+
+   This request is sent from the client to the verifier resource to
+   start the no-input timer, usually once the client has ascertained
+   that any audio prompts to the user have played to completion.
+
+   C->S:  MRCP/2.0 ... START-INPUT-TIMERS 543260
+          Channel-Identifier:32AECB23433801@speakverify
+
+   S->C:  MRCP/2.0 ... 543260 200 COMPLETE
+          Channel-Identifier:32AECB23433801@speakverify
+
+11.15.  VERIFICATION-COMPLETE
+
+   The VERIFICATION-COMPLETE event follows a call to VERIFY or VERIFY-
+   FROM-BUFFER and is used to communicate the verification results to
+   the client.  The event message body contains only verification
+   results.
+
+   S->C:  MRCP/2.0 ... VERIFICATION-COMPLETE 543259 COMPLETE
+          Completion-Cause:000 success
+          Content-Type:application/nlsml+xml
+          Content-Length:...
+
+          <?xml version="1.0"?>
+          <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+                  grammar="What-Grammar-URI">
+            <verification-result>
+              <voiceprint id="johnsmith">
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 165]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+                <incremental>
+                  <utterance-length> 500 </utterance-length>
+                  <device> cellular-phone </device>
+                  <gender> male </gender>
+                  <decision> accepted </decision>
+                  <verification-score> 0.85 </verification-score>
+                </incremental>
+                <cumulative>
+                  <utterance-length> 1500 </utterance-length>
+                  <device> cellular-phone </device>
+                  <gender> male </gender>
+                  <decision> accepted </decision>
+                  <verification-score> 0.75 </verification-score>
+                </cumulative>
+              </voiceprint>
+            </verification-result>
+          </result>
+
+11.16.  START-OF-INPUT
+
+   The START-OF-INPUT event is returned from the server to the client
+   once the server has detected speech.  This event is always returned
+   by the verifier resource when speech has been detected, irrespective
+   of whether or not the recognizer and verifier resources share the
+   same session.
+
+   S->C:  MRCP/2.0 ... START-OF-INPUT 543259 IN-PROGRESS
+          Channel-Identifier:32AECB23433801@speakverify
+
+11.17.  CLEAR-BUFFER
+
+   The CLEAR-BUFFER method can be used to clear the verification buffer.
+   This buffer is used to buffer speech during recognition, record, or
+   verification operations that may later be used by VERIFY-FROM-BUFFER.
+   As noted before, the buffer associated with the verifier resource is
+   shared by other input resources like recognizers and recorders.
+   Hence, a CLEAR-BUFFER request fails if the verification buffer is in
+   use.  This can happen when any one of the input resources that share
+   this buffer has an active read or write operation such as RECORD,
+   RECOGNIZE, or VERIFY with the Ver-Buffer-Utterance header field set
+   to "true".
+
+   C->S:  MRCP/2.0 ... CLEAR-BUFFER 543260
+          Channel-Identifier:32AECB23433801@speakverify
+
+   S->C:  MRCP/2.0 ... 543260 200 COMPLETE
+          Channel-Identifier:32AECB23433801@speakverify
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 166]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+11.18.  GET-INTERMEDIATE-RESULT
+
+   A client can use the GET-INTERMEDIATE-RESULT method to poll for
+   intermediate results of a verification request that is in progress.
+   Invoking this method does not change the state of the resource.  The
+   verifier resource collects the accumulated verification results and
+   returns the information in the method response.  The message body in
+   the response to a GET-INTERMEDIATE-RESULT REQUEST contains only
+   verification results.  The method response MUST NOT contain a
+   Completion-Cause header field as the request is not yet complete.  If
+   the resource does not have a verification in progress, the response
+   has a 402 failure status-code and no result in the body.
+
+   C->S:  MRCP/2.0 ... GET-INTERMEDIATE-RESULT 543260
+          Channel-Identifier:32AECB23433801@speakverify
+
+   S->C:  MRCP/2.0 ... 543260 200 COMPLETE
+          Channel-Identifier:32AECB23433801@speakverify
+          Content-Type:application/nlsml+xml
+          Content-Length:...
+
+          <?xml version="1.0"?>
+          <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+                  grammar="What-Grammar-URI">
+            <verification-result>
+              <voiceprint id="marysmith">
+                <incremental>
+                  <utterance-length> 50 </utterance-length>
+                  <device> cellular-phone </device>
+                  <gender> female </gender>
+                  <decision> undecided </decision>
+                  <verification-score> 0.85 </verification-score>
+                </incremental>
+                <cumulative>
+                  <utterance-length> 150 </utterance-length>
+                  <device> cellular-phone </device>
+                  <gender> female </gender>
+                  <decision> undecided </decision>
+                  <verification-score> 0.65 </verification-score>
+                </cumulative>
+              </voiceprint>
+            </verification-result>
+          </result>
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 167]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+12.  Security Considerations
+
+   MRCPv2 is designed to comply with the security-related requirements
+   documented in the SPEECHSC requirements [RFC4313].  Implementers and
+   users of MRCPv2 are strongly encouraged to read the Security
+   Considerations section of [RFC4313], because that document contains
+   discussion of a number of important security issues associated with
+   the utilization of speech as biometric authentication technology, and
+   on the threats against systems which store recorded speech, contain
+   large corpora of voiceprints, and send and receive sensitive
+   information based on voice input to a recognizer or speech output
+   from a synthesizer.  Specific security measures employed by MRCPv2
+   are summarized in the following subsections.  See the corresponding
+   sections of this specification for how the security-related machinery
+   is invoked by individual protocol operations.
+
+12.1.  Rendezvous and Session Establishment
+
+   MRCPv2 control sessions are established as media sessions described
+   by SDP within the context of a SIP dialog.  In order to ensure secure
+   rendezvous between MRCPv2 clients and servers, the following are
+   required:
+
+   1.  The SIP implementation in MRCPv2 clients and servers MUST support
+       SIP digest authentication [RFC3261] and SHOULD employ it.
+
+   2.  The SIP implementation in MRCPv2 clients and servers MUST support
+       'sips' URIs and SHOULD employ 'sips' URIs; this includes that
+       clients and servers SHOULD set up TLS [RFC5246] connections.
+
+   3.  If media stream cryptographic keying is done through SDP (e.g.
+       using [RFC4568]), the MRCPv2 clients and servers MUST employ the
+       'sips' URI.
+
+   4.  When TLS is used for SIP, the client MUST verify the identity of
+       the server to which it connects, following the rules and
+       guidelines defined in [RFC5922].
+
+12.2.  Control Channel Protection
+
+   Sensitive data is carried over the MRCPv2 control channel.  This
+   includes things like the output of speech recognition operations,
+   speaker verification results, input to text-to-speech conversion,
+   personally identifying grammars, etc.  For this reason, MRCPv2
+   servers must be properly authenticated, and the control channel must
+   permit the use of both confidentiality and integrity for the data.
+   To ensure control channel protection, MRCPv2 clients and servers MUST
+   support TLS and SHOULD utilize it by default unless alternative
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 168]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   control channel protection is used.  When TLS is used, the client
+   MUST verify the identity of the server to which it connects,
+   following the rules and guidelines defined in [RFC4572].  If there
+   are multiple TLS-protected channels between the client and the
+   server, the server MUST NOT send a response to the client over a
+   channel for which the TLS identities of the server or client differ
+   from the channel over which the server received the corresponding
+   request.  Alternative control-channel protection MAY be used if
+   desired (e.g., Security Architecture for the Internet Protocol
+   (IPsec) [RFC4301]).
+
+12.3.  Media Session Protection
+
+   Sensitive data is also carried on media sessions terminating on
+   MRCPv2 servers (the other end of a media channel may or may not be on
+   the MRCPv2 client).  This data includes the user's spoken utterances
+   and the output of text-to-speech operations.  MRCPv2 servers MUST
+   support a security mechanism for protection of audio media sessions.
+   MRCPv2 clients that originate or consume audio similarly MUST support
+   a security mechanism for protection of the audio.  One such mechanism
+   is the Secure Real-time Transport Protocol (SRTP) [RFC3711].
+
+12.4.  Indirect Content Access
+
+   MCRPv2 employs content indirection extensively.  Content may be
+   fetched and/or stored based on URI addressing on systems other than
+   the MRCPv2 client or server.  Not all of the stored content is
+   necessarily sensitive (e.g., XML schemas), but the majority generally
+   needs protection, and some indirect content, such as voice recordings
+   and voiceprints, is extremely sensitive and must always be protected.
+   MRCPv2 clients and servers MUST implement HTTPS for indirect content
+   access and SHOULD employ secure access for all sensitive indirect
+   content.  Other secure URI schemes such as Secure FTP (FTPS)
+   [RFC4217] MAY also be used.  See Section 6.2.15 for the header fields
+   used to transfer cookie information between the MRCPv2 client and
+   server if needed for authentication.
+
+   Access to URIs provided by servers introduces risks that need to be
+   considered.  Although RFC 6454 [RFC6454] discusses and focuses on a
+   same-origin policy, which MRCPv2 does not restrict URIs to, it still
+   provides an excellent description of the pitfalls of blindly
+   following server-provided URIs in Section 3 of the RFC.  Servers also
+   need to be aware that clients could provide URIs to sites designed to
+   tie up the server in long or otherwise problematic document fetches.
+   MRCPv2 servers, and the services they access, MUST always be prepared
+   for the possibility of such a denial-of-service attack.
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 169]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   MRCPv2 makes no inherent assumptions about the lifetime and access
+   controls associated with a URI.  For example, if neither
+   authentication nor scheme-specific access controls are used, a leak
+   of the URI is equivalent to a leak of the content.  Moreover, MRCPv2
+   makes no specific demands on the lifetime of a URI.  If a server
+   offers a URI and the client takes a long, long time to access that
+   URI, the server may have removed the resource in the interim time
+   period.  MRCPv2 deals with this case by using the URI access scheme's
+   'resource not found' error, such as 404 for HTTPS.  How long a server
+   should keep a dynamic resource available is highly application and
+   context dependent.  However, the server SHOULD keep the resource
+   available for a reasonable amount of time to make it likely the
+   client will have the resource available when the client needs the
+   resource.  Conversely, to mitigate state exhaustion attacks, MRCPv2
+   servers are not obligated to keep resources and resource state in
+   perpetuity.  The server SHOULD delete dynamically generated resources
+   associated with an MRCPv2 session when the session ends.
+
+   One method to avoid resource leakage is for the server to use
+   difficult-to-guess, one-time resource URIs.  In this instance, there
+   can be only a single access to the underlying resource using the
+   given URI.  A downside to this approach is if an attacker uses the
+   URI before the client uses the URI, then the client is denied the
+   resource.  Other methods would be to adopt a mechanism similar to the
+   URLAUTH IMAP extension [RFC4467], where the server sets cryptographic
+   checks on URI usage, as well as capabilities for expiration,
+   revocation, and so on.  Specifying such a mechanism is beyond the
+   scope of this document.
+
+12.5.  Protection of Stored Media
+
+   MRCPv2 applications often require the use of stored media.  Voice
+   recordings are both stored (e.g., for diagnosis and system tuning),
+   and fetched (for replaying utterances into multiple MRCPv2
+   resources).  Voiceprints are fundamental to the speaker
+   identification and verification functions.  This data can be
+   extremely sensitive and can present substantial privacy and
+   impersonation risks if stolen.  Systems employing MRCPv2 SHOULD be
+   deployed in ways that minimize these risks.  The SPEECHSC
+   requirements RFC [RFC4313] contains a more extensive discussion of
+   these risks and ways they may be mitigated.
+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 170]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+12.6.  DTMF and Recognition Buffers
+
+   DTMF buffers and recognition buffers may grow large enough to exceed
+   the capabilities of a server, and the server MUST be prepared to
+   gracefully handle resource consumption.  A server MAY respond with
+   the appropriate recognition incomplete if the server is in danger of
+   running out of resources.
+
+12.7.  Client-Set Server Parameters
+
+   In MRCPv2, there are some tasks, such as URI resource fetches, that
+   the server does on behalf of the client.  To control this behavior,
+   MRCPv2 has a number of server parameters that a client can configure.
+   With one such parameter, Fetch-Timeout (Section 6.2.12), a malicious
+   client could set a very large value and then request the server to
+   fetch a non-existent document.  It is RECOMMENDED that servers be
+   cautious about accepting long timeout values or abnormally large
+   values for other client-set parameters.
+
+12.8.  DELETE-VOICEPRINT and Authorization
+
+   Since this specification does not mandate a specific mechanism for
+   authentication and authorization when requesting DELETE-VOICEPRINT
+   (Section 11.9), there is a risk that an MRCPv2 server may not do such
+   a check for authentication and authorization.  In practice, each
+   provider of voice biometric solutions does insist on its own
+   authentication and authorization mechanism, outside of this
+   specification, so this is not likely to be a major problem.  If in
+   the future voice biometric providers standardize on such a mechanism,
+   then a future version of MRCP can mandate it.
+
+13.  IANA Considerations
+
+13.1.  New Registries
+
+   This section describes the name spaces (registries) for MRCPv2 that
+   IANA has created and now maintains.  Assignment/registration policies
+   are described in RFC 5226 [RFC5226].
+
+13.1.1.  MRCPv2 Resource Types
+
+   IANA has created a new name space of "MRCPv2 Resource Types".  All
+   maintenance within and additions to the contents of this name space
+   MUST be according to the "Standards Action" registration policy.  The
+   initial contents of the registry, defined in Section 4.2, are given
+   below:
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 171]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Resource type  Resource description  Reference
+   -------------  --------------------  ---------
+   speechrecog    Speech Recognizer     [RFC6787]
+   dtmfrecog      DTMF Recognizer       [RFC6787]
+   speechsynth    Speech Synthesizer    [RFC6787]
+   basicsynth     Basic Synthesizer     [RFC6787]
+   speakverify    Speaker Verifier      [RFC6787]
+   recorder       Speech Recorder       [RFC6787]
+
+13.1.2.  MRCPv2 Methods and Events
+
+   IANA has created a new name space of "MRCPv2 Methods and Events".
+   All maintenance within and additions to the contents of this name
+   space MUST be according to the "Standards Action" registration
+   policy.  The initial contents of the registry, defined by the
+   "method-name" and "event-name" BNF in Section 15 and explained in
+   Sections 5.2 and 5.5, are given below.
+
+   Name                     Resource type  Method/Event  Reference
+   ----                     -------------  ------------  ---------
+   SET-PARAMS               Generic        Method        [RFC6787]
+   GET-PARAMS               Generic        Method        [RFC6787]
+   SPEAK                    Synthesizer    Method        [RFC6787]
+   STOP                     Synthesizer    Method        [RFC6787]
+   PAUSE                    Synthesizer    Method        [RFC6787]
+   RESUME                   Synthesizer    Method        [RFC6787]
+   BARGE-IN-OCCURRED        Synthesizer    Method        [RFC6787]
+   CONTROL                  Synthesizer    Method        [RFC6787]
+   DEFINE-LEXICON           Synthesizer    Method        [RFC6787]
+   DEFINE-GRAMMAR           Recognizer     Method        [RFC6787]
+   RECOGNIZE                Recognizer     Method        [RFC6787]
+   INTERPRET                Recognizer     Method        [RFC6787]
+   GET-RESULT               Recognizer     Method        [RFC6787]
+   START-INPUT-TIMERS       Recognizer     Method        [RFC6787]
+   STOP                     Recognizer     Method        [RFC6787]
+   START-PHRASE-ENROLLMENT  Recognizer     Method        [RFC6787]
+   ENROLLMENT-ROLLBACK      Recognizer     Method        [RFC6787]
+   END-PHRASE-ENROLLMENT    Recognizer     Method        [RFC6787]
+   MODIFY-PHRASE            Recognizer     Method        [RFC6787]
+   DELETE-PHRASE            Recognizer     Method        [RFC6787]
+   RECORD                   Recorder       Method        [RFC6787]
+   STOP                     Recorder       Method        [RFC6787]
+   START-INPUT-TIMERS       Recorder       Method        [RFC6787]
+   START-SESSION            Verifier       Method        [RFC6787]
+   END-SESSION              Verifier       Method        [RFC6787]
+   QUERY-VOICEPRINT         Verifier       Method        [RFC6787]
+   DELETE-VOICEPRINT        Verifier       Method        [RFC6787]
+   VERIFY                   Verifier       Method        [RFC6787]
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 172]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   VERIFY-FROM-BUFFER       Verifier       Method        [RFC6787]
+   VERIFY-ROLLBACK          Verifier       Method        [RFC6787]
+   STOP                     Verifier       Method        [RFC6787]
+   START-INPUT-TIMERS       Verifier       Method        [RFC6787]
+   GET-INTERMEDIATE-RESULT  Verifier       Method        [RFC6787]
+   SPEECH-MARKER            Synthesizer    Event         [RFC6787]
+   SPEAK-COMPLETE           Synthesizer    Event         [RFC6787]
+   START-OF-INPUT           Recognizer     Event         [RFC6787]
+   RECOGNITION-COMPLETE     Recognizer     Event         [RFC6787]
+   INTERPRETATION-COMPLETE  Recognizer     Event         [RFC6787]
+   START-OF-INPUT           Recorder       Event         [RFC6787]
+   RECORD-COMPLETE          Recorder       Event         [RFC6787]
+   VERIFICATION-COMPLETE    Verifier       Event         [RFC6787]
+   START-OF-INPUT           Verifier       Event         [RFC6787]
+
+13.1.3.  MRCPv2 Header Fields
+
+   IANA has created a new name space of "MRCPv2 Header Fields".  All
+   maintenance within and additions to the contents of this name space
+   MUST be according to the "Standards Action" registration policy.  The
+   initial contents of the registry, defined by the "message-header" BNF
+   in Section 15 and explained in Section 5.1, are given below.  Note
+   that the values permitted for the "Vendor-Specific-Parameters"
+   parameter are managed according to a different policy.  See
+   Section 13.1.6.
+
+   Name                               Resource type    Reference
+   ----                               -------------    ---------
+   Channel-Identifier                 Generic          [RFC6787]
+   Accept                             Generic          [RFC2616]
+   Active-Request-Id-List             Generic          [RFC6787]
+   Proxy-Sync-Id                      Generic          [RFC6787]
+   Accept-Charset                     Generic          [RFC2616]
+   Content-Type                       Generic          [RFC6787]
+   Content-ID                         Generic
+                             [RFC2392], [RFC2046], and [RFC5322]
+   Content-Base                       Generic          [RFC6787]
+   Content-Encoding                   Generic          [RFC6787]
+   Content-Location                   Generic          [RFC6787]
+   Content-Length                     Generic          [RFC6787]
+   Fetch-Timeout                      Generic          [RFC6787]
+   Cache-Control                      Generic          [RFC6787]
+   Logging-Tag                        Generic          [RFC6787]
+   Set-Cookie                         Generic          [RFC6787]
+   Vendor-Specific                    Generic          [RFC6787]
+   Jump-Size                          Synthesizer      [RFC6787]
+   Kill-On-Barge-In                   Synthesizer      [RFC6787]
+   Speaker-Profile                    Synthesizer      [RFC6787]
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 173]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Completion-Cause                   Synthesizer      [RFC6787]
+   Completion-Reason                  Synthesizer      [RFC6787]
+   Voice-Parameter                    Synthesizer      [RFC6787]
+   Prosody-Parameter                  Synthesizer      [RFC6787]
+   Speech-Marker                      Synthesizer      [RFC6787]
+   Speech-Language                    Synthesizer      [RFC6787]
+   Fetch-Hint                         Synthesizer      [RFC6787]
+   Audio-Fetch-Hint                   Synthesizer      [RFC6787]
+   Failed-URI                         Synthesizer      [RFC6787]
+   Failed-URI-Cause                   Synthesizer      [RFC6787]
+   Speak-Restart                      Synthesizer      [RFC6787]
+   Speak-Length                       Synthesizer      [RFC6787]
+   Load-Lexicon                       Synthesizer      [RFC6787]
+   Lexicon-Search-Order               Synthesizer      [RFC6787]
+   Confidence-Threshold               Recognizer       [RFC6787]
+   Sensitivity-Level                  Recognizer       [RFC6787]
+   Speed-Vs-Accuracy                  Recognizer       [RFC6787]
+   N-Best-List-Length                 Recognizer       [RFC6787]
+   Input-Type                         Recognizer       [RFC6787]
+   No-Input-Timeout                   Recognizer       [RFC6787]
+   Recognition-Timeout                Recognizer       [RFC6787]
+   Waveform-URI                       Recognizer       [RFC6787]
+   Input-Waveform-URI                 Recognizer       [RFC6787]
+   Completion-Cause                   Recognizer       [RFC6787]
+   Completion-Reason                  Recognizer       [RFC6787]
+   Recognizer-Context-Block           Recognizer       [RFC6787]
+   Start-Input-Timers                 Recognizer       [RFC6787]
+   Speech-Complete-Timeout            Recognizer       [RFC6787]
+   Speech-Incomplete-Timeout          Recognizer       [RFC6787]
+   Dtmf-Interdigit-Timeout            Recognizer       [RFC6787]
+   Dtmf-Term-Timeout                  Recognizer       [RFC6787]
+   Dtmf-Term-Char                     Recognizer       [RFC6787]
+   Failed-URI                         Recognizer       [RFC6787]
+   Failed-URI-Cause                   Recognizer       [RFC6787]
+   Save-Waveform                      Recognizer       [RFC6787]
+   Media-Type                         Recognizer       [RFC6787]
+   New-Audio-Channel                  Recognizer       [RFC6787]
+   Speech-Language                    Recognizer       [RFC6787]
+   Ver-Buffer-Utterance               Recognizer       [RFC6787]
+   Recognition-Mode                   Recognizer       [RFC6787]
+   Cancel-If-Queue                    Recognizer       [RFC6787]
+   Hotword-Max-Duration               Recognizer       [RFC6787]
+   Hotword-Min-Duration               Recognizer       [RFC6787]
+   Interpret-Text                     Recognizer       [RFC6787]
+   Dtmf-Buffer-Time                   Recognizer       [RFC6787]
+   Clear-Dtmf-Buffer                  Recognizer       [RFC6787]
+   Early-No-Match                     Recognizer       [RFC6787]
+   Num-Min-Consistent-Pronunciations  Recognizer       [RFC6787]
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 174]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Consistency-Threshold              Recognizer       [RFC6787]
+   Clash-Threshold                    Recognizer       [RFC6787]
+   Personal-Grammar-URI               Recognizer       [RFC6787]
+   Enroll-Utterance                   Recognizer       [RFC6787]
+   Phrase-ID                          Recognizer       [RFC6787]
+   Phrase-NL                          Recognizer       [RFC6787]
+   Weight                             Recognizer       [RFC6787]
+   Save-Best-Waveform                 Recognizer       [RFC6787]
+   New-Phrase-ID                      Recognizer       [RFC6787]
+   Confusable-Phrases-URI             Recognizer       [RFC6787]
+   Abort-Phrase-Enrollment            Recognizer       [RFC6787]
+   Sensitivity-Level                  Recorder         [RFC6787]
+   No-Input-Timeout                   Recorder         [RFC6787]
+   Completion-Cause                   Recorder         [RFC6787]
+   Completion-Reason                  Recorder         [RFC6787]
+   Failed-URI                         Recorder         [RFC6787]
+   Failed-URI-Cause                   Recorder         [RFC6787]
+   Record-URI                         Recorder         [RFC6787]
+   Media-Type                         Recorder         [RFC6787]
+   Max-Time                           Recorder         [RFC6787]
+   Trim-Length                        Recorder         [RFC6787]
+   Final-Silence                      Recorder         [RFC6787]
+   Capture-On-Speech                  Recorder         [RFC6787]
+   Ver-Buffer-Utterance               Recorder         [RFC6787]
+   Start-Input-Timers                 Recorder         [RFC6787]
+   New-Audio-Channel                  Recorder         [RFC6787]
+   Repository-URI                     Verifier         [RFC6787]
+   Voiceprint-Identifier              Verifier         [RFC6787]
+   Verification-Mode                  Verifier         [RFC6787]
+   Adapt-Model                        Verifier         [RFC6787]
+   Abort-Model                        Verifier         [RFC6787]
+   Min-Verification-Score             Verifier         [RFC6787]
+   Num-Min-Verification-Phrases       Verifier         [RFC6787]
+   Num-Max-Verification-Phrases       Verifier         [RFC6787]
+   No-Input-Timeout                   Verifier         [RFC6787]
+   Save-Waveform                      Verifier         [RFC6787]
+   Media-Type                         Verifier         [RFC6787]
+   Waveform-URI                       Verifier         [RFC6787]
+   Voiceprint-Exists                  Verifier         [RFC6787]
+   Ver-Buffer-Utterance               Verifier         [RFC6787]
+   Input-Waveform-URI                 Verifier         [RFC6787]
+   Completion-Cause                   Verifier         [RFC6787]
+   Completion-Reason                  Verifier         [RFC6787]
+   Speech-Complete-Timeout            Verifier         [RFC6787]
+   New-Audio-Channel                  Verifier         [RFC6787]
+   Abort-Verification                 Verifier         [RFC6787]
+   Start-Input-Timers                 Verifier         [RFC6787]
+   Input-Type                         Verifier         [RFC6787]
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 175]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+13.1.4.  MRCPv2 Status Codes
+
+   IANA has created a new name space of "MRCPv2 Status Codes" with the
+   initial values that are defined in Section 5.4.  All maintenance
+   within and additions to the contents of this name space MUST be
+   according to the "Specification Required with Expert Review"
+   registration policy.
+
+13.1.5.  Grammar Reference List Parameters
+
+   IANA has created a new name space of "Grammar Reference List
+   Parameters".  All maintenance within and additions to the contents of
+   this name space MUST be according to the "Specification Required with
+   Expert Review" registration policy.  There is only one initial
+   parameter as shown below.
+
+   Name                       Reference
+   ----                       -------------
+   weight                     [RFC6787]
+
+13.1.6.  MRCPv2 Vendor-Specific Parameters
+
+   IANA has created a new name space of "MRCPv2 Vendor-Specific
+   Parameters".  All maintenance within and additions to the contents of
+   this name space MUST be according to the "Hierarchical Allocation"
+   registration policy as follows.  Each name (corresponding to the
+   "vendor-av-pair-name" ABNF production) MUST satisfy the syntax
+   requirements of Internet Domain Names as described in Section 2.3.1
+   of RFC 1035 [RFC1035] (and as updated or obsoleted by successive
+   RFCs), with one exception, the order of the domain names is reversed.
+   For example, a vendor-specific parameter "foo" by example.com would
+   have the form "com.example.foo".  The first, or top-level domain, is
+   restricted to exactly the set of Top-Level Internet Domains defined
+   by IANA and will be updated by IANA when and only when that set
+   changes.  The second-level and all subdomains within the parameter
+   name MUST be allocated according to the "First Come First Served"
+   policy.  It is RECOMMENDED that assignment requests adhere to the
+   existing allocations of Internet domain names to organizations,
+   institutions, corporations, etc.
+
+   The registry contains a list of vendor-registered parameters, where
+   each defined parameter is associated with a contact person and
+   includes an optional reference to the definition of the parameter,
+   preferably an RFC.  The registry is initially empty.
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 176]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+13.2.  NLSML-Related Registrations
+
+13.2.1.  'application/nlsml+xml' Media Type Registration
+
+   IANA has registered the following media type according to the process
+   defined in RFC 4288 [RFC4288].
+
+   To:  ietf-types@iana.org
+
+   Subject:  Registration of media type application/nlsml+xml
+
+   MIME media type name:  application
+
+   MIME subtype name:  nlsml+xml
+
+   Required parameters:  none
+
+   Optional parameters:
+
+      charset:  All of the considerations described in RFC 3023
+         [RFC3023] also apply to the application/nlsml+xml media type.
+
+   Encoding considerations:  All of the considerations described in RFC
+      3023 also apply to the 'application/nlsml+xml' media type.
+
+   Security considerations:  As with HTML, NLSML documents contain links
+      to other data stores (grammars, verifier resources, etc.).  Unlike
+      HTML, however, the data stores are not treated as media to be
+      rendered.  Nevertheless, linked files may themselves have security
+      considerations, which would be those of the individual registered
+      types.  Additionally, this media type has all of the security
+      considerations described in RFC 3023.
+
+   Interoperability considerations:  Although an NLSML document is
+      itself a complete XML document, for a fuller interpretation of the
+      content a receiver of an NLSML document may wish to access
+      resources linked to by the document.  The inability of an NLSML
+      processor to access or process such linked resources could result
+      in different behavior by the ultimate consumer of the data.
+
+   Published specification:  RFC 6787
+
+   Applications that use this media type:  MRCPv2 clients and servers
+
+   Additional information:  none
+
+   Magic number(s):  There is no single initial octet sequence that is
+      always present for NLSML files.
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 177]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Person & email address to contact for further information:
+      Sarvi Shanmugham, sarvi@cisco.com
+
+   Intended usage:  This media type is expected to be used only in
+      conjunction with MRCPv2.
+
+13.3.  NLSML XML Schema Registration
+
+   IANA has registered and now maintains the following XML Schema.
+   Information provided follows the template in RFC 3688 [RFC3688].
+
+   XML element type:  schema
+
+   URI:  urn:ietf:params:xml:schema:nlsml
+
+   Registrant Contact:  IESG
+
+   XML:  See Section 16.1.
+
+13.4.  MRCPv2 XML Namespace Registration
+
+   IANA has registered and now maintains the following XML Name space.
+   Information provided follows the template in RFC 3688 [RFC3688].
+
+   XML element type:  ns
+
+   URI:  urn:ietf:params:xml:ns:mrcpv2
+
+   Registrant Contact:  IESG
+
+   XML:  RFC 6787
+
+13.5.  Text Media Type Registrations
+
+   IANA has registered the following text media type according to the
+   process defined in RFC 4288 [RFC4288].
+
+13.5.1.  text/grammar-ref-list
+
+   To:  ietf-types@iana.org
+
+   Subject:  Registration of media type text/grammar-ref-list
+
+   MIME media type name:  text
+
+   MIME subtype name:  text/grammar-ref-list
+
+   Required parameters:  none
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 178]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Optional parameters:  none
+
+   Encoding considerations:  Depending on the transfer protocol, a
+      transfer encoding may be necessary to deal with very long lines.
+
+   Security considerations:  This media type contains URIs that may
+      represent references to external resources.  As these resources
+      are assumed to be speech recognition grammars, similar
+      considerations as for the media types 'application/srgs' and
+      'application/srgs+xml' apply.
+
+   Interoperability considerations:  '>' must be percent encoded in URIs
+      according to RFC 3986 [RFC3986].
+
+   Published specification:  The RECOGNIZE method of the MRCP protocol
+      performs a recognition operation that matches input against a set
+      of grammars.  When matching against more than one grammar, it is
+      sometimes necessary to use different weights for the individual
+      grammars.  These weights are not a property of the grammar
+      resource itself but qualify the reference to that grammar for the
+      particular recognition operation initiated by the RECOGNIZE
+      method.  The format of the proposed 'text/grammar-ref-list' media
+      type is as follows:
+
+      body       = *reference
+      reference  = "<" uri ">" [parameters] CRLF
+      parameters = ";" parameter *(";" parameter)
+      parameter  = attribute "=" value
+
+      This specification currently only defines a 'weight' parameter,
+      but new parameters MAY be added through the "Grammar Reference
+      List Parameters" IANA registry established through this
+      specification.  Example:
+
+            <http://example.com/grammars/field1.gram>
+            <http://example.com/grammars/field2.gram>;weight="0.85"
+            <session:field3@form-level.store>;weight="0.9"
+            <http://example.com/grammars/universals.gram>;weight="0.75"
+
+   Applications that use this media type:  MRCPv2 clients and servers
+
+   Additional information:  none
+
+   Magic number(s):  none
+
+   Person & email address to contact for further information:
+      Sarvi Shanmugham, sarvi@cisco.com
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 179]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Intended usage:  This media type is expected to be used only in
+      conjunction with MRCPv2.
+
+13.6.  'session' URI Scheme Registration
+
+   IANA has registered the following new URI scheme.  The information
+   below follows the template given in RFC 4395 [RFC4395].
+
+   URI scheme name:  session
+
+   Status:  Permanent
+
+   URI scheme syntax:  The syntax of this scheme is identical to that
+      defined for the "cid" scheme in Section 2 of RFC 2392 [RFC2392].
+
+   URI scheme semantics:  The URI is intended to identify a data
+      resource previously given to the network computing resource.  The
+      purpose of this scheme is to permit access to the specific
+      resource for the lifetime of the session with the entity storing
+      the resource.  The media type of the resource CAN vary.  There is
+      no explicit mechanism for communication of the media type.  This
+      scheme is currently widely used internally by existing
+      implementations, and the registration is intended to provide
+      information in the rare (and unfortunate) case that the scheme is
+      used elsewhere.  The scheme SHOULD NOT be used for open Internet
+      protocols.
+
+   Encoding considerations:  There are no other encoding considerations
+      for the 'session' URIs not described in RFC 3986 [RFC3986]
+
+   Applications/protocols that use this URI scheme name:  This scheme
+      name is used by MRCPv2 clients and servers.
+
+   Interoperability considerations:  Note that none of the resources are
+      accessible after the MCRPv2 session ends, hence the name of the
+      scheme.  For clients who establish one MRCPv2 session only for the
+      entire speech application being implemented, this is sufficient,
+      but clients who create, terminate, and recreate MRCP sessions for
+      performance or scalability reasons will lose access to resources
+      established in the earlier session(s).
+
+   Security considerations:  Generic security considerations for URIs
+      described in RFC 3986 [RFC3986] apply to this scheme as well.  The
+      URIs defined here provide an identification mechanism only.  Given
+      that the communication channel between client and server is
+      secure, that the server correctly accesses the resource associated
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 180]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+      with the URI, and that the server ensures session-only lifetime
+      and access for each URI, the only additional security issues are
+      those of the types of media referred to by the URI.
+
+   Contact:  Sarvi Shanmugham, sarvi@cisco.com
+
+   Author/Change controller:  IESG, iesg@ietf.org
+
+   References:  This specification, particularly Sections 6.2.7, 8.5.2,
+      9.5.1, and 9.9.
+
+13.7.  SDP Parameter Registrations
+
+   IANA has registered the following SDP parameter values.  The
+   information for each follows the template given in RFC 4566
+   [RFC4566], Appendix B.
+
+13.7.1.  Sub-Registry "proto"
+
+   "TCP/MRCPv2" value of the "proto" parameter
+
+   Contact name, email address, and telephone number:  Sarvi Shanmugham,
+      sarvi@cisco.com, +1.408.902.3875
+
+   Name being registered (as it will appear in SDP):  TCP/MRCPv2
+
+   Long-form name in English:  MCRPv2 over TCP
+
+   Type of name:  proto
+
+   Explanation of name:  This name represents the MCRPv2 protocol
+      carried over TCP.
+
+   Reference to specification of name:  RFC 6787
+
+   "TCP/TLS/MRCPv2" value of the "proto" parameter
+
+   Contact name, email address, and telephone number:  Sarvi Shanmugham,
+      sarvi@cisco.com, +1.408.902.3875
+
+   Name being registered (as it will appear in SDP):  TCP/TLS/MRCPv2
+
+   Long-form name in English:  MCRPv2 over TLS over TCP
+
+   Type of name:  proto
+
+   Explanation of name:  This name represents the MCRPv2 protocol
+      carried over TLS over TCP.
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 181]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Reference to specification of name:  RFC 6787
+
+13.7.2.  Sub-Registry "att-field (media-level)"
+
+   "resource" value of the "att-field" parameter
+
+   Contact name, email address, and telephone number:  Sarvi Shanmugham,
+      sarvi@cisco.com, +1.408.902.3875
+
+   Attribute name (as it will appear in SDP):  resource
+
+   Long-form attribute name in English:  MRCPv2 resource type
+
+   Type of attribute:  media-level
+
+   Subject to charset attribute?  no
+
+   Explanation of attribute:  See Section 4.2 of RFC 6787 for
+      description and examples.
+
+   Specification of appropriate attribute values:  See section
+      Section 13.1.1 of RFC 6787.
+
+   "channel" value of the "att-field" parameter
+
+   Contact name, email address, and telephone number:  Sarvi Shanmugham,
+      sarvi@cisco.com, +1.408.902.3875
+
+   Attribute name (as it will appear in SDP):  channel
+
+   Long-form attribute name in English:  MRCPv2 resource channel
+      identifier
+
+   Type of attribute:  media-level
+
+   Subject to charset attribute?  no
+
+   Explanation of attribute:  See Section 4.2 of RFC 6787 for
+      description and examples.
+
+   Specification of appropriate attribute values:  See Section 4.2 and
+      the "channel-id" ABNF production rules of RFC 6787.
+
+   "cmid" value of the "att-field" parameter
+
+   Contact name, email address, and telephone number:  Sarvi Shanmugham,
+      sarvi@cisco.com, +1.408.902.3875
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 182]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Attribute name (as it will appear in SDP):  cmid
+
+   Long-form attribute name in English:  MRCPv2 resource channel media
+      identifier
+
+   Type of attribute:  media-level
+
+   Subject to charset attribute?  no
+
+   Explanation of attribute:  See Section 4.4 of RFC 6787 for
+      description and examples.
+
+   Specification of appropriate attribute values:  See Section 4.4 and
+      the "cmid-attribute" ABNF production rules of RFC 6787.
+
+14.  Examples
+
+14.1.  Message Flow
+
+   The following is an example of a typical MRCPv2 session of speech
+   synthesis and recognition between a client and a server.  Although
+   the SDP "s=" attribute in these examples has a text description value
+   to assist in understanding the examples, please keep in mind that RFC
+   3264 [RFC3264] recommends that messages actually put on the wire use
+   a space or a dash.
+
+   The figure below illustrates opening a session to the MRCPv2 server.
+   This exchange does not allocate a resource or setup media.  It simply
+   establishes a SIP session with the MRCPv2 server.
+
+   C->S:
+          INVITE sip:mresources@example.com SIP/2.0
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bg1
+          Max-Forwards:6
+          To:MediaServer <sip:mresources@example.com>
+          From:sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:323123 INVITE
+          Contact:<sip:sarvi@client.example.com>
+          Content-Type:application/sdp
+          Content-Length:...
+
+          v=0
+          o=sarvi 2614933546 2614933546 IN IP4 192.0.2.12
+          s=Set up MRCPv2 control and audio
+          i=Initial contact
+          c=IN IP4 192.0.2.12
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 183]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   S->C:
+          SIP/2.0 200 OK
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bg1;received=192.0.32.10
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          From:sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:323123 INVITE
+          Contact:<sip:mresources@server.example.com>
+          Content-Type:application/sdp
+          Content-Length:...
+
+          v=0
+          o=- 3000000001 3000000001 IN IP4 192.0.2.11
+          s=Set up MRCPv2 control and audio
+          i=Initial contact
+          c=IN IP4 192.0.2.11
+
+   C->S:
+          ACK sip:mresources@server.example.com SIP/2.0
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bg2
+          Max-Forwards:6
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          From:Sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:323123 ACK
+          Content-Length:0
+
+   The client requests the server to create a synthesizer resource
+   control channel to do speech synthesis.  This also adds a media
+   stream to send the generated speech.  Note that, in this example, the
+   client requests a new MRCPv2 TCP stream between the client and the
+   server.  In the following requests, the client will ask to use the
+   existing connection.
+
+   C->S:
+          INVITE sip:mresources@server.example.com SIP/2.0
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bg3
+          Max-Forwards:6
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          From:sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:323124 INVITE
+          Contact:<sip:sarvi@client.example.com>
+          Content-Type:application/sdp
+          Content-Length:...
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 184]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+          v=0
+          o=sarvi 2614933546 2614933547 IN IP4 192.0.2.12
+          s=Set up MRCPv2 control and audio
+          i=Add TCP channel, synthesizer and one-way audio
+          c=IN IP4 192.0.2.12
+          t=0 0
+          m=application 9  TCP/MRCPv2 1
+          a=setup:active
+          a=connection:new
+          a=resource:speechsynth
+          a=cmid:1
+          m=audio 49170 RTP/AVP 0 96
+          a=rtpmap:0 pcmu/8000
+          a=rtpmap:96 telephone-event/8000
+          a=fmtp:96 0-15
+          a=recvonly
+          a=mid:1
+
+
+   S->C:
+          SIP/2.0 200 OK
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bg3;received=192.0.32.10
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          From:sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:323124 INVITE
+          Contact:<sip:mresources@server.example.com>
+          Content-Type:application/sdp
+          Content-Length:...
+
+          v=0
+          o=- 3000000001 3000000002 IN IP4 192.0.2.11
+          s=Set up MRCPv2 control and audio
+          i=Add TCP channel, synthesizer and one-way audio
+          c=IN IP4 192.0.2.11
+          t=0 0
+          m=application 32416  TCP/MRCPv2 1
+          a=setup:passive
+          a=connection:new
+          a=channel:32AECB23433801@speechsynth
+          a=cmid:1
+          m=audio 48260 RTP/AVP 0
+          a=rtpmap:0 pcmu/8000
+          a=sendonly
+          a=mid:1
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 185]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   C->S:
+          ACK sip:mresources@server.example.com SIP/2.0
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bg4
+          Max-Forwards:6
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          From:Sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:323124 ACK
+          Content-Length:0
+
+   This exchange allocates an additional resource control channel for a
+   recognizer.  Since a recognizer would need to receive an audio stream
+   for recognition, this interaction also updates the audio stream to
+   sendrecv, making it a two-way audio stream.
+
+   C->S:
+          INVITE sip:mresources@server.example.com SIP/2.0
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bg5
+          Max-Forwards:6
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          From:sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:323125 INVITE
+          Contact:<sip:sarvi@client.example.com>
+          Content-Type:application/sdp
+          Content-Length:...
+
+          v=0
+          o=sarvi 2614933546 2614933548 IN IP4 192.0.2.12
+          s=Set up MRCPv2 control and audio
+          i=Add recognizer and duplex the audio
+          c=IN IP4 192.0.2.12
+          t=0 0
+          m=application 9  TCP/MRCPv2 1
+          a=setup:active
+          a=connection:existing
+          a=resource:speechsynth
+          a=cmid:1
+          m=audio 49170 RTP/AVP 0 96
+          a=rtpmap:0 pcmu/8000
+          a=rtpmap:96 telephone-event/8000
+          a=fmtp:96 0-15
+          a=recvonly
+          a=mid:1
+          m=application 9  TCP/MRCPv2 1
+          a=setup:active
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 186]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+          a=connection:existing
+          a=resource:speechrecog
+          a=cmid:2
+          m=audio 49180 RTP/AVP 0 96
+          a=rtpmap:0 pcmu/8000
+          a=rtpmap:96 telephone-event/8000
+          a=fmtp:96 0-15
+          a=sendonly
+          a=mid:2
+
+
+   S->C:
+          SIP/2.0 200 OK
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bg5;received=192.0.32.10
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          From:sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:323125 INVITE
+          Contact:<sip:mresources@server.example.com>
+          Content-Type:application/sdp
+          Content-Length:...
+
+          v=0
+          o=- 3000000001 3000000003 IN IP4 192.0.2.11
+          s=Set up MRCPv2 control and audio
+          i=Add recognizer and duplex the audio
+          c=IN IP4 192.0.2.11
+          t=0 0
+          m=application 32416  TCP/MRCPv2 1
+          a=channel:32AECB23433801@speechsynth
+          a=cmid:1
+          m=audio 48260 RTP/AVP 0
+          a=rtpmap:0 pcmu/8000
+          a=sendonly
+          a=mid:1
+          m=application 32416  TCP/MRCPv2 1
+          a=channel:32AECB23433801@speechrecog
+          a=cmid:2
+          m=audio 48260 RTP/AVP 0
+          a=rtpmap:0 pcmu/8000
+          a=rtpmap:96 telephone-event/8000
+          a=fmtp:96 0-15
+          a=recvonly
+          a=mid:2
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 187]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   C->S:
+          ACK sip:mresources@server.example.com SIP/2.0
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bg6
+          Max-Forwards:6
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          From:Sarvi <sip:sarvi@example.com>;tag=1928301774
+          Call-ID:a84b4c76e66710
+          CSeq:323125 ACK
+          Content-Length:0
+
+   A MRCPv2 SPEAK request initiates speech.
+
+   C->S:
+          MRCP/2.0 ... SPEAK 543257
+          Channel-Identifier:32AECB23433801@speechsynth
+          Kill-On-Barge-In:false
+          Voice-gender:neutral
+          Voice-age:25
+          Prosody-volume:medium
+          Content-Type:application/ssml+xml
+          Content-Length:...
+
+          <?xml version="1.0"?>
+          <speak version="1.0"
+                 xmlns="http://www.w3.org/2001/10/synthesis"
+                 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+                 xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
+                 http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
+                 xml:lang="en-US">
+            <p>
+              <s>You have 4 new messages.</s>
+              <s>The first is from Stephanie Williams
+                <mark name="Stephanie"/>
+                and arrived at <break/>
+                <say-as interpret-as="vxml:time">0345p</say-as>.</s>
+              <s>The subject is <prosody
+                 rate="-20%">ski trip</prosody></s>
+            </p>
+          </speak>
+
+   S->C:
+          MRCP/2.0 ... 543257 200 IN-PROGRESS
+          Channel-Identifier:32AECB23433801@speechsynth
+          Speech-Marker:timestamp=857205015059
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 188]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   The synthesizer hits the special marker in the message to be spoken
+   and faithfully informs the client of the event.
+
+   S->C:  MRCP/2.0 ... SPEECH-MARKER 543257 IN-PROGRESS
+          Channel-Identifier:32AECB23433801@speechsynth
+          Speech-Marker:timestamp=857206027059;Stephanie
+
+   The synthesizer finishes with the SPEAK request.
+
+   S->C:  MRCP/2.0 ... SPEAK-COMPLETE 543257 COMPLETE
+          Channel-Identifier:32AECB23433801@speechsynth
+          Speech-Marker:timestamp=857207685213;Stephanie
+
+
+   The recognizer is issued a request to listen for the customer
+   choices.
+
+   C->S:  MRCP/2.0 ... RECOGNIZE 543258
+          Channel-Identifier:32AECB23433801@speechrecog
+          Content-Type:application/srgs+xml
+          Content-Length:...
+
+          <?xml version="1.0"?>
+          <!-- the default grammar language is US English -->
+          <grammar xmlns="http://www.w3.org/2001/06/grammar"
+                   xml:lang="en-US" version="1.0" root="request">
+          <!-- single language attachment to a rule expansion -->
+            <rule id="request">
+              Can I speak to
+              <one-of xml:lang="fr-CA">
+                <item>Michel Tremblay</item>
+                <item>Andre Roy</item>
+              </one-of>
+            </rule>
+          </grammar>
+
+
+   S->C:  MRCP/2.0 ... 543258 200 IN-PROGRESS
+          Channel-Identifier:32AECB23433801@speechrecog
+
+   The client issues the next MRCPv2 SPEAK method.
+
+   C->S:  MRCP/2.0 ... SPEAK 543259
+          Channel-Identifier:32AECB23433801@speechsynth
+          Kill-On-Barge-In:true
+          Content-Type:application/ssml+xml
+          Content-Length:...
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 189]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+          <?xml version="1.0"?>
+          <speak version="1.0"
+                 xmlns="http://www.w3.org/2001/10/synthesis"
+                 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+                 xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
+                 http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
+                 xml:lang="en-US">
+            <p>
+              <s>Welcome to ABC corporation.</s>
+              <s>Who would you like to talk to?</s>
+            </p>
+          </speak>
+
+   S->C:  MRCP/2.0 ... 543259 200 IN-PROGRESS
+          Channel-Identifier:32AECB23433801@speechsynth
+          Speech-Marker:timestamp=857207696314
+
+   This next section of this ongoing example demonstrates how kill-on-
+   barge-in support works.  Since this last SPEAK request had Kill-On-
+   Barge-In set to "true", when the recognizer (the server) generated
+   the START-OF-INPUT event while a SPEAK was active, the client
+   immediately issued a BARGE-IN-OCCURRED method to the synthesizer
+   resource.  The speech synthesizer then terminated playback and
+   notified the client.  The completion-cause code provided the
+   indication that this was a kill-on-barge-in interruption rather than
+   a normal completion.
+
+   Note that, since the recognition and synthesizer resources are in the
+   same session on the same server, to obtain a faster response the
+   server might have internally relayed the start-of-input condition to
+   the synthesizer directly, before receiving the expected BARGE-IN-
+   OCCURRED event.  However, any such communication is outside the scope
+   of MRCPv2.
+
+   S->C:  MRCP/2.0 ... START-OF-INPUT 543258 IN-PROGRESS
+          Channel-Identifier:32AECB23433801@speechrecog
+          Proxy-Sync-Id:987654321
+
+
+   C->S:  MRCP/2.0 ... BARGE-IN-OCCURRED 543259
+          Channel-Identifier:32AECB23433801@speechsynth
+          Proxy-Sync-Id:987654321
+
+
+   S->C:  MRCP/2.0 ... 543259 200 COMPLETE
+          Channel-Identifier:32AECB23433801@speechsynth
+          Active-Request-Id-List:543258
+          Speech-Marker:timestamp=857206096314
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 190]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   S->C:  MRCP/2.0 ... SPEAK-COMPLETE 543259 COMPLETE
+          Channel-Identifier:32AECB23433801@speechsynth
+          Completion-Cause:001 barge-in
+          Speech-Marker:timestamp=857207685213
+
+
+   The recognizer resource matched the spoken stream to a grammar and
+   generated results.  The result of the recognition is returned by the
+   server as part of the RECOGNITION-COMPLETE event.
+
+   S->C:  MRCP/2.0 ... RECOGNITION-COMPLETE 543258 COMPLETE
+          Channel-Identifier:32AECB23433801@speechrecog
+          Completion-Cause:000 success
+          Waveform-URI:<http://web.media.com/session123/audio.wav>;
+                       size=423523;duration=25432
+          Content-Type:application/nlsml+xml
+          Content-Length:...
+
+          <?xml version="1.0"?>
+          <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+                  xmlns:ex="http://www.example.com/example"
+                  grammar="session:request1@form-level.store">
+              <interpretation>
+                  <instance name="Person">
+                      <ex:Person>
+                          <ex:Name> Andre Roy </ex:Name>
+                      </ex:Person>
+                  </instance>
+                  <input>   may I speak to Andre Roy </input>
+              </interpretation>
+          </result>
+
+   Since the client was now finished with the session, including all
+   resources, it issued a SIP BYE request to close the SIP session.
+   This caused all control channels and resources allocated under the
+   session to be deallocated.
+
+   C->S:  BYE sip:mresources@server.example.com SIP/2.0
+          Via:SIP/2.0/TCP client.atlanta.example.com:5060;
+           branch=z9hG4bK74bg7
+          Max-Forwards:6
+          From:Sarvi <sip:sarvi@example.com>;tag=1928301774
+          To:MediaServer <sip:mresources@example.com>;tag=62784
+          Call-ID:a84b4c76e66710
+          CSeq:323126 BYE
+          Content-Length:0
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 191]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+14.2.  Recognition Result Examples
+
+14.2.1.  Simple ASR Ambiguity
+
+   System: To which city will you be traveling?
+   User:   I want to go to Pittsburgh.
+
+   <?xml version="1.0"?>
+   <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
+           xmlns:ex="http://www.example.com/example"
+           grammar="http://www.example.com/flight">
+     <interpretation confidence="0.6">
+        <instance>
+           <ex:airline>
+              <ex:to_city>Pittsburgh</ex:to_city>
+           <ex:airline>
+        <instance>
+        <input mode="speech">
+           I want to go to Pittsburgh
+        </input>
+     </interpretation>
+     <interpretation confidence="0.4"
+        <instance>
+           <ex:airline>
+              <ex:to_city>Stockholm</ex:to_city>
+           </ex:airline>
+        </instance>
+        <input>I want to go to Stockholm</input>
+     </interpretation>
+   </result>
+
+14.2.2.  Mixed Initiative
+
+   System: What would you like?
+   User:   I would like 2 pizzas, one with pepperoni and cheese,
+           one with sausage and a bottle of coke, to go.
+
+   This example includes an order object which in turn contains objects
+   named "food_item", "drink_item", and "delivery_method".  The
+   representation assumes there are no ambiguities in the speech or
+   natural language processing.  Note that this representation also
+   assumes some level of intra-sentential anaphora resolution, i.e., to
+   resolve the two "one"s as "pizza".
+
+   <?xml version="1.0"?>
+   <nl:result xmlns:nl="urn:ietf:params:xml:ns:mrcpv2"
+              xmlns="http://www.example.com/example"
+              grammar="http://www.example.com/foodorder">
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 192]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+     <nl:interpretation confidence="1.0" >
+        <nl:instance>
+         <order>
+           <food_item confidence="1.0">
+             <pizza>
+               <ingredients confidence="1.0">
+                 pepperoni
+               </ingredients>
+               <ingredients confidence="1.0">
+                 cheese
+               </ingredients>
+             </pizza>
+             <pizza>
+               <ingredients>sausage</ingredients>
+             </pizza>
+           </food_item>
+           <drink_item confidence="1.0">
+             <size>2-liter</size>
+           </drink_item>
+           <delivery_method>to go</delivery_method>
+         </order>
+       </nl:instance>
+       <nl:input mode="speech">I would like 2 pizzas,
+            one with pepperoni and cheese, one with sausage
+            and a bottle of coke, to go.
+       </nl:input>
+     </nl:interpretation>
+   </nl:result>
+
+14.2.3.  DTMF Input
+
+   A combination of DTMF input and speech is represented using nested
+   input elements.  For example:
+   User: My pin is (dtmf 1 2 3 4)
+
+   <input>
+     <input mode="speech" confidence ="1.0"
+        timestamp-start="2000-04-03T0:00:00"
+        timestamp-end="2000-04-03T0:00:01.5">My pin is
+     </input>
+     <input mode="dtmf" confidence ="1.0"
+        timestamp-start="2000-04-03T0:00:01.5"
+        timestamp-end="2000-04-03T0:00:02.0">1 2 3 4
+     </input>
+   </input>
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 193]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Note that grammars that recognize mixtures of speech and DTMF are not
+   currently possible in SRGS; however, this representation might be
+   needed for other applications of NLSML, and this mixture capability
+   might be introduced in future versions of SRGS.
+
+14.2.4.  Interpreting Meta-Dialog and Meta-Task Utterances
+
+   Natural language communication makes use of meta-dialog and meta-task
+   utterances.  This specification is flexible enough so that meta-
+   utterances can be represented on an application-specific basis
+   without requiring other standard markup.
+
+   Here are two examples of how meta-task and meta-dialog utterances
+   might be represented.
+
+System: What toppings do you want on your pizza?
+User:   What toppings do you have?
+
+<interpretation grammar="http://www.example.com/toppings">
+   <instance>
+      <question>
+         <questioned_item>toppings<questioned_item>
+         <questioned_property>
+          availability
+         </questioned_property>
+      </question>
+   </instance>
+   <input mode="speech">
+     what toppings do you have?
+   </input>
+</interpretation>
+
+User:   slow down.
+
+<interpretation grammar="http://www.example.com/generalCommandsGrammar">
+   <instance>
+    <command>
+       <action>reduce speech rate</action>
+       <doer>system</doer>
+    </command>
+   </instance>
+  <input mode="speech">slow down</input>
+</interpretation>
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 194]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+14.2.5.  Anaphora and Deixis
+
+   This specification can be used on an application-specific basis to
+   represent utterances that contain unresolved anaphoric and deictic
+   references.  Anaphoric references, which include pronouns and
+   definite noun phrases that refer to something that was mentioned in
+   the preceding linguistic context, and deictic references, which refer
+   to something that is present in the non-linguistic context, present
+   similar problems in that there may not be sufficient unambiguous
+   linguistic context to determine what their exact role in the
+   interpretation should be.  In order to represent unresolved anaphora
+   and deixis using this specification, one strategy would be for the
+   developer to define a more surface-oriented representation that
+   leaves the specific details of the interpretation of the reference
+   open.  (This assumes that a later component is responsible for
+   actually resolving the reference).
+
+   Example: (ignoring the issue of representing the input from the
+             pointing gesture.)
+
+   System: What do you want to drink?
+   User:   I want this. (clicks on picture of large root beer.)
+
+   <?xml version="1.0"?>
+   <nl:result xmlns:nl="urn:ietf:params:xml:ns:mrcpv2"
+           xmlns="http://www.example.com/example"
+           grammar="http://www.example.com/beverages.grxml">
+      <nl:interpretation>
+         <nl:instance>
+          <doer>I</doer>
+          <action>want</action>
+          <object>this</object>
+         </nl:instance>
+         <nl:input mode="speech">I want this</nl:input>
+      </nl:interpretation>
+   </nl:result>
+
+14.2.6.  Distinguishing Individual Items from Sets with One Member
+
+   For programming convenience, it is useful to be able to distinguish
+   between individual items and sets containing one item in the XML
+   representation of semantic results.  For example, a pizza order might
+   consist of exactly one pizza, but a pizza might contain zero or more
+   toppings.  Since there is no standard way of marking this distinction
+   directly in XML, in the current framework, the developer is free to
+   adopt any conventions that would convey this information in the XML
+   markup.  One strategy would be for the developer to wrap the set of
+   items in a grouping element, as in the following example.
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 195]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   <order>
+      <pizza>
+         <topping-group>
+            <topping>mushrooms</topping>
+         </topping-group>
+      </pizza>
+      <drink>coke</drink>
+   </order>
+
+   In this example, the programmer can assume that there is supposed to
+   be exactly one pizza and one drink in the order, but the fact that
+   there is only one topping is an accident of this particular pizza
+   order.
+
+   Note that the client controls both the grammar and the semantics to
+   be returned upon grammar matches, so the user of MRCPv2 is fully
+   empowered to cause results to be returned in NLSML in such a way that
+   the interpretation is clear to that user.
+
+14.2.7.  Extensibility
+
+   Extensibility in NLSML is provided via result content flexibility, as
+   described in the discussions of meta-utterances and anaphora.  NLSML
+   can easily be used in sophisticated systems to convey application-
+   specific information that more basic systems would not make use of,
+   for example, defining speech acts.
+
+15.  ABNF Normative Definition
+
+   The following productions make use of the core rules defined in
+   Section B.1 of RFC 5234 [RFC5234].
+
+LWS    =    [*WSP CRLF] 1*WSP ; linear whitespace
+
+SWS    =    [LWS] ; sep whitespace
+
+UTF8-NONASCII    =    %xC0-DF 1UTF8-CONT
+                 /    %xE0-EF 2UTF8-CONT
+                 /    %xF0-F7 3UTF8-CONT
+                 /    %xF8-FB 4UTF8-CONT
+                 /    %xFC-FD 5UTF8-CONT
+
+UTF8-CONT        =    %x80-BF
+UTFCHAR          =    %x21-7E
+                 /    UTF8-NONASCII
+param            =    *pchar
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 196]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+quoted-string    =    SWS DQUOTE *(qdtext / quoted-pair )
+                      DQUOTE
+
+qdtext           =    LWS / %x21 / %x23-5B / %x5D-7E
+                 /    UTF8-NONASCII
+
+quoted-pair      =    "\" (%x00-09 / %x0B-0C / %x0E-7F)
+
+token            =    1*(alphanum / "-" / "." / "!" / "%" / "*"
+                      / "_" / "+" / "`" / "'" / "~" )
+
+reserved         =    ";" / "/" / "?" / ":" / "@" / "&" / "="
+                      / "+" / "$" / ","
+
+mark             =    "-" / "_" / "." / "!" / "~" / "*" / "'"
+                 /    "(" / ")"
+
+unreserved       =    alphanum / mark
+
+pchar            =    unreserved / escaped
+                 /    ":" / "@" / "&" / "=" / "+" / "$" / ","
+
+alphanum         =    ALPHA / DIGIT
+
+BOOLEAN          =    "true" / "false"
+
+FLOAT            =    *DIGIT ["." *DIGIT]
+
+escaped          =    "%" HEXDIG HEXDIG
+
+fragment         =    *uric
+
+uri              =    [ absoluteURI / relativeURI ]
+                      [ "#" fragment ]
+
+absoluteURI      =    scheme ":" ( hier-part / opaque-part )
+
+relativeURI      =    ( net-path / abs-path / rel-path )
+                      [ "?" query ]
+
+hier-part        =    ( net-path / abs-path ) [ "?" query ]
+
+net-path         =    "//" authority [ abs-path ]
+
+abs-path         =    "/" path-segments
+
+rel-path         =    rel-segment [ abs-path ]
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 197]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+rel-segment      =    1*( unreserved / escaped / ";" / "@"
+                 /    "&" / "=" / "+" / "$" / "," )
+
+opaque-part      =    uric-no-slash *uric
+
+uric             =    reserved / unreserved / escaped
+
+uric-no-slash    =    unreserved / escaped / ";" / "?" / ":"
+                      / "@" / "&" / "=" / "+" / "$" / ","
+
+path-segments    =    segment *( "/" segment )
+
+segment          =    *pchar *( ";" param )
+
+scheme           =    ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
+
+authority        =    srvr / reg-name
+
+srvr             =    [ [ userinfo "@" ] hostport ]
+
+reg-name         =    1*( unreserved / escaped / "$" / ","
+                 /     ";" / ":" / "@" / "&" / "=" / "+" )
+
+query            =    *uric
+
+userinfo         =    ( user ) [ ":" password ] "@"
+
+user             =    1*( unreserved / escaped
+                 /    user-unreserved )
+
+user-unreserved  =    "&" / "=" / "+" / "$" / "," / ";"
+                 /    "?" / "/"
+
+password         =    *( unreserved / escaped
+                 /    "&" / "=" / "+" / "$" / "," )
+
+hostport         =    host [ ":" port ]
+
+host             =    hostname / IPv4address / IPv6reference
+
+hostname         =    *( domainlabel "." ) toplabel [ "." ]
+
+domainlabel      =    alphanum / alphanum *( alphanum / "-" )
+                      alphanum
+
+toplabel         =    ALPHA / ALPHA *( alphanum / "-" )
+                      alphanum
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 198]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+IPv4address      =    1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT "."
+                      1*3DIGIT
+
+IPv6reference    =    "[" IPv6address "]"
+
+IPv6address      =    hexpart [ ":" IPv4address ]
+
+hexpart          =    hexseq / hexseq "::" [ hexseq ] / "::"
+                      [ hexseq ]
+
+hexseq           =    hex4 *( ":" hex4)
+
+hex4             =    1*4HEXDIG
+
+port             =    1*19DIGIT
+
+; generic-message is the top-level rule
+
+generic-message  =    start-line message-header CRLF
+                      [ message-body ]
+
+message-body     =    *OCTET
+
+start-line       =    request-line / response-line / event-line
+
+request-line     =    mrcp-version SP message-length SP method-name
+                      SP request-id CRLF
+
+response-line    =    mrcp-version SP message-length SP request-id
+                      SP status-code SP request-state CRLF
+
+event-line       =    mrcp-version SP message-length SP event-name
+                      SP request-id SP request-state CRLF
+
+method-name      =    generic-method
+                 /    synthesizer-method
+                 /    recognizer-method
+                 /    recorder-method
+                 /    verifier-method
+
+generic-method   =    "SET-PARAMS"
+                 /    "GET-PARAMS"
+
+request-state    =    "COMPLETE"
+                 /    "IN-PROGRESS"
+                 /    "PENDING"
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 199]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+event-name       =    synthesizer-event
+                 /    recognizer-event
+                 /    recorder-event
+                 /    verifier-event
+
+message-header   =  1*(generic-header / resource-header / generic-field)
+
+generic-field    =    field-name ":" [ field-value ]
+field-name       =    token
+field-value      =    *LWS field-content *( CRLF 1*LWS field-content)
+field-content    =    <the OCTETs making up the field-value
+                      and consisting of either *TEXT or combinations
+                      of token, separators, and quoted-string>
+
+resource-header  =    synthesizer-header
+                 /    recognizer-header
+                 /    recorder-header
+                 /    verifier-header
+
+generic-header   =    channel-identifier
+                 /    accept
+                 /    active-request-id-list
+                 /    proxy-sync-id
+                 /    accept-charset
+                 /    content-type
+                 /    content-id
+                 /    content-base
+                 /    content-encoding
+                 /    content-location
+                 /    content-length
+                 /    fetch-timeout
+                 /    cache-control
+                 /    logging-tag
+                 /    set-cookie
+                 /    vendor-specific
+
+; -- content-id is as defined in RFC 2392, RFC 2046 and RFC 5322
+; -- accept and accept-charset are as defined in RFC 2616
+
+mrcp-version     =    "MRCP" "/" 1*2DIGIT "." 1*2DIGIT
+
+message-length   =    1*19DIGIT
+
+request-id       =    1*10DIGIT
+
+status-code      =    3DIGIT
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 200]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+channel-identifier =  "Channel-Identifier" ":"
+                      channel-id CRLF
+
+channel-id       =    1*alphanum "@" 1*alphanum
+
+active-request-id-list = "Active-Request-Id-List" ":"
+                         request-id *("," request-id) CRLF
+
+proxy-sync-id    =    "Proxy-Sync-Id" ":" 1*VCHAR CRLF
+
+content-base     =    "Content-Base" ":" absoluteURI CRLF
+
+content-length   =    "Content-Length" ":" 1*19DIGIT CRLF
+
+content-type     =    "Content-Type" ":" media-type-value CRLF
+
+media-type-value =    type "/" subtype *( ";" parameter )
+
+type             =    token
+
+subtype          =    token
+
+parameter        =    attribute "=" value
+
+attribute        =    token
+
+value            =    token / quoted-string
+
+content-encoding =    "Content-Encoding" ":"
+                      *WSP content-coding
+                      *(*WSP "," *WSP content-coding *WSP )
+                      CRLF
+
+content-coding   =    token
+
+content-location =    "Content-Location" ":"
+                      ( absoluteURI / relativeURI )  CRLF
+
+cache-control    =    "Cache-Control" ":"
+                      [*WSP cache-directive
+                      *( *WSP "," *WSP cache-directive *WSP )]
+                      CRLF
+
+fetch-timeout    =    "Fetch-Timeout" ":" 1*19DIGIT CRLF
+
+cache-directive  =    "max-age" "=" delta-seconds
+                 /    "max-stale" ["=" delta-seconds ]
+                 /    "min-fresh" "=" delta-seconds
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 201]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+delta-seconds    =    1*19DIGIT
+
+logging-tag      =    "Logging-Tag" ":" 1*UTFCHAR CRLF
+
+vendor-specific  =    "Vendor-Specific-Parameters" ":"
+                      [vendor-specific-av-pair
+                      *(";" vendor-specific-av-pair)] CRLF
+
+vendor-specific-av-pair = vendor-av-pair-name "="
+                          value
+
+vendor-av-pair-name     = 1*UTFCHAR
+
+set-cookie        = "Set-Cookie:" SP set-cookie-string
+set-cookie-string = cookie-pair *( ";" SP cookie-av )
+cookie-pair       = cookie-name "=" cookie-value
+cookie-name       = token
+cookie-value      = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )
+cookie-octet      = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
+token             = <token, defined in [RFC2616], Section 2.2>
+
+cookie-av         = expires-av / max-age-av / domain-av /
+                     path-av / secure-av / httponly-av /
+                     extension-av / age-av
+expires-av        = "Expires=" sane-cookie-date
+sane-cookie-date  = <rfc1123-date, defined in [RFC2616], Section 3.3.1>
+max-age-av        = "Max-Age=" non-zero-digit *DIGIT
+non-zero-digit    = %x31-39
+domain-av         = "Domain=" domain-value
+domain-value      = <subdomain>
+path-av           = "Path=" path-value
+path-value        = <any CHAR except CTLs or ";">
+secure-av         = "Secure"
+httponly-av       = "HttpOnly"
+extension-av      = <any CHAR except CTLs or ";">
+age-av            = "Age=" delta-seconds
+
+; Synthesizer ABNF
+
+synthesizer-method    =    "SPEAK"
+                      /    "STOP"
+                      /    "PAUSE"
+                      /    "RESUME"
+                      /    "BARGE-IN-OCCURRED"
+                      /    "CONTROL"
+                      /    "DEFINE-LEXICON"
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 202]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+synthesizer-event     =    "SPEECH-MARKER"
+                      /    "SPEAK-COMPLETE"
+
+synthesizer-header    =    jump-size
+                      /    kill-on-barge-in
+                      /    speaker-profile
+                      /    completion-cause
+                      /    completion-reason
+                      /    voice-parameter
+                      /    prosody-parameter
+                      /    speech-marker
+                      /    speech-language
+                      /    fetch-hint
+                      /    audio-fetch-hint
+                      /    failed-uri
+                      /    failed-uri-cause
+                      /    speak-restart
+                      /    speak-length
+                      /    load-lexicon
+                      /    lexicon-search-order
+
+jump-size             =    "Jump-Size" ":" speech-length-value CRLF
+
+speech-length-value   =    numeric-speech-length
+                      /    text-speech-length
+
+text-speech-length    =    1*UTFCHAR SP "Tag"
+
+numeric-speech-length =    ("+" / "-") positive-speech-length
+
+positive-speech-length =   1*19DIGIT SP numeric-speech-unit
+
+numeric-speech-unit   =    "Second"
+                      /    "Word"
+                      /    "Sentence"
+                      /    "Paragraph"
+
+kill-on-barge-in      =    "Kill-On-Barge-In" ":" BOOLEAN
+                           CRLF
+
+speaker-profile       =    "Speaker-Profile" ":" uri CRLF
+
+completion-cause         =  "Completion-Cause" ":" cause-code SP
+                            cause-name CRLF
+cause-code               =  3DIGIT
+cause-name               =  *VCHAR
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 203]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+completion-reason     =    "Completion-Reason" ":"
+                           quoted-string CRLF
+
+voice-parameter       =    voice-gender
+                      /    voice-age
+                      /    voice-variant
+                      /    voice-name
+
+voice-gender          =    "Voice-Gender:" voice-gender-value CRLF
+
+voice-gender-value    =    "male"
+                      /    "female"
+                      /    "neutral"
+
+voice-age             =    "Voice-Age:" 1*3DIGIT CRLF
+
+voice-variant         =    "Voice-Variant:" 1*19DIGIT CRLF
+
+voice-name            =    "Voice-Name:"
+                           1*UTFCHAR *(1*WSP 1*UTFCHAR) CRLF
+
+prosody-parameter     =    "Prosody-" prosody-param-name ":"
+                           prosody-param-value CRLF
+
+prosody-param-name    =    1*VCHAR
+
+prosody-param-value   =    1*VCHAR
+
+timestamp             =    "timestamp" "=" time-stamp-value
+
+time-stamp-value      =    1*20DIGIT
+
+speech-marker         =    "Speech-Marker" ":"
+                           timestamp
+                           [";" 1*(UTFCHAR / %x20)] CRLF
+
+speech-language       =    "Speech-Language" ":" 1*VCHAR CRLF
+
+fetch-hint            =    "Fetch-Hint" ":" ("prefetch" / "safe") CRLF
+
+audio-fetch-hint      =    "Audio-Fetch-Hint" ":"
+                          ("prefetch" / "safe" / "stream") CRLF
+
+failed-uri            =    "Failed-URI" ":" absoluteURI CRLF
+
+failed-uri-cause      =    "Failed-URI-Cause" ":" 1*UTFCHAR CRLF
+
+speak-restart         =    "Speak-Restart" ":" BOOLEAN CRLF
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 204]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+speak-length          =    "Speak-Length" ":" positive-length-value
+                           CRLF
+
+positive-length-value   =  positive-speech-length
+                        /  text-speech-length
+
+load-lexicon          =    "Load-Lexicon" ":" BOOLEAN CRLF
+
+lexicon-search-order  =    "Lexicon-Search-Order" ":"
+          "<" absoluteURI ">" *(" " "<" absoluteURI ">") CRLF
+
+; Recognizer ABNF
+
+recognizer-method     =    recog-only-method
+                      /    enrollment-method
+
+recog-only-method     =    "DEFINE-GRAMMAR"
+                      /    "RECOGNIZE"
+                      /    "INTERPRET"
+                      /    "GET-RESULT"
+                      /    "START-INPUT-TIMERS"
+                      /    "STOP"
+
+enrollment-method     =    "START-PHRASE-ENROLLMENT"
+                      /    "ENROLLMENT-ROLLBACK"
+                      /    "END-PHRASE-ENROLLMENT"
+                      /    "MODIFY-PHRASE"
+                      /    "DELETE-PHRASE"
+
+recognizer-event      =    "START-OF-INPUT"
+                      /    "RECOGNITION-COMPLETE"
+                      /    "INTERPRETATION-COMPLETE"
+
+recognizer-header     =    recog-only-header
+                      /    enrollment-header
+
+recog-only-header     =    confidence-threshold
+                      /    sensitivity-level
+                      /    speed-vs-accuracy
+                      /    n-best-list-length
+                      /    input-type
+                      /    no-input-timeout
+                      /    recognition-timeout
+                      /    waveform-uri
+                      /    input-waveform-uri
+                      /    completion-cause
+                      /    completion-reason
+                      /    recognizer-context-block
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 205]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+                      /    start-input-timers
+                      /    speech-complete-timeout
+                      /    speech-incomplete-timeout
+                      /    dtmf-interdigit-timeout
+                      /    dtmf-term-timeout
+                      /    dtmf-term-char
+                      /    failed-uri
+                      /    failed-uri-cause
+                      /    save-waveform
+                      /    media-type
+                      /    new-audio-channel
+                      /    speech-language
+                      /    ver-buffer-utterance
+                      /    recognition-mode
+                      /    cancel-if-queue
+                      /    hotword-max-duration
+                      /    hotword-min-duration
+                      /    interpret-text
+                      /    dtmf-buffer-time
+                      /    clear-dtmf-buffer
+                      /    early-no-match
+
+enrollment-header     =    num-min-consistent-pronunciations
+                      /    consistency-threshold
+                      /    clash-threshold
+                      /    personal-grammar-uri
+                      /    enroll-utterance
+                      /    phrase-id
+                      /    phrase-nl
+                      /    weight
+                      /    save-best-waveform
+                      /    new-phrase-id
+                      /    confusable-phrases-uri
+                      /    abort-phrase-enrollment
+
+confidence-threshold  =    "Confidence-Threshold" ":"
+                           FLOAT CRLF
+
+sensitivity-level     =    "Sensitivity-Level" ":" FLOAT
+                           CRLF
+
+speed-vs-accuracy     =    "Speed-Vs-Accuracy" ":" FLOAT
+                           CRLF
+
+n-best-list-length    =    "N-Best-List-Length" ":" 1*19DIGIT
+                           CRLF
+
+input-type            =    "Input-Type" ":"  inputs CRLF
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 206]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+inputs                =    "speech" / "dtmf"
+
+no-input-timeout      =    "No-Input-Timeout" ":" 1*19DIGIT
+                           CRLF
+
+recognition-timeout   =    "Recognition-Timeout" ":" 1*19DIGIT
+                           CRLF
+
+waveform-uri          =    "Waveform-URI" ":" ["<" uri ">"
+                           ";" "size" "=" 1*19DIGIT
+                           ";" "duration" "=" 1*19DIGIT] CRLF
+
+recognizer-context-block = "Recognizer-Context-Block" ":"
+                           [1*VCHAR] CRLF
+
+start-input-timers    =    "Start-Input-Timers" ":"
+                           BOOLEAN CRLF
+
+speech-complete-timeout =  "Speech-Complete-Timeout" ":"
+                           1*19DIGIT CRLF
+
+speech-incomplete-timeout = "Speech-Incomplete-Timeout" ":"
+                            1*19DIGIT CRLF
+
+dtmf-interdigit-timeout = "DTMF-Interdigit-Timeout" ":"
+                          1*19DIGIT CRLF
+
+dtmf-term-timeout     =    "DTMF-Term-Timeout" ":" 1*19DIGIT
+                           CRLF
+
+dtmf-term-char        =    "DTMF-Term-Char" ":" VCHAR CRLF
+
+save-waveform         =    "Save-Waveform" ":" BOOLEAN CRLF
+
+new-audio-channel     =    "New-Audio-Channel" ":"
+                           BOOLEAN CRLF
+
+recognition-mode      =    "Recognition-Mode" ":"
+                           "normal" / "hotword" CRLF
+
+cancel-if-queue       =    "Cancel-If-Queue" ":" BOOLEAN CRLF
+
+hotword-max-duration  =    "Hotword-Max-Duration" ":"
+                           1*19DIGIT CRLF
+
+hotword-min-duration  =    "Hotword-Min-Duration" ":"
+                           1*19DIGIT CRLF
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 207]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+interpret-text        =    "Interpret-Text" ":" 1*VCHAR CRLF
+
+dtmf-buffer-time      =    "DTMF-Buffer-Time" ":" 1*19DIGIT CRLF
+
+clear-dtmf-buffer     =    "Clear-DTMF-Buffer" ":" BOOLEAN CRLF
+
+early-no-match        =    "Early-No-Match" ":" BOOLEAN CRLF
+
+num-min-consistent-pronunciations    =
+    "Num-Min-Consistent-Pronunciations" ":" 1*19DIGIT CRLF
+
+consistency-threshold =    "Consistency-Threshold" ":" FLOAT
+                           CRLF
+
+clash-threshold       =    "Clash-Threshold" ":" FLOAT CRLF
+
+personal-grammar-uri  =    "Personal-Grammar-URI" ":" uri CRLF
+
+enroll-utterance      =    "Enroll-Utterance" ":" BOOLEAN CRLF
+
+phrase-id             =    "Phrase-ID" ":" 1*VCHAR CRLF
+
+phrase-nl             =    "Phrase-NL" ":" 1*UTFCHAR CRLF
+
+weight                =    "Weight" ":" FLOAT CRLF
+
+save-best-waveform    =    "Save-Best-Waveform" ":"
+                           BOOLEAN CRLF
+
+new-phrase-id         =    "New-Phrase-ID" ":" 1*VCHAR CRLF
+
+confusable-phrases-uri =   "Confusable-Phrases-URI" ":"
+                           uri CRLF
+
+abort-phrase-enrollment =  "Abort-Phrase-Enrollment" ":"
+                           BOOLEAN CRLF
+
+; Recorder ABNF
+
+recorder-method       =    "RECORD"
+                      /    "STOP"
+                      /    "START-INPUT-TIMERS"
+
+recorder-event        =    "START-OF-INPUT"
+                      /    "RECORD-COMPLETE"
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 208]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+recorder-header       =    sensitivity-level
+                      /    no-input-timeout
+                      /    completion-cause
+                      /    completion-reason
+                      /    failed-uri
+                      /    failed-uri-cause
+                      /    record-uri
+                      /    media-type
+                      /    max-time
+                      /    trim-length
+                      /    final-silence
+                      /    capture-on-speech
+                      /    ver-buffer-utterance
+                      /    start-input-timers
+                      /    new-audio-channel
+
+record-uri            =    "Record-URI" ":" [ "<" uri ">"
+                           ";" "size" "=" 1*19DIGIT
+                           ";" "duration" "=" 1*19DIGIT] CRLF
+
+media-type            =    "Media-Type" ":" media-type-value CRLF
+
+max-time              =    "Max-Time" ":" 1*19DIGIT CRLF
+
+trim-length           =    "Trim-Length" ":" 1*19DIGIT CRLF
+
+final-silence         =    "Final-Silence" ":" 1*19DIGIT CRLF
+
+capture-on-speech     =    "Capture-On-Speech " ":"
+                           BOOLEAN CRLF
+
+; Verifier ABNF
+
+verifier-method       =    "START-SESSION"
+                      /    "END-SESSION"
+                      /    "QUERY-VOICEPRINT"
+                      /    "DELETE-VOICEPRINT"
+                      /    "VERIFY"
+                      /    "VERIFY-FROM-BUFFER"
+                      /    "VERIFY-ROLLBACK"
+                      /    "STOP"
+                      /    "CLEAR-BUFFER"
+                      /    "START-INPUT-TIMERS"
+                      /    "GET-INTERMEDIATE-RESULT"
+
+verifier-event        =    "VERIFICATION-COMPLETE"
+                      /    "START-OF-INPUT"
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 209]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+verifier-header       =    repository-uri
+                      /    voiceprint-identifier
+                      /    verification-mode
+                      /    adapt-model
+                      /    abort-model
+                      /    min-verification-score
+                      /    num-min-verification-phrases
+                      /    num-max-verification-phrases
+                      /    no-input-timeout
+                      /    save-waveform
+                      /    media-type
+                      /    waveform-uri
+                      /    voiceprint-exists
+                      /    ver-buffer-utterance
+                      /    input-waveform-uri
+                      /    completion-cause
+                      /    completion-reason
+                      /    speech-complete-timeout
+                      /    new-audio-channel
+                      /    abort-verification
+                      /    start-input-timers
+                      /    input-type
+
+repository-uri        =    "Repository-URI" ":" uri CRLF
+
+voiceprint-identifier        =  "Voiceprint-Identifier" ":"
+                                vid *[";" vid] CRLF
+vid                          =  1*VCHAR ["." 1*VCHAR]
+
+verification-mode     =    "Verification-Mode" ":"
+                           verification-mode-string
+
+verification-mode-string = "train" / "verify"
+
+adapt-model           =    "Adapt-Model" ":" BOOLEAN CRLF
+
+abort-model           =    "Abort-Model" ":" BOOLEAN CRLF
+
+min-verification-score  =  "Min-Verification-Score" ":"
+                           [ %x2D ] FLOAT CRLF
+
+num-min-verification-phrases = "Num-Min-Verification-Phrases"
+                               ":" 1*19DIGIT CRLF
+
+num-max-verification-phrases = "Num-Max-Verification-Phrases"
+                               ":" 1*19DIGIT CRLF
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 210]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+voiceprint-exists     =    "Voiceprint-Exists" ":"
+                           BOOLEAN CRLF
+
+ver-buffer-utterance  =    "Ver-Buffer-Utterance" ":"
+                           BOOLEAN CRLF
+
+input-waveform-uri    =    "Input-Waveform-URI" ":" uri CRLF
+
+abort-verification    =    "Abort-Verification " ":"
+                           BOOLEAN CRLF
+
+   The following productions add a new SDP session-level attribute.  See
+   Paragraph 5.
+
+   cmid-attribute     =    "a=cmid:" identification-tag
+
+   identification-tag =    token
+
+16.  XML Schemas
+
+16.1.  NLSML Schema Definition
+
+ <?xml version="1.0" encoding="UTF-8"?>
+ <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
+             targetNamespace="urn:ietf:params:xml:ns:mrcpv2"
+             xmlns="urn:ietf:params:xml:ns:mrcpv2"
+             elementFormDefault="qualified"
+             attributeFormDefault="unqualified" >
+   <xs:annotation>
+     <xs:documentation> Natural Language Semantic Markup Schema
+     </xs:documentation>
+   </xs:annotation>
+   <xs:include schemaLocation="enrollment-schema.rng"/>
+   <xs:include schemaLocation="verification-schema.rng"/>
+   <xs:element name="result">
+     <xs:complexType>
+       <xs:sequence>
+         <xs:element name="interpretation" maxOccurs="unbounded">
+           <xs:complexType>
+             <xs:sequence>
+               <xs:element name="instance">
+                 <xs:complexType mixed="true">
+                   <xs:sequence minOccurs="0">
+                     <xs:any namespace="##other" processContents="lax"/>
+                   </xs:sequence>
+                 </xs:complexType>
+               </xs:element>
+               <xs:element name="input" minOccurs="0">
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 211]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+                 <xs:complexType mixed="true">
+                   <xs:choice>
+                     <xs:element name="noinput" minOccurs="0"/>
+                     <xs:element name="nomatch" minOccurs="0"/>
+                     <xs:element name="input" minOccurs="0"/>
+                   </xs:choice>
+                   <xs:attribute name="mode"
+                                 type="xs:string"
+                                 default="speech"/>
+                   <xs:attribute name="confidence"
+                                 type="confidenceinfo"
+                                 default="1.0"/>
+                   <xs:attribute name="timestamp-start"
+                                 type="xs:string"/>
+                   <xs:attribute name="timestamp-end"
+                                 type="xs:string"/>
+                 </xs:complexType>
+               </xs:element>
+             </xs:sequence>
+             <xs:attribute name="confidence" type="confidenceinfo"
+                           default="1.0"/>
+             <xs:attribute name="grammar" type="xs:anyURI"
+                           use="optional"/>
+           </xs:complexType>
+         </xs:element>
+         <xs:element name="enrollment-result"
+                     type="enrollment-contents"/>
+         <xs:element name="verification-result"
+                     type="verification-contents"/>
+       </xs:sequence>
+       <xs:attribute name="grammar" type="xs:anyURI"
+                     use="optional"/>
+     </xs:complexType>
+   </xs:element>
+
+   <xs:simpleType name="confidenceinfo">
+     <xs:restriction base="xs:float">
+        <xs:minInclusive value="0.0"/>
+        <xs:maxInclusive value="1.0"/>
+     </xs:restriction>
+   </xs:simpleType>
+ </xs:schema>
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 212]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+16.2.  Enrollment Results Schema Definition
+
+   <?xml version="1.0" encoding="UTF-8"?>
+
+   <!-- MRCP Enrollment Schema
+   (See http://www.oasis-open.org/committees/relax-ng/spec.html)
+   -->
+
+   <grammar datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"
+            ns="urn:ietf:params:xml:ns:mrcpv2"
+            xmlns="http://relaxng.org/ns/structure/1.0">
+
+     <start>
+       <element name="enrollment-result">
+         <ref name="enrollment-content"/>
+       </element>
+     </start>
+
+     <define name="enrollment-content">
+       <interleave>
+         <element name="num-clashes">
+           <data type="nonNegativeInteger"/>
+         </element>
+         <element name="num-good-repetitions">
+           <data type="nonNegativeInteger"/>
+         </element>
+         <element name="num-repetitions-still-needed">
+           <data type="nonNegativeInteger"/>
+         </element>
+         <element name="consistency-status">
+           <choice>
+             <value>consistent</value>
+             <value>inconsistent</value>
+             <value>undecided</value>
+           </choice>
+         </element>
+         <optional>
+           <element name="clash-phrase-ids">
+             <oneOrMore>
+               <element name="item">
+                 <data type="token"/>
+               </element>
+             </oneOrMore>
+           </element>
+         </optional>
+         <optional>
+           <element name="transcriptions">
+             <oneOrMore>
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 213]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+               <element name="item">
+                 <text/>
+               </element>
+             </oneOrMore>
+           </element>
+         </optional>
+         <optional>
+           <element name="confusable-phrases">
+             <oneOrMore>
+               <element name="item">
+                 <text/>
+               </element>
+             </oneOrMore>
+           </element>
+         </optional>
+       </interleave>
+     </define>
+
+   </grammar>
+
+16.3.  Verification Results Schema Definition
+   <?xml version="1.0" encoding="UTF-8"?>
+
+   <!--    MRCP Verification Results Schema
+           (See http://www.oasis-open.org/committees/relax-ng/spec.html)
+      -->
+
+   <grammar datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"
+            ns="urn:ietf:params:xml:ns:mrcpv2"
+            xmlns="http://relaxng.org/ns/structure/1.0">
+
+     <start>
+       <element name="verification-result">
+         <ref name="verification-contents"/>
+       </element>
+     </start>
+
+     <define name="verification-contents">
+       <element name="voiceprint">
+         <ref name="firstVoiceprintContent"/>
+       </element>
+       <zeroOrMore>
+         <element name="voiceprint">
+           <ref name="restVoiceprintContent"/>
+         </element>
+       </zeroOrMore>
+     </define>
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 214]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+     <define name="firstVoiceprintContent">
+       <attribute name="id">
+         <data type="string"/>
+       </attribute>
+       <interleave>
+         <optional>
+           <element name="adapted">
+             <data type="boolean"/>
+           </element>
+         </optional>
+         <optional>
+           <element name="needmoredata">
+             <ref name="needmoredataContent"/>
+           </element>
+         </optional>
+         <optional>
+           <element name="incremental">
+             <ref name="firstCommonContent"/>
+           </element>
+         </optional>
+         <element name="cumulative">
+           <ref name="firstCommonContent"/>
+         </element>
+       </interleave>
+     </define>
+
+     <define name="restVoiceprintContent">
+       <attribute name="id">
+         <data type="string"/>
+       </attribute>
+       <element name="cumulative">
+         <ref name="restCommonContent"/>
+       </element>
+     </define>
+
+     <define name="firstCommonContent">
+       <interleave>
+         <element name="decision">
+           <ref name="decisionContent"/>
+         </element>
+         <optional>
+           <element name="utterance-length">
+             <ref name="utterance-lengthContent"/>
+           </element>
+         </optional>
+         <optional>
+           <element name="device">
+             <ref name="deviceContent"/>
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 215]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+           </element>
+         </optional>
+         <optional>
+           <element name="gender">
+             <ref name="genderContent"/>
+           </element>
+         </optional>
+         <zeroOrMore>
+           <element name="verification-score">
+             <ref name="verification-scoreContent"/>
+           </element>
+         </zeroOrMore>
+       </interleave>
+     </define>
+
+     <define name="restCommonContent">
+       <interleave>
+         <optional>
+           <element name="decision">
+             <ref name="decisionContent"/>
+           </element>
+         </optional>
+         <optional>
+           <element name="device">
+             <ref name="deviceContent"/>
+           </element>
+         </optional>
+         <optional>
+           <element name="gender">
+             <ref name="genderContent"/>
+           </element>
+         </optional>
+        <zeroOrMore>
+           <element name="verification-score">
+             <ref name="verification-scoreContent"/>
+           </element>
+        </zeroOrMore>
+        </interleave>
+     </define>
+
+     <define name="decisionContent">
+       <choice>
+         <value>accepted</value>
+         <value>rejected</value>
+         <value>undecided</value>
+       </choice>
+     </define>
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 216]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+     <define name="needmoredataContent">
+       <data type="boolean"/>
+     </define>
+
+     <define name="utterance-lengthContent">
+       <data type="nonNegativeInteger"/>
+     </define>
+
+     <define name="deviceContent">
+       <choice>
+         <value>cellular-phone</value>
+         <value>electret-phone</value>
+         <value>carbon-button-phone</value>
+         <value>unknown</value>
+       </choice>
+     </define>
+
+     <define name="genderContent">
+       <choice>
+         <value>male</value>
+         <value>female</value>
+         <value>unknown</value>
+       </choice>
+     </define>
+
+     <define name="verification-scoreContent">
+       <data type="float">
+         <param name="minInclusive">-1</param>
+         <param name="maxInclusive">1</param>
+       </data>
+     </define>
+
+   </grammar>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 217]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+17.  References
+
+17.1.  Normative References
+
+   [ISO.8859-1.1987]
+              International Organization for Standardization,
+              "Information technology - 8-bit single byte coded graphic
+              - character sets - Part 1: Latin alphabet No. 1, JTC1/
+              SC2", ISO Standard 8859-1, 1987.
+
+   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
+              RFC 793, September 1981.
+
+   [RFC1035]  Mockapetris, P., "Domain names - implementation and
+              specification", STD 13, RFC 1035, November 1987.
+
+   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
+              Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+   [RFC2326]  Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time
+              Streaming Protocol (RTSP)", RFC 2326, April 1998.
+
+   [RFC2392]  Levinson, E., "Content-ID and Message-ID Uniform Resource
+              Locators", RFC 2392, August 1998.
+
+   [RFC2483]  Mealling, M. and R. Daniel, "URI Resolution Services
+              Necessary for URN Resolution", RFC 2483, January 1999.
+
+   [RFC2616]  Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
+              Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
+              Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
+
+   [RFC3023]  Murata, M., St. Laurent, S., and D. Kohn, "XML Media
+              Types", RFC 3023, January 2001.
+
+   [RFC3261]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
+              A., Peterson, J., Sparks, R., Handley, M., and E.
+              Schooler, "SIP: Session Initiation Protocol", RFC 3261,
+              June 2002.
+
+   [RFC3264]  Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
+              with Session Description Protocol (SDP)", RFC 3264,
+              June 2002.
+
+   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
+              Jacobson, "RTP: A Transport Protocol for Real-Time
+              Applications", STD 64, RFC 3550, July 2003.
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 218]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   [RFC3629]  Yergeau, F., "UTF-8, a transformation format of ISO
+              10646", STD 63, RFC 3629, November 2003.
+
+   [RFC3688]  Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688,
+              January 2004.
+
+   [RFC3711]  Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
+              Norrman, "The Secure Real-time Transport Protocol (SRTP)",
+              RFC 3711, March 2004.
+
+   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
+              Resource Identifier (URI): Generic Syntax", STD 66,
+              RFC 3986, January 2005.
+
+   [RFC4145]  Yon, D. and G. Camarillo, "TCP-Based Media Transport in
+              the Session Description Protocol (SDP)", RFC 4145,
+              September 2005.
+
+   [RFC4288]  Freed, N. and J. Klensin, "Media Type Specifications and
+              Registration Procedures", BCP 13, RFC 4288, December 2005.
+
+   [RFC4566]  Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
+              Description Protocol", RFC 4566, July 2006.
+
+   [RFC4568]  Andreasen, F., Baugher, M., and D. Wing, "Session
+              Description Protocol (SDP) Security Descriptions for Media
+              Streams", RFC 4568, July 2006.
+
+   [RFC4572]  Lennox, J., "Connection-Oriented Media Transport over the
+              Transport Layer Security (TLS) Protocol in the Session
+              Description Protocol (SDP)", RFC 4572, July 2006.
+
+   [RFC5226]  Narten, T. and H. Alvestrand, "Guidelines for Writing an
+              IANA Considerations Section in RFCs", BCP 26, RFC 5226,
+              May 2008.
+
+   [RFC5234]  Crocker, D. and P. Overell, "Augmented BNF for Syntax
+              Specifications: ABNF", STD 68, RFC 5234, January 2008.
+
+   [RFC5246]  Dierks, T. and E. Rescorla, "The Transport Layer Security
+              (TLS) Protocol Version 1.2", RFC 5246, August 2008.
+
+   [RFC5322]  Resnick, P., Ed., "Internet Message Format", RFC 5322,
+              October 2008.
+
+   [RFC5646]  Phillips, A. and M. Davis, "Tags for Identifying
+              Languages", BCP 47, RFC 5646, September 2009.
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 219]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   [RFC5888]  Camarillo, G. and H. Schulzrinne, "The Session Description
+              Protocol (SDP) Grouping Framework", RFC 5888, June 2010.
+
+   [RFC5905]  Mills, D., Martin, J., Burbank, J., and W. Kasch, "Network
+              Time Protocol Version 4: Protocol and Algorithms
+              Specification", RFC 5905, June 2010.
+
+   [RFC5922]  Gurbani, V., Lawrence, S., and A. Jeffrey, "Domain
+              Certificates in the Session Initiation Protocol (SIP)",
+              RFC 5922, June 2010.
+
+   [RFC6265]  Barth, A., "HTTP State Management Mechanism", RFC 6265,
+              April 2011.
+
+   [W3C.REC-semantic-interpretation-20070405]
+              Tichelen, L. and D. Burke, "Semantic Interpretation for
+              Speech Recognition (SISR) Version 1.0", World Wide Web
+              Consortium Recommendation REC-semantic-
+              interpretation-20070405, April 2007,
+              <http://www.w3.org/TR/2007/
+              REC-semantic-interpretation-20070405>.
+
+   [W3C.REC-speech-grammar-20040316]
+              McGlashan, S. and A. Hunt, "Speech Recognition Grammar
+              Specification Version 1.0", World Wide Web Consortium
+              Recommendation REC-speech-grammar-20040316, March 2004,
+              <http://www.w3.org/TR/2004/REC-speech-grammar-20040316>.
+
+   [W3C.REC-speech-synthesis-20040907]
+              Walker, M., Burnett, D., and A. Hunt, "Speech Synthesis
+              Markup Language (SSML) Version 1.0", World Wide Web
+              Consortium Recommendation REC-speech-synthesis-20040907,
+              September 2004,
+              <http://www.w3.org/TR/2004/REC-speech-synthesis-20040907>.
+
+   [W3C.REC-xml-names11-20040204]
+              Layman, A., Bray, T., Hollander, D., and R. Tobin,
+              "Namespaces in XML 1.1", World Wide Web Consortium First
+              Edition REC-xml-names11-20040204, February 2004,
+              <http://www.w3.org/TR/2004/REC-xml-names11-20040204>.
+
+17.2.  Informative References
+
+   [ISO.8601.1988]
+              International Organization for Standardization, "Data
+              elements and interchange formats - Information interchange
+              - Representation of dates and times", ISO Standard 8601,
+              June 1988.
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 220]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   [Q.23]     International Telecommunications Union, "Technical
+              Features of Push-Button Telephone Sets", ITU-T Q.23, 1993.
+
+   [RFC2046]  Freed, N. and N. Borenstein, "Multipurpose Internet Mail
+              Extensions (MIME) Part Two: Media Types", RFC 2046,
+              November 1996.
+
+   [RFC2818]  Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000.
+
+   [RFC4217]  Ford-Hutchinson, P., "Securing FTP with TLS", RFC 4217,
+              October 2005.
+
+   [RFC4267]  Froumentin, M., "The W3C Speech Interface Framework Media
+              Types: application/voicexml+xml, application/ssml+xml,
+              application/srgs, application/srgs+xml, application/
+              ccxml+xml, and application/pls+xml", RFC 4267,
+              November 2005.
+
+   [RFC4301]  Kent, S. and K. Seo, "Security Architecture for the
+              Internet Protocol", RFC 4301, December 2005.
+
+   [RFC4313]  Oran, D., "Requirements for Distributed Control of
+              Automatic Speech Recognition (ASR), Speaker
+              Identification/Speaker Verification (SI/SV), and Text-to-
+              Speech (TTS) Resources", RFC 4313, December 2005.
+
+   [RFC4395]  Hansen, T., Hardie, T., and L. Masinter, "Guidelines and
+              Registration Procedures for New URI Schemes", BCP 35,
+              RFC 4395, February 2006.
+
+   [RFC4463]  Shanmugham, S., Monaco, P., and B. Eberman, "A Media
+              Resource Control Protocol (MRCP) Developed by Cisco,
+              Nuance, and Speechworks", RFC 4463, April 2006.
+
+   [RFC4467]  Crispin, M., "Internet Message Access Protocol (IMAP) -
+              URLAUTH Extension", RFC 4467, May 2006.
+
+   [RFC4733]  Schulzrinne, H. and T. Taylor, "RTP Payload for DTMF
+              Digits, Telephony Tones, and Telephony Signals", RFC 4733,
+              December 2006.
+
+   [RFC4960]  Stewart, R., "Stream Control Transmission Protocol",
+              RFC 4960, September 2007.
+
+   [RFC6454]  Barth, A., "The Web Origin Concept", RFC 6454,
+              December 2011.
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 221]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   [W3C.REC-emma-20090210]
+              Johnston, M., Baggia, P., Burnett, D., Carter, J., Dahl,
+              D., McCobb, G., and D. Raggett, "EMMA: Extensible
+              MultiModal Annotation markup language", World Wide Web
+              Consortium Recommendation REC-emma-20090210,
+              February 2009,
+              <http://www.w3.org/TR/2009/REC-emma-20090210>.
+
+   [W3C.REC-pronunciation-lexicon-20081014]
+              Baggia, P., Bagshaw, P., Burnett, D., Carter, J., and F.
+              Scahill, "Pronunciation Lexicon Specification (PLS)",
+              World Wide Web Consortium Recommendation
+              REC-pronunciation-lexicon-20081014, October 2008,
+              <http://www.w3.org/TR/2008/
+              REC-pronunciation-lexicon-20081014>.
+
+   [W3C.REC-voicexml20-20040316]
+              Danielsen, P., Porter, B., Hunt, A., Rehor, K., Lucas, B.,
+              Burnett, D., Ferrans, J., Tryphonas, S., McGlashan, S.,
+              and J. Carter, "Voice Extensible Markup Language
+              (VoiceXML) Version 2.0", World Wide Web Consortium
+              Recommendation REC-voicexml20-20040316, March 2004,
+              <http://www.w3.org/TR/2004/REC-voicexml20-20040316>.
+
+   [refs.javaSpeechGrammarFormat]
+              Sun Microsystems, "Java Speech Grammar Format Version
+              1.0", October 1998.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 222]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+Appendix A.  Contributors
+
+   Pierre Forgues
+   Nuance Communications Ltd.
+   1500 University Street
+   Suite 935
+   Montreal, Quebec
+   Canada H3A 3S7
+
+   EMail:  forgues@nuance.com
+
+
+   Charles Galles
+   Intervoice, Inc.
+   17811 Waterview Parkway
+   Dallas, Texas 75252
+   USA
+
+   EMail:  charles.galles@intervoice.com
+
+
+   Klaus Reifenrath
+   Scansoft, Inc
+   Guldensporenpark 32
+   Building D
+   9820 Merelbeke
+   Belgium
+
+   EMail: klaus.reifenrath@scansoft.com
+
+Appendix B.  Acknowledgements
+
+   Andre Gillet (Nuance Communications)
+   Andrew Hunt (ScanSoft)
+   Andrew Wahbe (Genesys)
+   Aaron Kneiss (ScanSoft)
+   Brian Eberman (ScanSoft)
+   Corey Stohs (Cisco Systems, Inc.)
+   Dave Burke (VoxPilot)
+   Jeff Kusnitz (IBM Corp)
+   Ganesh N. Ramaswamy (IBM Corp)
+   Klaus Reifenrath (ScanSoft)
+   Kristian Finlator (ScanSoft)
+   Magnus Westerlund (Ericsson)
+   Martin Dragomirecky (Cisco Systems, Inc.)
+   Paolo Baggia (Loquendo)
+   Peter Monaco (Nuance Communications)
+   Pierre Forgues (Nuance Communications)
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 223]
+
+RFC 6787                         MRCPv2                    November 2012
+
+
+   Ran Zilca (IBM Corp)
+   Suresh Kaliannan (Cisco Systems, Inc.)
+   Skip Cave (Intervoice, Inc.)
+   Thomas Gal (LumenVox)
+
+   The chairs of the SPEECHSC work group are Eric Burger (Georgetown
+   University) and Dave Oran (Cisco Systems, Inc.).
+
+   Many thanks go in particular to Robert Sparks, Alex Agranovsky, and
+   Henry Phan, who were there at the end to dot all the i's and cross
+   all the t's.
+
+Authors' Addresses
+
+   Daniel C. Burnett
+   Voxeo
+   189 South Orange Avenue #1000
+   Orlando, FL  32801
+   USA
+
+   EMail: dburnett@voxeo.com
+
+
+   Saravanan Shanmugham
+   Cisco Systems, Inc.
+   170 W. Tasman Dr.
+   San Jose, CA  95134
+   USA
+
+   EMail: sarvi@cisco.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Burnett & Shanmugham         Standards Track                  [Page 224]
+