Internet Engineering Task Force J. Engelsma Internet-Draft Motorola Expires: February 1, 2008 C. Cross IBM July 31, 2007 Distributed Multimodal Synchronization Protocol draft-engelsma-dmsp-04.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on February 1, 2008. Copyright Notice Copyright (C) The IETF Trust (2007). Abstract This document proposes a Distributed Multimodal Synchronization Protocol (DMSP) designed to enable multimodal interaction for mobile devices by accessing services in the network. More specifically, this protocol coordinates events of interest between a visual browser or application running on a mobile device with a VoiceXML (Voice Extensible Markup Language) browser running in the network. Engelsma & Cross Expires February 1, 2008 [Page 1] Internet-Draft DMSP Protocol July 2007 Table of Contents 1. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. System Architecture . . . . . . . . . . . . . . . . . . . . . 4 3.1. Multimodal Mark-up Language . . . . . . . . . . . . . . . 5 3.2. VoIP Audio Session Control . . . . . . . . . . . . . . . . 5 3.3. Distributed Speech Audio Format . . . . . . . . . . . . . 5 3.4. Distributed Speech Engine Protocol . . . . . . . . . . . . 5 3.5. View Independent Model . . . . . . . . . . . . . . . . . . 6 3.6. Distributed Multimodal Synchronization Protocol (DMSP) . . 6 4. The Distributed Multimodal Synchronization Protocol . . . . . 6 4.1. Conceptual Framework for DMSP . . . . . . . . . . . . . . 8 4.2. Binary Message Encodings . . . . . . . . . . . . . . . . . 9 4.2.1. Intrinsic Data Types . . . . . . . . . . . . . . . . . 9 4.2.2. Message Type Constants and Basic Structures . . . . . 9 4.2.3. Signal Messages . . . . . . . . . . . . . . . . . . . 15 4.2.4. Command Messages . . . . . . . . . . . . . . . . . . . 18 4.2.5. Response Messages . . . . . . . . . . . . . . . . . . 26 4.2.6. Event Messages . . . . . . . . . . . . . . . . . . . . 29 4.3. XML Message Encoding . . . . . . . . . . . . . . . . . . . 37 4.4. State Machines . . . . . . . . . . . . . . . . . . . . . . 37 4.4.1. GUI User Agent State Machine . . . . . . . . . . . . . 39 4.4.2. VoiceXML User Agent State Machine . . . . . . . . . . 49 5. Message Transport . . . . . . . . . . . . . . . . . . . . . . 60 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 60 7. Security Considerations . . . . . . . . . . . . . . . . . . . 60 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 60 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 61 9.1. Normative References . . . . . . . . . . . . . . . . . . . 61 9.2. Informative References . . . . . . . . . . . . . . . . . . 61 Appendix A. Schema for XML Message Encoding . . . . . . . . . . . 63 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 71 Intellectual Property and Copyright Statements . . . . . . . . . . 73 Engelsma & Cross Expires February 1, 2008 [Page 2] Internet-Draft DMSP Protocol July 2007 1. Conventions The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, NOT RECOMMENDED, MAY, and OPTIONAL, when they appear in this document, are to be interpreted as described in RFC 2119 [2]. The following acronyms are used in this document: 3GPP - 3rd Generation Partnership Project ASR - Automatic Speech Recognition DMSP - Distributed Multimodal Synchronization Protocol DOM - Document Object Model DSR - Distributed Speech Recognition ETSI - European Telecommunications Standards Institute GUA - GUI User Agent GUI - Graphical User Interface IM - Interaction Manager HTTP - Hypertext Transport Protocol J2ME - Java 2 Platform, Micro Edition MRCP - Media Resource Control Protocol MVC - Model-View-Controller OMA - Open Mobile Alliance SIP - Session Initiation Protocol TTS - Text-To-Speech Synthesis VUA - VoiceXML User Agent VUI - Voice User Interface VXML - VoiceXML (Voice Extensible Markup Language 2.0) W3C - World-Wide Web Consortium 2. Introduction The advancement of wireless networking, mobile computing, and voice processing technologies have created a mobile computing environment where access to network based content and services is possible from mobile devices. While the compact form factor of these devices is an essential and desirable characteristic from the mobility perspective, it unfortunately gives rise to a "user interface bottleneck". That is, the small displays and limited keypads are significant barriers to usability. The aim of multimodal user interfaces is to enhance the usability of mobile devices. The term "multimodal interface" in the context of this specification refers to augmenting the existing graphical user interface (GUI) of mobile devices with a voice user interface (VUI). In this way, the strengths of one modality offset the weaknesses of the other. For example, use speech to enter data that would be tedious to enter on a Engelsma & Cross Expires February 1, 2008 [Page 3] Internet-Draft DMSP Protocol July 2007 small keypad. Similarly, use tactile input to disambiguate or confirm speech recognition results, or in lieu of speech recognition in a noisy environment. Speech output also improves the accessibility of mobile applications for visually impaired users and is a means for addressing growing safety concerns by providing hands- free operation. A visual display helps the user better retain and interpret information presented audibly via speech prompts. Section 3 elaborates on a broad architectural framework that combines the existing web and VoiceXML (Voice Extensible Markup Language) [5] ecosystems and the industry standards upon which they are based. The scope of this specification is limited to one particular architectural configuration within this framework: the coordination of a GUI browser or application running on a mobile device with a VoiceXML browser running in the network. The Distributed Multimodal Synchronization Protocol (DMSP) accomplishes this and is fully specified in Section 4. Section 5 introduces transport possibilities for DMSP, and Section 6 and Section 7 take up IANA and Security considerations, respectively. 3. System Architecture A mobile device multimodal reference architecture enables multimodal interaction for mobile devices that are used to access web applications and data. The architectures under development by various standards bodies acknowledge that devices, networks, and servers may distribute the workload based on the capabilities of the components. The W3C Multimodal Interaction Working Group has published a working draft[6] that addresses the distribution of modalities and employs an Interaction Manager to coordinate and synchronize user interface events. The Open Mobile Alliance (OMA) Browser and Content, Mobile Application Environment (BAC MAE) group has published an architecture document[7] that lays out the relationship of components, interfaces, and protocols necessary to implement multimodal solutions through the interoperation of components from diverse vendors. Both the W3C and OMA assemble relevant standards from the W3C and IETF and both identify the need for a synchronization protocol to be created as a standard in order to enable the desired open and heterogeneous computing environment. This submission documents such a protocol. With this in mind, the following areas are identified for standardization: Engelsma & Cross Expires February 1, 2008 [Page 4] Internet-Draft DMSP Protocol July 2007 1. Multimodal Mark-up Language 2. Voice Over IP (VoIP) Audio Session Control 3. Distributed Speech Audio Format 4. Distributed Speech Engine Control Protocol 5. View Independent Model 6. Distributed Modality Synchronization Protocol At the present we identify all of these areas having standardization activity with the exception of the distributed multimodal synchronization protocol. This submission proposes such a protocol. 3.1. Multimodal Mark-up Language The mark-up language for multimodal web applications is currently in progress in the W3C. W3C Multimodal Interaction Activity. http://www.w3c.org. 3.2. VoIP Audio Session Control This is the signaling protocol for configurations that use distributed voice services. The Session Initiation Protocol (SIP), RFC 3261[13] specified by the IETF's SIP Working Group is the preferred choice for this. 3.3. Distributed Speech Audio Format This is the media format of the audio data for configurations that use network based speech engines. Existing media frameworks such as RTP in the IETF provide protocols for the transport of compressed speech or distributed speech recognition (DSR) features such as those standardised by European Telecommunications Standards Institute (ETSI) Aurora[16][17] and 3rd Generation Partnership Project (3GPP). http://www.etsi.org, http://www.3gpp.org. 3.4. Distributed Speech Engine Protocol This is the protocol for accessing and controlling speech engines in configurations that use distributed speech engines. The IETF Working Group on Speech Services Control created the Media Resource Control Protocol (MRCP) Version 2[21]. Engelsma & Cross Expires February 1, 2008 [Page 5] Internet-Draft DMSP Protocol July 2007 3.5. View Independent Model This is the data model used in a multimodal system to which all the modalities synchronize. The W3C DOM satisfies this requirement with DOM 2 Events[22] being the mechanism for interfacing with modalities. 3.6. Distributed Multimodal Synchronization Protocol (DMSP) This is the protocol for synchronizing a GUI user agent (GUA) with a remote VoiceXML user agent (VUA). That is, this protocol enables the rendering of VoiceXML markup to be distributed over a wired or wireless connection to network based voice servers. In general, the protocol supports the synchronization and data exchange requirements of web based multimodal applications. Since the reference architectures are based on a Model-View-Controller (MVC) pattern, this protocol needs to support both network based and client based View Independent Models. 4. The Distributed Multimodal Synchronization Protocol The goal of the Distributed Multimodal Synchronization Protocol (DMSP) is to provide an open infrastructure for multimodal interaction based on a loosely coupled web-based architecture. Instead of developing a new multimodal browser from scratch, we adopt the strategy of coordinating existing and well-established single- modality browsers which together form a multimodal super-browser. There are two architectural configurations of interest that are enabled by the protocol: 1. The distributed client approach where a GUI browser or other user agent running on a mobile device coordinates with a VoiceXML browser running in the network in a peer-to-peer fashion. 2. The approach that uses an Interaction Manager hosted in the network to coordinate synchronization between distributed user agents supporting separate modalities. The first configuration of interest here is that of a web browser or other GUI user agent on a mobile device that is coordinated with a VoiceXML browser in the network, as shown in Figure 2. In this configuration, the abstract interfaces can be thought of as message classes, where each method on the interface is represented with a message type. DMSP is therefore a concrete specification of these abstract interfaces in the form of messages. Engelsma & Cross Expires February 1, 2008 [Page 6] Internet-Draft DMSP Protocol July 2007 +-------------+ +---------------+ | |--commands-----------> O------| | | | CommandControl| | | GUI Browser |--registration-------> O------| Voice Browser | | | EventControl| | | |------O <-------------events--| | | |EventListener | | +-------------+ +---------------+ A GUI-browser driven multimodal browser. Figure 2 The second configuration shown in Figure 3 employs an Interaction Manager to effect a loosely coupled coordination of user agents across the network. The IM collects user interface events in one mode and publishes them to the other modes being synchronized. In the example, the GUI User Agent and the Voice User Agent each constitute one interaction mode. In this context the IM is responsible for: 1. Receiving events and signals from one user agent. 2. Finding appropriate action to take to reflect that user interaction in all other user agents. 3. Dispatching cross-markup events and event handlers from one user agent to another. Engelsma & Cross Expires February 1, 2008 [Page 7] Internet-Draft DMSP Protocol July 2007 +-------------+ +---------------+ | (CC)|--O <-2----+ +----2-> O--|(CC) | | | | | | | | GUI Browser | | | | Voice Browser | | (EL)|--O <-1--+ | | +--1-> O--|(EL) | | | | | | | | | | |--+ | | | | +--| | +-------------+ | | | | | | +---------------+ | | | | | | | | | | | | | +-------------+ | | | | | | | Interaction | | | | Manager | | | | | | +-> O-|(EL) (EL)|-O <-+ | | +-------------+ An Interaction Manger Controlled Multimodal System Legend: (CC) - CommandControl (EL) - EventListener 1 - Registration 2 - Commands 3 - Events Figure 3 4.1. Conceptual Framework for DMSP The two architectural configurations show above can be realized with four basic abstract interfaces: Command, Response, Event, and Signal. Each of these interfaces define a set of methods and related data structures which can be exchanged between the user agents of a particular multimodal system. Signals are one-way methods used to negotiate internal processing state among entities. While responses are not anticipated in the typical sunny day scenario, a signaled entity may invoke a response method of the sender when an error occurs during signal processing. Incorporating rudimentary signaling Engelsma & Cross Expires February 1, 2008 [Page 8] Internet-Draft DMSP Protocol July 2007 mechanisms into the protocol provides flexibility with regard to the transport mechanism an implementation of the protocol uses. When an entity invokes a Command method of another entity in the system, it anticipates a correlated call on its Response interface. The methods of Event interfaces serve as asynchronous notifications. Typically, one or more events can occur as a result of a Command method invocation. The entities within a particular multimodal system may register to listen to events generated by other entities within the system. While a variety of architectural configurations can be realized with these basic abstractions, the primary focus of the protocol and state machine specifications that follow is the synchronization of a GUA with a remote VUA as shown previously in Figure 2. In this particular configuration, the abstract interfaces can be thought of as message classes, where each method on the interface is represented with a message type. DMSP is therefore a concrete specification of these abstract interfaces in the form of messages. We present two encodings for DMSP, binary and XML. Given the low bandwidth of conventional wide area mobile packet data networks and limited memory and processing resources of the vast majority of mobile devices in use today, a simple and compact binary message set is appropriate for mobile devices communicating with a VoiceXML user agent on the network. For Interaction Manager based systems, the IM and Voice Server may be implemented on a high speed network where other design considerations dominate over bandwidth and memory. Therefore we also present XML bindings in the form of an XML schema in Appendix A as an option for implementers of those systems. 4.2. Binary Message Encodings 4.2.1. Intrinsic Data Types Boolean data is encoded as a single byte having either a 0x00 (false) or 0x01 (true) value. Integer data is encoded in big-endian variable length fields, depending on the maximum range of values allowed for the field in question. Integer field lengths must be 1, 2, 4, or 8 bytes long. String data is encoded using a 2-byte (big-endian) length prefix followed by the specified number of bytes of UTF8- encoded character data. The longest string that may be transferred using this encoding is 65535 characters, assuming none of the characters requires more than one byte. 4.2.2. Message Type Constants and Basic Structures The remainder of this section defines the various message type constants and basic structures reused by several messages. Engelsma & Cross Expires February 1, 2008 [Page 9] Internet-Draft DMSP Protocol July 2007 Section 4.2.3 - Section 4.2.6 define the messages themselves. 4.2.2.1. Message Types +--------------+-------+ | Message Type | Value | +--------------+-------+ | MSG_SIGNAL | 0x01 | | MSG_COMMAND | 0x02 | | MSG_RESPONSE | 0x03 | | MSG_EVENT | 0x04 | +--------------+-------+ Table 1: Message Types 4.2.2.2. Signal Message Subtypes +----------------+-------+---------------+ | Message Type | Value | Note | +----------------+-------+---------------+ | SIG_INIT | 0x01 | See Table 10. | | SIG_VXML_START | 0x02 | See Table 11. | | SIG_CLOSE | 0x03 | See Table 12. | +----------------+-------+---------------+ Table 2: Signal Message Subtypes 4.2.2.3. Command Message Subtypes +-------------------------+-------+------+ | Message Type | Value | Note | +-------------------------+-------+------+ | CMD_ADD_EVT_LISTENER | 0x01 | | | CMD_REMOVE_EVT_LISTENER | 0x02 | | | CMD_CAN_DISPATCH | 0x03 | | | CMD_DISPATCH_EVT | 0x04 | | | CMD_LOAD_URL | 0x05 | | | CMD_LOAD_SRC | 0x06 | | | CMD_SET_FOCUS | 0x07 | | | CMD_GET_FOCUS | 0x08 | | | CMD_SET_FIELDS | 0x09 | | | CMD_GET_FIELDS | 0x0A | | | CMD_CANCEL | 0x0B | | | CMD_EXEC_FORM | 0x0C | | | CMD_SET_COOKIES | 0x0D | | | CMD_GET_COOKIES | 0x0E | | +-------------------------+-------+------+ Engelsma & Cross Expires February 1, 2008 [Page 10] Internet-Draft DMSP Protocol July 2007 Table 3: Command Message Subtypes 4.2.2.4. Response Message Subtypes +---------------------+---------+-----------------------------------+ | Message Type | Value | Note | +---------------------+---------+-----------------------------------+ | RESP_OK | 0x01 | OK with no return value. | | RESP_BOOL | 0x02 | OK with boolean return value. | | RESP_STRING | 0x03 | OK with string return value. | | reserved | 0x04 - | Reserved for primitive types. | | | 0x40 | | | RESP_FIELDS | 0x41 | OK with list of field name/value | | | | pairs return value. | | reserved | 0x42 - | Reserved for complex types. | | | 0x65 | | | RESP_SYNTAX_ERROR | 0x65 | Error parsing a message. | | RESP_SYMANTIC_ERROR | 0x66 | Error processing a message. | | reserved | 0x67 - | Reserved for implementation error | | | 0xFF | codes. | +---------------------+---------+-----------------------------------+ Table 4: Response Message Subtypes Reserved ranges are left for implementations to provide response messages for additional primitive and complex types and error codes. Engelsma & Cross Expires February 1, 2008 [Page 11] Internet-Draft DMSP Protocol July 2007 4.2.2.5. Event Message Subtypes +--------------------+----------+-----------------------------------+ | Message Type | Value | Note | +--------------------+----------+-----------------------------------+ | EVT_DOM_ACTIVATE | 0x01 | See Table 32 | | EVT_DOM_FOCUS_IN | 0x02 | See Table 32 | | EVT_DOM_FOCUS_OUT | 0x03 | See Table 32 | | EVT_CLICK | 0x04 | See Table 33 | | EVT_MOUSEDOWN | 0x05 | See Table 33 | | EVT_MOUSEUP | 0x06 | See Table 33 | | EVT_KEYDOWN | 0x07 | See Table 35 | | EVT_KEYUP | 0x08 | See Table 35 | | EVT_KEYPRESSED | 0x09 | See Table 35 | | EVT_ERROR | 0x0A | See Table 36 | | EVT_CHANGE | 0x0B | | | EVT_HELP | 0x0C | See Table 38 | | EVT_NOMATCH | 0x0D | See Table 38 | | EVT_NOINPUT | 0x0E | See Table 38 | | EVT_VXMLDONE | 0x0F | See Table 37 | | EVT_RECO_RESULT | 0x10 | See Table 39 | | EVT_RECO_RESULT_EX | 0x11 | See Table 39 | | EVT_PLAYBACK_START | 0x12 | See Table 41 | | EVT_PLAYBACK_STOP | 0x13 | See Table 41 | | EVT_PLAYBACK_MARK | 0x14 | See Table 42 | | reserved | 0x15 - | Reserved for future versions. | | | 0x40 | | | EVT_CUSTOM | 0x41 - | See Table 43 | | | 0xFE | | | EVT_ALL | 0xFF | Used to denote "All Events" in | | | | event subscriptions. | +--------------------+----------+-----------------------------------+ Table 5: Event Message Subtypes Engelsma & Cross Expires February 1, 2008 [Page 12] Internet-Draft DMSP Protocol July 2007 4.2.2.6. Error Codes +-------------------------+--------+--------------------------------+ | Error Code | Value | Note | +-------------------------+--------+--------------------------------+ | ERR_BAD_RESPONSE_REF | 0x01 | Attribute refers to a | | | | non-existent request. | | ERR_UNSUPPORTED_EVENT | 0x02 | Browser does not support event | | | | type. | | ERR_INVALID_TARGET | 0x03 | Browser does not know the | | | | target node. | | ERR_UNSUPPORTED_COMMAND | 0x04 | Browser does not support the | | | | command. | | ERR_INVALID_STATE | 0x05 | Invoking a command in the | | | | current state is not legal. | | ERR_UNKNOWN_ERROR | 0x06 | For implementation specific | | | | errors. See Table 36. | | ERR_ILLEGAL_ARGUMENT | 0x07 | Invoking a command with | | | | illegal arguments. | +-------------------------+--------+--------------------------------+ Table 6: Error Codes 4.2.2.7. Field Type FIELD is a structure consisting of the following fields: Field ID: The name of the field. Value Count: The number of values associated with the field. Field Value: An array of one or more values associated with the field. +-------------+------------------+-------------+---------+ | Field | Type | Byte Length | Value | +-------------+------------------+-------------+---------+ | Field ID | String | Variable | | | Value Count | Integer | 1 | 1 - 255 | | Field Value | Array of Strings | Variable | | +-------------+------------------+-------------+---------+ Table 7: FIELD Type Engelsma & Cross Expires February 1, 2008 [Page 13] Internet-Draft DMSP Protocol July 2007 4.2.2.8. Recognition Result Type RESULT is a structure consisting of the following fields: Score: The confidence score associated with the speech recognition result. Raw Utterance: A text string containing the raw utterance result. Field Count: The number of fields filled by the recognition results. Fields: An array of one or more FIELD structures. +---------------+----------------+-------------+--------------+ | Field | Type | Byte Length | Value | +---------------+----------------+-------------+--------------+ | Score | Integer | 1 | 0-100 | | Raw Utterance | String | Variable | | | Field Count | Integer | 1 | 1-255 | | Fields | Array of Field | Variable | See Table 7. | +---------------+----------------+-------------+--------------+ Table 8: RESULT Type 4.2.2.9. Extended Recognition Result Type RESULT_EX is a structure consisting of the following fields: Score: The confidence score associated with the speech recognition result. Raw Utterance: A text string containing the raw utterance result. Grammar: The URI of the grammar that matched. Interpretation Type: Designates the type of the Semantic Interpretation field. Semantic Interpretation: A string containing the semantic interpretation data (parse tree and annotations,) the format of which is determined from the preceding type field. The W3C specification for Semantic Interpretation, see reference [10], provides one such scheme where the ECMAScript result is serialized into an XML fragment. Engelsma & Cross Expires February 1, 2008 [Page 14] Internet-Draft DMSP Protocol July 2007 +---------------+---------+-------------+--------------+ | Field | Type | Byte Length | Value | +---------------+---------+-------------+--------------+ | Score | Integer | 1 | 0-100 | | Raw Utterance | String | Variable | | | Grammar | String | Variable | | | SI_Type | Integer | 1 | 0 = SISR XML | | SI | String | Variable | | +---------------+---------+-------------+--------------+ Table 9: RESULT_EX Type 4.2.3. Signal Messages 4.2.3.1. Initialization Signal Message A GUA signals an intent to initiate the protocol by sending the VUA or IM the initialization message (SIG_INIT). If the VUA is able and willing to participate, it will send a SIG_INIT message of its own to the GUA that signaled. VUA initiation of the protocol is not supported by this version of DMSP. In addition to the Message Type and Message Subtype fields, the session initialization message consists of the following fields: Version: The DMSP version. This field must be set to "1.0". Session Id: A string uniquely identifying the session that is to be established. The value of this field will likely be determined by the underlying transport mechanism. Transport considerations are discussed in more detail in Section 5. User Agent: Optional User-Agent field analogous to the HTTP request- header of the sending user agent. The typical use of the field in an HTTP application allows a server application to tailor markup to a specific user agent. An example use case for the DMSP is to indicate to the VUA that the client has local TTS capability and the VUA should send prompt text instead of streaming audio. Engelsma & Cross Expires February 1, 2008 [Page 15] Internet-Draft DMSP Protocol July 2007 +-----------------+---------+-------------+------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+------------+ | Message Type | Integer | 1 | MSG_SIGNAL | | Message Subtype | Integer | 1 | SIG_INIT | | Version | String | Variable | "1.0" | | Session ID | String | Variable | | | User Agent | String | Variable | optional | +-----------------+---------+-------------+------------+ Table 10: Session Initialization Message 4.2.3.2. VXML Start Signal Message This message provides an alternate approach to establishing the protocol between the VUA and GUA. The GUA signals its intent to initiate the protocol by sending a VXML start message (SIG_VXML_START) message to the VUA or IM. Unlike the SIG_INIT message, SIG_VXML_START signals the VUA to enter into an active dialog state. If the VUA is able and willing to participate, it will initialize itself by fetching and loading the specified VoiceXML document and beginning execution in the requested VoiceXML form. Once initialized, the VUA will signal its readiness to the GUA by sending it a SIG_VXML_START of its own. The SIG_VXML_START message is typically used to initiate the protocol in use cases where the GUA wishes to expedite initiation of the protocol and is willing to listen to all events generated by the VUA The SIG_INIT message is used when fine-grained control of which events the client will listen is needed, and latency is not an issue. In addition to the Message Type and Message Subtype fields, the session SIG_VXML_START message consists of the following fields: Version: The DMSP version. User agents must send "1.0". Session Id: A string uniquely identifying the session that is to be established. User Agent: The user agent string of the GUA or VUA, depending on who is sending the message. Dialog Type: An integer value that indicates the type of data contained in the Dialog Data field. This field can be set to either CMD_LOAD_URL or CMD_LOAD_SRC. Engelsma & Cross Expires February 1, 2008 [Page 16] Internet-Draft DMSP Protocol July 2007 Dialog Data: If the Dialog Type field is set to CMD_LOAD_URL, this field contains the URL of the VoiceXML document to be loaded. The URL must be a fully qualified URL that conforms to RFC 3986 [3]. The URL must include a fragment component that identifies the VoiceXML form within the document that must be executed. For example: http://example.com/avxmldoc.jsp?astringvar=val#initform If the Dialog Type field is set to CMD_LOAD_SRC, this field contains a conforming VoiceXML document. [5] Event Type Count: The number of elements in the subsequent Event Type array. Event Types: An array of integer event types the sender is subscribing to. Use EVT_ALL to subscribe to all events. See Table 5. The previous discussion on the User Agent field in Section 4.2.3.1 is equally applicable to the User Agent field in this message. +---------------+----------------+-----------+----------------------+ | Field | Type | Byte | Value | | | | Length | | +---------------+----------------+-----------+----------------------+ | Message Type | Integer | 1 | MSG_SIGNAL | | Message | Integer | 1 | SIG_VXML_START | | Subtype | | | | | Version | String | Variable | "1.0" | | Session ID | String | Variable | | | User Agent | String | Variable | | | Dialog Type | Integer | 1 | CMD_LOAD_URL or | | | | | CMD_LOAD_SRC | | Dialog Data | String | Variable | | | Event Type | Integer | 1 | 1-255 | | Count | | | | | Event Types | Array of | Variable | See Table 5 | | | Integers | | | +---------------+----------------+-----------+----------------------+ Table 11: VXML Start Message 4.2.3.3. Close Signal Message Used by the user agents to signal that the protocol is being terminated. Engelsma & Cross Expires February 1, 2008 [Page 17] Internet-Draft DMSP Protocol July 2007 +-----------------+---------+-------------+------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+------------+ | Message Type | Integer | 1 | MSG_SIGNAL | | Message Subtype | Integer | 1 | SIG_CLOSE | +-----------------+---------+-------------+------------+ Table 12: Session Close Signal Message 4.2.4. Command Messages 4.2.4.1. Add Event Listener Message Add event listener messages (CMD_ADD_EVT_LISTENER) are sent by the GUA to listen for specific events generated by the VUA. In addition to Message Type and Message Subtype, the message includes the following fields: Correlation: The message's sequence number to match the request with the corresponding response message. Target Node: The id of the node that is to be listened to. Setting this field to "*" causes the server to forward all events generated by all nodes, for the duration of the protocol session. Sending the "*" value effectively renders the CMD_REMOVE_EVT_LISTENER message a no-op for the duration of the protocol session. See references [9] and [11]. Event Type: Set to one of the event types defined in Table 5 This field is ignored if Target Node has a value of "*". +-----------------+---------+-------------+----------------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+----------------------+ | Message Type | Integer | 1 | MSG_COMMAND | | Message Subtype | Integer | 1 | CMD_ADD_EVT_LISTENER | | Correlation | Integer | 4 | | | Target Node | String | Variable | Node id or "*" | | Event Type | Integer | 1 | See Table 5. | +-----------------+---------+-------------+----------------------+ Table 13: Add Event Listener Message Expected Responses: OK (Table 27), ERROR (Table 31) Engelsma & Cross Expires February 1, 2008 [Page 18] Internet-Draft DMSP Protocol July 2007 4.2.4.2. Remove Event Listener Message The remove event listener message (CMD_REMOVE_EVT_LISTENER) is sent by the GUA to indicate it no longer wants to be notified of specific events generated by the VUA In addition to Message Type and Message Subtype, the message includes the following fields: Correlation: The message's sequence number to match the request with the corresponding response message. Target Node: The id of the node that the sender is no longer interested in listening to. Setting this field to "*" causes subscriptions to all nodes and event types to be removed. See references [9] and [11]. Event Type: Set to one of the event types defined in Table 5 This field is ignored if Target Node has a value of "*". +-----------------+---------+-------------+-------------------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+-------------------------+ | Message Type | Integer | 1 | MSG_COMMAND | | Message Subtype | Integer | 1 | CMD_REMOVE_EVT_LISTENER | | Correlation | Integer | 4 | | | Target Node | String | Variable | Node id or "*" | | Event Type | Integer | 1 | See Table 5. | +-----------------+---------+-------------+-------------------------+ Table 14: Remove Event Listener Message Expected Responses: OK (Table 27), ERROR (Table 31) 4.2.4.3. Can Dispatch Message Sent by the IM to check if the target browser supports a given event type. In addition to Message Type and Message Subtype, the message includes the following fields: Correlation: The message's sequence number to match the request with the corresponding response message. Event Type: Set to one of the event types defined in Table 5 Engelsma & Cross Expires February 1, 2008 [Page 19] Internet-Draft DMSP Protocol July 2007 +-----------------+---------+-------------+------------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+------------------+ | Message Type | Integer | 1 | MSG_COMMAND | | Message Subtype | Integer | 1 | CMD_CAN_DISPATCH | | Correlation | Integer | 4 | | | Event Type | Integer | 1 | See Table 5. | +-----------------+---------+-------------+------------------+ Table 15: Can Dispatch Message Expected Responses: OK Boolean (Table 28), ERROR (Table 31) 4.2.4.4. Dispatch Event Message The IM fires a given event as if it occurred in the source browser. In addition to Message Type and Message Subtype, the message includes the following fields: Correlation: The message's sequence number to match the request with the corresponding response message. Event Type: Set to one of the event types defined in Table 5 +-----------------+---------+-------------+------------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+------------------+ | Message Type | Integer | 1 | MSG_COMMAND | | Message Subtype | Integer | 1 | CMD_DISPATCH_EVT | | Correlation | Integer | 4 | | | Event Type | Integer | 1 | See Table 5. | +-----------------+---------+-------------+------------------+ Table 16: Dispatch Event Message Expected Responses: OK (Table 27), ERROR (Table 31) 4.2.4.5. Load URL Message The GUA sends a load URL message (CMD_LOAD_URL) to the VUA to have it fetch and load a VoiceXML document. In addition to Message Type and Message Subtype, the message includes the following fields: Correlation: The message's sequence number to match the request with the corresponding response message. Engelsma & Cross Expires February 1, 2008 [Page 20] Internet-Draft DMSP Protocol July 2007 URL: The URL of the VoiceXML document to be loaded. The URL must be a fully qualified URL that conforms to RFC 3986 [3]. +-----------------+---------+-------------+--------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+--------------+ | Message Type | Integer | 1 | MSG_COMMAND | | Message Subtype | Integer | 1 | CMD_LOAD_URL | | Correlation | Integer | 4 | | | URL | String | Variable | | +-----------------+---------+-------------+--------------+ Table 17: Load URL Message Expected Responses: OK (Table 27), ERROR (Table 31) 4.2.4.6. Load Source Message The GUA sends a load source message (CMD_LOAD_SRC) to the VUA to have it load a VoiceXML document from a string. In addition to Message Type and Message Subtype, the message includes the following fields: This command is useful in a situation in which the GUA has the VoiceXML content to be executed by the VUA, rather than a URL to VoiceXML content in the network. In practice this could a situation in which VoiceXML markup is in lined within another document that was fetched and is being rendered by the GUA. Correlation: The message's sequence number to match the request with the corresponding response message. Page Source: A string containing a conforming VoiceXML 2.0 document [5]. Base URI: The URL of the VoiceXML document's base URI, if any. The VoiceXML browser may use this URI to resolve relative URI references within the loaded document. The URL must be a fully qualified URL that conforms to RFC 3986 [3]. Engelsma & Cross Expires February 1, 2008 [Page 21] Internet-Draft DMSP Protocol July 2007 +-----------------+---------+-------------+--------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+--------------+ | Message Type | Integer | 1 | MSG_COMMAND | | Message Subtype | Integer | 1 | CMD_LOAD_SRC | | Correlation | Integer | 4 | | | Page Source | String | Variable | | | Base URI | String | Variable | | +-----------------+---------+-------------+--------------+ Table 18: Load Source Message Expected Responses: OK (Table 27), ERROR (Table 31) 4.2.4.7. Get Focus Message Retrieves the id of the document node that has focus. In addition to Message Type and Message Subtype, the message includes the following fields: Correlation: The message's sequence number to match the request with the corresponding response message. +-----------------+---------+-------------+---------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+---------------+ | Message Type | Integer | 1 | MSG_COMMAND | | Message Subtype | Integer | 1 | CMD_GET_FOCUS | | Correlation | Integer | 4 | | +-----------------+---------+-------------+---------------+ Table 19: Get Focus Message Expected Responses: OK String (Field Id, see Table 29), ERROR (Table 31) 4.2.4.8. Set Focus Message The GUA sends a set focus message (CMD_SET_FOCUS) to the VUA to explicitly request that the VoiceXML browser visit a particular VoiceXML input item. In addition to Message Type and Message Subtype, the message includes the following fields: Correlation: The message's sequence number to match the request with the corresponding response message. Engelsma & Cross Expires February 1, 2008 [Page 22] Internet-Draft DMSP Protocol July 2007 Target Node: The name of the input item in the currently active VoiceXML form that is to be visited. +-----------------+---------+-------------+---------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+---------------+ | Message Type | Integer | 1 | MSG_COMMAND | | Message Subtype | Integer | 1 | CMD_SET_FOCUS | | Correlation | Integer | 4 | | | Target Node | String | Variable | | +-----------------+---------+-------------+---------------+ Table 20: Set Focus Message Expected Responses: OK (Table 27), ERROR (Table 31) 4.2.4.9. Get Fields Message Retrieves the current field values. In addition to Message Type and Message Subtype, the message includes the following fields: Correlation: The message's sequence number to match the request with the corresponding response message. Field Count: The number of elements in the subsequent Field array. Fields: An array of Field names. +-----------------+------------------+-------------+----------------+ | Field | Type | Byte Length | Value | +-----------------+------------------+-------------+----------------+ | Message Type | Integer | 1 | MSG_COMMAND | | Message Subtype | Integer | 1 | CMD_SET_FIELDS | | Correlation | Integer | 4 | | | Field Count | Integer | 1 | 1-255 | | Fields | Array of Strings | Variable | | +-----------------+------------------+-------------+----------------+ Table 21: Get Fields Message Expected Responses: Fields (Table 30), ERROR (Table 31) 4.2.4.10. Set Fields Message The GUA sends a set fields message (CMD_SET_FIELDS) to fill fields in the currently active VoiceXML form. That is, if the user were to fill input items displayed visually by the GUA, the VUA can be informed of this by sending it a CMD_SET_FIELDS message. In addition Engelsma & Cross Expires February 1, 2008 [Page 23] Internet-Draft DMSP Protocol July 2007 to Message Type and Message Subtype, the message includes the following fields: Correlation: The message's sequence number to match the request with the corresponding response message. Field Count: The number of elements in the subsequent Field array. Fields: An array of Field name/values. +-----------------+----------------+-------------+----------------+ | Field | Type | Byte Length | Value | +-----------------+----------------+-------------+----------------+ | Message Type | Integer | 1 | MSG_COMMAND | | Message Subtype | Integer | 1 | CMD_SET_FIELDS | | Correlation | Integer | 4 | | | Field Count | Integer | 1 | 1-255 | | Fields | Array of Field | Variable | See Table 7. | +-----------------+----------------+-------------+----------------+ Table 22: Set Fields Message Expected Responses: OK (Table 27), ERROR (Table 31) 4.2.4.11. Cancel Message The GUA cancels the execution of a currently running VoiceXML form by sending an cancel message (CMD_CANCEL) to the VUA. In addition to Message Type and Message Subtype, the message includes the following fields: Correlation: The message's sequence number to match the request with the corresponding response message. +-----------------+---------+-------------+-------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+-------------+ | Message Type | Integer | 1 | MSG_COMMAND | | Message Subtype | Integer | 1 | CMD_CANCEL | | Correlation | Integer | 4 | | +-----------------+---------+-------------+-------------+ Table 23: Cancel Message Expected Responses: OK (Table 27), ERROR (Table 31) Engelsma & Cross Expires February 1, 2008 [Page 24] Internet-Draft DMSP Protocol July 2007 4.2.4.12. Execute Form Message The GUA runs a form within the currently loaded VoiceXML document by sending an execute form message (CMD_EXEC_FORM). If the VUA is already executing a form upon receipt of this message an implicit cancel is assumed. In addition to Message Type and Message Subtype, the message includes the following fields: Correlation: The message's sequence number to match the request with the corresponding response message. Form ID: The id of the VoiceXML form or menu element that is to be executed. +-----------------+---------+-------------+---------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+---------------+ | Message Type | Integer | 1 | MSG_COMMAND | | Message Subtype | Integer | 1 | CMD_EXEC_FORM | | Correlation | Integer | 4 | | | Form ID | String | Variable | | +-----------------+---------+-------------+---------------+ Table 24: Execute Form Message Expected Responses: OK (Table 27), ERROR (Table 31) 4.2.4.13. Set Cookies Message The GUA synchronizes the HTTP session context by sending the Set Cookies message. In addition to Message Type and Message Subtype, the message includes the following fields: Correlation: The message's sequence number to match the request with the corresponding response message. Cookie: A cookie string conforming to RFC 2965 [4]. +-----------------+---------+-------------+-----------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+-----------------+ | Message Type | Integer | 1 | MSG_COMMAND | | Message Subtype | Integer | 1 | CMD_SET_COOKIES | | Correlation | Integer | 4 | | | Cookie | String | Variable | | +-----------------+---------+-------------+-----------------+ Table 25: Execute Form Message Engelsma & Cross Expires February 1, 2008 [Page 25] Internet-Draft DMSP Protocol July 2007 Expected Responses: OK (Table 27), ERROR (Table 31) 4.2.4.14. Get Cookies Message The VUA or IM synchronizes the HTTP session context by sending the Get Cookies message. In addition to Message Type and Message Subtype, the message includes the following fields: Correlation: The message's sequence number to match the request with the corresponding response message. +-----------------+---------+-------------+-----------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+-----------------+ | Message Type | Integer | 1 | MSG_COMMAND | | Message Subtype | Integer | 1 | CMD_GET_COOKIES | | Correlation | Integer | 4 | | +-----------------+---------+-------------+-----------------+ Table 26: Execute Form Message Expected Responses: Cookie String (Table 29), ERROR (Table 31) 4.2.5. Response Messages Response messages are sent by one user agent to respond to command messages sent by another user agent. Response messages can also be sent when a user agent encounters an error while processing a signal message. 4.2.5.1. OK Response Message The OK response message (RESP_OK) is a positive confirmation the server sends to the client in response to certain commands. In addition to Message Type and Message Subtype, the message includes the following field: Response To: The sequence number to the corresponding command that is being confirmed. Engelsma & Cross Expires February 1, 2008 [Page 26] Internet-Draft DMSP Protocol July 2007 +-----------------+---------+-------------+--------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+--------------+ | Message Type | Integer | 1 | MSG_RESPONSE | | Message Subtype | Integer | 1 | RESP_OK | | Response To | Integer | 4 | | +-----------------+---------+-------------+--------------+ Table 27: OK Response Message 4.2.5.2. Boolean Response Message The VUA sends a boolean response value to the GUA using the boolean response message (RESP_BOOL). In addition to Message Type and Message Subtype, the message includes the following field: Response To: The sequence number to corresponding command that is being confirmed. Value: The Boolean value being returned. +-----------------+---------+-------------+--------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+--------------+ | Message Type | Integer | 1 | MSG_RESPONSE | | Message Subtype | Integer | 1 | RESP_BOOL | | Response To | Integer | 4 | | | Value | Boolean | 1 | 0x00, 0x01 | +-----------------+---------+-------------+--------------+ Table 28: Boolean Response Message 4.2.5.3. String Response Message The VUA sends a string response value to the GUA using the boolean response message (RESP_STRING). In addition to Message Type and Message Subtype, the message includes the following field: Response To: The sequence number to corresponding command that is being confirmed. Value: The String value being returned. Engelsma & Cross Expires February 1, 2008 [Page 27] Internet-Draft DMSP Protocol July 2007 +-----------------+---------+-------------+--------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+--------------+ | Message Type | Integer | 1 | MSG_RESPONSE | | Message Subtype | Integer | 1 | RESP_STRING | | Response To | Integer | 4 | | | Value | String | Variable | | +-----------------+---------+-------------+--------------+ Table 29: String Response Message 4.2.5.4. Fields Response Message The VUA sends a string response value to the GUA using the boolean response message (RESP_FIELDS). In addition to Message Type and Message Subtype, the message includes the following field: Response To: The sequence number to corresponding command that is being confirmed. Field Count: Number of fields in the response. Fields: An array of fields and values. +-----------------+-----------------+-------------+--------------+ | Field | Type | Byte Length | Value | +-----------------+-----------------+-------------+--------------+ | Message Type | Integer | 1 | MSG_RESPONSE | | Message Subtype | Integer | 1 | RESP_FIELDS | | Response To | Integer | 4 | | | Field Count | Integer | 1 | 1-255 | | Fields | Array of Fields | Variable | See Table 7. | +-----------------+-----------------+-------------+--------------+ Table 30: Fields Response Message 4.2.5.5. Error Response Message The VUA communicates error conditions to the GUA using the error response message (RESP_ERROR). In addition to Message Type and Message Subtype, the message includes the following fields: Response To: The sequence number to the corresponding command that is being confirmed. Engelsma & Cross Expires February 1, 2008 [Page 28] Internet-Draft DMSP Protocol July 2007 Error Code: A code indicating the error condition. Error Location: A URL conforming to RFC 3986 that marks the location of the error, typically set to the URL of the offending document. Error Description: A textual description of the error. +-------------------+---------+-------------+--------------+ | Field | Type | Byte Length | Value | +-------------------+---------+-------------+--------------+ | Message Type | Integer | 1 | MSG_RESPONSE | | Message Subtype | Integer | 1 | RESP_ERROR | | Response To | Integer | 4 | | | Error Code | Integer | 1 | See Table 6 | | Error Location | String | Variable | | | Error Description | String | Variable | | +-------------------+---------+-------------+--------------+ Table 31: Error Response Message 4.2.6. Event Messages Event messages carry asynchronous notifications from one user agent to another. For example, a VoiceXML nomatch or noinput, as well as the filling of a field are all conveyed to the GUA in event messages. All event messages have a common structure up to the specific event payload. It consists of the following fields: Message Type: Event Message Message Subtype: Event Type, see Table 5 Correlation: Sequence number of a related command message or '0' if user initiated event. Prevents an infinite loop in the IM. Target Node: The id of the DOM Node that is the target of the event. 4.2.6.1. DOMActivate, DOMFocusIn, and DOMFocusOut Event Messages User agents propagate serialized DOM focus events using these messages. In addition to the common event fields, the payload for the Focus Event includes: UI Event Detail: For 'DOMActivate' determines the type of activation. Engelsma & Cross Expires February 1, 2008 [Page 29] Internet-Draft DMSP Protocol July 2007 +-------------+---------+-----------+-------------------------------+ | Field | Type | Byte | Value | | | | Length | | +-------------+---------+-----------+-------------------------------+ | Message | Integer | 1 | MSG_EVENT | | Type | | | | | Message | Integer | 1 | EVT_DOM_ACTIVATE, | | Subtype | | | EVT_DOM_FOCUS_OUT, | | | | | EVT_DOM_FOCUS_IN | | Correlation | Integer | 4 | | | Target Node | String | Variable | | | UI Event | Integer | 4 | See reference [9] | | Detail | | | | +-------------+---------+-----------+-------------------------------+ Table 32: DOMActivate, DOMFocusIn, and DOMFocusOut Event Messages 4.2.6.2. Click, Mouse Down, Mouse Up Event Messages In IM configurations, user agents may propagate DOM 2 MouseEvents. These messages are serialized versions of the W3C DOM 2 MouseEvents. Starting with 'UI Event Detail', the normative definition of the fields are found in reference [9]. In addition to the common event fields, the payload for the Mouse Events includes: UI Event Detail: Indicates the number of times a mouse button has been pressed and released over the same screen location during a user action. clientX, clientY: The horizontal and verticle coordinates at which the event occurred relative to the DOM implementation's client area. screenX, screenY: The horizontal and verticle coordinate at which the event occurred relative to the origin of the screen coordinate system. modifiers: Bit map used to indicate modifier keys shift, alt, ctl. See Table 34. button: During mouse events caused by the depression or release of a mouse button, button is used to indicate which mouse button changed state. Engelsma & Cross Expires February 1, 2008 [Page 30] Internet-Draft DMSP Protocol July 2007 +--------------+---------+-----------+------------------------------+ | Field | Type | Byte | Value | | | | Length | | +--------------+---------+-----------+------------------------------+ | Message Type | Integer | 1 | MSG_EVENT | | Message | Integer | 1 | EVT_CLICK, EVT_MOUSEDOWN, | | Subtype | | | EVT_MOUSEUP | | Correlation | Integer | 4 | | | Target Node | String | Variable | | | UI Event | Integer | 4 | See reference [9] | | Detail | | | | | screenX | Integer | 4 | See reference [9] | | screenY | Integer | 2 | See reference [9] | | clientX | Integer | 2 | See reference [9] | | clientY | Integer | 2 | See reference [9] | | modifiers | Integer | 1 | See reference [9] | | button | Integer | 1 | See reference [9] | +--------------+---------+-----------+------------------------------+ Table 33: Click, Mouse Down, Mouse Up Event Messages +-------------+-----------+ | Key | Bit Field | +-------------+-----------+ | ctrlKey | 0x01 | | shiftKey | 0x02 | | altKey | 0x04 | | metaKey | 0x08 | | altGraphKey | 0x10 | +-------------+-----------+ Table 34: Key Modifiers 4.2.6.3. keydown, keyup, keypress Event Messages In IM configurations, user agents may propagate DOM Key events. These messages are serialized versions of the W3C Key Events. Starting with 'UI Event Detail', the normative definition of the fields are found in reference [9]. In addition to the common event fields, the payload for the Key Events includes: UI Event Detail: keyIdentifier Engelsma & Cross Expires February 1, 2008 [Page 31] Internet-Draft DMSP Protocol July 2007 keyLocation modifiers: Bit map used to indicate modifier keys shift, alt, ctl. See Table 34. +---------------+---------+-----------+-----------------------------+ | Field | Type | Byte | Value | | | | Length | | +---------------+---------+-----------+-----------------------------+ | Message Type | Integer | 1 | MSG_EVENT | | Message | Integer | 1 | EVT_KEYDOWN, EVT_KEYUP, | | Subtype | | | EVT_KEYPRESSED | | Correlation | Integer | 4 | | | Target Node | String | Variable | | | UI Event | Integer | 4 | See reference [9] | | Detail | | | | | keyIdentifier | String | Variable | mandatory See reference [9] | | keyLocation | Integer | 2 | 0xFFFF means unspecified | | | | | See reference [9] | | modifiers | Integer | 1 | See reference [9] | +---------------+---------+-----------+-----------------------------+ Table 35: keydown, keyup, keypress Event Messages 4.2.6.4. Error Event Message The VUA notifies the GUA of a VoiceXML error event by sending an error event message (EVT_ERROR). In addition to the common event fields, the payload for the Error Event includes: Error Location: A URL conforming to RFC 3986 that marks the location of the error, typically set to the URL of the offending document. Error Code: A code indicating the error condition. Error Code String: Used to specify browser-specific error conditions for which the protocol does not predefine a code. This value will be the empty string when the Error Code field is set to a value other than ERR_UNKNOWN_ERROR. Error Description: A textual description of the error. Engelsma & Cross Expires February 1, 2008 [Page 32] Internet-Draft DMSP Protocol July 2007 +-------------------+---------+-------------+--------------+ | Field | Type | Byte Length | Value | +-------------------+---------+-------------+--------------+ | Message Type | Integer | 1 | MSG_EVENT | | Message Subtype | Integer | 1 | EVT_ERROR | | Correlation | Integer | 4 | | | Target Node | String | Variable | | | Error Location | String | Variable | | | Error Code | Integer | 1 | See Table 6. | | Error Code String | String | Variable | | | Error Description | String | Variable | | +-------------------+---------+-------------+--------------+ Table 36: Error Event Message 4.2.6.5. VXML Done Event Message The VUA notifies the GUA with a VXML done event message (EVT_VXML_DONE) when it exits the running form. The payload for the VXML Done Event includes only the common event fields. +-----------------+---------+-------------+--------------+ | Field | Type | Byte Length | Value | +-----------------+---------+-------------+--------------+ | Message Type | Integer | 1 | MSG_EVENT | | Message Subtype | Integer | 1 | EVT_VXMLDONE | | Correlation | Integer | 4 | | | Target Node | String | Variable | | +-----------------+---------+-------------+--------------+ Table 37: VXML Done Event Message 4.2.6.6. Help, Nomatch, and Noinput Event Message The VUA notifies the GUA of VoiceXML help, nomatch, and noinput messages by sending a corresponding event message (EVT_HELP, EVT_NOMATCH, or EVT_NOINPUT). In addition to the common event fields, the payload for the this message includes: Count: Specifies the occurrence of the event. Different count values allow the GUA to respond to the same event differently, depending on how many times it has occurred. Engelsma & Cross Expires February 1, 2008 [Page 33] Internet-Draft DMSP Protocol July 2007 +---------------+---------+-----------+-----------------------------+ | Field | Type | Byte | Value | | | | Length | | +---------------+---------+-----------+-----------------------------+ | Message Type | Integer | 1 | MSG_EVENT | | Message | Integer | 1 | EVT_HELP, EVT_NOMATCH, | | Subtype | | | EVT_NOINPUT | | Correlation | Integer | 4 | | | Target Node | String | Variable | | | Count | Integer | 1 | 1-255 | +---------------+---------+-----------+-----------------------------+ Table 38: Help, Nomatch, and Noinput Event Message 4.2.6.7. Recognition Result Event Message The VUA notifies the GUA of recognition results by sending a recognition result event message (EVT_RECO_RESULT). In addition to the common event fields, the payload for the Recognition Result Event includes: Result Count: The number of recognition results returned. Results: An array of one or more RESULT structures. +-----------------+-----------------+-------------+-----------------+ | Field | Type | Byte Length | Value | +-----------------+-----------------+-------------+-----------------+ | Message Type | Integer | 1 | MSG_EVENT | | Message Subtype | Integer | 1 | EVT_RECO_RESULT | | Correlation | Integer | 4 | | | Target Node | String | Variable | | | Result Count | Integer | 1 | 1-255 | | Results | Array of RESULT | Variable | See Table 8 | +-----------------+-----------------+-------------+-----------------+ Table 39: Recognition Result Event Message While in some application use cases, not sending the values of VoiceXML input items filled by a particular utterance that have not changed may be viewed as an optimization, in command and control style applications this is problematic and requires the application perform extra book-keeping. Therefore, the Results array MUST contain values for each VoiceXML input item filled as a consequence of the utterance, even if the value of an input item has not actually changed. Engelsma & Cross Expires February 1, 2008 [Page 34] Internet-Draft DMSP Protocol July 2007 4.2.6.8. Extended Recognition Result Event Message The VUA notifies the GUI User Agent of extended recognition results by sending an extended recognition result event message (EVT_RECO_RESULTEX). In addition to the common event fields, the payload for the Extended Recognition Result Event includes: Result Count: The number of recognition results returned. Results: An array of one or more RESULT_EX structures. +-----------------+--------------+--------------+-------------------+ | Field | Type | Byte Length | Value | +-----------------+--------------+--------------+-------------------+ | Message Type | Integer | 1 | MSG_EVENT | | Message Subtype | Integer | 1 | EVT_RECO_RESULTEX | | Correlation | Integer | 4 | | | Target Node | String | Variable | | | Result Count | Integer | 1 | 1-255 | | Results | Array of | Variable | See Table 9 | | | RESULT_EX | | | +-----------------+--------------+--------------+-------------------+ Table 40: Extended Recognition Result Event Message While in some application use cases, not sending the values of VoiceXML input items filled by a particular utterance that have not changed may be viewed as an optimization, in command and control style applications this is problematic and requires the application perform extra book-keeping. Therefore, the Results array MUST contain values for each VoiceXML input item filled as a consequence of the utterance, even if the value of an input item has not actually changed. 4.2.6.9. Start and Stop Playback Event Message The VUA notifies the GUI User Agent when playback of audio or TTS prompts has started or stopped. The message only contains the common event fields. Engelsma & Cross Expires February 1, 2008 [Page 35] Internet-Draft DMSP Protocol July 2007 +-------------+--------------+--------------+-----------------------+ | Field | Type | Byte Length | Value | +-------------+--------------+--------------+-----------------------+ | Message | Integer | 1 | MSG_EVENT | | Type | | | | | Message | Integer | 1 | EVT_PLAYBACK_START, | | Subtype | | | EVT_PLAYBACK_STOP | | Correlation | Integer | 4 | | | Target Node | String | Variable | | +-------------+--------------+--------------+-----------------------+ Table 41: Extended Recognition Result Event Message 4.2.6.10. Start and Stop Playback Event Message The VUA notifies the GUI User Agent when playback of TTS encounters a mark in the prompt text. In addition to the common event fields, the payload for the Playback Mark Event includes: Mark: The mark label or tag contained in the prompt text. +-----------------+--------------+--------------+-------------------+ | Field | Type | Byte Length | Value | +-----------------+--------------+--------------+-------------------+ | Message Type | Integer | 1 | MSG_EVENT | | Message Subtype | Integer | 1 | EVT_PLAYBACK_MARK | | Correlation | Integer | 4 | | | Target Node | String | Variable | | | Mark | String | Variable | | +-----------------+--------------+--------------+-------------------+ Table 42: Extended Recognition Result Event Message 4.2.6.11. Custom Event Message The VUA and GUA may implement custom events not specified in the protocol. Custom events may hinder interoperation between user agents from different vendors. In addition to the common event fields, the payload for the Custom Event includes: Name: The custom event name. Value: The value for the event. Engelsma & Cross Expires February 1, 2008 [Page 36] Internet-Draft DMSP Protocol July 2007 +-----------------+--------------+--------------+-------------------+ | Field | Type | Byte Length | Value | +-----------------+--------------+--------------+-------------------+ | Message Type | Integer | 1 | MSG_EVENT | | Message Subtype | Integer | 1 | EVT_PLAYBACK_MARK | | Correlation | Integer | 4 | | | Target Node | String | Variable | | | Name | String | Variable | | | Value | String | Variable | | +-----------------+--------------+--------------+-------------------+ Table 43: Extended Recognition Result Event Message 4.3. XML Message Encoding The DMSP XML Encoding provides a one-for-one equivalence to the binary message set. Thus, it is possible for an IM implementation to use the binary messages with clients on a low bandwidth cellular network and synchronize with user agents on the high speed network using XML. The XML message set uses batching of messages within an XML document to allow the application to optimize performance. This is provided by , the root of the DMSP XML language. The element can contain one or more , , , or elements. For the complete schema for the XML encoding, see Appendix A. 4.4. State Machines The sending and receiving of DMSP messages is governed by state machines executing on the user agents. Transitions between states are driven by the receipt of messages from the corresponding remote side, or when the local user agent invokes state machine primitives that cause messages to be sent. In the state table specifications below, the generalized term "upper layer" refers to the entity invoking the state machine's primitives. That is, when specifying the GUA state machine, "upper layer" refers to entities such as XHTML browsers, or custom applications (e.g. Java MIDlets) that are coordinating with the remote VUA. When specifying the VUA state machine, "upper layer" would refer to the VoiceXML browser that is executing on behalf of the remote client. The following notational conventions are used: Sm(p): Denotes the invoking of a state machine primitive p. Engelsma & Cross Expires February 1, 2008 [Page 37] Internet-Draft DMSP Protocol July 2007 Tx(m): Denotes the sending of a DMSP message m. Rx(m): Denotes the receiving of a DMSP message m. N(e): Denotes that the state machine is notifying the upper layer of an event e. As described in Section 4.2, some messages have a sequence number (Correlation field) to match requests with corresponding responses. Sequence numbers are used to synchronize state machines. For instance, when the GUA state machine sends the CMD_EXEC_FORM message and transitions to the DLG_SENT state, it stores the sequence number of the CMD_EXEC_FORM message. Responses to the CMD_EXEC_FORM (i.e. either the RESP_ERROR or RESP_OK) are processed only if their sequence number matches what the GUA state machine is expecting. If the sequence numbers match, the response is processed and the GUA state machine transitions to the next state. Responses that do not have a sequence number matching with what the GUA is expecting are ignored and there state transition is effected. Sequence numbers are used in a similar way at the VUA to control processing. For instance, when the VUA state machine receives a CMD_EXEC_FORM message, it stores the sequence number present in the message and notifies the upper layer with a RUN_DLG notification, while transitioning to the DLG_RCVD state. Subsequent primitives (Sm(DLG_ACTIVE) or Sm(DLG_ERROR)) that are received from the upper layer in response to the RUN_DLG notification are processed only if the sequence number in the primitive matches with what the state machine is expecting. The LOAD_DOC notification and its responses from the upper layer (Sm(DOC_LOADED) or Sm(DOC_ERROR)) are handled similarly. In configurations using an Interaction Manager, it is assumed that the user agents operate the same state machines as in the distributed client configuration. The IM operates as a proxy in most cases. In some implementations the IM may synthesize event messages. Event messages carry a correlation number to determine if it was initiated by a user agent or synthesized. The correlation is used to prevent an infinite loop in the IM when synthesizing messages. In the following sections, the state machines are specified using tables (one per state). In each table, only the relevant events are included. Events that occur in a state that are not specified in these tables are to be ignored and no state transition takes place. This includes any event that occurs without a condition matching what is specified in the table. The state machine specification is articulated in terms of the Engelsma & Cross Expires February 1, 2008 [Page 38] Internet-Draft DMSP Protocol July 2007 client(GUA)/server(VUA) configuration for simplicity. The state machines are unaffected with the introduction of an Interaction Manager. 4.4.1. GUI User Agent State Machine The GUA state machine consists of the following states: CONN_CLOSED: Protocol is not established with the VUA. This is an initial state. INIT_SENT: GUA has initiated the protocol with the VUA, but its not yet established. CONN_OPEN: The protocol is established. DOC_SENT The GUA has requested the VUA load a document. DOC_LOADED: The VUA has acknowledged the requested document is loaded. DLG_SENT: The GUA has requested the VUA begin executing a dialog. DLG_ACTIVE: The VUA has acknowledged that the requested dialog is now running. This is optionally an initial state. VXML_START_SENT The GUA has initiated the protocol with the VUA and specified the initial dialog to be executed. It is possible to initiate the GUA state machine in the DLG_ACTIVE state. In this case, it is assumed that the underlying transport protocol has been used to specify a consistent initial DLG_ACTIVE state for both the GUA and VUA state machines. For example, if an implementation utilized SIP as the transport mechanism, it would be possible to have the initial INVITE sent from the GUA to the VUA include enough information to enable the VUA to fetch, load, and execute a particular VoiceXML form without exchanging any DMSP messages. 4.4.1.1. CONN_CLOSED State This is the initial state of the GUA state machine. In this state, the underlying transport mechanism that carries the DMSP messages is assumed to be established. The following events can be handled in this state: Engelsma & Cross Expires February 1, 2008 [Page 39] Internet-Draft DMSP Protocol July 2007 Sm(OPEN_CONN): This is a primitive received from the upper layer. This event is handled by sending a SIG_INIT message to the VUA to establish a connection. The state machine transitions to the INIT_SENT state after handling the primitive. This primitive is used if the GUA desires fine grained control of which remote browser events it wishes to listen to. Sm(VXML_START): This is a primitive received from the upper layer. This event is handled by sending a SIG_VXML_START to the VUA. This primitive is used if the GUA wishes to start a VoiceXML dialog on the remote VUA and listen to all the events it generates. +----------------+-----------------+--------------------+ | Event | Next State | Action | +----------------+-----------------+--------------------+ | Sm(OPEN_CONN) | INIT_SENT | Tx(SIG_INIT) | | Sm(VXML_START) | VXML_START_SENT | Tx(SIG_VXML_START) | +----------------+-----------------+--------------------+ Table 44: CONN_CLOSED State 4.4.1.2. INIT_SENT State In this state, the state machine is awaiting response to the messages sent in the CONN_CLOSED state. The following events are handled: Rx(SIG_INIT): The state machine receives a SIG_INIT message indicating that the VUA has accepted the connection request. The event is handled by notifying the upper layer with a CONN_OPEN notification and the state machine transitions to the CONN_OPEN state. Rx(SIG_CLOSE): The state machine receives a SIG_CLOSE message indicating that the VUA has declined the connection request. The event is handled by notifying the upper layer with a CONN_FAIL notification and the state machine transitions back to the CONN_CLOSED state. Sm(CLOSE_CONN): This is a primitive received from the upper layer, used to initiate a connection close. The event is handled by sending a SIG_CLOSE message to the VUA and the state machine transitions back to the CONN_CLOSED state. Engelsma & Cross Expires February 1, 2008 [Page 40] Internet-Draft DMSP Protocol July 2007 +----------------+-------------+---------------+ | Event | Next State | Action | +----------------+-------------+---------------+ | Rx(SIG_INIT) | CONN_OPEN | N(CONN_OPEN) | | Rx(SIG_CLOSE) | CONN_CLOSED | N(CONN_FAIL) | | Sm(CLOSE_CONN) | CONN_CLOSED | Tx(SIG_CLOSE) | +----------------+-------------+---------------+ Table 45: INIT_SENT State 4.4.1.3. CONN_OPEN State The protocol has been successfully established with the VUA in this state. The following events are handled: Sm(CLOSE_CONN): This is a primitive received from the upper layer, instructing the state machine to close the connection. The event is handled by sending a SIG_CLOSE message to the VUA and the state machine transitions to the CONN_CLOSED state. Rx(SIG_CLOSE): The state machine receives a SIG_CLOSE message from the VUA. This instructs it to close the connection. The event is handled by notifying the upper layer with a CONN_CLOSED notification and the state machine transitions to the CONN_CLOSED state. Sm(LOAD_DOC): This is a primitive received from the upper layer, instructing the state machine to load a VoiceXML document. The event is handled by sending a CMD_LOAD_URL or CMD_LOAD_SRC message to the VUA and the state machine transitions to the DOC_SENT state. Sm(ADD_LISTENER): This is a primitive received from the upper layer, instructing the state machine to listen for an event or set of events. The event is handled by sending a CMD_ADD_EVT_LISTENER message to the VUA. There is no state transition. Sm(REMOVE_LISTENER): This is a primitive received from the upper layer, instructing the state machine to no longer listen to an event or set of events. The event is handled by sending a CMD_REMOVE_EVT_LISTENER message to the VUA. There is no state transition. RX(RESP_OK): RESP_OK messages are received for each successful Sm(ADD_LISTENER) or Sm(REMOVE_LISTENER) primitives invoked by the upper layer in this state. There is no state transition. Engelsma & Cross Expires February 1, 2008 [Page 41] Internet-Draft DMSP Protocol July 2007 RX(RESP_ERROR): A RESP_ERROR indicates the failure of a Sm(ADD_LISTENER) or Sm(REMOVE_LISTENER). The state machine informs the upper layer of the failure and transitions to the CONN_CLOSED state. +---------------------+-------------+----------------------------+ | Event | Next State | Action | +---------------------+-------------+----------------------------+ | Sm(CLOSE_CONN) | CONN_CLOSED | Tx(SIG_CLOSE) | | Rx(SIG_CLOSE) | CONN_CLOSED | N(CONN_CLOSED) | | Sm(LOAD_DOC) | DOC_SENT | Tx(CMD_LOAD_SRC) | | | | || Tx(CMD_LOAD_URL) | | Sm(ADD_LISTENER) | CONN_OPEN | Tx(CMD_ADD_EVT_LISTENER) | | Sm(REMOVE_LISTENER) | CONN_OPEN | Tx(CMD_REMOVE_EVT_LISTENER | | Rx(RESP_OK) | CONN_OPEN | | +---------------------+-------------+----------------------------+ Table 46: CONN_OPEN State 4.4.1.4. DOC_SENT State A CMD_LOAD_SRC or CMD_LOAD_URL message has been sent to the VUA and the GUA is awaiting a response from the VUA. The following events are handled: Sm(CLOSE_CONN): This is a primitive received from the upper layer, instructing the state machine to close the connection. The event is handled by sending a SIG_CLOSE message to the VUA and the state machine transitions to the CONN_CLOSED state. Rx(SIG_CLOSE): The state machine receives a SIG_CLOSE message from the VUA. This instructs it to close the connection. The event is handled by notifying the upper layer with a CONN_CLOSED notification and the state machine transitions to the CONN_CLOSED state. Rx(RESP_ERROR): The state machine receives a RESP_ERROR message from the VUA, indicating that an error occurred while loading a document. Condition: The document instance sequence number in the RESP_ERROR message matches with the state machine's document instance sequence number. This indicates that an error occurred while loading the document for which the GUA is awaiting a response. The event is handled by notifying the upper layer with a DOC_ERROR notification and the state machine transitions to the CONN_OPEN state. Engelsma & Cross Expires February 1, 2008 [Page 42] Internet-Draft DMSP Protocol July 2007 Rx(RESP_OK): The state machine receives a RESP_OK message from the VUA, indicating that a document was loaded successfully. Condition: The document instance sequence number in the RESP_OK message matches with the state machine's document instance sequence number. This indicates that the RESP_OK message corresponds to the CMD_LOAD_SRC or CMD_LOAD_URL message sent earlier, for which the GUA is awaiting a response. The event is handled by notifying the upper layer with a DOC_LOADED notification and the state machine transitions to the DOC_LOADED state. Sm(LOAD_DOC): This is a primitive received from the upper layer, instructing the state machine to load a VoiceXML document while the state machine is still awaiting response to a previously sent CMD_LOAD_SRC or CMD_LOAD_URL message. The event is handled by sending a CMD_LOAD_SRC or CMD_LOAD_URL message to the VUA and the state machine transitions to the DOC_SENT state. +----------------+-----------------+-------------+------------------+ | Event | Condition | Next State | Action | +----------------+-----------------+-------------+------------------+ | Sm(CLOSE_CONN) | | CONN_CLOSED | Tx(SIG_CLOSE) | | Rx(SIG_CLOSE) | | CONN_CLOSED | N(CONN_CLOSED) | | Rx(RESP_ERROR) | Document | CONN_OPEN | N(DOC_ERROR) | | | Sequence Match | | | | Rx(RESP_OK) | Document | DOC_LOADED | N(DOC_LOADED) | | | Sequence Match | | | | Sm(LOAD_DOC) | | DOC_SENT | Tx(CMD_LOAD_SRC) | | | | | || | | | | | Tx(CMD_LOAD_URL) | +----------------+-----------------+-------------+------------------+ Table 47: DOC_SENT State 4.4.1.5. DOC_LOADED State The VUA has successfully loaded a VoiceXML document. The following events are handled: Sm(CLOSE_CONN): This is a primitive received from the upper layer, instructing the state machine to close the connection. The event is handled by sending a SIG_CLOSE message to the VUA and the state machine transitions to the CONN_CLOSED state. Engelsma & Cross Expires February 1, 2008 [Page 43] Internet-Draft DMSP Protocol July 2007 Rx(SIG_CLOSE): The state machine receives a SIG_CLOSE message from the VUA. This instructs it to close the connection. The event is handled by notifying the upper layer with a CONN_CLOSED notification and the state machine transitions to the CONN_CLOSED state. Sm(LOAD_DOC): This is a primitive received from the upper layer, instructing the state machine to load a VoiceXML document. The event is handled by sending a CMD_LOAD_SRC or CMD_LOAD_URL message to the VUA and the state machine transitions to the DOC_SENT state. Sm(RUN_DLG): This is a primitive received from the upper layer, instructing the state machine to run a VoiceXML dialogue. The event is handled by sending a CMD_EXEC_FORM message to the VUA and the state machine transitions to the DLG_SENT state. Sm(ADD_LISTENER): This is a primitive received from the upper layer, instructing the state machine to listen for an event or set of events. The event is handled by sending a CMD_ADD_EVT_LISTENER message to the VUA. There is no state transition. Sm(REMOVE_LISTENER): This is a primitive received from the upper layer, instructing the state machine to no longer listen to an event or set of events. The event is handled by sending a CMD_REMOVE_EVT_LISTENER message to the VUA. There is no state transition. RX(RESP_OK): RESP_OK messages are received for each successful Sm(ADD_LISTENER) or Sm(REMOVE_LISTENER) primitives invoked by the upper layer in this state. There is no state transition. +---------------------+-------------+----------------------------+ | Event | Next State | Action | +---------------------+-------------+----------------------------+ | Sm(CLOSE_CONN) | CONN_CLOSED | Tx(SIG_CLOSE) | | Rx(SIG_CLOSE) | CONN_CLOSED | N(CONN_CLOSED) | | Sm(RUN_DLG) | DLG_SENT | Tx(CMD_EXEC_FORM) | | Sm(LOAD_DOC) | DOC_SENT | Tx(CMD_LOAD_SRC) | | | | || Tx(CMD_LOAD_URL) | | Sm(ADD_LISTENER) | CONN_OPEN | Tx(CMD_ADD_EVT_LISTENER) | | Sm(REMOVE_LISTENER) | CONN_OPEN | Tx(CMD_REMOVE_EVT_LISTENER | | Rx(RESP_OK) | CONN_OPEN | | +---------------------+-------------+----------------------------+ Table 48: DOC_LOADED State Engelsma & Cross Expires February 1, 2008 [Page 44] Internet-Draft DMSP Protocol July 2007 4.4.1.6. DLG_SENT State The GUA has sent a request to the VUA to activate a dialogue and is awaiting a response. The following events are handled: Sm(CLOSE_CONN): This is a primitive received from the upper layer, instructing the state machine to close the connection. The event is handled by sending a SIG_CLOSE message to the VUA and the state machine transitions to the CONN_CLOSED state. Rx(SIG_CLOSE): The state machine receives a SIG_CLOSE message from the VUA. This instructs it to close the connection. The event is handled by notifying the upper layer with a CONN_CLOSED notification and the state machine transitions to the CONN_CLOSED state. Sm(LOAD_DOC): This is a primitive received from the upper layer, instructing the state machine to load a VoiceXML document. The event is handled by sending a CMD_LOAD_SRC or CMD_LOAD_URL message to the VUA and the state machine transitions to the DOC_SENT state. Sm(RUN_DLG): This is a primitive received from the upper layer, instructing the state machine to run another VoiceXML dialogue. The event is handled by sending a CMD_EXEC_FORM message to the VUA. The state machine remains in the DLG_SENT state. Since only one dialogue can be active at a time, the state machine sends a CMD_ABORT message to the VUA, instructing it to cancel the previous dialogue activation. Sm(CANCEL_DLG): This is a primitive received from the upper layer, instructing the state machine to cancel the current dialogue. The event is handled by sending a CMD_ABORT message to the VUA and the state machine transitions to the DOC_LOADED state. Rx(RESP_ERROR): The state machine receives a RESP_ERROR message from the VUA indicating that an error occurred in activating a voice dialogue. Condition: The dialogue instance sequence number of the RESP_ERROR message matches with the dialogue instance sequence number of the state machine. This ensures that an error occurred while activating the dialogue for which the GUA is awaiting a response. The event is handled by notifying the upper layer with a DLG_ERROR notification and the state machine transitions to the DOC_LOADED state. Engelsma & Cross Expires February 1, 2008 [Page 45] Internet-Draft DMSP Protocol July 2007 Rx(RESP_OK): The state machine receives a RESP_OK message from the VUA indicating that a voice dialogue has been activated successfully. Condition: The dialogue instance sequence number of the RESP_OK message matches with the dialogue instance sequence number of the state machine. This ensures the RESP_OK message corresponds to a previously sent CMD_EXEC_FORM message, for which the GUA is awaiting a response. The event is handled by notifying the upper layer with a DLG_ACTIVE notification and the state machine transitions to the DLG_ACTIVE state. +----------------+----------------+-------------+-------------------+ | Event | Condition | Next State | Action | +----------------+----------------+-------------+-------------------+ | Sm(CLOSE_CONN) | | CONN_CLOSED | Tx(SIG_CLOSE) | | Rx(SIG_CLOSE) | | CONN_CLOSED | N(CONN_CLOSED) | | Sm(LOAD_DOC) | | DOC_SENT | Tx(CMD_LOAD_SRC) | | | | | || | | | | | Tx(CMD_LOAD_URL) | | Sm(RUN_DLG) | | DLG_SENT | Tx(CMD_EXEC_FORM) | | Sm(CANCEL_DLG) | | DOC_LOADED | Tx(CMD_ABORT) | | Rx(RESP_ERROR) | Dialog | DOC_LOADED | N(DLG_ERROR) | | | Sequence Match | | | | Rx(RESP_OK) | Dialog | DLG_ACTIVE | N(DLG_ACTIVE) | | | Sequence Match | | | +----------------+----------------+-------------+-------------------+ Table 49: DLG_SENT State 4.4.1.7. DLG_ACTIVE State The VUA has successfully activated a voice dialogue and the GUA can remotely interact with it. As discussed above, this state can optionally serve as an initial state for implementations in which a VUA can be initialized in a running state by the underlying transport mechanism. The following events are handled: Sm(CLOSE_CONN): This is a primitive received from the upper layer, instructing the state machine to close the connection. The event is handled by sending a SIG_CLOSE message to the VUA and the state machine transitions to the CONN_CLOSED state. Engelsma & Cross Expires February 1, 2008 [Page 46] Internet-Draft DMSP Protocol July 2007 Rx(SIG_CLOSE): The state machine receives a SIG_CLOSE message from the VUA. This instructs it to close the connection. The event is handled by notifying the upper layer with a CONN_CLOSED notification and the state machine transitions to the CONN_CLOSED state. Sm(LOAD_DOC): This is a primitive received from the upper layer, instructing the state machine to load a VoiceXML document. The event is handled by sending a CMD_LOAD_SRC or CMD_LOAD_URL message to the VUA and the state machine transitions to the DOC_SENT state. Sm(RUN_DLG): This is a primitive received from the upper layer, instructing the state machine to run another VoiceXML dialogue. The event is handled by sending a CMD_EXEC_FORM message to the VUA and the state machine transitions to the DLG_SENT state. Since only one dialogue can be active at a time, the VUA state machine will implicitly abort any currently running dialog. Sm(CANCEL_DLG): This is a primitive received from the upper layer instructing the state machine to cancel the current dialogue. The event is handled by sending a CMD_ABORT message to the VUA and the state machine transitions to the DOC_LOADED state. Rx(EVT_RECO_RESULT): The state machine receives a EVT_RECO_RESULT message from the VUA. This message will have the results of a successful voice recognition. The event is handled by notifying the upper layer with a DLG_RESULTS notification. The state machine remains in the same state. Rx(EVT_ERROR): The state machine receives a EVT_ERROR message from the VUA. This message indicates the VUA has an encountered an error while processing the dialogue. The event is handled by notifying the upper layer with a DLG_ERROR notification. The state machine remains in the same state. Rx(EVT_HELP): The state machine receives a EVT_HELP message from the VUA. This message indicates the dialogue has thrown a help event. The event is handled by notifying the upper layer with a DLG_HELP notification. The state machine remains in the same state. Rx(EVT_NOMATCH): The state machine receives a EVT_NOMATCH message from the VUA. This message indicates the dialogue has thrown a nomatch event. The event is handled by notifying the upper layer with a DLG_NOMATCH notification. The state machine remains in the same state. Engelsma & Cross Expires February 1, 2008 [Page 47] Internet-Draft DMSP Protocol July 2007 Rx(EVT_NOINPUT): The state machine receives a EVT_NOINPUT message from the VUA. This message indicates the dialogue has thrown a noinput event. The event is handled by notifying the upper layer with a DLG_NOINPUT notification. The state machine remains in the same state. Rx(EVT_VXMLDONE): The state machine receives a EVT_VXMLDONE message from the VUA. This message indicates the VoiceXML browser has completed filling the current form. The event is handled by notifying the upper layer with a DLG_VXMLDONE notification. The state machine remains in the same state. Sm(UPDATE_DLG): This is a primitive received from the upper layer, instructing the state machine to update the values of a VoiceXML field. The event is handled by sending an CMD_SET_FIELDS message to the VUA. The state machine remains in the same state. Sm(FOCUS_DLG): This is a primitive received from the upper layer, instructing the state machine to set the focus on a VoiceXML field. The event is handled by sending a CMD_SET_FOCUS message to the VUA. The state machine remains in the same state. +---------------------+-------------+---------------------+ | Event | Next State | Action | +---------------------+-------------+---------------------+ | Sm(CLOSE_CONN) | CONN_CLOSED | Tx(SIG_CLOSE) | | Rx(SIG_CLOSE) | CONN_CLOSED | Notify(CONN_CLOSED) | | Sm(LOAD_DOC) | DOC_SENT | Tx(CMD_LOAD_SRC) | | | | || Tx(CMD_LOAD_URL) | | Sm(RUN_DLG) | DLG_SENT | Tx(CMD_EXEC_FORM) | | Sm(CANCEL_DLG) | DOC_LOADED | Tx(CMD_ABORT) | | Rx(EVT_RECO_RESULT) | DLG_ACTIVE | N(DLG_RESULTS) | | Rx(EVT_ERROR) | DLG_ACTIVE | N(DLG_ERROR) | | Rx(EVT_HELP) | DLG_ACTIVE | N(DLG_HELP) | | Rx(EVT_NOMATCH) | DLG_ACTIVE | N(DLG_NOMATCH) | | Rx(EVT_NOINPUT) | DLG_ACTIVE | N(DLG_NOINPUT) | | Rx(EVT_VXMLDONE) | DLG_ACTIVE | N(DLG_VXMLDONE) | | Sm(UPDATE_DLG) | DLG_ACTIVE | Tx(CMD_SET_FIELDS) | | Sm(FOCUS_DLG) | DLG_ACTIVE | Tx(CMD_SET_FOCUS) | +---------------------+-------------+---------------------+ Table 50: DLG_Active State 4.4.1.8. VXML_START_SENT State In this state the GUA is waiting for the a response from the VUA indicating the session has been established, and the requested VoiceXML dialogue is executing. The following events are handled: Engelsma & Cross Expires February 1, 2008 [Page 48] Internet-Draft DMSP Protocol July 2007 Rx(SIG_VXML_START): The state machine receives a SIG_VXML_START message indicating that the VUA has accepted the connection request and has started executing the requested VoiceXML dialogue. The event is handled by notifying the upper layer with a VXML_STARTED notification and the state machine transitions to the DLG_ACTIVE state. Rx(SIG_CLOSE): The state machine receives a SIG_CLOSE message indicating that the VUA has declined the connection request. The event is handled by notifying the upper layer with a CONN_FAIL notification and the state machine transitions back to the CONN_CLOSED state. Sm(CLOSE_CONN): This is a primitive received from the upper layer, used to initiate a connection close. The event is handled by sending a SIG_CLOSE message to the VUA and the state machine transitions back to the CONN_CLOSED state. +--------------------+-------------+-----------------+ | Event | Next State | Action | +--------------------+-------------+-----------------+ | Rx(SIG_VXML_START) | DLG_ACTIVE | N(VXML_STARTED) | | Rx(SIG_CLOSE) | CONN_CLOSED | N(CONN_FAIL) | | Sm(CLOSE_CONN) | CONN_CLOSED | Tx(SIG_CLOSE) | +--------------------+-------------+-----------------+ Table 51: VXML_START_SENT State 4.4.2. VoiceXML User Agent State Machine The VUA state machine consists of the following states: CONN_CLOSED: Protocol is not established with the GUA. This is an initial state. CONN_RCVD The VUA has received a SIG_INIT from the GUA. CONN_OPEN: The VUA has acknowledged session is established. DOC_RCVD The VUA has received a request from the GUA to load a document. DOC_LOADED The VUA has acknowledged the requested document is loaded. Engelsma & Cross Expires February 1, 2008 [Page 49] Internet-Draft DMSP Protocol July 2007 DLG_RCVD The VUA has received a request from the GUA to execute a form. DLG_ACTIVE: The VUA has acknowledged that the requested dialog is now running. This is optionally an initial state. VXML_START_RCVD The VUA has received a SIG_VXML_START from the GUA to establish a session and invoke the initial VXML dialog. As discussed in Section 4.4.1, both the VUA and GUA state machines can be initiated in the DLG_ACTIVE state if the underlying transport mechanism has been used to coordinate. In this case, implementations must ensure that both state machines are initiated in their respective DLG_ACTIVE states. The only other possibility allowed is that both state machines are initiated in their respective CONN_CLOSED states. 4.4.2.1. CONN_CLOSED State This is an initial state of the VUA server state machine. In this state, the underlying transport mechanism that carries the DMSP message is assumed to be established. The following events can be handled in this state: Rx(SIG_INIT) A SIG_INIT message is received from the GUA, indicating its intent to establish the protocol. The event is handled by notifying the upper layer with a CONN_OPEN notification and the state machine transitions to the CONN_RCVD state. Rx(Rx(SIG_VXML_START)) A SIG_VXML_START message is received from the GUA, indicating its intent to establish the protocol and initiate the load and execution of a VoiceXML document. The event is handled by notifying the upper layer with a VXML_START notification and the state machine transitions to the VXML_START_RCVD state. +--------------------+-----------------+---------------+ | Event | Next State | Action | +--------------------+-----------------+---------------+ | Rx(SIG_INIT) | CONN_RCVD | N(CONN_OPEN) | | Rx(SIG_VXML_START) | VXML_START_RCVD | N(VXML_START) | +--------------------+-----------------+---------------+ Table 52: CONN_CLOSED State Engelsma & Cross Expires February 1, 2008 [Page 50] Internet-Draft DMSP Protocol July 2007 4.4.2.2. CONN_RCVD State The state machine has received a SIG_INIT request from the GUA. It is possible to either initiate the protocol with the GUA or to reject. The upper layer will determine this and call the appropriate state machine primitive. The following events can be handled in this state: Sm(ACCEPT_CONN): This is a primitive received from the upper layer, indicating that the connection has been accepted. The event is handled by sending a SIG_INIT message to the GUA and the state machine transitions to the CONN_OPEN state. Sm(REJECT_CONN): This is a primitive received from the upper layer, indicating that the connection has been rejected. The event is handled by sending a SIG_CLOSE message to the GUA and the state machine transitions to the CONN_CLOSED state. Rx(SIG_CLOSE): The state machine receives a SIG_CLOSE message from the GUA. This instructs it to close the connection. The event is handled by notifying the upper layer with a CONN_CLOSED notification and the state machine transitions to the CONN_CLOSED state. +-----------------+-------------+----------------+ | Event | Next State | Action | +-----------------+-------------+----------------+ | Sm(ACCEPT_CONN) | CONN_OPEN | Tx(SIG_INIT) | | Sm(REJECT_CONN) | CONN_CLOSED | Tx(SIG_CLOSE) | | Rx(SIG_CLOSE) | CONN_CLOSED | N(CONN_CLOSED) | +-----------------+-------------+----------------+ Table 53: CONN_RCVD State 4.4.2.3. CONN_OPEN State The protocol has been successfully initiated with the GUA. The following events can be handled in this state: Rx(SIG_CLOSE): The state machine receives a SIG_CLOSE message from the GUA. This instructs it to close the connection. The event is handled by notifying the upper layer with a CONN_CLOSED notification and the state machine transitions to the CONN_CLOSED state. Engelsma & Cross Expires February 1, 2008 [Page 51] Internet-Draft DMSP Protocol July 2007 Sm(CLOSE_CONN): This is a primitive received from the upper layer, instructing the state machine to close the connection. The event is handled by sending a SIG_CLOSE message to the GUA and the state machine transitions to the CONN_CLOSED state. Rx(CMD_LOAD_URL): This is a message received from the GUA instructing the state machine to load a VoiceXML document from the given URL. The event is handled by notifying the upper layer with a LOAD_DOC notification. The state machine transitions to the DOC_RCVD state. Rx(CMD_LOAD_SRC): This is a message received from the GUA instructing the state machine to load a VoiceXML document from the given string. The event is handled by notifying the upper layer with a LOAD_DOC notification. The state machine transitions to the DOC_RCVD state. Rx(CMD_ADD_EVT_LISTENER): The VUA has received a CMD_ADD_EVT_LISTENER from the GUA. The event is handled by notifying the upper layer with an ADD_LISTENER notification. There is no state transition. Rx(CMD_REMOVE_EVT_LISTENER): The VUA has received a CMD_REMOVE_EVT_LISTENER from the GUA. The event is handled by notifying the upper layer with an REMOVE_LISTENER notification. There is no state transition. +-----------------------------+-------------+----------------+ | Event | Next State | Action | +-----------------------------+-------------+----------------+ | Rx(SIG_CLOSE) | CONN_CLOSED | N(CONN_CLOSED) | | Sm(CLOSE_CONN) | CONN_CLOSED | Tx(SIG_CLOSE) | | Rx(CMD_LOAD_URL) | DOC_RCVD | N(LOAD_DOC) | | Rx(CMD_LOAD_SRC) | DOC_RCVD | N(LOAD_DOC) | | Rx(CMD_ADD_EVT_LISTENER) | CONN_OPEN | N(ADD) | | Rx(CMD_REMOVE_EVT_LISTENER) | CONN_OPEN | N(REMOVE) | +-----------------------------+-------------+----------------+ Table 54: CONN_OPEN State 4.4.2.4. DOC_RCVD State The VUA has received a request to load a VoiceXML document. The following events can be handled in this state: Engelsma & Cross Expires February 1, 2008 [Page 52] Internet-Draft DMSP Protocol July 2007 Rx(SIG_CLOSE): The state machine receives a SIG_CLOSE message from the GUA. This instructs it to close the connection. The event is handled by notifying the upper layer with a CONN_CLOSED notification and the state machine transitions to the CONN_CLOSED state. Sm(CLOSE_CONN): This is a primitive received from the upper layer, instructing the state machine to close the connection. The event is handled by sending a SIG_CLOSE message to the GUA and the state machine transitions to the CONN_CLOSED state. Sm(DOC_ERROR): This is a primitive received from the upper layer, indicating that an error occurred while loading a document. Condition: The document instance sequence number in the primitive matches with the document instance sequence number of the state machine. The event is handled by sending a RESP_ERROR message to the GUA and the state machine transitions to the CONN_OPEN state. Sm(DOC_LOADED): This is a primitive received from the upper layer, indicating that a document has been loaded successfully. Condition: The document instance sequence number in the primitive matches with the document instance sequence number of the state machine. The event is handled by sending a RESULT_OK message to the GUA and the state machine transitions to the DOC_LOADED state. Rx(CMD_LOAD_URL): This is a message received from the GUA instructing the state machine to load a VoiceXML document from the given URL. The event is handled by notifying the upper layer with a LOAD_DOC notification. The state machine transitions to the DOC_RCVD state. Rx(CMD_LOAD_SRC): This is a message received from the GUA instructing the state machine to load a VoiceXML document from a string the GUA has sent. The event is handled by notifying the upper layer with a LOAD_DOC notification. The state machine transitions to the DOC_RCVD state. Engelsma & Cross Expires February 1, 2008 [Page 53] Internet-Draft DMSP Protocol July 2007 +------------------+-----------------+-------------+----------------+ | Event | Condition | Next State | Action | +------------------+-----------------+-------------+----------------+ | Rx(SIG_CLOSE) | | CONN_CLOSED | N(CONN_CLOSED) | | Sm(CLOSE_CONN) | | CONN_CLOSED | Tx(SIG_CLOSE) | | Sm(DOC_ERROR) | Document | CONN_OPEN | Tx(RESP_ERROR) | | | Sequence Match | | | | Sm(DOC_LOADED) | Document | DOC_LOADED | Tx(RESP_OK) | | | Sequence Match | | | | Rx(CMD_LOAD_URL) | | DOC_RCVD | N(LOAD_DOC) | | Rx(CMD_LOAD_SRC) | | DOC_RCVD | N(LOAD_DOC) | +------------------+-----------------+-------------+----------------+ Table 55: DOC_RCVD State 4.4.2.5. DOC_LOADED State The VUA has successfully loaded a VoiceXML document. The following events are handled:" Rx(SIG_CLOSE): The state machine receives a SIG_CLOSE message from the GUA. This instructs it to close the connection. The event is handled by notifying the upper layer with a CONN_CLOSED notification and the state machine transitions to the CONN_CLOSED state. Sm(CLOSE_CONN): This is a primitive received from the upper layer, instructing the state machine to close the connection. The event is handled by sending a SIG_CLOSE message to the GUA and the state machine transitions to the CONN_CLOSED state. Rx(CMD_LOAD_URL): This is a message received from the GUA instructing the state machine to load a VoiceXML document from the given URL. The event is handled by notifying the upper layer with a LOAD_DOC notification. The state machine transitions to the DOC_RCVD state. Rx(CMD_LOAD_SRC): This is a message received from the GUA instructing the state machine to load a VoiceXML document from a string sent by the GUA. The event is handled by notifying the upper layer with a LOAD_DOC notification. The state machine transitions to the DOC_RCVD state. Rx(CMD_EXEC_FORM): This is a message received from the GUA instructing the state machine to activate a VoiceXML form. The event is handled by notifying the upper layer with a RUN_DLG notification. The state machine transitions to the DLG_RCVD state. Engelsma & Cross Expires February 1, 2008 [Page 54] Internet-Draft DMSP Protocol July 2007 Rx(CMD_ADD_EVT_LISTENER): The VUA has received a CMD_ADD_EVT_LISTENER from the GUA. The event is handled by notifying the upper layer with an ADD_LISTENER notification. There is no state transition. Rx(CMD_REMOVE_EVT_LISTENER): The VUA has received a CMD_REMOVE_EVT_LISTENER from the GUA. The event is handled by notifying the upper layer with an REMOVE_LISTENER notification. There is no state transition. +-----------------------------+-------------+----------------+ | Event | Next State | Action | +-----------------------------+-------------+----------------+ | Rx(SIG_CLOSE) | CONN_CLOSED | N(CONN_CLOSED) | | Sm(CLOSE_CONN) | CONN_CLOSED | Tx(SIG_CLOSE) | | Rx(CMD_LOAD_URL) | DOC_RCVD | N(LOAD_DOC) | | Rx(CMD_LOAD_SRC) | DOC_RCVD | N(LOAD_DOC) | | Rx(CMD_EXEC_FORM) | DLG_RCVD | N(RUN_DLG) | | Rx(CMD_ADD_EVT_LISTENER) | DOC_LOADED | N(ADD) | | Rx(CMD_REMOVE_EVT_LISTENER) | DOC_LOADED | N(REMOVE) | +-----------------------------+-------------+----------------+ Table 56: DOC_LOADED State 4.4.2.6. DLG_RCVD State The state machine has received a dialogue activation request from the GUA and the request has been passed on to the upper layer. The following events are handled: Rx(SIG_CLOSE): The state machine receives a SIG_CLOSE message from the GUA. This instructs it to close the connection. The event is handled by notifying the upper layer with a CONN_CLOSED notification and the state machine transitions to the CONN_CLOSED state. Sm(CLOSE_CONN): This is a primitive received from the upper layer, instructing the state machine to close the connection. The event is handled by sending a SIG_CLOSE message to the GUA and the state machine transitions to the CONN_CLOSED state. Rx(CMD_LOAD_URL): This is a message received from the GUA instructing the state machine to load a VoiceXML document from the given URL. The event is handled by notifying the upper layer with a LOAD_DOC notification. The state machine transitions to the DOC_RCVD state. Engelsma & Cross Expires February 1, 2008 [Page 55] Internet-Draft DMSP Protocol July 2007 Rx(CMD_LOAD_SRC): This is a message received from the GUA instructing the state machine to load a VoiceXML document from a string sent by the GUA. The event is handled by notifying the upper layer with a LOAD_DOC notification. The state machine transitions to the DOC_RCVD state. Rx(CMD_EXEC_FORM): This is a message received from the GUA instructing the state machine to activate a VoiceXML dialogue. The event is handled by notifying the upper layer with a RUN_DLG notification. The state machine transitions to the DLG_RCVD state. Rx(CMD_ABORT): This is a message received from the GUA instructing the state machine to deactivate a voice dialogue. The event is handled by notifying the upper layer with a CANCEL_DLG notification. The state machine transitions to the DOC_LOADED state. Sm(DLG_ERROR): This is a primitive received from the upper layer indicating that an error occurred in activating a voice dialogue. The event is handled by sending a RESULT_ERROR message to the GUA. The state machine transitions to the DOC_LOADED state. Condition: The dialogue instance sequence number in the primitive matches with the dialogue instance sequence number of the state machine. Sm(DLG_ACTIVE): This is a primitive received from the upper layer indicating that a voice dialogue has been activated successfully. The event is handled by sending a DLG_ACTIVE message to the GUA. The state machine transitions to the DLG_ACTIVE state. Condition: The dialogue instance sequence number in the primitive matches with the dialogue instance sequence number of the state machine. Engelsma & Cross Expires February 1, 2008 [Page 56] Internet-Draft DMSP Protocol July 2007 +-------------------+----------------+-------------+----------------+ | Event | Condition | Next State | Action | +-------------------+----------------+-------------+----------------+ | Rx(SIG_CLOSE) | | CONN_CLOSED | N(CONN_CLOSED) | | Sm(CLOSE_CONN) | | CONN_CLOSED | Tx(SIG_CLOSE) | | Rx(CMD_LOAD_URL) | | DOC_RCVD | N(LOAD_DOC) | | Rx(CMD_LOAD_SRC) | | DOC_RCVD | N(LOAD_DOC) | | Rx(CMD_EXEC_FORM) | | DLG_RCVD | N(RUN_DLG) | | Rx(CMD_ABORT) | | DOC_LOADED | N(CANCEL_DLG) | | Sm(DLG_ERROR) | Dialog | DOC_LOADED | Tx(RESP_ERROR) | | | Sequence Match | | | | Sm(DLG_ACTIVE) | Dialog | DLG_ACTIVE | Tx(RESP_OK) | | | Sequence Match | | | +-------------------+----------------+-------------+----------------+ Table 57: DOC_RCVD State 4.4.2.7. DLG_ACTIVE State The VUA has successfully activated a voice dialogue with which a GUA can interact remotely. The following events are handled: Rx(SIG_CLOSE): The state machine receives a SIG_CLOSE message from the GUA. This instructs it to close the connection. The event is handled by notifying the upper layer with a CONN_CLOSED notification and the state machine transitions to the CONN_CLOSED state. Sm(CLOSE_CONN): This is a primitive received from the upper layer, instructing the state machine to close the connection. The event is handled by sending a SIG_CLOSE message to the GUA and the state machine transitions to the CONN_CLOSED state. Rx(CMD_LOAD_URL): This is a message received from the GUA instructing the state machine to load a VoiceXML document from the given URL. The event is handled by notifying the upper layer with a LOAD_DOC notification. The state machine transitions to the DOC_RCVD state. Rx(CMD_LOAD_SRC): This is a message received from the GUA instructing the state machine to load a VoiceXML document from a string sent by the GUA. The event is handled by notifying the upper layer with a LOAD_DOC notification. The state machine transitions to the DOC_RCVD state. Engelsma & Cross Expires February 1, 2008 [Page 57] Internet-Draft DMSP Protocol July 2007 Rx(CMD_EXEC_FORM): This is a message received from the GUA instructing the state machine to activate a VoiceXML dialogue. The event is handled by notifying the upper layer with a RUN_DLG notification. The state machine transitions to the DLG_RCVD state. Since another form is actively executing when the CMD_EXEC_FORM is received in this state, the VUA will implicitly abort that form's execution. Rx(CMD_ABORT): This is a message received from the GUA instructing the state machine to deactivate a voice dialogue. The event is handled by notifying the upper layer with a CANCEL_DLG notification. The state machine transitions to the DOC_LOADED state. Sm(DLG_RESULTS): This is a primitive received from the upper layer. It contains the results of a successful voice recognition or an error in case an error occurred in the voice recognition. The event is handled by sending the appropriate event message to the GUA. There is no transition in the state machine. Condition: The dialogue instance sequence number in the message matches with the dialogue instance sequence number of the state machine. Rx (CMD_SET_FIELDS): This is a message received from a GUA containing updated values of a field in an active voice dialogue. The event is handled by notifying the upper layer with a UPDATE_DLG notification. There is no transition in the state machine. Rx (CMD_SET_FOCUS): This is a message received from a GUA, instructing the VUA to set the focus on a particular field in an active voice dialogue. The event is handled by notifying the upper layer with a FOCUS_DLG notification. There is no transition in the state machine. Engelsma & Cross Expires February 1, 2008 [Page 58] Internet-Draft DMSP Protocol July 2007 +-------------------+----------+-------------+----------------------+ | Event | Conditio | Next State | Action | | | n | | | +-------------------+----------+-------------+----------------------+ | Rx(SIG_CLOSE) | | CONN_CLOSED | N(CONN_CLOSED) | | Sm(CLOSE_CONN) | | CONN_CLOSED | Tx(SIG_CLOSE) | | Rx(CMD_LOAD_URL) | | DOC_RCVD | N(LOAD_DOC) | | Rx(CMD_LOAD_SRC) | | DOC_RCVD | N(LOAD_DOC) | | Rx(CMD_EXEC_FORM) | | DLG_RCVD | N(RUN_DLG) | | Rx(CMD_ABORT) | | DOC_LOADED | N(CANCEL_DLG) | | Sm(DLG_RESULTS) | Dialog | DLG_ACTIVE | Tx(EVT_RECO_RESULTS) | | | Sequence | | | | | Match | | | | | | | || Tx(EVT_ERROR) | | | | | || Tx(EVT_HELP) | | | | | || Tx(EVT_NOMATCH) | | | | | || Tx(EVT_NOINPUT) | | | | | || Tx(EVT_VXMLDONE) | | Rx(CMD_SET_FIELDS | | DLG_ACTIVE | N(UPDATE_DLG) | | ) | | | | | Rx(CMD_SET_FOCUS) | | DLG_ACTIVE | N(FOCUS_DLG) | +-------------------+----------+-------------+----------------------+ Table 58: DOC_ACTIVE State 4.4.2.8. VXML_START_RCVD State The VUA has received a SIG_VXML_START message from the GUA. These events are handled: Sm(VXML_STARTED): This is a primitive received from the upper layer, that indicates the session is initialized, the VXML document loaded, and the dialogue executing. The event is handled by sending a SIG_VXML_START to the GUA and transitioning to the DLG_ACTIVE state. Rx(SIG_CLOSE): The state machine receives a SIG_CLOSE message indicating that the GUA has terminated the session. The event is handled by notifying the upper layer with a CONN_FAIL notification and the state machine transitions back to the CONN_CLOSED state. Sm(CLOSE_CONN): This is a primitive received from the upper layer, used to initiate a connection close. The event is handled by sending a SIG_CLOSE message to the GUA and the state machine transitions back to the CONN_CLOSED state. Engelsma & Cross Expires February 1, 2008 [Page 59] Internet-Draft DMSP Protocol July 2007 +------------------+-------------+--------------------+ | Event | Next State | Action | +------------------+-------------+--------------------+ | Sm(VXML_STARTED) | DLG_ACTIVE | Tx(SIG_VXML_START) | | Rx(SIG_CLOSE) | CONN_CLOSED | N(CONN_FAIL) | | Sm(CLOSE_CONN) | CONN_CLOSED | Tx(SIG_CLOSE) | +------------------+-------------+--------------------+ Table 59: VXML_START_RCVD State 5. Message Transport DMSP is specified independent of the underlying transport mechansim. The inclusion of rudimentary signaling messages in the protocol provide implementation flexibility in that the protocol can be implemented directly on a native transport layer protocol such as TCP, or carried by popular application layer protocols such as SIP and HyperText Transport Protocol (HTTP). 6. IANA Considerations A media type registration may be required for DMSP, depending on which application layer transport protocol an implementation uses. It is anticipated that bindings of DMSP to specific application protocols such as SIP and HTTP would be specified separately and would include the appropriate IANA requirements. 7. Security Considerations The DMSP protocol may carry sensitive application information such as account numbers, passwords, private information, etc. For this reason it is important that clients have the option of secure communication with the VUA for both messages it sends and receives, though the GUA is not required to use it. This can be achieved by imposing following requirement on DMSP VoiceXML user agent implementations: All DMSP VoiceXML user agents MUST be implemented upon transport mechanisms that can be properly secured. 8. Contributors The editors acknowledge the following individuals made significant contributions to DMSP: Jaroslav Gergic (IBM) Engelsma & Cross Expires February 1, 2008 [Page 60] Internet-Draft DMSP Protocol July 2007 Rafah Hosn (IBM) Thomas Ling (IBM) Charles Wiecha (IBM) Michael Pearce (Motorola) Rohit Chaudhri (Motorola) James Ferrans (Motorola) Paolo Baggia (Loquendo) Andrew Wahbe (VoiceGenie) 9. References 9.1. Normative References [1] Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [3] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", RFC 3986, January 2005. [4] Kristol, D. and L. Montulli, "HTTP State Management Mechanism", RFC 2965, October 2000. 9.2. Informative References [5] Worldwide Web Consortium, "Voice Extensible Markup Language (VoiceXML) Version 2.0", W3C Recommendation (http://www.w3.org/ TR/2004/REC-voicexml20-20040316/), March 2004. [6] Worldwide Web Consortium, "Multimodal Architecture and Interfaces", W3C Working Draft (http://www.w3.org/TR/mmi-arch/), April 2006. [7] Open Mobile Alliance, "OMA Multimodal and Multi-device Enabler Architecture", Draft OMA-AD-MMMD-V1_0-20060612-D, June 2006. [8] Krasner, G. and S. Pope, "A cookbook for using the model-view- controller user interface paradigm in Smalltalk-80."", Journal of Object-Oriented Programming 1(3):26-49, August/ September 1988. [9] Worldwide Web Consortium, "Document Object Model (DOM) Level 2 Core Specification", W3C Recommendation (http://www.w3.org/TR/ 2000/REC-DOM-Level-2-Core-20001113/), November 2000. Engelsma & Cross Expires February 1, 2008 [Page 61] Internet-Draft DMSP Protocol July 2007 [10] Worldwide Web Consortium, "Semantic Interpretation for Speech Recognition", W3C Recommendation (http://www.w3.org/TR/semantic-interpretation/), January 2006. [11] Worldwide Web Consortium, "XML Events, An Events Syntax for XML", W3C Recommendation (http://www.w3.org/TR/xml-events/), October 2003. [12] VoiceXML Forum, "XHTML+Voice Profile 1.2", March 2004. [13] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002. [14] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", RFC 3550, July 2003. [15] Sjoberg, J., Westerlund, M., Lakaniemi, A., and Q. Xie, "Real- Time Transport Protocol (RTP) Payload Format and File Storage Format for the Adaptive Multi-Rate (AMR) and Adaptive Multi- Rate Wideband (AMR-WB) Audio Codecs", RFC 3267, June 2002. [16] European Telecommunications Standards Institute (ETSI) Standard ES 202 050, "Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Front-end Feature Extraction Algorithm; Compression Algorithms", (http://pda.etsi.org/pda/) , October 2002. [17] European Telecommunications Standards Institute (ETSI) Standard ES 202 211, "Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Extended front- end feature extraction algorithm; Compression algorithms; Back- end speech reconstruction algorithm", (http://pda.etsi.org/pda/) , November 2003. [18] European Telecommunications Standards Institute (ETSI) Standard ES 202 212, "Speech Processing, Transmission and Quality aspects (STQ); Distributed speech recognition; Extended advanced front-end feature extraction algorithm; Compression algorithms; Back-end speech reconstruction algorithm", (http://pda.etsi.org/pda/) , November 2003. [19] Xie, Q., "RTP Payload Format for European Telecommunications Standards Institute (ETSI) European Standard ES 201 108 Distributed Speech Recognition Encoding", RFC 3557, July 2003. Engelsma & Cross Expires February 1, 2008 [Page 62] Internet-Draft DMSP Protocol July 2007 [20] Xie, Q. and D. Pearce, "RTP Payload Formats for European Telecommunications Standards Institute (ETSI) European Standard ES 202 050, ES 202 211, and ES 202 212 Distributed Speech Recognition Encoding", RFC 4060, May 2005. [21] Shanmugham, S. and D. Burnett, "Media Resource Control Protocol Version 2(MRCPv2)", IETF Internet Draft , September 2006. [22] Worldwide Web Consortium, "Document Object Model (DOM) Level 2 Events Specification", W3C Recommendation (http://www.w3.org/ TR/2000/REC-DOM-Level-2-Events-20001113/), November 2000. Appendix A. Schema for XML Message Encoding 1.0 Engelsma & Cross Expires February 1, 2008 [Page 63] Internet-Draft DMSP Protocol July 2007 Engelsma & Cross Expires February 1, 2008 [Page 65] Internet-Draft DMSP Protocol July 2007 Engelsma & Cross Expires February 1, 2008 [Page 66] Internet-Draft DMSP Protocol July 2007 Engelsma & Cross Expires February 1, 2008 [Page 67] Internet-Draft DMSP Protocol July 2007 Engelsma & Cross Expires February 1, 2008 [Page 68] Internet-Draft DMSP Protocol July 2007 Engelsma & Cross Expires February 1, 2008 [Page 69] Internet-Draft DMSP Protocol July 2007 Engelsma & Cross Expires February 1, 2008 [Page 70] Internet-Draft DMSP Protocol July 2007 Engelsma & Cross Expires February 1, 2008 [Page 71] Internet-Draft DMSP Protocol July 2007 Authors' Addresses Jonathan Engelsma, Editor Motorola, Inc. 1295 E. Algonquin Road Schaumburg, IL 60196 US Phone: +1-616-777-0432 Email: Jonathan.Engelsma@motorola.com Chris Cross, Editor IBM 8051 Congress Ave Boca Raton, FL 33487 US Phone: +1-561-862-2102 Email: xcross@us.ibm.com Engelsma & Cross Expires February 1, 2008 [Page 72] Internet-Draft DMSP Protocol July 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Engelsma & Cross Expires February 1, 2008 [Page 73]