In order to accomplish this goal, we will propose an XML DTD which records the basic events in a Communicator-compliant system which can be annotated with type information indicating that a data element is "significant" from the point of view of annotators (and annotation tools).
To clarify we will consider the following (term definitions are by no means final and are open to suggestion):
Data | Obligatory | Standard access |
Duration of session | yes | readable directly off the XML representation proposed below |
Duration of turn (input or output) | yes | readable directly off the XML representation proposed below |
Duration of generation of output (in a phone demo, the time the synthesizer takes to generate the audio file) | yes | see 1 |
Duration of display of output (in a phone demo, how long it takes to play the audio file) | yes | see 2 |
Duration of recognition of input (in a phone demo, how long it takes the recognizer to produce its hypotheses) | yes | see 3 |
Duration of arbitrary operations | no | readable directly off the XML representation proposed below |
Number of turns within a session | yes | readable directly off the XML representation proposed below |
Number of sessions (in our current model each session is its own logfile) | yes | readable directly off the XML representation proposed below |
The audio files corresponding to the user input and system output and their formats. The audio files should be stored and distributed with the logs, and the pathnames of these files should be relative to the log. | yes | accessed given an arbitrary search of the logged data (see the "audio_input" and "audio_output" values for the type attribute of the GC_DATA tag, as well as the "mime_type" attribute) |
The text of the user input chosen by the system | yes | accessed given an arbitrary search of the logged data (see the "text_input" values for the type attribute of the GC_DATA tag) |
The text of the system output | yes | accessed given an arbitrary search of the logged data (see the "text_output" value for the type attribute of the GC_DATA tag) |
All possible input sentences (from the recognizer) up to a certain limit (TBD) (N/A to systems that use a word lattice) | no | accessed given an arbitrary search of the logged data (see the "text_input_hypothesis" value for the type attribute of the GC_DATA tag) |
Indication of whether the parse succeeded | no | see 4 |
The full input interpretation | no | accessed given an arbitrary search of the logged data |
The elements which may pose minor complications have been left blank. Here we make tentative proposals for each of these:
We propose that operations should be logged as single XML elements. For example:
<GC_OPERATION name="paraphrase_reply" server="nl" location="localhost:11000"
turnid="-1" stime="941473394.66" etime="941473394.69" tidx="3">
<GC_DATA key=":reply_string" dtype="string">
Hi! Welcome to Mitre's Travel demonstration. This call is being recorded for
system development. You may hang up or ask for help at any time. How can I
help you?
</GC_DATA>
</GC_OPERATION>
Since in our distributed architecture
messages are sent asynchronously, and many events may occur before the
completion of an operation, some caching (or post processing) will be necessary
to log operations as single elements.
Next we will try to define the main entities
in the logfile and their formats. A DTD is also available which defines
these terms and their relations. We will assume all time types will use
a standard base time known as "the epoch", the number of milliseconds since
January 1, 1970, 00:00:00 GMT.
Name | Description | Type | Required |
id | We should attempt to determine a unique identifier for sessions. MIT's solution for this is of the following format (IP:process id:session counter). Process id's might not be trivial to achieve in different programing languages and OS' however there usually are "equivalent" data available | string | yes |
stime | time when session started | milliseconds | yes |
etime | time when session finished | milliseconds | yes |
GC_TURN | see GC_TURN | GC_TURN | no |
<GC_SESSION
id="129.10.2.200:1010:3"
stime="930254422.720000"
etime="930254434.790000">
...
</GC_SESSION>
Name | Description | Type | Required |
id | A unique identifier within each session | number | yes |
stime | time when turn started | milliseconds | yes |
etime | time when turn ended | milliseconds | yes |
GC_OPERATION | see GC_OPERATION | GC_OPERATION | no |
GC_MESSAGE | see GC_MESSAGE | GC_MESSAGE | no |
GC_EVENT | see GC_EVENT | GC_EVENT | no |
<GC_TURN
id="-01"
stime="930254422.720000"
etime="930254424.790000">
...
</GC_TURN>
Name | Description | Type | Required |
type | the type of operation being executed (specific values TBD) | string | no |
turnid | the turn id that this operation was executed under | number | yes |
stime | time when operation started | milliseconds | yes |
etime | time when operation ended | milliseconds | yes |
server | the name (according to the program file) of the server that executed the operation | string | yes |
location | the server (real server name or IP address) and its port (server_name:port_number) | string | yes |
name | the name of the operation | string | yes |
tidx | the token index associated with the operation | number | no |
reply_type | valid values of reply_type include normal, detroy, and error | string | no |
GC_DATA | see GC_DATA | GC_DATA | no |
<GC_OPERATION name="paraphrase_reply" server="nl" location="localhost:11000"
turnid="-1" stime="941473394.66" etime="941473394.69" tidx="3">
<GC_DATA key=":reply_string" dtype="string">
Hi! Welcome to Mitre's Travel demonstration. This call is being recorded for
system development. You may hang up or ask for help at any time. How can I
help you?
</GC_DATA>
</GC_OPERATION>
Name | Description | Type | Required |
type | the type of message being issued (specific values TBD) | string | no |
turnid | the turn id that this operation was executed under | number | yes |
time | time when message issued | milliseconds | yes |
server | the name of the server that issued the message | string | yes |
location | the server (real server name or IP address) and its port (server_name:port_number) | string | yes |
name | the name of the message | string | yes |
direction | server_to_hub or hub_to_server | string | yes |
tidx | the token index associated with the message | number | no |
reply_type | valid values of reply_type include normal, detroy, and error | string | no |
GC_DATA | see GC_DATA | GC_DATA | no |
<GC_MESSAGE name="filelog" direction="server_to_hub" server="audio"
location="localhost:15000" turnid="-1" time="941473396.48" tidx="6">
<GC_DATA key=":synth_log_filename" dtype="string">
/home/communicator/test/Travel-demo/../../logs/travel_cfone/19991101/001/
travel_cfone-19991101-001-synth--01-001.wav
</GC_DATA>
</GC_MESSAGE>
Name | Description | Type | Required |
etype | the name of hub event (SYSTEM_ERROR, LOCK, etc.) | string | yes |
turnid | the turn id under which this event occurred | number | yes |
time | time when message issued | milliseconds | yes |
name | the type of the event (operation) | string | yes |
server | the name of the server that issued the message | string | no |
location | the server (real server name or IP address) and its port (server_name:port_number) | string | no |
tidx | the token index associated with the event | number | no |
GC_DATA | see GC_DATA | GC_DATA | no |
<GC_EVENT etype="LOCK" server="audio" location="localhost:15000" turnid="-1"
time="941473396.19" name=":hub_get_session_lock" tidx="5"/>
Name | Description | Type | Required |
key | the name of this data point | string | yes |
turnid | the turn id that this operation was executed under | number | no |
time | time stamp for this data point | milliseconds | no |
type | valid values of type include audio_input, audio_output, text_input, text_output, text_input_hypothesis, and concept. See the Content section. | string | no |
mime_type | the mime type of the data | string | no |
dtype | the data type - valid values include integer, string, etc. (full list of values TBD) | string | no |
GC_FRAME | see GC_FRAME | GC_FRAME | no |
GC_LIST | see GC_LIST | GC_LIST | no |
<GC_DATA key=":reply_string" dtype="string">
Hi! Welcome to Mitre's Travel demonstration. This call is being recorded for
system development. You may hang up or ask for help at any time. How can I
help you?
</GC_DATA>
Name | Description | Type | Required |
frame_type | Galaxy frame type | string | no |
name | the name of the frame | string | no |
turnid | the turn id in which this frame appears | number | no |
GC_DATA | see GC_DATA | GC_DATA | no |
<GC_DATA key=":rec_scores">
<GC_FRAME name="scores" type="clause">
<GC_DATA key=":acoustic_score" dtype="string">
"-617.9270"
</GC_DATA>
<GC_DATA key=":ngram_score" dtype="string">
"-17.4465"
</GC_DATA>
<GC_DATA key=":nwords" dtype="integer">
8
</GC_DATA>
<GC_DATA key=":total_score" dtype="string">
"-651.3735"
</GC_DATA>
<GC_DATA key=":nphones" dtype="integer">
36
</GC_DATA>
</GC_FRAME>
</GC_DATA>
Name | Description | Type | Required |
name | the name of the list | string | no |
turnid | the turn id in which this list appears | number | no |
GC_DATA | see GC_DATA | GC_DATA | no |
<GC_DATA name=":nbest_list">
<GC_LIST name=":nbest_list">
<GC_DATA key=":nbest_list[0]" dtype="string">
can i get this american flight
</GC_DATA>
<GC_DATA key=":nbest_list[1]" dtype="string">
can i get this american difference
</GC_DATA>
<GC_DATA key=":nbest_list[2]" dtype="string">
can i did this american difference
</GC_DATA>
<GC_DATA key=":nbest_list[3]" dtype="string">
can i get this american that flight
</GC_DATA>
</GC_LIST>
</GC_DATA>
<?xml version="1.0"?>
<!ELEMENT GC_LOG (GC_SESSION)*>
<!ATTLIST GC_LOG logfile_version CDATA #IMPLIED>
<!ELEMENT GC_SESSION (GC_TURN)* >
<!ATTLIST GC_SESSION id NMTOKEN #REQUIRED>
<!-- time could be defined as CDATA if we chose to use a non
millisecond format -->
<!ATTLIST GC_SESSION stime NMTOKEN #REQUIRED>
<!ATTLIST GC_SESSION etime NMTOKEN #REQUIRED>
<!ELEMENT GC_TURN ( GC_OPERATION | GC_MESSAGE | GC_EVENT )*>
<!ATTLIST GC_TURN id NMTOKEN #REQUIRED>
<!ATTLIST GC_TURN stime NMTOKEN #REQUIRED>
<!ATTLIST GC_TURN etime NMTOKEN #REQUIRED>
<!ELEMENT GC_OPERATION (GC_DATA)*>
<!ATTLIST GC_OPERATION type NMTOKENS #IMPLIED>
<!ATTLIST GC_OPERATION turnid NMTOKEN #REQUIRED>
<!ATTLIST GC_OPERATION server CDATA #REQUIRED>
<!ATTLIST GC_OPERATION location NMTOKEN #REQUIRED>
<!ATTLIST GC_OPERATION name CDATA #REQUIRED>
<!ATTLIST GC_OPERATION tidx NMTOKEN #IMPLIED>
<!ATTLIST GC_OPERATION reply_type CDATA #IMPLIED>
<!ATTLIST GC_OPERATION stime NMTOKEN #REQUIRED>
<!ATTLIST GC_OPERATION etime NMTOKEN #REQUIRED>
<!ELEMENT GC_MESSAGE (GC_DATA)*>
<!ATTLIST GC_MESSAGE type NMTOKENS #IMPLIED>
<!ATTLIST GC_MESSAGE turnid NMTOKEN #REQUIRED>
<!ATTLIST GC_MESSAGE server CDATA #REQUIRED>
<!ATTLIST GC_MESSAGE location NMTOKEN #REQUIRED>
<!ATTLIST GC_MESSAGE name CDATA #REQUIRED>
<!ATTLIST GC_MESSAGE direction NMTOKEN #REQUIRED>
<!ATTLIST GC_MESSAGE tidx NMTOKEN #IMPLIED>
<!ATTLIST GC_MESSAGE reply_type CDATA #IMPLIED>
<!ATTLIST GC_MESSAGE time NMTOKEN #REQUIRED>
<!ELEMENT GC_EVENT (GC_DATA)*>
<!ATTLIST GC_EVENT etype NMTOKEN #REQUIRED>
<!ATTLIST GC_EVENT turnid NMTOKEN #REQUIRED>
<!ATTLIST GC_EVENT server CDATA #IMPLIED>
<!ATTLIST GC_EVENT location NMTOKEN #IMPLIED>
<!ATTLIST GC_EVENT time NMTOKEN #REQUIRED>
<!ATTLIST GC_EVENT name CDATA #REQUIRED>
<!ATTLIST GC_EVENT tidx NMTOKEN #IMPLIED>
<!ELEMENT GC_DATA ANY>
<!ATTLIST GC_DATA key CDATA #REQUIRED>
<!ATTLIST GC_DATA type NMTOKENS #IMPLIED>
<!ATTLIST GC_DATA mime_type NMTOKEN #IMPLIED>
<!ATTLIST GC_DATA dtype NMTOKEN #IMPLIED>
<!ATTLIST GC_DATA time NMTOKEN #IMPLIED>
<!ATTLIST GC_DATA turnid NMTOKEN #IMPLIED>
<!ELEMENT GC_FRAME (GC_DATA)*>
<!ATTLIST GC_FRAME frame_type NMTOKEN #IMPLIED>
<!ATTLIST GC_FRAME name CDATA #IMPLIED>
<!ATTLIST GC_FRAME turnid NMTOKEN #IMPLIED>
<!ELEMENT GC_LIST (GC_DATA)*>
<!ATTLIST GC_LIST name CDATA #IMPLIED>
<!ATTLIST GC_LIST turnid NMTOKEN #IMPLIED>