In order to accomplish this goal, we will propose an XML DTD which records the basic events in a Communicator-compliant system which can be annotated with type information indicating that a data element is "significant" from the point of view of annotators (and annotation tools).
To clarify we will consider the following (term definitions are by no means final and are open to suggestion):
Data | Obligatory | Standard access |
Duration of session | yes | readable directly off the XML representation proposed below |
Duration of turn (input or output) | yes | readable directly off the XML representation proposed below |
Duration of generation of output (in a phone demo, the time the synthesizer takes to generate the audio file) | yes | see 1 |
Duration of display of output (in a phone demo, how long it takes to play the audio file) | yes | see 2 |
Duration of recognition of input (in a phone demo, how long it takes the recognizer to produce its hypotheses) | yes | see 3 |
Duration of arbitrary operations | no | readable directly off the XML representation proposed below |
Number of turns within a session | yes | readable directly off the XML representation proposed below |
Number of sessions (in our current model each session is its own logfile) | yes | readable directly off the XML representation proposed below |
The audio files corresponding to the user input and system output and their formats. The audio files should be stored and distributed with the logs, and the pathnames of these files should be relative to the log. | yes | accessed given an arbitrary search of the logged data (see the "audio_input" and "audio_output" values for the type attribute of the GC_DATA tag, as well as the "mime_type" attribute) |
The text of the user input chosen by the system | yes | accessed given an arbitrary search of the logged data (see the "text_input" values for the type attribute of the GC_DATA tag) |
The text of the system output | yes | accessed given an arbitrary search of the logged data (see the "text_output" value for the type attribute of the GC_DATA tag) |
All possible input sentences (from the recognizer) up to a certain limit (TBD) (N/A to systems that use a word lattice) | no | accessed given an arbitrary search of the logged data (see the "text_input_hypothesis" value for the type attribute of the GC_DATA tag) |
Indication of whether the parse succeeded | no | see 4 |
The full input interpretation | no | accessed given an arbitrary search of the logged data |
The elements which may pose minor complications have been left blank. Here we make tentative proposals for each of these:
We propose that operations should be logged as single XML elements. For example:
<GC_OPERATION name="paraphrase_reply" server="nl" location="localhost:11000"
turnid="-1" stime="941473394.66" etime="941473394.69" tidx="3">
<GC_DATA key=":reply_string" dtype="string">
Hi! Welcome to Mitre's Travel demonstration. This call is being recorded for
system development. You may hang up or ask for help at any time. How can I
help you?
</GC_DATA>
</GC_OPERATION>
Since in our distributed architecture
messages are sent asynchronously, and many events may occur before the
completion of an operation, some caching (or post processing) will be necessary
to log operations as single elements.
Next we will try to define the main entities
in the logfile and their formats. A DTD is also available which defines
these terms and their relations. We will assume all time types will use
a standard base time known as "the epoch", the number of milliseconds since
January 1, 1970, 00:00:00 GMT. In attribute values, these times will be
represented as float seconds.
Name | Description | Type | Required |
logfile_version | A label referring to the system that generated the log | string | no |
human_annotations_included | human_annotations_included="1" indicates that the human annotations file containing transcriptions and objective task completion has been folded into the main log file | number | no |
GC_SESSION | see GC_SESSION | GC_SESSION | no |
<GC_LOG
logfile_version="travel cfone, version 2.0">
...
</GC_LOG>
Name | Description | Type | Required |
id | We should attempt to determine a unique identifier for sessions. MIT's solution for this is of the following format (IP:process id:session counter). Process id's might not be trivial to achieve in different programing languages and OS' however there usually are "equivalent" data available | string | yes |
stime | time when session started | seconds (float) | yes |
etime | time when session finished | seconds (float) | yes |
GC_TURN | see GC_TURN | GC_TURN | no |
<GC_SESSION
id="129.10.2.200:1010:3"
stime="930254422.720000"
etime="930254434.790000">
...
</GC_SESSION>
Name | Description | Type | Required |
id | A unique identifier within each session | number | yes |
stime | time when turn started | seconds (float) | yes |
etime | time when turn ended | seconds (float) | yes |
GC_OPERATION | see GC_OPERATION | GC_OPERATION | no |
GC_MESSAGE | see GC_MESSAGE | GC_MESSAGE | no |
GC_EVENT | see GC_EVENT | GC_EVENT | no |
<GC_TURN
id="-01"
stime="930254422.720000"
etime="930254424.790000">
...
</GC_TURN>
Name | Description | Type | Required |
type | the type of operation being executed (specific values TBD) | string | no |
turnid | the turn id that this operation was executed under | number | yes |
stime | time when operation started | seconds (float) | yes |
etime | time when operation ended | seconds (float) | yes |
server | the name (according to the program file) of the server that executed the operation | string | yes |
location | the server (real server name or IP address) and its port (server_name:port_number) | string | yes |
name | the name of the operation | string | yes |
tidx | the token index associated with the operation | number | no |
reply_type | valid values of reply_type include normal, detroy, and error | string | no |
reply_status | valid values are normal, error, destroy and asynchronous | string | no |
type_start_task | valid values of type_start_task are true, task and total, and indicate whether the measurement is of on-task time or total call time | string | no |
type_end_task | indicates the end of the task | string | no |
type_new_turn | valid values of type_new_turn are user and system | string | no |
type_start_utt | valid values of type_start_utt include user, system and pacifier | string | no |
type_end_utt | valid valudes of type_end_utt include user, system and pacifier | string | no |
type_prompt | indicates the system is prompting for a key. the value of type_prompt is the key being prompted | string | no |
GC_DATA | see GC_DATA | GC_DATA | no |
<GC_OPERATION type_new_turn="system" name="paraphrase_reply" server="nl" location="localhost:11000"
turnid="-1" stime="941473394.66" etime="941473394.69" tidx="3">
<GC_DATA type_utt_text="system" key=":reply_string" dtype="string">
Hi! Welcome to Mitre's Travel demonstration. This call is being recorded for
system development. You may hang up or ask for help at any time. How can I
help you?
</GC_DATA>
</GC_OPERATION>
Name | Description | Type | Required |
type | the type of message being issued (specific values TBD) | string | no |
turnid | the turn id that this operation was executed under | number | yes |
time | time when message issued | seconds (float) | yes |
server | the name of the server that issued the message | string | yes |
location | the server (real server name or IP address) and its port (server_name:port_number) | string | yes |
name | the name of the message | string | yes |
direction | server_to_hub or hub_to_server | string | yes |
tidx | the token index associated with the message | number | no |
reply_type | valid values of reply_type include normal, detroy, and error | string | no |
reply_status | valid values are normal, error, destroy and asynchronous | string | no |
type_start_task | valid values of type_start_task are true, task and total, and indicate whether the measurement is of on-task time or total call time | string | no |
type_end_task | indicates the end of the task | string | no |
type_new_turn | valid values of type_new_turn are user and system | string | no |
type_start_utt | valid values of type_start_utt include user, system and pacifier | string | no |
type_end_utt | valid values of type_end_utt include user, system and pacifier | string | no |
type_prompt | indicates the system is prompting for a key. the value of type_prompt is the key being prompted | string | no |
GC_DATA | see GC_DATA | GC_DATA | no |
<GC_MESSAGE name="filelog" direction="server_to_hub" server="audio"
location="localhost:15000" turnid="-1" time="941473396.48" tidx="6">
<GC_DATA key=":synth_log_filename" dtype="string">
/home/communicator/test/Travel-demo/../../logs/travel_cfone/19991101/001/
travel_cfone-19991101-001-synth--01-001.wav
</GC_DATA>
</GC_MESSAGE>
Name | Description | Type | Required |
etype | the name of hub event (SYSTEM_ERROR, LOCK, etc.) | string | yes |
turnid | the turn id under which this event occurred | number | yes |
time | time when message issued | seconds (float) | yes |
name | the type of the event (operation) | string | yes |
server | the name of the server that issued the message | string | no |
location | the server (real server name or IP address) and its port (server_name:port_number) | string | no |
tidx | the token index associated with the event | number | no |
type_start_task | valid values of type_start_task are true, task and total, and indicate whether the measurement is of on-task time or total call time | string | no |
type_end_task | indicates the end of the task | string | no |
type_new_turn | valid values of type_new_turn are user and system | string | no |
type_start_utt | valid values of type_start_utt include user, system and pacifier | string | no |
type_end_utt | valid valudes of type_end_utt include user, system and pacifier | string | no |
type_prompt | indicates the system is prompting for a key. the value of type_prompt is the key being prompted | string | no |
GC_DATA | see GC_DATA | GC_DATA | no |
<GC_EVENT etype="LOCK" server="audio" location="localhost:15000" turnid="-1"
time="941473396.19" name=":hub_get_session_lock" tidx="5"/>
Name | Description | Type | Required |
turnid | the turn id under which this event occurred | number | yes |
tidx | the token index associated with the event | number | no |
type_task_completion | human annotation indicating whether the task was successfully completed or not | string | no |
GC_DATA | see GC_DATA | GC_DATA | no |
<GC_ANNOT type_task_completion="1"/><GC_ANNOT turnid="2" tidx="129">
<GC_DATA type_utt_text="transcription" dtype="string">
i'd like a flight from boston to san francisco
</GC_DATA>
</GC_ANNOT>
Name | Description | Type | Required |
key | the name of this data point | string | yes |
turnid | the turn id that this operation was executed under | number | no |
time | time stamp for this data point | seconds (float) | no |
type | valid values of type include audio_input, audio_output, text_input, text_output, text_input_hypothesis, and concept. See the Content section. | string | no |
mime_type | the mime type of the data | string | no |
direction | valid values are in and out | string | no |
dtype | the data type - valid values include integer, string, etc. (full list of values TBD) | string | no |
type_utt_text | valid values of type_utt_text are transciption, system and asr | string | no |
type_error_msg | valid value is true | string | no |
type_help_msg | valid value is true | string | no |
type_task_completion | human annotation indicating whether the task was successfully completed or not | string | no |
GC_FRAME | see GC_FRAME | GC_FRAME | no |
GC_LIST | see GC_LIST | GC_LIST | no |
<GC_DATA key=":reply_string" dtype="string">
Hi! Welcome to Mitre's Travel demonstration. This call is being recorded for
system development. You may hang up or ask for help at any time. How can I
help you?
</GC_DATA>
Name | Description | Type | Required |
frame_type | Galaxy frame type | string | no |
name | the name of the frame | string | no |
turnid | the turn id in which this frame appears | number | no |
GC_DATA | see GC_DATA | GC_DATA | no |
<GC_DATA key=":rec_scores">
<GC_FRAME name="scores" type="clause">
<GC_DATA key=":acoustic_score" dtype="string">
"-617.9270"
</GC_DATA>
<GC_DATA key=":ngram_score" dtype="string">
"-17.4465"
</GC_DATA>
<GC_DATA key=":nwords" dtype="integer">
8
</GC_DATA>
<GC_DATA key=":total_score" dtype="string">
"-651.3735"
</GC_DATA>
<GC_DATA key=":nphones" dtype="integer">
36
</GC_DATA>
</GC_FRAME>
</GC_DATA>
Name | Description | Type | Required |
name | the name of the list | string | no |
turnid | the turn id in which this list appears | number | no |
GC_DATA | see GC_DATA | GC_DATA | no |
<GC_DATA name=":nbest_list">
<GC_LIST name=":nbest_list">
<GC_DATA key=":nbest_list[0]" dtype="string">
can i get this american flight
</GC_DATA>
<GC_DATA key=":nbest_list[1]" dtype="string">
can i get this american difference
</GC_DATA>
<GC_DATA key=":nbest_list[2]" dtype="string">
can i did this american difference
</GC_DATA>
<GC_DATA key=":nbest_list[3]" dtype="string">
can i get this american that flight
</GC_DATA>
</GC_LIST>
</GC_DATA>
<?xml version="1.0"?>
<!ELEMENT GC_LOG (GC_SESSION)*>
<!ATTLIST GC_LOG logfile_version CDATA #IMPLIED>
<!ATTLIST GC_LOG human_annotations_included NMTOKEN #IMPLIED>
<!ELEMENT GC_SESSION ( GC_TURN | GC_ANNOT )*>
<!ATTLIST GC_SESSION id NMTOKEN #REQUIRED>
<!ATTLIST GC_SESSION stime CDATA #REQUIRED>
<!ATTLIST GC_SESSION etime CDATA #REQUIRED>
<!ELEMENT GC_TURN ( GC_ANNOT | GC_OPERATION | GC_MESSAGE | GC_EVENT
)*>
<!ATTLIST GC_TURN id NMTOKEN #REQUIRED>
<!ATTLIST GC_TURN stime NMTOKEN #REQUIRED>
<!ATTLIST GC_TURN etime NMTOKEN #REQUIRED>
<!ELEMENT GC_ANNOT (GC_DATA)*>
<!-- GC_ANNOT can have a sequence of one or more GC_DATA tags
or it can be empty -->
<!ATTLIST GC_ANNOT type_task_completion CDATA #IMPLIED>
<!ATTLIST GC_ANNOT turnid NMTOKEN #IMPLIED>
<!ATTLIST GC_ANNOT tidx NMTOKEN #IMPLIED>
<!ELEMENT GC_OPERATION (GC_DATA)*>
<!ATTLIST GC_OPERATION type NMTOKENS #IMPLIED>
<!ATTLIST GC_OPERATION turnid NMTOKEN #REQUIRED>
<!ATTLIST GC_OPERATION server CDATA #REQUIRED>
<!ATTLIST GC_OPERATION location NMTOKEN #REQUIRED>
<!ATTLIST GC_OPERATION name CDATA #REQUIRED>
<!ATTLIST GC_OPERATION tidx NMTOKEN #IMPLIED>
<!ATTLIST GC_OPERATION reply_type CDATA #IMPLIED>
<!ATTLIST GC_OPERATION reply_status CDATA #IMPLIED>
<!ATTLIST GC_OPERATION stime CDATA #REQUIRED>
<!ATTLIST GC_OPERATION etime CDATA #REQUIRED>
<!ATTLIST GC_OPERATION type_start_task CDATA #IMPLIED>
<!ATTLIST GC_OPERATION type_end_task CDATA #IMPLIED>
<!ATTLIST GC_OPERATION type_new_turn CDATA #IMPLIED>
<!ATTLIST GC_OPERATION type_start_utt CDATA #IMPLIED>
<!ATTLIST GC_OPERATION type_end_utt CDATA #IMPLIED>
<!ATTLIST GC_OPERATION type_prompt CDATA #IMPLIED>
<!ELEMENT GC_MESSAGE (GC_DATA)*>
<!ATTLIST GC_MESSAGE type NMTOKENS #IMPLIED>
<!ATTLIST GC_MESSAGE turnid NMTOKEN #REQUIRED>
<!ATTLIST GC_MESSAGE server CDATA #REQUIRED>
<!ATTLIST GC_MESSAGE location NMTOKEN #REQUIRED>
<!ATTLIST GC_MESSAGE name CDATA #REQUIRED>
<!ATTLIST GC_MESSAGE direction NMTOKEN #REQUIRED>
<!ATTLIST GC_MESSAGE tidx NMTOKEN #IMPLIED>
<!ATTLIST GC_MESSAGE reply_type CDATA #IMPLIED>
<!ATTLIST GC_MESSAGE reply_status CDATA #IMPLIED>
<!ATTLIST GC_MESSAGE time CDATA #REQUIRED>
<!ATTLIST GC_MESSAGE type_start_task CDATA #IMPLIED>
<!ATTLIST GC_MESSAGE type_end_task CDATA #IMPLIED>
<!ATTLIST GC_MESSAGE type_new_turn CDATA #IMPLIED>
<!ATTLIST GC_MESSAGE type_start_utt CDATA #IMPLIED>
<!ATTLIST GC_MESSAGE type_end_utt CDATA #IMPLIED>
<!ATTLIST GC_MESSAGE type_prompt CDATA #IMPLIED>
<!ELEMENT GC_EVENT (GC_DATA)*>
<!ATTLIST GC_EVENT etype NMTOKEN #REQUIRED>
<!ATTLIST GC_EVENT turnid NMTOKEN #REQUIRED>
<!ATTLIST GC_EVENT server CDATA #IMPLIED>
<!ATTLIST GC_EVENT location NMTOKEN #IMPLIED>
<!ATTLIST GC_EVENT time CDATA #REQUIRED>
<!ATTLIST GC_EVENT name CDATA #REQUIRED>
<!ATTLIST GC_EVENT tidx NMTOKEN #IMPLIED>
<!ATTLIST GC_EVENT type_start_task CDATA #IMPLIED>
<!ATTLIST GC_EVENT type_end_task CDATA #IMPLIED>
<!ATTLIST GC_EVENT type_new_turn CDATA #IMPLIED>
<!ATTLIST GC_EVENT type_start_utt CDATA #IMPLIED>
<!ATTLIST GC_EVENT type_end_utt CDATA #IMPLIED>
<!ATTLIST GC_EVENT type_prompt CDATA #IMPLIED>
<!ELEMENT GC_DATA ANY>
<!ATTLIST GC_DATA key CDATA #REQUIRED>
<!ATTLIST GC_DATA type NMTOKENS #IMPLIED>
<!ATTLIST GC_DATA mime_type NMTOKEN #IMPLIED>
<!ATTLIST GC_DATA direction NMTOKEN #IMPLIED>
<!ATTLIST GC_DATA dtype NMTOKEN #IMPLIED>
<!ATTLIST GC_DATA time CDATA #IMPLIED>
<!ATTLIST GC_DATA turnid NMTOKEN #IMPLIED>
<!ATTLIST GC_DATA type_utt_text CDATA #IMPLIED>
<!ATTLIST GC_DATA type_error_msg CDATA #IMPLIED>
<!ATTLIST GC_DATA type_help_msg CDATA #IMPLIED>
<!ATTLIST GC_DATA type_task_completion CDATA #IMPLIED>
<!ELEMENT GC_FRAME (GC_DATA)*>
<!ATTLIST GC_FRAME frame_type NMTOKEN #IMPLIED>
<!ATTLIST GC_FRAME name CDATA #IMPLIED>
<!ATTLIST GC_FRAME turnid NMTOKEN #IMPLIED>
<!ELEMENT GC_LIST (GC_DATA)*>
<!ATTLIST GC_LIST name CDATA #IMPLIED>
<!ATTLIST GC_LIST turnid NMTOKEN #IMPLIED>