The Speech Recognition Message Set defines the Hub messages for speech recognition usage. The message set consists of messages to send to a server that contains a speech recognition engine.
The Hub uses speech recognition engines that support SAPI and JSAPI messages sets as defined in the speech recognition standards. Message wrappers translate the Hub messages into JASPER and SAPI messages. The speech recognition JSAPI implementation describes the features of the speech recognition could be supported by a JSAPI implementation using IBM Via Voice.
If other speech recognition message sets need to be supported by the Hub, please send a message to the MITRE Communicator team at bugs-darpacomm@linus.mitre.org.
This Speech Recognition Message set is part of the Speech Message Set.
Command Line
Default Port
15560
Command Line Arguments
None
reinitialize: The parameters of the reinitialize message initialize the recognition server engine. The specific meaning of the parameters depends on the on the implementation of the recognition engine. A reinitialize message sent after a previous reinitialize message changes parameters the recognition engine uses. A specific recognition engine, however, may use the reinitialize message to initialize itself again.
Please note that actual usage of these parameters depends on the recognizer
engine, since the engine must provide the support.
|
|
|
|
|
|
IN: | :audio_source_type | GAL_STRING | The :audio_source_type describes where the input is coming from: <"brokering" | "file" | "microphone" | "telephone" >. | ||
audio_source_param | GAL_STRING | audio_source_type | The name of file or port number (for brokering, microphone, or telephone) to use to receive the audio samples. | ||
:automatic_gain | boolean | yes | A value of true causes the recognition engine to automatically adapt to background noise | ||
:energy_floor | GAL_FLOAT | yes | The minimum energy level for the recognition engine to use. | ||
:language_model | GAL_STRING | The full path and filename of the language models for the recognition engine to load for recognition. The path is relative to the language model location on the recognition server. | |||
:sub_language_model | GAL_STRING | yes | :language_model | The name of a specific language model dialect, and is specific to the recognition engine. | |
:grammar_type | GAL_STRING | yes | The format of the grammar file: <"dictation" | "rule"> | ||
:grammar_file | GAL_STRING | yes | :grammar_type | The full path and file name that contains the grammar the recognition engine is use for recognition. The path is relative to the grammar file location on the recognition server. | |
:grammar_params | GAL_LIST | yes | :grammar_type | An array of name-value pairs that is specific to the particular :grammar_type and is passed to the recognition engine. These name-value pairs are not used by the Hub and cannot appear as a Hub rule condition. | |
:n_best | GAL_INT | yes | The :n_best value specifies the maximum number of possible recognition results the recognition returns in the Results message. The recognition engine, however, may return less than N results. | ||
:noise_sensitivity | GAL_FLOAT | yes | The :noise_sensitivity level reflects how close the recognizer is sensitive to quite input and more sensitive to noise input. A value of zero requires the user to speak loud, but makes the recognizer less sensitive to background noise. Value is between zero and one. | ||
:pause_after_recognition | boolean | yes | A value of true causes the recognition engine to automatically pause after a successful recognition (see the Pause message). | ||
:real_time_setting | GAL_FLOAT | yes | The :real_time_setting indicates how close to real time the recognition process should take, where a value of 1.0 indicates real time. (This parameter is a tradeoff between response time and recognition accuracy.) Value is between zero and one. | ||
:threshold | GAL_FLOAT | yes | The :threshold is the minimum threshold that recognition server requires to achieve a successful recognition. The recognizer engine returns all results that are exceed the :threshold in the Results message. | ||
:timeout | GAL_INT | yes | The :timeout specifies the number of seconds the recognition engine allows for pauses. |
Configure: The Configure
message parameters define specifies messages to receive, or not to
receive, from a recognition server. The setting of a parameter overrides
the previous parameter setting, and parameter not specified retain their
current values. Note that not all recognition engines within a recognition
server support all of the Configure message parameters.
|
|
|
|
|
|
IN: | :audio_samples | boolean | yes | A value of true causes the recognition server to issue the Audio_Samples message that contains the spoken audio samples | |
:background_noise_level | boolean | yes | :notify_status | A value of true causes the recognition server to issue the Background_Noise_Level message when the background noise level changes. | |
:current_byte | boolean | yes | A value of true causes the recognition server to issue the Current_Byte message to indicate the audio stream byte the recognizer is processing. | ||
:interference | boolean | yes | :notify_status | A value of true causes the recognition server to issue the Interference message to indicate that the recognition engine could not obtain a satisfactory recognition. Some reasons for the lag of a recognition results are garbled input, whispered speech (volume level too low), shouted speech (volume level too high), noisy speech, and sound. | |
:notify_attribute_changes | boolean | yes | :notify_status | A value of true causes the recognition server to issue the Attribute_Changed message when an internal recognition engine attribute changes value. | |
:notify_status | boolean | yes | A value of true causes the recognizer to send Recognition_Event message indicating different recognition events. This flag also controls when the Attribute_Changed, Background_Noise_Level, Interference, Phrase, Speech_Started, Speech_Stopped, and Volume_Limits messages are sent. (Default value is true.) | ||
:start_end_speech | boolean | yes | :notify_status | A value of true causes the recognition server to issue the Speech_Started and Speech_Stopped messages. The Speech_Started message indicates that the recognition server detects the beginning of speech. The Speech_Stopped message indicates that the recognition server detects the end of speech. | |
:volume_limits | boolean | yes | :notify_status | A value of true causes the recognition server to periodically issue the Volume_Limits message to indicate the current volume setting level. |
Grammar_Control: The parameters of the Grammar_Control message allows for modifications of the grammar on the recognition server. The parameter control_type indicates the type of modification. This message may be sent repeatedly where the new values may update existing values. The current values for the :control_type are as follows:
|
|
|
|
|
|
IN: | :control | GAL_STRING | The :control_type specifies the necessary modification to the grammar on the recognition server. The values must be one of: <"add", "delete"> | ||
:words | GAL_LIST | yes | :control_type = "add", "delete" | An array (of strings) that contains each word to add to the grammar. | |
:phones | GAL_LIST | yes | :control_type = "add" | An array of phonetic strings to use to pronounce the corresponding word. |
Pause: The Pause
message pauses the recognition engine on the recognition server until the
recognition server receives the Resume message.
Some audio bytes could be lost when pausing the recognition server.
|
|
|
|
|
|
Resume: The Resume
message resumes a recognition engine is currently paused (see the Pause
message or the :pause_after_recognition parameter of the reinitialize
message).
|
|
|
|
|
|
The following message are issued by the recognizer, unless the the :send_notifications parameter of the Configure message is disabled. When the :send_notifications of the Configure message is enable, the recognition server issues the following messages.
Attribute_Changed: The recognition engine server issues the Attribute_Changed message when certain recognition engine parameters change value. The :notify_attribute_changes parameter of the Configure message enables or disables whether the recognition server issues the Attribute_Changed message. The name and value of the attribute depends on the specific recognition engine. The notify_status flag of the Configure message must also be set to receive this message.
{c Attribute_Changed .... }
|
|
|
|
|
|
OUT: | :attribute_name | GAL_STRING | The name of attribute. | ||
:attribute_value_new | GAL_STRING | The new value of the attribute name. | |||
:attribute_value_old | GAL_STRING | The old value of the attribute name. |
Audio Samples: The recognition engine server issues the Audio_Samples message in response to receive audio samples from the :audio_samples parameter of the Configure message.
{c Audio_Samples ... }
|
|
|
|
|
|
OUT: | :num_samples | GAL_INT | The number of audio samples. | ||
:sample_byte_size | GAL_INT | The number of bytes corresponding to one audio sample. | |||
:samples | byte array | The audio samples that speaker uttered. The number of bytes is equal to :num_samples times :sample_byte_size. |
Background_Noise_Level: The recognition process issues this message when a request has been made using the Configure message to receive notification of when the background noise level changes. The notify_status flag of the Configure message must also be set.
{c Background_Noise_level ...}
|
|
|
|
|
|
OUT: | :noise_level | GAL_INT | The current background noise level value. |
Current_Byte: The recognition process issues this message when a request has been made using the Configure message to receive notification the current of audio stream being processed. The frequency of this message depends on the recognizer.
{c Current_Byte ... }
|
|
|
|
|
|
OUT: | :current_byte | GAL_INT | The current audio byte being processed within the recognizer. |
Interference: The recognition process issues this message when a request has been made using the Configure message to receive notification when interference prevents a successful recognition. The notify_status flag of the Configure message must also be set. Examples of interference are garbled input, whispered speech (volume level too low), shouted speech (volume level too high), noisy speech, and sound.
{c Interference ... }
|
|
|
|
|
|
OUT: | :interference | GAL_STRING | yes | Describes the interference: <"garbled" | "whispered" | "shouted" | "noisy" | "sound" > |
Recognition_Event: The recognition process issues this message when an event unique to the recognizer, provided the the notify_status parameter has been enabled with the Configure message. The actual parameter of the event_name depends on the recognizer.
{c Recognition_Event ... }
|
|
|
|
|
|
OUT: | :event_name | GAL_STRING | The value describes the the type of event and depends on the recognizer being used. |
Results: The recognition process issues this message when a recognition result has been made. The reinitialize message controls the results message contents. More than one result may be returned when the reinitialize asks for the n_best results or for results that exceed a given threshold.
{c Results ... }
|
|
|
|
|
|
OUT: | :score | GAL_LIST | A list of scores, with each score as a float. | ||
:results | GAL_LIST | A list of results, with each result as a string. |
Speech_Started: The recognition process issues this message when a request has been made using the Configure message to receive notification of when the recognizer detects the beginning of speech. The notify_status flag of the Configure message must also be set.
{c Speech_Started ... }
|
|
|
|
|
|
Speech_Stopped: The recognition process issues this message when a request has been made using the Configure message to receive notification of when the recognizer detects the end of speech. The notify_status flag of the Configure message must also be set.
{c Speech_Stopped ... }
|
|
|
|
|
|
Volume_Limits: The recognition process issues this message when a request has been made using the Configure message to receive periodic notifications of the current volume setting. This message, for example, allows for the application to provide feedback that a microphone is one or how loud a user is speaking. The notify_status flag of the Configure message must also be set.
{c Volume_Level ... }
|
|
|
|
|
|
OUT: | :volume | GAL_INT | The volume level setting. |
Copyright (c) 2000
The MITRE
Corporation
ALL RIGHTS RESERVED