Speech Recognition Communicator Message Set

Speech Recognition Communicator Message Set

The Speech Recognition Message Set defines the Hub messages for speech recognition usage. The message set consists of messages to send to a server that contains a speech recognition engine.

The Hub uses speech recognition engines that support SAPI and JSAPI messages sets as defined in the speech recognition standards. Message wrappers translate the Hub messages into JASPER and SAPI messages. The speech recognition JSAPI implementation describes the features of the speech recognition could be supported by a JSAPI implementation using IBM Via Voice.

If other speech recognition message sets need to be supported by the Hub, please send a message to the MITRE Communicator team at bugs-darpacomm@linus.mitre.org.

This Speech Recognition Message set is part of the Speech Message Set.

Usage

Command Line

Default Port

15560

Command Line Arguments

None

Message Set

reinitialize: The parameters of the reinitialize message initialize the recognition server engine. The specific meaning of the parameters depends on the on the implementation of the recognition engine. A reinitialize message sent after a previous reinitialize message changes parameters the recognition engine uses. A specific recognition engine, however, may use the reinitialize message to initialize itself again.

Please note that actual usage of these parameters depends on the recognizer engine, since the engine must provide the support.

Parameter Type Optional Depends on Description / Constraints

IN: :audio_source_type GAL_STRING The :audio_source_type describes where the input is coming from: <"brokering" | "file" | "microphone" | "telephone" >.

audio_source_param GAL_STRING audio_source_type The name of file or port number (for brokering, microphone, or telephone) to use to receive the audio samples.

:automatic_gain boolean yes A value of true causes the recognition engine to automatically adapt to background noise

:energy_floor GAL_FLOAT yes The minimum energy level for the recognition engine to use.

:language_model GAL_STRING The full path and filename of the language models for the recognition engine to load for recognition. The path is relative to the language model location on the recognition server.

:sub_language_model GAL_STRING yes :language_model The name of a specific language model dialect, and is specific to the recognition engine.

:grammar_type GAL_STRING yes The format of the grammar file: <"dictation" | "rule">

:grammar_file GAL_STRING yes :grammar_type The full path and file name that contains the grammar the recognition engine is use for recognition. The path is relative to the grammar file location on the recognition server.

:grammar_params GAL_LIST yes :grammar_type An array of name-value pairs that is specific to the particular :grammar_type and is passed to the recognition engine. These name-value pairs are not used by the Hub and cannot appear as a Hub rule condition.

:n_best GAL_INT yes The :n_best value specifies the maximum number of possible recognition results the recognition returns in the Results message. The recognition engine, however, may return less than N results.

:noise_sensitivity GAL_FLOAT yes The :noise_sensitivity level reflects how close the recognizer is sensitive to quite input and more sensitive to noise input. A value of zero requires the user to speak loud, but makes the recognizer less sensitive to background noise. Value is between zero and one.

:pause_after_recognition boolean yes A value of true causes the recognition engine to automatically pause after a successful recognition (see the Pause message).

:real_time_setting GAL_FLOAT yes The :real_time_setting indicates how close to real time the recognition process should take, where a value of 1.0 indicates real time. (This parameter is a tradeoff between response time and recognition accuracy.) Value is between zero and one.

:threshold GAL_FLOAT yes The :threshold is the minimum threshold that recognition server requires to achieve a successful recognition. The recognizer engine returns all results that are exceed the :threshold in the Results message.

:timeout GAL_INT yes The :timeout specifies the number of seconds the recognition engine allows for pauses.

Configure: The Configure message parameters define specifies messages to receive, or not to receive, from a recognition server. The setting of a parameter overrides the previous parameter setting, and parameter not specified retain their current values. Note that not all recognition engines within a recognition server support all of the Configure message parameters.

Parameter Type Optional Depends on Description / Constraints

IN: :audio_samples boolean yes A value of true causes the recognition server to issue the Audio_Samples message that contains the spoken audio samples

:background_noise_level boolean yes :notify_status A value of true causes the recognition server to issue the Background_Noise_Level message when the background noise level changes.

:current_byte boolean yes A value of true causes the recognition server to issue the Current_Byte message to indicate the audio stream byte the recognizer is processing.

:interference boolean yes :notify_status A value of true causes the recognition server to issue the Interference message to indicate that the recognition engine could not obtain a satisfactory recognition. Some reasons for the lag of a recognition results are garbled input, whispered speech (volume level too low), shouted speech (volume level too high), noisy speech, and sound.

:notify_attribute_changes boolean yes :notify_status A value of true causes the recognition server to issue the Attribute_Changed message when an internal recognition engine attribute changes value.

:notify_status boolean yes A value of true causes the recognizer to send Recognition_Event message indicating different recognition events. This flag also controls when the Attribute_Changed, Background_Noise_Level, Interference, Phrase, Speech_Started, Speech_Stopped, and Volume_Limits messages are sent. (Default value is true.)

:start_end_speech boolean yes :notify_status A value of true causes the recognition server to issue the Speech_Started and Speech_Stopped messages. The Speech_Started message indicates that the recognition server detects the beginning of speech. The Speech_Stopped message indicates that the recognition server detects the end of speech.

:volume_limits boolean yes :notify_status A value of true causes the recognition server to periodically issue the Volume_Limits message to indicate the current volume setting level.

Grammar_Control: The parameters of the Grammar_Control message allows for modifications of the grammar on the recognition server. The parameter control_type indicates the type of modification. This message may be sent repeatedly where the new values may update existing values. The current values for the :control_type are as follows:

"add" - Adds a list of words and phonetic spelling to the existing grammar in the recognizer. This option depends on the parameters :words and :phones.
"delete" - Removes the words contains in the :words array from the grammar in the recognizer. This option depends on the parameter :words.

	Parameter	Type	Optional	Depends on	Description / Constraints
IN:	:control	GAL_STRING			The :control_type specifies the necessary modification to the grammar on the recognition server. The values must be one of: <"add", "delete">
	:words	GAL_LIST	yes	:control_type = "add", "delete"	An array (of strings) that contains each word to add to the grammar.
	:phones	GAL_LIST	yes	:control_type = "add"	An array of phonetic strings to use to pronounce the corresponding word.

Pause: The Pause message pauses the recognition engine on the recognition server until the recognition server receives the Resume message. Some audio bytes could be lost when pausing the recognition server.

Parameter Type Optional Depends on Description / Constraints

Resume: The Resume message resumes a recognition engine is currently paused (see the Pause message or the :pause_after_recognition parameter of the reinitialize message).

Parameter Type Optional Depends on Description / Constraints

Messages Issued

The following message are issued by the recognizer, unless the the :send_notifications parameter of the Configure message is disabled. When the :send_notifications of the Configure message is enable, the recognition server issues the following messages.

Attribute_Changed: The recognition engine server issues the Attribute_Changed message when certain recognition engine parameters change value. The :notify_attribute_changes parameter of the Configure message enables or disables whether the recognition server issues the Attribute_Changed message. The name and value of the attribute depends on the specific recognition engine. The notify_status flag of the Configure message must also be set to receive this message.

{c Attribute_Changed .... }

Parameter Type Optional Depends on Description / Constraints

OUT: :attribute_name GAL_STRING The name of attribute.

:attribute_value_new GAL_STRING The new value of the attribute name.

:attribute_value_old GAL_STRING The old value of the attribute name.

Audio Samples: The recognition engine server issues the Audio_Samples message in response to receive audio samples from the :audio_samples parameter of the Configure message.

{c Audio_Samples ... }

Parameter Type Optional Depends on Description / Constraints

OUT: :num_samples GAL_INT The number of audio samples.

:sample_byte_size GAL_INT The number of bytes corresponding to one audio sample.

:samples byte array The audio samples that speaker uttered. The number of bytes is equal to :num_samples times :sample_byte_size.

Background_Noise_Level: The recognition process issues this message when a request has been made using the Configure message to receive notification of when the background noise level changes. The notify_status flag of the Configure message must also be set.

{c Background_Noise_level ...}

Parameter Type Optional Depends on Description / Constraints

OUT: :noise_level GAL_INT The current background noise level value.

Current_Byte: The recognition process issues this message when a request has been made using the Configure message to receive notification the current of audio stream being processed. The frequency of this message depends on the recognizer.

{c Current_Byte ... }

Parameter Type Optional Depends on Description / Constraints

OUT: :current_byte GAL_INT The current audio byte being processed within the recognizer.

Interference: The recognition process issues this message when a request has been made using the Configure message to receive notification when interference prevents a successful recognition. The notify_status flag of the Configure message must also be set. Examples of interference are garbled input, whispered speech (volume level too low), shouted speech (volume level too high), noisy speech, and sound.

{c Interference ... }

Parameter Type Optional Depends on Description / Constraints

OUT: :interference GAL_STRING yes Describes the interference: <"garbled" | "whispered" | "shouted" | "noisy" | "sound" >

Recognition_Event: The recognition process issues this message when an event unique to the recognizer, provided the the notify_status parameter has been enabled with the Configure message. The actual parameter of the event_name depends on the recognizer.

{c Recognition_Event ... }

Parameter Type Optional Depends on Description / Constraints

OUT: :event_name GAL_STRING The value describes the the type of event and depends on the recognizer being used.

Results: The recognition process issues this message when a recognition result has been made. The reinitialize message controls the results message contents. More than one result may be returned when the reinitialize asks for the n_best results or for results that exceed a given threshold.

{c Results ... }

Parameter Type Optional Depends on Description / Constraints

OUT: :score GAL_LIST A list of scores, with each score as a float.

:results GAL_LIST A list of results, with each result as a string.

Speech_Started: The recognition process issues this message when a request has been made using the Configure message to receive notification of when the recognizer detects the beginning of speech. The notify_status flag of the Configure message must also be set.

{c Speech_Started ... }

Parameter Type Optional Depends on Description / Constraints

Speech_Stopped: The recognition process issues this message when a request has been made using the Configure message to receive notification of when the recognizer detects the end of speech. The notify_status flag of the Configure message must also be set.

{c Speech_Stopped ... }

Parameter Type Optional Depends on Description / Constraints

Volume_Limits: The recognition process issues this message when a request has been made using the Configure message to receive periodic notifications of the current volume setting. This message, for example, allows for the application to provide feedback that a microphone is one or how loud a user is speaking. The notify_status flag of the Configure message must also be set.

{c Volume_Level ... }

Parameter Type Optional Depends on Description / Constraints

OUT: :volume GAL_INT The volume level setting.

Please send comments and suggestions to: bugs-darpacomm@linus.mitre.org
Last updated July 3, 2000.