The Galaxy Communicator infrastructure is
a hub-and-spoke architecture. The distribution contains:
A Hub, implemented in C, which mediates connections
between Communicator servers (such as speech recognition and synthesis,
parsing, dialogue management, etc.)
Server libraries for constructing Communicator-compliant
servers in C (and C++),
Java, Python, and Allegro Common Lisp
Examples illustrating basic and advanced functionality
for creating servers and setting up the Hub to communicate with them
Extensive documentation (which you're now
reading)
Example servers such as wrappers for the TrueTalk
and Festival speech synthesizers, and for Oracle and PostGres database
clients
What you
don't get
You don't get an end-to-end dialogue system
The Galaxy Communicator distribution is not
an end-to-end dialogue system; it provides you with tools for constructing
such a system out of a suite of servers. You can obtain the servers you
need in a number of ways:
You may find some of them in the Galaxy Communicator
distribution, as noted above;
You may be able to obtain some of them from
willing participants in the Communicator program;
You may choose to write some of them yourself.
MITRE intends in FY02 to assemble an open-source
toolkit of Communicator-compliant servers which make up an end-to-end dialogue
system, but this effort is in its infancy. At the moment, if you want to
use this infrastructure to build an end-to-end dialogue system, you'll
need many pieces that aren't provided here, such as audio support, speech
recognition, language parsing, dialogue management and language generation.
You don't get run-time semantic standards
The Galaxy Communicator infrastructure provides
a sophisticated and general transport layer for connecting servers and
Hubs, as well as a message syntax, but does not provide any specifications
about the semantics of the messages which travel between the servers
and Hubs. That is, there's no standard run-time API for speech recognizers,
or audio devices, or parsers. While the MITRE team has explored adapting
existing APIs or message sets for use with Communicator-compliant servers,
nothing in the infrastructure endorses any of these existing APIs or message
sets.
You don't get configuration-time semantic
standards
The Galaxy Communicator infrastructure provides
no support or specifications for standards for configuring individual servers.
For instance, the W3C Voice Browsers
group is proposing specifications for speech recognition grammars. Such
proposals are completely compatible with the Galaxy Communicator infrastructure,
but the infrastructure does not endorse any particular proposal.
Prerequisites:
technical background, hardware and software requirements
Technical background
The core Galaxy Communicator infrastructure
is written in C, and detailed documentation is provided only for the C
libraries, so familiarity with C is fairly important. Since the Galaxy
Communicator infrastructure is a distributed infrastructure, some background
in distributed processing is preferable (e.g., RPC, CORBA, Java RMI), since
distributed processing is a fairly distinct programming paradigm. Object-oriented
programming experience is not needed unless you'll be using the Java or
Allegro Common Lisp server bindings. Finally (and fairly obviously, since
you're already reading this), a reasonable command of English is required
for understanding the documentation.
Supported platforms
The current version of the Galaxy Communicator
infrastructure is actively supported on Sparc Solaris, Intel Linux and
Win32. Consult the installation
instructions for more details.
Software requirements
The Galaxy Communicator infrastructure requires
the GNU gcc compiler and GNU
make.
Consult the installation instructions
for more details.
Technical
overview
A Communicator-compliant dialogue system
consists of a process called the Hub,
together with a set of servers. Servers and the Hub communicate with each
other using named attribute-value structures called frames.
These frames form the basis of all structured communication in the Galaxy
Communicator infrastructure.
Hub
Almost all communication between Communicator-compliant
servers passes through the Hub. The Hub has a number of significant capabilities:
The Hub maintains connections to servers (parser,
speech recognizer, backend, etc.), and routes messages among them.
The message traffic routing in the Hub can
be programmed via a scripting language that controls the flow through each
dialogue turn; the default scripting language is the MIT
scripting language, but users can opt to use no scripting language
or incorporate their own.
The Hub also incorporates an internal server
named Builtin to implement user-visible
administrative tasks
Server libraries
The server libraries provide a number of convenient
capabilities for managing the data and communications. Detailed documentation
is provided for the C bindings; the documentation for other programming
language bindings contains representative examples and equivalence tables.
The server libraries provide a set of commonly-available
command
line arguments and support for defining dispatch
functions, which are invoked in response to frames from the Hub.
The server libraries provide support for backchannel
connections for high-bandwidth data that can be passed directly from
server to server.