The previous article in this series, "Building VoiceXML Applications Using
J2EE" (XML-J, Vol. 2, issue 2), discussed building dynamic and
interactive voice applications using VoiceXML and J2EE. In this issue we'll
focus on the tools available to aid development and testing of VoiceXML-based
components and applications. We'll discuss how to use the tools to test and
debug such applications from a normal Touch-Tone-based phone.
The tools reviewed are:
BeVocal Café To test VoiceXML applications from a live Touch-Tone phone, developers can
dial the BeVocal Café testing number (877-33-VOCAL) and use their personal user
ID/PIN to test the currently active VoiceXML file or URL.
BeVocal Café contains a number of technical resources, including a Getting
Started guide, a FAQ, and a VoiceXML tutorial and reference. A number of
VoiceXML samples are provided to explain the features of its platform and/or
serve as starter applications. Café contains a section called Audio Libraries, a
large set of frequently used recorded prompts. To enhance collaboration between
developers, Café includes Usenet-based threaded for- ums. Find BeVocal Café at
http://cafe.bevocal.com/
WebSphere Voice Server SDK WebSphere Studio 3.5 (see Figure 2), a separate Web development IDE, supports
a syntax checking-enabled VoiceXML editor and wizards to create dynamic VoiceXML
documents. Voice Server SDK is also integrated with WebSphere Studio, providing
the capability to launch the speech browser from within the IDE.
To simulate input for an application, SDK supports voice input through a
microphone attached to the workstation and text-simulated input through the
keyboard. The DTMF simulator adds the capability to simulate recognition of DTMF
inputs as well.
IBM's WebSphere Voice Server SDK is available at http://www.ibm.com/software/speech/enterprise/ep_11.html The site
also provides technical resources, such as a programmer's guide, a FAQ, and
access to Usenet-based discussion groups.
Mobile ADK The wireless IDE provides capabilities such as a color code editor. The IDE
is tightly integrated, and a touch of a green toolbar button instantly brings up
the VoiceXML Simulator.
At this writing the Mobile ADK 2.0 beta (the version that supports VoiceXML)
is available from Motorola's Mobile Internet Exchange (MIX) at http://mix.motorola.com/audiences/developers/madk_intro_dev.asp
The site provides technical information such as white papers, a FAQ, a
searchable knowledge base, and a Web-based discussion group for collaboration.
V-Builder V-Builder also allows visual development of grammars that can be used by
VoiceXML-based application dialogs. V-Builder supports grammar specification
language (GSL) format for describing both online and external grammars.
VoiceXML application components (grammars/dialogs) can be tested using either
native desktop audio or a telephone (with an appropriate hardware and/or audio
provider). A third-party text-to-speech engine is required if the application
incorporates TTS functionality. V-Builder and the other tools are available at
http://extranet.nuance.com/developer/ The site provides access to
technical documentation - development guides and tutorials - integration with
text-to-speech engines and hardware interfaces, and collaboration tools, such as
discussion groups and newsletters
Tellme Studio MyStudio includes a series of development, testing, and debugging toolsets,
such as a syntax checker and a record-by-phone option, that allow developers to
record prompts for use in the application. As discussed in my previous article,
grammars are an important component of a VoiceXML application. They define how
the VoiceXML Interpreter and the inherent speech recognition engine should
interpret input phrases. Tellme Studio provides tools to validate a grammar
(from a scratchpad or external URL), a phrase checker to test a particular
grammar, a phrase generator to generate the various phrases the grammar can
recognize, and a DTMF Generator to generate DTMF entries for lists of words.
Figure 5 shows the DTMF Generator that generates DTMF entries for HTML, XML, and
VXML.
Tellme Studio also provides a tool to aid in converting prerecorded prompts
to the supported audio file formats. It's available as a plug-in to a popular
sound-editing tool (Sonic Foundry's Sound Forge).
Tellme Studio provides a series of technical resources, including a VoiceXML
reference, grammar reference, a FAQ, guidelines for creating phone-based
applications ("Designing for the Phone"), and technical white papers on
VoiceXML. Example VoiceXML code, grammars, and audio prompts are available for
easier development of applications. Tellme Studio is available on the Web at http://studio.tellme.com/ and
provides Usenet-based threaded forums for communication between developers.
VoiceGenie Developer Workshop VoiceGenie Developer Workshop provides an extension management tool that
allows the developer to manage up to 20 extensions (per account) to develop and
test multiple applications. The extensions capability allows multiplexing of
applications with a single telephone number.
Technical resources include VoiceXML reference, tutorials, how-to documents,
examples, and a FAQ. There are also audio and grammar libraries. VoiceGenie
Developer Workshop is available on the Web at http://developer.voicegenie.com.
Web-based threaded discussions provide a collaborative environment for the
developer community.
voxeo community Technical resources include a development guide, tutorials, notes, a FAQ, a
Perl module (called vxml.pm) that facilitates generation of VoiceXML code
through Perl, and examples to help jump-start your VoiceXML applications. Find
them on the Web at http://community.voxeo.com/ Usenet-/Web-based threaded
discussions provide a collaborative environment for the developer community.
Conclusion As we've seen thus far, an important aspect of developing VoiceXML
applications is designing and developing grammars, and the strength and
flexibility of an interactive VoiceXML application is based on the richness of
the grammar. In the next part of this series we'll focus on developing grammars
and review the upcoming standards/specifications being developed around it.
Author Bio
www.XML-Journal.com
by Hitesh Seth
BeVocal Café by BeVocal, Inc., is a hosted VoiceXML
platform that allows development and testing of VoiceXML-based applications. It
includes tools (see Table 1) such as File Management for uploading VoiceXML,
grammar, and audio files, VoiceXML Checker to validate VoiceXML content, a vocal
player for replaying sessions, Trace Tool for tracing and debugging the
application, Log Browser for viewing the call trace log, and Port Estimator to
estimate the number of ports necessary to support a required level of service
and concurrent callers. Figure 1 shows the File Management utility. Since Café
allows the developer to upload VoiceXML application files, a Web server isn't
immediately needed to serve VoiceXML content.
IBM's WebSphere Voice Server SDK (see
Table 2) leverages the multimedia capabilities of a workstation - speakers and
microphone - and provides a desktop-based simulation environment for testing
interactive voice-based applications developed using VoiceXML. Its components
include a desktop-based speech recognition engine (IBM's ViaVoice Speech
Recognition Engine) to recognize speech input through the microphone, a
text-to-speech engine (IBM's ViaVoice Text-to-Speech Engine), a VoiceXML
browser, and a DTMF simulator. Applications developed using Voice Server SDK can
be deployed on the WebSphere Voice Server. The VoiceXML browser supports
grammars specified using JSGF format.
Motorola's Mobile ADK (see Table 3) builds on top of the
Motorola Wireless IDE, a common tool for developing applications for phones,
mobile phones, and interactive two-way pagers. Key components of Mobile ADK are
an IDE that supports validation of VoiceXML applications, a desktop-based
VoiceXML simulator, and a Microsoft agent-based application that allows
interactive testing of VoiceXML applications. The VoiceXML Simulator supports
both simulated text-based input (in the Simulator Response box) and speech input
through a desktop microphone (see Figure 3). The simulator also supports DTMF
input by clicking on buttons of the phone image.
Nuance Communications' V-Builder (see Table 4) is a
visual IDE for developing VoiceXML-based applications. We're all used to the
concept of assembling a visual dialog from fundamental elements such as text
areas, text fields, buttons, and menus. V-Builder leverages this concept and
represents VoiceXML tags as visual elements, which are contained in the elements
palette and can be dragged and dropped to create a VoiceXML document/dialog (see
Figure 4). V-Builder also incorporates Nuance SpeechObjects by providing a set
of prebuilt speech objects for common dialogs, entry, database and Web queries,
and natural language interfaces. SpeechObjects run on top of the Nuance Advanced
Speech Recognition (ASR) engine and are integrated with the VoiceXML language
using the <object> tag.
Tellme Studio, a product of Tellme Networks, Inc., is
a hosted VoiceXML platform (see Table 5). The tools in Tellme Studio are in two
sections: MyExtensions and MyStudio. The former allows development and
publication of applications to the Tellme platform. Enabling MyExtensions allows
developers to access consumers.
VoiceGenie Technologies' VoiceGenie
Developer Workshop (see Table 6) is a hosted VoiceXML platform that provides
essential tools such as a call log explorer to view the logs for debugging, an
online audio converter that converts prerecorded audio prompts into the
supported .vox format, and an extension manager that allows a number of
extensions to be defined to test individual applications. A VoiceXML validator
converts VoiceXML applications into a series of "blocks" that allow easier
visualization of the workings of the application (see Figure 6).
voxeo corporation's voxeo community (see Table 7)
is a hosted VoiceXML platform that provides an online logger to assist in
debugging a VoiceXML application. A URL mapping tool maps phone numbers and
allows a developer to register unique phone numbers for multiple VoiceXML-based
applications (see Figure 7).
VoiceXML is an emerging standard. Although it's only
been around for a small duration, a number of tools have emerged to jump-start
the development of the next generation interactive voice portals. Even though
these tools are initial entries, given the enthusiasm and excitement that
VoiceXML has created we can expect substantial improvements, developments, and
compatibilities.
Hitesh Seth is chief technology evangelist
for SeraNova, a global e-business and mobile solutions consulting firm. He has
extensive experience in the technologies associated with Internet application
development. Hitesh received his bachelor's degree from the Indian Institute of
Technology Kanpur (IITK), India.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14