Speech Recognition Software

By Peter Lyle DeHaan, PhD

Since the advent of computers, users dreamed of the day when they could communicate with computers using normal speech, something that wasn’t possible until speech recognition software was developed. Speech recognition allows people to interact with these machines and visa versa.

Initial speech recognition systems were speaker-dependant, meaning that the software needed to be “trained” to understand each user, by them repeatedly recording common words or phrases. These systems had small vocabularies and limited uses since they couldn’t communicate with the general public.

Today’s speaker-independent systems solved that, making them ideal in automating basic or repetitive call center transactions. In a call center, speech recognition occurs via the telephone. Speaker-independent speech recognition systems are becoming more and more adept at dealing with large variations in speech, be it pronunciation, accent, or dialect, as well as being able to accommodate continuous speech (that is, without the user pausing unnaturally after each word).

Speech recognition should not be confused with voice recognition or voice authentication, which confirms callers’ identities by analyzing their voice print. As such, speech recognition is a communication technology and voice recognition is an identification or verification technology.

For today’s call center, speech recognition can be used to supplement or replace touch-tone IVR (Interactive Voice Response), enter information into a database (speech-to-text conversion), or retrieve information from a database. A speech enabled IVR system can answer a call and prompt the caller for information, such as an account number, phone number, or address. The system takes the response, converts it into text, and pre-populates a form. This information can then be presented to an agent to complete the call. In some situations, the entire interaction with the caller is done via speech recognition. The resulting data is written into a call record, which can then be forwarded to the appropriate individual, department, or even an external database.

To access a database, the caller is prompted for information, which is used to create a query. The database could be a directory of pager numbers, phone extensions, or on-call staff. Alternately, the database could contain orders, messages, documents, trouble reports, account balances, payment information, and so forth.

Here are call center vendors that provide speech recognition software:

Peter Lyle DeHaan, PhD, is the publisher and editor-in-chief of Connections Magazine. He’s a passionate wordsmith whose goal is to change the world one word at a time. Read more of his articles at PeterDeHaanPublishing.com.

Alston Tascom, Inc: Alston Tascom’s Spoken Response is speaker-independent, accommodating natural language algorithms and allowing callers to speak more naturally. Since people speak with different syntaxes, “word spotting” allows recognition of words in any order, which allows people to talk normally. It includes software that lets call centers record their own menus, prompts, and answers. Additionally, it has a drag and drop interface and online tutorial, allowing call centers to quickly develop a custom application. Spoken Response has many features, including noise-canceling input, a speech recognition engine, vocabularies, application interfaces, and rudimentary natural-language processing. Users indicate that Spoken Response can answer a telephone-based customer service inquiry for about one third the cost of an agent.

For more information about Spoken Response, contact Alston Tascom at 866-282-7266 or info@alstontascom.com.

Amcom Software, Inc: Amcom Smart Speech applications enable call centers to automatically process routine phone requests such as directory assistance, messaging, and paging. Smart Speech applications are designed to handle high call volumes and directories with thousands or even millions of records. They can be integrated with other call center applications, enabling use of a single database for combined agent, Web, and speech-enabled directory functions. Callers simply say the name of the person or department they want to reach and are quickly connected. Experienced callers can barge in ahead of system prompts.

Real-time access and multiple call handling ability decrease caller wait time, while integrated feeds from other databases (including the phone system) automate directory updates and ease administration. Directory data can range from employee or department to patient, student, and even guest information. All applications include logs and reports, training, professionally recorded names, maintenance, and upgrades. Also included is dynamic call scripting and text-to-speech capability. Key applications include auto attendant/IVR (for directory assistance and call transfer), paging, meet-me-paging, morale call manager, and on-call scheduling.

Contact Amcom at 800-852-8935 (just say “sales”), 952-946-7729

Amtelco: Amtelco’s “Just Say It,” provides text-to-speech and automatic speech recognition capabilities. It enables call centers to automate call processing, reduce labor costs, increase accuracy, and offer more services without increasing staff. The Just Say It speech recognition module uses SpeechWorks’ voice recognition technology that allows IVR scripts to listen for words or phrases spoken by a caller in multiple languages.

Just Say It simplifies voice call processing by allowing callers to speak an option instead of entering DTMF tones. In addition to listening for commands, Just Say It can gather information from the caller and store it in an OLE compliant database. This can range from simple yes/no commands, to speaking a person’s name or an ID or credit card number. Just Say It provides the tools to build dictionaries of words or phrases needed for each application to make script design and call processing more efficient.

Contact Amtelco at 800-356-9148 or info@amtelco.com.

LumenVox: LumenVox’s speech driven information system is an intuitive GUI-based toolkit to design, develop, and deploy speech applications or IVR. By connecting to most phone systems and accessing a database, the Designer can create a speech driven application, which can be exported to a VoiceXML file. The speech designer can be extended to allow for any speech application to be created. These extensions can be written as either a Visual Basic ActiveX exe or C/C++ DLL. Finally, the Designer is tied to Windows OS, Intel Dialogic, and LumenVox’s speech recognition engine and post-deployment tuner. Some applications that have been built include a customer satisfaction survey, pizza ordering and delivery status, out-bound post surgery patient checkup, speech driven technical support, call router, customer service, dealer locator, and auto-attendant.

For more information, call 877-977-0707.

[From Connection Magazine – June 2005]