Speech Recognition Grammar Specification

The Speech Recognition Grammar Specification ( SRGS ) is a W3C standard that describes how speech recognition grammars (English speech recognition grammars ) can be specified. A speech recognition grammar is a set of word schemas that tell the speech recognition system that a man would say. For instance, if you invoke an automatic switching system, the speech recognition system according to the name of the person to ask, with which you want to talk. Thereafter, a voice recognition program is invoked to present a speech recognition grammar. This grammar contains the names of all persons in the directory and the various sentence patterns with which the caller typically call.

SRGS specifies two different but logically equivalent syntaxes, one is XML-based, the other uses the augmented BNF format. In practice, however, the XML syntax is used more often.

If the voice recognition program to return only a string of spoken words, the language software would have to take the very tedious job to extract the semantic meaning of the words. For this reason, SRGS grammars with tag elements can be designed that, when executed, produce the semantic result. SRGS does not specify the content of this tag elements: this is done in collaboration with the W3C standard Semantic Interpretation for Speech Recognition ( SISR ). SISR is based on ECMAScript and ECMAScript statements within the SRGS tags create an ECMAScript semantic result object that can be easily processed by the Voice Application.

Both SRGS and SISR are W3C recommendations, ie on the final stage on the way to the W3C standard. The W3C VoiceXML standard that defines how voice dialogs are specified, based heavily on SRGS and SISR.

Examples

Here is an example of the Augmented BNF form of SRGS, as might occur in a language directory application:

# ABNF 1.0 ISO -8859 -1; / / Default grammar language is English language de -DE; / / Single language attachment to tokens / / Note that " fr- CA" (Canadian French) is Applied to only / / The word " oui " because of precedence rules $ yes = yes | oui fr- CA; ! / / Single language attachment to at expansion $ people1 = ( Michel Tremblay | André Roy ) fr- CA; ! / / Handling language -specific pronunciations of the same word / / A Capable speech recognizer will listen for Mexican Spanish and / / U.S. English pronunciations. $ people2 = en-US Jose | Jose es-MX! ; / ** * Multi- lingual input possible * @ Example may I speak to Andre Roy * @ Example may I speak to Jose * / public $ request = may I speak to ($ people1 | $ people2 ); Here is the same example as SRGS XML form:

< DOCTYPE grammar PUBLIC "- / / W3C / / DTD GRAMMAR 1.0 / / EN " " http://www.w3.org/TR/speech-grammar/grammar.dtd " > < - The default grammar language is U.S. English - > < grammar xmlns = " http://www.w3.org/2001/06/grammar " xmlns: xsi = " http://www.w3.org/2001/XMLSchema-instance " xsi: schemaLocation = " http://www.w3.org/2001/06/grammar http://www.w3.org/TR/speech-grammar/grammar.xsd " xml: lang = "en -US" version = "1.0" > < - single language attachment to tokens "yes" inherits U.S. English language " oui " is Canadian French language -> joiners yes < / item> oui < / item> < - Single language attachment to at expansion -> joiners Michel Tremblay < / item> joiners Andre Roy < / item> < - Handling language -specific pronunciations of the same word A speech recognizer Capable wants list for Mexican Spanish and U.S. English pronunciations. -> Jose < / item> Jose < / item> < - Multi- lingual input is possible -> may I speak with André Roy < / example> may I speak with Jose < / example> may I speak with joiners < / item> joiners < / item> see also

SISR
VoiceXML

741143