Recognition

Speech recognition related parameters.

Warning

For configurations made in the file /opt/cpqd/asr/config/engine/engine.conf, do not leave blank spaces around the equal symbol “=”.

hints.words

Description: Adds new words to the Language Model or increases the probability of existing words appearing.

Values: Word list with or without boost or pronunciation attributes. Default: empty.

Format:

<palavra>:<boost> [<pronúncia>], <palavra>:<boost> [<pronúncia>], ...

Where, <boost> and <pronúncia> are optionals.

Important:

  • The pronunciation must always be in brackets [ ], and there can not be any blank spaces between the brackets and the pronunciation written there.

  • Each word, with its attributes, must be separated by a comma ,.

  • Each word can only be assigned one pronunciation. If you wish to add more than one pronunciation per word, you will need to repeat the word.

  • Words can only contain letters and dashes.

Location: /opt/cpqd/asr/config/engine/engine.conf, API

Example:

--hints.words=mexirica:1.8 [mixirica], siciliano:2, castanha-do-pará

textify.enabled

Description: Enables the automatic formatting of numbers, dates, times, etc.

Values: ‘true’ or ‘false’. Default value: ‘false’.

Location: /opt/cpqd/asr/config/engine/engine.conf, API

Example:

--textify.enabled=true

am.models

Description: Indicates which acoustic models must be loaded by the ASR engine upon startup. Currently, only one model can be defined. The value entered here must be one of the directories contained in /opt/cpqd/asr/lang.

Values: Text. Default value: default.

Location: /opt/cpqd/asr/config/engine/engine.conf

Example:

--am.models=default

lm.preloadModels

Description: Indicates which language models must be loaded by the ASR engine upon startup. Models loaded here are never unloaded from memory and can be updated only by restarting the ASR. It is generally used to load the free speech model or a very large grammar.

Values: The value must be a comma-separated model list.

Location: /opt/cpqd/asr/config/engine/engine.conf

Example:

--lm.preloadModels=builtin:slm/general

lm.timeToLive

Description: Maximum time after which a language model will be unloaded from memory. This is a model’s life cycle in memory.

Values: Integer value in minutes. Default value: 60

Location: /opt/cpqd/asr/config/engine/engine.conf

Example:

--lm.timeToLive=60

lm.timeToIdle

Description: Time after which a language model is to be unloaded from memory if not used for recognition, in other words, maximum idle time.

Values: Integer value in minutes. Default value: 10

Location: /opt/cpqd/asr/config/engine/engine.conf

Example:

--lm.timeToIdle=10

decoder.partialResultEnabled

Description: Indicates whether partial results are enabled. A partial result is text recognized before the audio has been completely received.

Values: ‘true’ or ‘false’. Default value: ‘false’.

Location: /opt/cpqd/asr/config/engine/engine.conf

Example:

--decoder.partialResultEnabled=false

decoder.partialResultInterval

Description: Indicates the time interval for generating a partial result.

Values: Integer value in milliseconds. Default value: 1000

Location: /opt/cpqd/asr/config/engine/engine.conf

Example:

--decoder.partialResultInterval=1000

noInputTimeout.enabled

Description: Enables the noInputTimeout timer for all recognitions.

Values: ‘true’ or ‘false’. Default value: ‘true’.

Location: /opt/cpqd/asr/config/engine/engine.conf, API

Example:

--noInputTimeout.enabled=true

noInputTimeout.value

Description: Maximum time to wait for speech to begin. After this period, the system ends the recognition and returns NO_INPUT_TIMEOUT.

Values: Value in milliseconds. Default value: 10000.

Location: /opt/cpqd/asr/config/engine/engine.conf, API

Example:

--noInputTimeout.value=10000

recognitionTimeout.enabled

Description: Enables the recognitionTimeout timer for all recognitions.

Values: ‘true’ or ‘false’. Default value: ‘true’.

Location: /opt/cpqd/asr/config/engine/engine.conf, API

Example:

--recognitionTimeout.enabled=true

recognitionTimeout.value

Description: Maximum time to wait for speech recognition results. If the recognition has not finished by the end of the defined period, the system will end the recognition and return RECOGNITION_TIMEOUT.

Values: Integer value in milliseconds. Default value: 30000

Location: /opt/cpqd/asr/config/engine/engine.conf, API

Example:

--recognitionTimeout.value=30000

decoder.confidenceThreshold

Description: Minimum recognition confidence value, to be considered valid and not return NO_MATCH.

Values: Integer number from 0 - 100. Default value: 30.

Location: /opt/cpqd/asr/config/engine/engine.conf, API

Example:

--decoder.confidenceThreshold=30

decoder.startInputTimers

Description: Automatically launches the enabled timers (noInputTimeout and recognitionTimeout) when recognition starts. If deactivated, the enabled timer will be started manually upon receiving the message START INPUT TIMERS.

Values: ‘true’ or ‘false’. Default value: ‘true’.

Location: /opt/cpqd/asr/config/engine/engine.conf, API

Example:

--decoder.startInputTimers=true

decoder.maxSentences

Description: Maximum number of probable results generated by the recognition (alternative sentences).

Values: Integer number greater than zero. Default value: 1.

Location: /opt/cpqd/asr/config/engine/engine.conf, API

Example:

--decoder.maxSentences=1

decoder.continuousMode

Description: Enables continuous mode recognition.

Values: ‘true’ or ‘false’. Default value ‘false’.

Location: /opt/cpqd/asr/config/engine/engine.conf, API

Example:

--decoder.continuousMode=true

decoder.wordDetails

Description: Controls if word details are to be displayed.

Values: Integer (0,1,2). Default value ‘1’.

  1. no details,

  2. only the first n-best result,

  3. all the n-best results.

Location: /opt/cpqd/asr/config/engine/engine.conf, API

Example:

--decoder.wordDetails=1

loggingTag

Description: A tag provided by the client application that is registered in the CPQD ASR logs to track a given interaction with a user. The Logging-Tag is defined by the client application and sent to the CPQD ASR by the MRCP, REST and WS APIs.

Values: The ID the user wishes to register in the logs.

Location: API

Example:

--loggingTag=CompanhiaTelecom