Speech Detection¶
Parameters for configuring the speech segment detection.
endpointer.useToneDetectors¶
Description: Enables the suppression of telephone tones in the recognition.
Values: “true” or “false.” Default value: “true.”
Location: /opt/cpqd/asr/config/engine/engine.conf, API
Example:
--endpointer.useToneDetectors=true
endpointer.enabled¶
Description: Enables speech segment detection. If enabled, only the segment containing speech is processed and any surrounding silence is ignored. Otherwise, all the recorded audio is processed, increasing the time spent on recognition. Only when enabled will the starting and ending points of the speech be generated.
Values: “true” or “false.” Default value: “true.”
Location: /opt/cpqd/asr/config/engine/engine.conf
Example:
--endpointer.enabled=true
endpointer.headMargin¶
Description: Period of silence placed at the beginning of the speech segment.
Values: Integer number in milliseconds. Default value: 200.
Location: /opt/cpqd/asr/config/engine/engine.conf, API
Example:
--endpointer.headMargin=200
endpointer.tailMargin¶
Description: Period of silence placed at the end of the speech segment.
Values: Integer number in milliseconds. Default value: 400.
Location: /opt/cpqd/asr/config/engine/engine.conf, API
Example:
--endpointer.tailMargin=400
endpointer.waitEnd¶
Description: Time of silence to assume end of speech.
Values: Integer number in milliseconds. Default value: 1000.
Location: /opt/cpqd/asr/config/engine/engine.conf, API
Example:
--endpointer.waitEnd=1000
endpointer.levelMode¶
Description: Calculation of the amplitude threshold to be interpreted as silence.
Values: Number (0, 1 or 2). Default value: 2
Off. Ignores amplitude.
Automatic. Uses mean amplitude at the beginning of the audio, with a duration of “endpointer.autoLevelLen”, added to the fixed percentage defined by “endpointer.levelThreshold”.
Fixed. Percentage threshold defined by “endpointer.levelThreshold”.
Location: /opt/cpqd/asr/config/engine/engine.conf, API
Example:
--endpointer.levelMode=2
endpointer.levelThreshold¶
Description: Amplitude percentage of the signal to be considered as silence. Used only when
endpointer.levelMode = 2
orendpointer.levelMode = 1
. For example, withendpointer.levelMode = 2
andendpointer.levelThreshold=10
, we will have speech detected only when the signal is greater than 10% of the maximum amplitude. If levelMode=1, the mean amplitude level of the first audio segment will be added to the 10% of the amplitude.Values: Integer number between 0 and 100. Default value: 5.
Location: /opt/cpqd/asr/config/engine/engine.conf, API
Example:
--endpointer.levelThreshold=5
endpointer.autoLevelLen¶
Description: Length of the initial audio segment used to calculate the silence threshold. Used if levelMode = 1.
Values: Integer number in milliseconds. Default value: 300.
Location: /opt/cpqd/asr/config/engine/engine.conf, API
Example:
--endpointer.autoLevelLen=300