What’s an acoustic model?

Print anything with Printful



An acoustic model is a map of voice in relation to printed words used in speech recognition programs. It is one of two main files needed, along with a language model. Speech recognition software requires a microphone and sound processing program. Acoustic models can recognize variations in pronunciation and interpret multiple languages. Computational linguistics is the field that develops speech recognition technology. Acoustic models can also be used in music and psychoacoustics.

An acoustic model is essentially a map of the voice in relation to a set of printed words. This technology is used in speech recognition programs to help a computer learn to recognize patterns in a person’s speech. An acoustic model is one of the two main files needed to run a speech recognition program; the other is the language model, which indicates the probable words and speech patterns that can be used by the speaker. These models are created by comparing the sound details of a spoken audio file with the text of the spoken words.

Speech recognition software is software designed to recognize and transcribe or respond to words spoken by a person. Many operating systems are designed with basic built-in speech recognition features that the user can turn on and off. Speech recognition capabilities on operating systems usually give the user the ability to control the computer and type words on the screen using his or her voice.

To access speech recognition software, a user needs a microphone to transmit his voice to the computer, as well as a program that processes sound. While many computers have built-in microphones, an external headset microphone allows the user the benefit of clearer voice sound and the freedom to move around the room while speaking. Standalone speech recognition software brands include LumenVox®, Loquendo® and Dragon®.

Most speech recognition programs have acoustic model programming that allows the program to recognize variations in pronunciation. They use patterns in the sound of the speaker’s voice to identify words in speech. Many are designed with configuration software to help you create an acoustic model designed to interpret your voice. Some advanced speech recognition programs can identify and interpret multiple languages, often with a small amount of sound information. The more advanced a speech recognition program, the more likely it is to accurately interpret words based on its context, including where a word is pronounced in a sentence.

The field of study that develops speech recognition technology is called computational linguistics. Computational linguistics involves the study and design that creates software that is programmed to understand human language. This field often incorporates information from the study of psychology to create acoustic models that can more accurately interpret speech.

The word “acoustics” generally refers to anything that has to do with sound. While acoustic models are most often used in speech recognition, they can also be used in music. An acoustic model of a music track can identify properties such as beats per minute, musical keys, or dominant pitches in the music. This information can be used by a computer program to identify a music track, or it can be used to roughly determine the genre into which the music is likely to be classified. Acoustic models are also being used in a field of study called psychoacoustics, where researchers hope to learn how to structure music that predictably affects the brain.




Protect your devices with Threat Protection by NordVPN


Skip to content