What’s Lexical Density?

Print anything with Printful



Lexical density measures the ratio of lexical to functional words in a text. It can be used to compare spoken and written vocabularies, with academic and government texts having high densities. However, it does not account for different word forms or individual lexical knowledge. Computational linguistics uses statistical analysis to study language, but struggles with complex grammar and pragmatics.

Lexical density refers to the ratio of lexical to functional words in a given text or collections of texts. It is a branch of computational linguistics and linguistic analysis. It is related to vocabulary, the known words of any individual and can be used to compare a person’s spoken and written lexicons. The lexicon differs from the total vocabulary in that it does not include function words such as pronouns and particles.

The density of a speech or text is calculated by comparing the number of lexical words and the number of function words. Short sentences and short texts can be calculated using mental arithmetic or simply by counting. Broader comparisons, such as of Charles Dickens or William Shakespeare, are made by entering the information into a computer program. The program will sort the text into function and lexical words.

Balanced lexical density is about 50 percent. This means that half of each sentence consists of lexical words and half of function words. A low density text will have a ratio of less than 50:50 and a high density text will have more than 50:50. Academic texts and government documents filled with jargon tend to produce the highest densities.

A flaw in calculating lexical density is that it does not take into account the different forms and cases of the constituent words. Statistical analysis aims only at studying the relationship between word types. It does not produce a study of an individual’s lexical knowledge. In that case, lexical density analysis would distinguish between forms such as “give” and “give”. In theory, lexical density can be applied to texts to study the frequency of certain lexical units.

A person’s written vocabulary can be aided through the use of dictionaries and thesauri. Such tools provide alternate words and clarify meanings. When speaking, a person must rely only on his mental vocabulary. This means that lexical density can be used as a tool for comparing spoken and written lexicons. The lexical density of spoken languages ​​tends to be lower than that of a written text.

Computational linguistics is a statistical modeling area of ​​linguistic analysis. It was born out of the Cold War and the American desire to use computers to translate Russian text into English. This required the use of mathematics, statistics, artificial intelligence and computer programming. The biggest problem for programmers was getting the computer to understand the complex grammar and pragmatics of the language. This gave rise to the China Room theory that computers can perform literal translations of words, but cannot ultimately understand languages.




Protect your devices with Threat Protection by NordVPN


Skip to content