Controlled vocabulary involves using pre-approved terms to classify information, contrasting with natural language. It improves search relevance and usability, but requires human intervention and may result in large volumes of information. Examples include search engines and business filing systems.
Controlled vocabulary is a concept in computing and computer programming that involves using only previously agreed upon or approved terms when constructing relational databases, searchable metadata, or other systems where human-readable words are used to mark up information for subsequent recovery. The methodology of using a controlled vocabulary to classify information is in direct contrast to the concept of natural language vocabulary, where there are no agreed-upon terms and instead all words used are connected by weighted relationships. In addition to the first-level words used in a controlled vocabulary, helper words can be used so that synonyms or other terms strongly associated with the first-level term can trigger the use of the first-level word. The main differences measured between natural language systems and controlled vocabulary systems are the relevance of the results of a query using words, the volume of information returned, and the overall usability of the system.
There are many instances where a collection of words or terms is used to make arbitrary, ever-changing, or disorganized information more accessible to users. Search terms within an Internet search engine, a corporate information database, and even a digital research library are all examples of applications through which information can be classified with metadata terms rather than a rigid hierarchical structure. The words used to describe an object in such situations build a sort of searchable index of the larger pool of information.
An example of the use of controlled vocabulary can be seen when considering a filing system for a business. Files must be classified so that they are easily and predictably recoverable. If a file is about cars, it might be filed under the “cars” category. If another person also has a file that deals with cars, without a controlled vocabulary, the file could be placed under the “cars” heading, making it difficult to find both files in one search. When categories are checked, all car related files will be placed in one agreed header.
The advantage of using a controlled vocabulary is that the information is rigorously described in a predictable way. This means that anyone who knows the vocabulary will be able to search for information effectively and accurately. A complication with vocabulary, however, is that search terms are more difficult, if not impossible, to auto-generate and usually require human intervention, making it a big chore to convert existing databases to use controlled vocabulary. If the vocabulary is not large enough, there is also the possibility that a single query will produce such a large volume of information that sorting without the use of another query method is impractical.
Protect your devices with Threat Protection by NordVPN