[ad_1]
Dirty data is inaccurate electronic data caused by errors or intentional omissions. It can be corrected by updating or replacing outdated information. Some dirty data is intentional and may not be corrected if it has no impact on business function.
Dirty data is a term used to describe any type of electronic data that is obsolete, incomplete, or otherwise inaccurate. Data of this type can be created due to data entry errors, failure to update data periodically, or even entering the same data more than once. Sometimes, erroneous data is nothing more than punctuation errors in the text of electronic documents. In other cases, dirty data can be intentionally misleading information, such as attempts to change accounting records to present a specific image to investors and others.
For the most part, the accumulation of dirty data in any type of database is unintentional. Individuals entering new information into the database may misspell words, omit punctuation that is important to understanding the intent of the text, or fail to follow a specific formatting strategy. With situations like this, correcting the incorrect information is a relatively simple process that requires nothing more than altering the incorrect text and saving the changes. Companies sometimes manage this process by re-reading the data after it has been entered and making any necessary updates.
Dirty data can also occur from existing records not being updated when information changes. For example, if salespeople fail to update customer files when personnel changes occur with a particular customer, those files are no longer accurate and are considered dirty. As with correcting spelling and punctuation errors, taking the time to remove outdated information and replace it with current data helps increase the overall usability of the database.
There are situations where creating dirty data is intentional. Businesses may choose to omit specific information from a database in order to create a specific perception about finances, such as highlighting the amount of revenue generated for a given period, but choosing not to enter data relating to the amount of revenue collected for the same period. In this type of dirty data, the information presented is as accurate as possible, but is considered incomplete.
With some types of dirty data, the decision may be not to put the time and effort into making corrections. This is common when bad data has no impact on the ability of the business to function properly or has no potential to cause major disruption. This means that virtually any entity that maintains some kind of database probably has at least some dirty data interspersed with other up-to-date and accurate information.
[ad_2]