[ad_1]
Data mining and data warehousing are often confused, but have different designs, methodologies, and purposes. Data mining uses pattern recognition to identify trends in a sample set, while data warehousing extracts and stores data for easier reporting. Data mining is used for targeted marketing and fraud detection, while data warehousing is part of business intelligence. The specifications used to define the sample set impact the relevance and accuracy of the analysis.
The terms data mining and data warehousing are often confused by both business and technical personnel. The entire field of data management has experienced phenomenal growth with the implementation of data collection software programs and the decreasing cost of computer memory. The primary purpose of both of these functions is to provide the tools and methodologies for exploring the patterns and meaning of large amounts of data.
The main differences between data mining and data warehousing are the design of the system, the methodology used, and the purpose. Data mining is the use of pattern recognition logic to identify trends within a sample data set and extrapolate this information against the larger data pool. Data warehousing is the process of extracting and storing data to enable easier reporting.
Data mining is an umbrella term used to describe a set of business processes that derive models from data. Typically, a statistical analysis software package is used to identify specific patterns, based on the data set and queries generated by the end user. A typical use of data mining is to create targeted marketing programs, identify financial fraud, and report unusual behavior patterns as part of a security review.
An excellent example of data mining is the process used by telephone companies to market products to existing customers. The telephone company uses data mining software to access its database of customer information. A query is written to identify customers who have subscribed to the basic phone package and Internet service in a specific time interval. Once this dataset is selected, another query is written to determine how many of these customers used free additional phone features during a trial promotion. The results of this data mining exercise reveal patterns of behavior that can guide or help refine a marketing plan to increase the use of additional phone services.
It is important to note that the primary purpose of data mining is to find patterns in the data. The specifications used to define the sample set have a huge impact on the relevance of the output and the accuracy of the analysis. Going back to the example above, if your dataset is limited to customers within a specific geographic area, your results and patterns will differ from a larger dataset. While both data mining and data warehousing work with large volumes of information, the processes used are quite different.
A data warehouse is a software product used to store large volumes of data and run specially designed queries and reports. Business intelligence is a growing field of study that focuses on data warehousing and related capabilities. These tools are designed to extract data and store it in a method designed to provide enhanced system performance. Much of the terminology in data mining and data warehousing is the same, leading to more confusion.