Wednesday, July 27, 2011

Data Mining

The term data mining is a step in a larger process called knowledge extraction in databases in the business environment, both terms are used interchangeably. What really makes the data mining is to bring the advantages of various areas such as Statistics, Artificial Intelligence, Computer Graphics, Databases and bulk processing, mainly used as raw material databases. A traditional definition is: A non-trivial process of identifying valid, novel, potentially useful and understandable patterns that are hidden in the data. From the point of view, we define it as: The integration of a number of areas that are intended to identify a knowledge obtained from the databases to provide a bias decision making.

The idea of data mining is not new. Since the sixties the statistical data handled as fishing terms, data mining or data archeology with the idea of finding correlations without a priori hypothesis in databases with noise. In the early eighties, Rakesh Agrawal, Gio Wiederhold, Robert Blum, Gregory Piatetsky-Shapiro, among others, began to consolidate the terms data mining. In the late eighties there were only a couple of companies dedicated to this technology in 2002 there more than 100 companies in the world offering over 300 solutions. Discussion lists on this topic the researchers are more than eighty countries. This technology has been a good meeting point between people from the academia and business.

Data mining is a technology that integrates stages consists of several areas and should not be confused with a great software. During the development a project of this type using different software applications in each stage can be statistical, data visualization and intelligence artificial, mainly. Currently there are applications and tools commercial data mining very powerful utilities contain a wealth of facilitating the development of a project. However, almost always end complemented by another tool. The data mining is the discovery stage in the KDD process: Step consistent in the use of specific algorithms that generate a list patterns from the preprocessed data. Although are often used interchangeably the terms Data Mining.

Labels: