Data-Profiling

Data profiling refers to the largely automated process for analysis of existing data (eg in a database) by different analytical techniques. Through the data profiling the existing metadata is validated to real data and identified new metadata. In addition, existing data quality issues are validated, identified the causative data and information quality of the analyzed data measured. Through the data profiling no quality problems in the data are corrected, but only corrects the associated metadata.

The data profiling process

The data profiling analysis is an iterative process, the following four sub- steps ( cf. Apel et al 2010, p 110. ) Expires:

Data profiling method

The various data profiling process can be divided into attribute, row, and table analysis. In the attribute analysis all values ​​in a table column ( = attribute) as well as the properties of the attributes of a table are examined, in the data set analysis, all records in a table and the table analysis all relationships between different tables. For each of these three types of analysis, many different data profiling methods exist.

219417
de