Attribute profiling in data warehouse
Attribute profiling examines values of individual data attributes and provides information about frequencies and distributions of their values. It helps to identify meaning and allowed values for an attribute.
Attribute profiling basics
Attribute profiling examines the values of individual data attributes and produces three categories of metadata for each attribute: basic aggregate statistics, frequent values, and value distribution.
Basic aggregate statistics include prevailing value type; number of Null values; minimum, maximum, average value, and other aggregate statistics. The key component of the attribute profile shows the most frequently occurring values along with their counts and percentages. Overall distribution of values is helpful for analysis of numeric and date/time attributes.
Attribute profiling basics (How to build it)
Data profiling is quite difficult, but some technologies emerged that are helpful. Many data profiling tools provide built-in functionality to collect comprehensive attribute profiles in a single pass. In absence of a tool attribute profiles can still be gathered (though somewhat less efficiently) using various aggregating queries.
Advanced attribute profiling techniques
On the surface attribute profiling tools only allow to collect profile on simple table fields. In reality same tools and approaches can be used for in-depth profiling and data analysis.
One technique is to collect profiles on data expressions. Another technique is to define conditions that filter out some of the records. A more technique is to build a complex data stream based on a query or a stored procedure.
- 4State-transition model profiling examines life...
- Timeline profiling looks for patterns in histor...
- Analyzing profiling results Data profiles provi...
- Mining basic statistics Attribute profiling pro...
- Attribute profiling examines values of individu...