Profiling in data warehousing project

by / Friday, 28 August 2015 / Published in Billet

Profiling into a data warehousing and business project can help success and more…

Primary expectation

A good profiler analyzes data, structure and all elements with a basic attitude:


All Data must be analyze and never thing that a source is accurate. Human is not perfect and can make some mistake.


overview profiling



Data dictionary:

It’s a collection of basic metadata about data attributes. It includes basic attribute listings, detailed descriptions and usage patterns, as well as reference information, including valid values and their meanings, default values, etc.

Data models:

Subject area models define main data subjects – categories of high level business objects whose data is stored in the database. Relational data models depict logical relationships between various entities and attributes.

Data profiling:

Data models and dictionary are the source of initial knowledge about data. Data profiling is a group of experimental techniques aimed at examining the data and understanding its actual structure and dependencies.


The reason it is so important is that actual data is often very different from what is theoretically expected. Over time data models and dictionaries become inaccurate. Data profiling is like an X-Ray showing the hidden truth. It is key to building correct data mappings and quality rules. As a rule of thumb, the more in-depth analysis and profiling we conduct the easier it is to design a comprehensive set of data mappings and quality rules and achieve greater success in data conversion and consolidations.

Difference with data cleansing

  • it shows content of the data.
  • it helps data governance committee to define data cleansing rules.
  • Data cleansing rules have been implemented by development team.
  • Conclusion: data profiling doesn’t give solution to resolve data quality issues.

All techniques

  • Data profiling is often mistakenly equated to attribute profiling. The cause of that mistake is the proliferation of efficient attribute profiling tools. However, comprehensive data profiling is a far broader exercise.
  • Techniques are:
    • Subject profiling

      examines subjects in different tables or on different systems and helps to find where the information about each subject is stored;

    • Relationship profiling

    • is an exercise in identifying entity keys and relationships as well as counting occurrences for each relationship in the data model. It is necessary to validate existing relational data models or build them when none are available;
    • Attribute profiling

    • examines values of individual data attributes and provides information about frequencies and distributions of their values. It helps to identify meaning and allowed values for an attribute;
    • Timeline profiling

    • looks for patterns in historical data, such as temporal distribution of the data, patterns of values for different time periods, etc…;
    • State-transition model profiling

    • examines lifecycle of state-dependent objects and provides actual information about the order and characteristics of states and actions. It helps build or validate state-transition models;
    • Dependency profiling

    • uses various pattern recognition techniques to find hidden relationships between attribute values.

Others articles

State transition model 








state-transition timeline profiling Analyzing Mining Mining Profiling relationship profiling

Subject profiling Profiling


Get Free Email Updates!

Signup now and receive an email once We publish new content.

We will never give away, trade or sell your email address. You can unsubscribe at any time.

Please follow and like us:
David-Marc Petit

David-Marc Petit

Président at DWBI Expert
David-Marc PETIT est le président de DWBI Expert Inc. Il cumule plus de 20 ans d’expérience dans des entreprises de toutes tailles et tous secteurs, sur trois continents, en tant qu’expert en intelligence d’affaires. Il a fait de sa mission la démocratisation de l’intelligence d’affaires (Business Intelligence) pour optimiser les revenus et la performance de ses clients.
David-Marc Petit
David-Marc Petit

Latest posts by David-Marc Petit (see all)

Leave a Reply


Enjoy this website? Please spread the word :)