Restricted access

September 2, 2008

Data Quality Metrics in Data Warehousing

Filed under: Data Quality, Data Warehousing — Alena Semeshko @ 3:26 am

A question was posed to a expert as to what metrics should be used for a data warehousing project.

The expert (William McKnight from Lucidity Consulting) recommended the following three as most valuable:

# Business return on investment (ROI) - Are you getting the bottom line success with your project?
# Data usage - Is your data used as intended by the users?
# Data gathering and availability - Is your data available to the extent it should be?

He also mentioned up time, cycle end times, successful loads and clean data levels as secondary technical metrics to pay attention to.

In short, you want to eliminate intolerable defects – as defined by the data stewards. These defects come in 10 different categories: referential integrity, uniqueness/deduplication, cardinality, subtype/supertype constructs, value domains/bounds, formatting errors, contingency conditions, calculations, correctness and conformance to “clean” set of values.

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment