Restricted access

June 3, 2008

Solving Data Quality Problems in Three Steps

Filed under: Data Quality — Alena Semeshko @ 1:32 am

Vicki P. Raeburn wrote an article for titled “Talking with Your Business Partners about Data Quality,” in which she discusses what needs to be done to solve bad data problems.

Solving the bad data problem requires:

* Clearly defining the nature of the problems your business partners/customers are experiencing,
* Establishing priorities to tackle the most strategically important issues first, and
* Implementing an improvement plan with appropriate metrics and communications to your business partners/customers.

Vicki also names four dimensions for all business users to understand (which is also a problem, as most don’t even know the problem CAN be dealt with) and keep in mind whenever working with the corporate data: timeliness, accuracy, completeness and consistency 

  * Timeliness: Currency of data elements.
* Accuracy: Attributes of the entity (object) are correctly represented.
* Completeness: Breadth (number of entities) and depth (number of fields defined and populated).
* Consistency: Identity, definitions, hierarchies, standards and metrics are the same within and across databases.

1 Comment »

  1. I would like to add a couple of points to Vicki’s steps of resolving the data quality problems - (1) expectation management should be done before the project starts. Instead of waiting until the customers get upset on the data quality due to the wrong expectation. It is the sales effort to build up the proper expectation - the fix after the damage was done could only be half fix; (2) in most cases, the data error is the direct reflection of the domain know-hows your team has learned. If the processing team is just a group of data pushers, then the data quality is doomed to suffer. Training after training is the best way to ensure data quality.

    Comment by Larry Yen — October 2, 2008 @ 10:15 pm

RSS feed for comments on this post. TrackBack URL

Leave a comment