Restricted access

October 6, 2008

Data Quality - Upstream or Downstream?

Filed under: Data Quality — Alena Semeshko @ 3:36 am

I keep wondering how come data quality check still exists as a procedure performed once in a while, rather than as a part of the front-end process? How come most companies start worrying about the quality of your data only when it’s already dirty and in use? How come it doesn’t occur to them that the quality of data needs to be thought through before it’s actually captured? Even at the early stages of data capturing, data quality aleady plays an important role in the future of the company. It is the early stages that make a difference in how your data turns out and if it will pay off later on.

A recent Forrester paper titled It’s Time To Invest In Upstream Data Quality suggests that when companies realize short-term data cleanup ROI immediately, it’s hard to justify front-end investments that may take years.

At the same time, Forrester says, IT budget planning committees tend to avoid the existing data quality (DQ) products that allow integrating downstream data hygiene rules into front-end processes, justifying this by solutions’ cost and complexity.

The result? I&KM (Information and Knowledge Management) pros quickly reach diminishing return on data quality investments, requiring even more investments later on to catch up with missed opportunities like verifying customer contact information, standardizing product data, and eliminating duplicate records.

Read the paper to find out how to break this cycle and identify the optimal DQ solution downstream and audit source systems that cause the most significant data issues upstream.


  1. We have been advocates of upstream data quality for many years now. I would take issue with the Forrester analyst who wrote this report on one issue; that of ROI. We work primarily in the marketing automation and CRM space, where we have found there are often people who invest substantial amounts of time manually editing “dirty” and “incomplete” data. In these cases, employing upstream data quality tools produces immediate and often substantial ROI.

    In addition, from upstream processes, we have been able to consistently demonstrate substantial reductions in duplicate records and almost complete elimination of “trash” records (i.e., the Homer Simpson and Darth Vader records that always show up). We have also witnessed increases in the immediate usability of data by auto-appending variables marketing folks use for audience selection, such as job title or department codes.

    It is interesting to note that the “upstream” processing of data is just now beginning to catch on. We have offered upstream software solutions that specifically address the needs of marketing and sales people for a number of years, but find the concept is still new to many people. The IT community has been the slowest to adopt upstream data quality tools, probably due to the fact that most IT professionals are simply used to the process being part of back end database maintenance.

    Comment by Mark Baran — March 16, 2009 @ 8:31 am

  2. Thank you, Mark. You’re right about IT community. It can be, a kind of, conservative sometimes. :)

    Comment by Olga Belokurskaya — March 27, 2009 @ 3:54 am

RSS feed for comments on this post. TrackBack URL

Leave a comment