Restricted access

March 7, 2008

Data cleansing…cleans data

Filed under: Data Cleansing, Data Quality — Tags: , — Alena Semeshko @ 5:42 am

As I mentioned in the previous post, data cleansing deserves a post of its own. Even more than just one post actually.

Well, it’s obvious data is the key player in business decision-making. Good clean data provides the platform for wise decisions that put the company’s profits onto an upward curve.

Acquiring the right data, however, is not always as simple as it seems. The techniques are many, but the effect from them doesn’t always meet the expectations. That’s where data cleaning technologies come in place. Data cleaning software cleanses the initial data, making it more precise, useable and up-to-date. Techniques used in data cleaning, among others, include:
• Data merge from data sources
• Record matching and synchronization
• Data type and format conversion
• Data segmentation

In this post I want to focus more on record matching and data synchronization.

An example that is often used in this regard is name and address data. Name, address and phone information is the quickest to get outdated and easiest to get wrong. Of course, there are directories and yellow pages that you can always check…but if you do it by hand each time you encounter a mistake, that’s an impermissible luxury in that it takes way (I mean waaaaaaay) too much time.

That’s pretty much the reason and the root of data synchronization technologies. They process the data, compare it to the standard and return a valid quality dataset with all possible mistakes (misspellings, wrong street type extensions, city and state names) eliminated. Apatar’s StrikeIron US Verification data quality service, for instance is one of such tools.

Employing sophisticated matching and data synchronization technology, it first closely inspects each address to ensure its validity and then updates incorrect addresses according to postal standards and cleans customer data before it gets into CSM/ERP systems, databases, flat files, and RSS feeds. It also adds ZIP+4 data, specifying congressional districts, carrier routes, etc. Data cleansing tools of this sort are indispensible in business today. They allow companies to increase productivity, improve sales strategies, and deliver a better and more accurate customer service.

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment