Restricted access

October 22, 2010

Data Integration in Four Steps

Filed under: Data Integration, ETL — Katherine Vasilega @ 1:29 am

With this post, I’d like to explore the process of designing a data integration solution. What is the sequence of the development stages? Which stage is the most challenging one? Let’s have a closer look at each stage.

The first step in data integration is identification of common elements
, and this is how you can do it:

    • Identify the common entities. Once the common entity is identified, its definition should be standardized. Do Customers include permanent customers only or those who made single purchases as well?
    • Identify the common attributes. What are the attributes common to customers: first name, last name, date of purchase, anything else? Each attribute should be defined.
    • Identify the common values. The same information can be represented in different forms across multiple source systems. For example, sex, can be represented as ‘M’ or ‘1′ or ‘male’ or something else by each source system. A common representation must be defined.

The second step of data integration is the appointment of Data Steward, who will own the responsibility for a particular set of data elements. The Data Steward ensures that each assigned data element:

    • Has clear and unambiguous definition
    • Does not conflict with other data elements
    • Has enumerated value definitions if it is of code type
    • Is still being used
    • Is being used consistently in various systems
    • Has suitable documentation on appropriate usage

The third step of data integration is to design an ETL process to integrate the data into the target. This is the most important stage of data integration, which I have discussed in the previous posts.

The final step of data integration is to establish a process of maintenance, reviewing and reporting of data elements.

The key to successful data integration is a clear vision coupled with a comprehensive plan that will cover each stage of the process. Data integration is not an easy task to perform. Still, when being performed properly, it will help your company to lower your costs, improve your decision making process and make it more flexible, and make your company more successful.

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment