Restricted access

December 21, 2010

Data Migration Project Training

Filed under: Data Migration — Katherine Vasilega @ 6:20 am

Data migration projects typically require a lot of additional tools and support documents to function smoothly. To deliver the project successfully, organizations need data migration training. Ideally, the resources should be fully trained in advance. This requires some learning courses.

The training can be held in the form of a role-based game, where all parties involved in a data migration project are presented, e.g. business users, data stewards, IT department, etc.

Data migration training can also be held in the form of classroom-style seminars. These lessons should cover the following topics:

    • Introduction to data migration: value, strategy, technologies
    • Data migration fundamentals: architecture, source and target metadata definitions, creating data maps, validating, tracing, debugging, and data assessment.
    • Data migration tool evaluation and selection
    • Demonstrations and hands-on exercises on migrating an object from the source system to the target
    • Overview of data quality within the data migration process
    • Managing data migration projects

Carrying out a data migration project can be a complex procedure. Ensure that all your training materials and education tools are tested and in place prior to the data migration project inception.

December 20, 2010

Testing Your Data Migration Solution

Filed under: Data Migration — Katherine Vasilega @ 8:32 am

Data migration solutions should be carefully tested before and after the migration process starts. A well-organized testing strategy must include the following stages to ensure data quality in the data migration process:

Testing before Migration

During this stage you should verify scope of source/target systems and data with business users and IT department. It is also useful to test source and target system connections from the data migration platform.

Data Migration Design Review

The design review of the data migration specifications can include:

    • Source data sets and queries
    • Data mappings
    • Number of source records
    • Identification of supplementary sources
    • Data cleansing requirements

The outcome of this stage of data migration should be summarized in a list of open issues, which have to be resolved before the actual migration starts.

Testing after Data Migration

Post-migration is typically performed in a test environment and includes:

    • Testing the migration process (number of records per unit time)
    • Comparing migrated records to records generated by the destination
    • Comparing migrated records to sources

It is recommended that you use both automated tools and skilled data migration resources for successful solution testing.

December 17, 2010

Gartner Makes Its Predictions for Data Integration

Filed under: Data Integration — Katherine Vasilega @ 2:34 am

The year of 2010 is approaching its ending, and more and more predictions for 2011 are coming out to the public. It’s time to see what we have accomplished and what we are heading for in the area of data integration management. Good news is that Gartner published a report “Predicts 2011: Master Data Management Is Important in a Tough Economy, and More Important in Growth.” Let’s see what kind of MDM predictions for 2011 analyzed by Gartner experts it includes:

1. From 2009 through 2014, MDM software markets will grow at a Compound Annual Growth Rate (CAGR) of 18 percent, from $1.3 billion to $2.9 billion.

Data integration and MDM is a fast growing software market. The growth provides a major business opportunity for software vendors that specialize in these areas. The rapid growth of the market means that skilled MDM resources are in great demand among software and service providers. As a result, end-user organizations will struggle to adequately resource their MDM programs.

2. By 2015, 10 percent of packaged data management implementations will be delivered as software-as-a-service in the public Cloud.

Data integration software vendors will seek to leverage Cloud computing. Once organizations gain more experience with the public and private Clouds, the early adopters will seek to gain the same benefits with a wider range of software, including packaged MDM solutions.

3. Through 2015, 66% of organizations that initiate an MDM program will struggle to demonstrate the business value of it.

When IT departments initiate data integration projects, they often struggle to get the business on board and to demonstrate the business value of these projects. MDM needs to be guided by the business strategy, and will require strong involvement of business stakeholders and managers.

Today it is not enough to throw technology at the problem of inconsistent master data. Getting the proper governance, establishing the right data stewardship roles and responsibilities will be vital to the success of data integration initiatives. Meanwhile, I will watch for more predictions to appear and will try to suggest the most useful pieces of data for your consideration and review.

December 15, 2010

The Role of Data Stewards in Data Integration

Filed under: Data Quality — Katherine Vasilega @ 3:52 am

The amount of data gathered from different sources can very quickly become overwhelming. For effective data integration, all this data must be maintained and managed. This is where data stewards come into play.

Data stewards do not own the data for themselves and do not have complete control over it. Their main role is just to ensure that data will be accurate and that it will pass the quality standards agreed upon by the company. They perform their duties before, in the process, and after data integration, which helps maintain the information in the long run.

To be effective, data stewards need to work together with the database administrators, data architects, and anyone who is also involved in data management and data integration in the organization. Aside from technical skills, a data steward should have a clear way of communicating issues and ideas during the data integration process.

Responsibilities of a data steward include but are not limited to:

    • Ensuring that the new data doesn’t overlap any existing, contradicting data.
    • Looking for possible errors in the data structure.
    • Ensuring that the data is error-free.
    • Performing data warehousing
    • Approving the consistency of data

Data stewards are accountable for enhancing data quality, especially during data integration activities. Their primary role is to ensure that the data governance goals of the company are met.

December 14, 2010

Data Integration Tools: A Point of Convergence

Filed under: Data Integration — Katherine Vasilega @ 7:56 am

Organizations are often adding business intelligence tools, rather than subtracting them. The problem is that with each tool you add, you also increase the complexity of your IT architecture, as well as the costs of your team’s training, software licensing and maintenance. This is especially true for data integration tools: ETL (extract, transform, and load), EAI (enterprise application integration), and EDR (enterprise data replication).

You can cut costs of your data integration initiatives if you review your data integration tools and their usage throughout your organization. ETL tools are built to move large amounts of data, while transforming it to match the business rules. EAI tools specialize in bite-sized, consumable pieces of information such as found in operational or transaction systems. EDR tools provide a mechanism to identify changes to datasets.

All three areas of usage are closely related to each other, which leads to the conclusion that the convergence of ETL, EAI, and EDR functionality is a good starting point for the modern data integration tool. Making a single tool available to diverse project teams for their data integration needs increases the productivity and cuts costs of data integration.

December 13, 2010

Data Integration Best Practices: Using Taxonomies

Filed under: Data Integration, Data Quality — Katherine Vasilega @ 8:20 am

Data taxonomies are tree-structured classification systems, which provide increasing refinement of classes as you go deeper into the tree. Here are some tips for working with taxonomies when building a data integration solution.

    1. If the data is rich enough, you might not need taxonomies at all, as you may be able to find what you need using a keyword search. Taxonomies are only needed when there is no other data available to assist classification.

    2. Your taxonomy is never going to go away once you have it. Nodes are only going to be added to it, not removed. So keep it as small and simple as you can, and try to minimize the addition of new nodes.

    3. You have to understand what kind of the taxonomy is going to be used in the data integration solution. Most taxonomies are designed with human browsing in mind. On the other hand, they can be built with an intent to reduce the search space for an item when the data set is large. There may also be the need to automatically classify a data item into the taxonomy. The features that make a taxonomy detectable to business users are not be the same ones that make it easies to be processed by electronic systems.

    4. If you need a taxonomy for electronic systems, try to keep it small. This makes classifiers much easier to build.

    5. Have a precise data-labeling policy, don’t ever label a data point with both a parent and child class from the taxonomy.

You have to keep in mind that sometimes the need will arise to ingest a new data source into the existing system. This data source will have its own classification that will be not quite compatible with the existing one. This is why you should avoid deep and highly refined taxonomies in your data integration solution in general.

December 10, 2010

Data Integration Predictions for 2011

Filed under: Data Integration — Katherine Vasilega @ 7:59 am

It’s time to make prognosis about the upcoming year 2011. Here are some trends in data integration that are likely to get further development in 2011.

Enhanced data availability

The data is not going to be locked up at the corporate warehouses anymore. Many businesses move to the Cloud, and so will their master data. Organizations start seeing benefits of sharing information and making their data more open.

Business and IT will converge more

The difference between IT staff and marketing teams gets less obvious. Business people get more and more involved in using data integration techniques in their everyday activities. Business people have to be more educated about information technologies. On the other hand, IT specialists need marketing skills to promote their projects and tools. Successful data integration initiatives are impossible without involving both IT and business users.

Data integration tools will enhance further

Data integration and migration tools will become more user-friendly as business users need access them to manage data. Future tools will focus on work flow features, reporting and better graphical user interfaces to provide business users with more opportunities.

In 2011, the business will rely on data more, than it ever did before. Today, digital data is a huge part of our lives. No matter if the economy turns up or down, data integration industry will continue to deliver sophisticated solutions, to provide top quality of data.

December 9, 2010

Quality of Transformed Data in Data Integration

Filed under: Data Integration — Katherine Vasilega @ 7:41 am

Ensuring data quality after transformation is the most difficult part of data integration procedures. Data transformation algorithms often rely on the theoretical data definitions and data models, rather than on actual information about data content. Since this information is usually incomplete, outdated, and incorrect, the converted data looks nothing like what was expected before the data integration project started.

Every system consists of three layers: database, business rules, and user interface. As a result, what users see is not what is actually stored in the database. This is especially true for legacy systems, which are notorious for elaborate hidden business rules. Even if the data is transformed with accuracy, the information that comes out of the new system will be totally incorrect, if you are not aware of those rules.

Moreover, the source data itself can be in issue in data integration. Inaccurate data tends to spread like a virus during the transformation process. A data cleansing initiative is typically necessary and must be performed before, rather than after, transformation.

To gain data quality, you have to precede the transformation stage with extensive data profiling and analysis. In fact, data quality after the transformation is directly related to the amount of knowledge about the actual data you possess. Lack of an in-depth analysis will guarantee a significant loss of data quality in data integration. In an ideal data integration project, 80 percent of the time should be spent on data analysis, and 20 percent on designing transformation rules. In practice, however, this rarely occurs. Therefore, the initial stage of data integration process needs full attention of your team.

December 8, 2010

Data Migration Project Planning

Filed under: Data Migration — Katherine Vasilega @ 2:17 am

When engaging a data migration project, you need to identify the migration activities in advance. Here is a check-list to help you with migration planning. These are the important points you have to check when getting ready for data migration.

Project estimates should be based on actual strategy. The data migration project should provide an accurate analysis of cost and resource requirements. You can perform a pre-migration impact assessment to verify the cost and likely outcome of the migration.

Identify the key project resources. You should inform your business and technical teams of their roles in your data migration project. You need to have an accurate set of tasks and responsibilities, so that every member of the team must understand what it is expected to deliver and in what order.

Determine the best development process for the project. Data migration projects require agile project planning with highly focused delivery points. Ensure you have sufficient capacity in your plan to cope with the high possibility of delay.

Provide your team with training. Data migration typically requires a lot of additional tools, therefore your team will definitely need some training. You have to ensure that all your training materials and education tools are tested and in place.

Agree on security policies. The data migration project staff is expected to handle data securely. That is why you have to agree on security policy and sign the corresponding documents.

The pre-migration stage involves close collaboration of data analysts, business users, and IT staff. Careful data migration planning ensures that the project initiation phase runs effortlessly.

December 7, 2010

Data Pruning in Data Migration

Filed under: Data Migration — Katherine Vasilega @ 2:30 am

The easiest way to save money and speed up your data migration project is to only move the data that is essential to your company. As a gardener trims away dry wood, so must the data migration expert look for opportunities to prune data that offers little value to the business.

Gaining data quality requires setting a clear strategy for data pruning in your data migration project. This is how you can approach it:

    • All data should be excluded from the target warehouse, unless you can justify a valid business or technical reason for including it in data migration.

    • Data profiling and scoping should be done as part of pre-migration activities.

    • Use advanced data quality tools combined with professional expertise to understand whether data is acceptable for migration.

    • Leverage data migration tools that do more than just moving data. Matching, standardization, transformation, and cleaning are very important features.

    • Assign a value to datasets and explain to business users why duplicated records increase costs and reduce quality.

    • Appoint a data steward who is familiar with all the key business information in your company. This person will supervise the data migration project in general, and will be responsible for data pruning in particular.

The importance of accurate data pruning in data migration projects should not be underestimated. For example, it requires about two hours to migrate a single attribute of data. If you have100,000,000 records against this attribute, it will take much longer to analyze, test and migrate them, than if you have 1000.

« Older PostsNewer Posts »