Restricted access
 
 

February 24, 2011

What Is The Difference Between Data Conversion and Data Migration?

Filed under: Data Migration — Katherine Vasilega @ 8:28 am

The terms data conversion and data migration are still sometimes used interchangeably on the internet. However, they do mean different things. Data conversion is the transformation of data from one format to another. It implies extracting data from the source, transforming it and loading the data to the target system based on a set of requirements.

Data migration is the process of transferring data between silos, formats, or systems. Therefore, data conversion is only the first step in this complicated process. Except for data conversion, data migration includes data profiling, data cleansing, data validation, and the ongoing data quality assurance process in the target system.

Both terms are used as synonymous by many internet resources. I think the reason for that might be that there are very few situations when a company has to convert the data without migrating it.

Data conversion possible issues

There are some data conversion issues to consider, when data is transferred between different systems. Operating systems have certain alignment requirements which will cause program exceptions if these requirements are not taken into consideration. Converting files to another format can be tricky as how you convert it depends on how the file was created. These are only few examples of possible conversion issues.

There are some ways to avoid data conversion problems:

    1. Always transform objects into printable character data types, including numeric data.
    2. Devise an operating system-neutral format for an object transformed into a binary data type.
    3. Include sufficient header information in the transformed data type so that the remainder of the encoded object can be correctly interpreted independent of the operating system.

Data conversion is often the most important part of data migration. You have to be very careful during this stage to assure data quality in your target system.

December 21, 2010

Data Migration Project Training

Filed under: Data Migration — Katherine Vasilega @ 6:20 am

Data migration projects typically require a lot of additional tools and support documents to function smoothly. To deliver the project successfully, organizations need data migration training. Ideally, the resources should be fully trained in advance. This requires some learning courses.

The training can be held in the form of a role-based game, where all parties involved in a data migration project are presented, e.g. business users, data stewards, IT department, etc.

Data migration training can also be held in the form of classroom-style seminars. These lessons should cover the following topics:

    • Introduction to data migration: value, strategy, technologies
    • Data migration fundamentals: architecture, source and target metadata definitions, creating data maps, validating, tracing, debugging, and data assessment.
    • Data migration tool evaluation and selection
    • Demonstrations and hands-on exercises on migrating an object from the source system to the target
    • Overview of data quality within the data migration process
    • Managing data migration projects

Carrying out a data migration project can be a complex procedure. Ensure that all your training materials and education tools are tested and in place prior to the data migration project inception.

December 20, 2010

Testing Your Data Migration Solution

Filed under: Data Migration — Katherine Vasilega @ 8:32 am

Data migration solutions should be carefully tested before and after the migration process starts. A well-organized testing strategy must include the following stages to ensure data quality in the data migration process:

Testing before Migration

During this stage you should verify scope of source/target systems and data with business users and IT department. It is also useful to test source and target system connections from the data migration platform.

Data Migration Design Review

The design review of the data migration specifications can include:

    • Source data sets and queries
    • Data mappings
    • Number of source records
    • Identification of supplementary sources
    • Data cleansing requirements

The outcome of this stage of data migration should be summarized in a list of open issues, which have to be resolved before the actual migration starts.

Testing after Data Migration

Post-migration is typically performed in a test environment and includes:

    • Testing the migration process (number of records per unit time)
    • Comparing migrated records to records generated by the destination
    • Comparing migrated records to sources

It is recommended that you use both automated tools and skilled data migration resources for successful solution testing.

December 8, 2010

Data Migration Project Planning

Filed under: Data Migration — Katherine Vasilega @ 2:17 am

When engaging a data migration project, you need to identify the migration activities in advance. Here is a check-list to help you with migration planning. These are the important points you have to check when getting ready for data migration.

Project estimates should be based on actual strategy. The data migration project should provide an accurate analysis of cost and resource requirements. You can perform a pre-migration impact assessment to verify the cost and likely outcome of the migration.

Identify the key project resources. You should inform your business and technical teams of their roles in your data migration project. You need to have an accurate set of tasks and responsibilities, so that every member of the team must understand what it is expected to deliver and in what order.

Determine the best development process for the project. Data migration projects require agile project planning with highly focused delivery points. Ensure you have sufficient capacity in your plan to cope with the high possibility of delay.

Provide your team with training. Data migration typically requires a lot of additional tools, therefore your team will definitely need some training. You have to ensure that all your training materials and education tools are tested and in place.

Agree on security policies. The data migration project staff is expected to handle data securely. That is why you have to agree on security policy and sign the corresponding documents.

The pre-migration stage involves close collaboration of data analysts, business users, and IT staff. Careful data migration planning ensures that the project initiation phase runs effortlessly.

December 7, 2010

Data Pruning in Data Migration

Filed under: Data Migration — Katherine Vasilega @ 2:30 am

The easiest way to save money and speed up your data migration project is to only move the data that is essential to your company. As a gardener trims away dry wood, so must the data migration expert look for opportunities to prune data that offers little value to the business.

Gaining data quality requires setting a clear strategy for data pruning in your data migration project. This is how you can approach it:

    • All data should be excluded from the target warehouse, unless you can justify a valid business or technical reason for including it in data migration.

    • Data profiling and scoping should be done as part of pre-migration activities.

    • Use advanced data quality tools combined with professional expertise to understand whether data is acceptable for migration.

    • Leverage data migration tools that do more than just moving data. Matching, standardization, transformation, and cleaning are very important features.

    • Assign a value to datasets and explain to business users why duplicated records increase costs and reduce quality.

    • Appoint a data steward who is familiar with all the key business information in your company. This person will supervise the data migration project in general, and will be responsible for data pruning in particular.

The importance of accurate data pruning in data migration projects should not be underestimated. For example, it requires about two hours to migrate a single attribute of data. If you have100,000,000 records against this attribute, it will take much longer to analyze, test and migrate them, than if you have 1000.

November 22, 2010

Data Migration: the Earlier Business Users Get Involved, the Better

Filed under: Data Migration — Katherine Vasilega @ 7:51 am

One of the biggest challenges of data migration is getting business users involved on early stages. Data migration professionals often fail to provide data collaboration with business users. This has a great impact on the success of data migration projects. When business data experts are not engaged, the technical team has to decide what data needs to be migrated and how it should be mapped. When a technical team makes these decisions instead of a business team, this could end up in re-writing data mappings and business rules of the entire data migration solution.

What can you do to resolve these data migration issues? Here are some recommendations:

    1. Make sure that you have the correct source system. Validate that you are pulling out all the information needed without any duplicated records.
    2. Profile the source data and share that information with business users before data mapping.
    3. When the data mapping is complete, profile the target data to ensure that you have the right information and share your conclusions with the business team.
    4. Continue profiling and evaluating the quality of the data throughout the data migration process.

These data migration practices require a close collaboration of the entire team, including data stewards, developers and the end-users. You should start data migration with a project plan. Then you can appoint data stewards and data experts. After that, be sure to gain commitment from them for mapping, data quality review, validation, and scheduling the entire project. This will make your data migration efforts much more efficient and prevent you from doing a useless job.

November 18, 2010

Data Migration Process Defined

Filed under: Data Migration — Katherine Vasilega @ 5:05 am

Data migration is a crucial operation within any enterprise and its failure can be catastrophic. So what are the stages of the successful data migration process? Here they are:

1. Source system exploration: Although source systems may contain thousands of fields, some might be not needed in the target system. During this stage, you have to identify, which data is required and where it is located. You also have to decide what data is redundant and not necessary for the migration.

2. Data assessment: Next, you have to assess the quality of the source data. If the new system fails due to data inconsistencies, incorrect or duplicate data, there is very limited value in migrating data to the target system. To assess the data, use the data profiling. Profiling identifies data defects at the table and column levels.

3. Data migration solution design: You have to define the technical architecture and design of the migration processes. In addition, you have to define the testing processes and determine whether there will be a one-way or bi-directional datamigration, whether you will purchase a data migration tool or build a customone.

4. Execution: In the majority of cases, the source systems are shut down during the data migration execution. In some cases, a zero-downtime migration approach may be needed. This requires data migration software to provide the initial load processes with additional data synchronization technology. It will allowcapturing changes and synchronizingthe source and target data after the initial load finishes.

5. Maintenance: There have to be ongoing data quality enhancements. You will need to manage data improvements and monitor the data quality of the new system.

Successful data migration is based on getting three things right: people, process and technology. Getting any one or more of these three wrong will damage the entire project.

October 26, 2010

Data Integration and Data Migration

Filed under: Data Integration, Data Migration — Katherine Vasilega @ 5:55 am

Data integration implies combining data from different sources and providing users with a unified view of this data. Data Migration, in its turn, is transferring data between storage types, databases, or computer systems. Data migration is the permanent move of data from the source to the target. Data integration is the periodic movement of data from, for example, a CRM system to a data warehouse. Data integration is a wider notion and it can include data migration.

There are differences in approaching data integration and data migration, however, more important that there are common issues, two of which I will describe further.

Business involvement

Since data is utilized by a business, it greatly influences business decisions. That is why any project that requires changing the data eventually needs business involvement.

Understanding the data

Many projects fail because of incorrect assumptions and gaps in knowledge that only become obvious when a project is already running. It is crucial that the source data is clearly understood prior to data integration or migration.

These two approaches are connected in many ways. To understand the data, you need to have a business experience in it. To have business involved, you have to be able to deliver technical information in such a way that it becomes intelligible to non-technical users.

The genuine problems with data, such as duplicate records, missing fields and records, incorrect values, often come out only in the process of data migration or data integration. That is why it is so important to involve business users and make them understand the data before performing data migration or data integration. The proper approach will save you time, cost, and effort on both data migration and data integration.

October 14, 2010

Major Data Migration Mistakes

Filed under: Data Migration — Tags: — Katherine Vasilega @ 6:48 am

As we know, data migration is something that has to be carefully planned. It is not a great idea to move your data into the cloud or any other single repository, until you outline all the requirements and examine how the data migration solution will work in practice. Here’s an overview of the major mistakes that companies make when performing data migration:

Considering data migration a solely IT project. If business owners, project managers, and partners don’t work out a business approach to data migration, your project is likely to fail.

Getting the easiest work done first. If you move all of your data or the parts that are easiest to move first, and decide later what objects and master types are really needed, the solution is not going to work.

Hiring developers with no experience in data migration to design your solution. Don’t think that all software developers know how to move the whole records across source and target systems. Not all of them do.

Performing data migration as a part of a bigger project. Data migration is quite a challenge by itself. Don’t think that it is done in a couple of hours.

Not thinking about data quality standards. If you don’t have time for outlining the data quality standards, then it’s better not to perform data migration at all. You don’t want to use inaccurate and outdated information, do you?

Lack of requirements. If you are not sure what problem you are solving with a data migration solution, you’d better save your time and money, until you know exactly what you need.

Buying the data migration tool first. Are you sure this tool is going to address your business needs just because somebody told you that it has worked fine for their organization? There is no data migration technology that fits every business need. You might want to take time to do some research.

If you follow this advice, you will probably find that the effort required to perform data migration is significantly greater than you thought. It really is, and not realizing this fact can lead to considerable time, money, and data losses.

September 17, 2010

Data Mapping for Data Integration

Filed under: Data Integration, Data Migration, Data Synchronization, Database Integration, ETL — Katherine Vasilega @ 6:43 am

Your data sources grow together with your business. You have ERP, CRM systems, mail clients, Web forms, Excel documents and it’s getting harder to distinguish the accurate data. Data integration can solve this issue, but how do you transfer data from multiple sources in a nice and easy way? You’d probably need to deploy a system that allows automating the process of data transfer.

What is data mapping?

Data mapping is used in data integration when you need to gather information form multiple sources. Data mapping involves matching between a source and a target, e.g., two databases that contain the same data elements but call them by different names. A simple example of data mapping includes moving the value from a ‘customer name’ field in one DB to a ‘customer last name’ field in another DB. To do so, your ETL tool needs to know that you want to take the value from the source field ‘customer name’, cut out the first part (name) and leave the second part (last name), and move it to the target field ‘customer last name’. Besides, the steps of performing these operations need to be marked in the data integration process.

Data mapping tools

The modern ETL systems include the functionality of making data maps. Commonly, these are graphical mapping tools. They enable you to draw a line from one field to another, identifying the correct connection. It’s relatively easy to do if you want to, let’s say, move your contacts from the mail client to the CRM. But what if the task is more complicated, such as to move the information received through a Web contact form (first name, last name, address, phone, email, company name) to your CRM that has different fields for all these values? It will take much time to do it manually and it will take some time for you to draw a data map in your ETL tool, unless you are an IT specialist.

Open source ETL

Remember, if you are using an open source ETL tool, there is always a community behind it. People from all over the world create data maps for various purposes and make them available for free download. You can find great tools for complicated data integration tasks and use them for free. No need to draw a data map of your own, just use what has already proved to be effective. That way you can execute your data integration with no effort and money spending at all.

« Older Posts