April 20, 2010

Understanding Data Integration in the Cloud Context

Filed under: Data Integration, Data Migration — Tags: , — Olga Belokurskaya @ 2:29 am

Today, cloud providers enable both small companies and enterprises appreciate cost-savings, scalability, and ongoing server support, moving their applications or parts of them to the cloud. However, data integration still remains among the challenges cloud providers and their customers face. According to a number of cloud experts (David Linthicum among others), this occurs due to the fact that data integration in the context of the cloud has not yet been clearly understood and elaborated.

  • Sadly, but data integration seems to be an afterthought for many cloud providers; they do not consider the need companies have for synchronizing data in the cloud with the on-premises sources.
  • Another issue is that there are still no common standards for data integration between different clouds. Companies have different business goals and they may use the services from different cloud providers. So, today, we speak about not just cloud-to-on-premises integration, but cloud-to-cloud integration, as well. Since there are a number of different cloud platforms available today, providers are expected to consider cross-platform integration as soon as possible.

So, in the context of the cloud, data integration means the possibility to integrate and manage data across all on-premises and cloud-based systems a company utilizes. Cloud provider should consider these options and give its customers such a possibility.

April 5, 2010

ETL Faces New Challenges

Filed under: Data Integration, ETL — Tags: , , — Olga Belokurskaya @ 3:30 am

Today, well-established data integration process is necessary for a sound business. Business information is a valuable asset; companies’ decision-makers depend greatly on the data they receive, its quality, value, and actuality. As the amounts of data companies work with grow exponentially, the requirements to ETL systems get more complex. Today, ETL providers face some new challenges along with traditional data integration issues:

Scalability. ETL systems need to be able to process large volumes of data that intend to keep growing. Moreover, today’s business reality requires getting more data in less time. So, scalable ETL is a must.

Operability. A large company’s IT system comprises multiple disparate sources of business-critical data, such as databases, CRM systems, etc. These days, ETL tool should have connectivity to all those systems. Ah, moreover, data integration between all data sources often requires complex transformations to make the data fit the formats common for this or that system.

Real-time data integration. This requirement is being heard more and more often. The need for real-time data demands from ETL systems the ability to process extract-transform-load operations and gather all the data in a standard, homogeneous environment in a really short period of time.

Finally, the Cloud. As cloud offerings get mature and provide some beneficial solutions (especially for small and mid-sized business), companies choose to move parts of their applications to the cloud. Providing the connectivity to cloud systems is a today’s ETL challenge, as well.

March 12, 2010

What Should Data Migration Plan Comprise?

Filed under: Data Cleansing, Data Migration, Data Quality — Tags: , — Olga Belokurskaya @ 2:22 am

In my previous posting, I wrote about the importance of planning to avoid data migration project failure. So today, I’d like to have some words on what data migration plan should provide for. I mentioned pre-migration procedures and process. A good data migration project starts from planning the necessary pre-migration procedures and then gets to planning the process itself.

Why pre-migration stage? The data can’t be migrated from an old system to a new one just as is, because the old problems will be migrated to the new system as well, thus making data migration useless. To take the most of the new system, a company should ensure the data migrated there can bring value to the business and can be utilized by the business users. Thus, before being migrated:

  • The business value of the data should be analyzed to define what data to be migrated
  • Data cleansing (elimination of duplicate records, etc.) should be performed to ensure the quality of the data.
  • If needed, data transformation should also be performed, to ensure that data formats qualify the new system’s requirements.

Well-elaborated process is the key to data migration project’s success.

  • Data migration project requires creating and implementing migration policies to define the order of the process and a responsible person for each stage of the migration. When the order of the process is set, it’s easier to prevent the troubles, such as server or system crash due to the excessive amount of data migrated at once, etc.
  • Testing is an important stage. One should test each portion of the migrated data to ensure that it’s accurate and in the right format. Without proper testing, the whole data migration project may fail. It’s not a good decision to migrate tons of data only to find out that it’s not in the expected format or the new system can’t read it, and thus the migrated records are useless.
  • In order to ensure future success of data migration project, the process of migration—each stage—should be carefully documented.

So, to conclude: ensure you know what to migrate, provide the quality, systematize the process, test, again test, and document it. This may seem rather time consuming, however, in the reality, when all the procedures and stages are planned, you get more clear picture about time and budget data migration process will require.

March 9, 2010

The Role of Planning in Data Migration

Filed under: Data Migration, Data Quality, Uncategorized — Tags: , — Olga Belokurskaya @ 3:00 am

Data migration has never been an easy process, and though there are a variety of tools available today, the process remains complex, and the rate of errors is high. Data integration may fail due to hardware or system failures, but those are so-called unforeseen situations. The most common reason for migration project failure is lack of proper planning.

In the result of rushing into migration without careful planning of time and resources needed, data migration projects experience schedule delays and require additional expenses, so budgets get overrun. That’s because multiple issues occur during the process of data migration, including copy process failures, issues with data formats match in the source and the target, server crashes due to excessive amounts of data migrated at once, etc. Coping with these issues requires time and money, so data migration process may stick.

Proper migration planning should include a set of pre-migration procedures and well-elaborated migration program to help address data migration complexities, hit deadlines, and avoid unpredicted additional costs. I’ll touch on this in my next posting.

March 5, 2010

Database Integration: On the Importance of Data Quality Standards

It’s a sad fact, but many organizations realize the poor quality of the data in their databases, only when it comes to database integration. Data quality issues are among the common reasons for data integration failure.

This neglecting attitude to data quality lies in the fact that companies often don’t understand how much data quality impacts business processes. Thus, each data source or database a company uses may have its own rules and standards for data quality. The issues, however, evolve as soon as the database integration started in order to get a unified look at, for example, company’s customers’ data.

Those issues may come out of the difference of data fields, for example, or data formats, so the same contact may be represented differently in different databases. Thus when it comes to database integration, it can’t be performed correctly due to those differences, which may lead to data duplication, and many more data quality issues. In fact, in the result of integrating several databases of poor quality, a company gets one big database of poor quality. This means that database integration was in vain, as it failed to achieve its main goal of providing the company with a general view of business data, while the integration expenses were significant.

Unfortunately, data quality technology does not always allow organizations to fix poor data. So, it’s much wiser to implement company-wide standards for data quality to prevent the appearance of data quality issues associated with integration of data from heterogeneous sources, then to perform data cleansing and other data quality procedures afterward.

March 3, 2010

Data Integration Is Not About Tools, It’s About Strategy

Filed under: Data Integration, Data Migration, ETL — Tags: , — Olga Belokurskaya @ 4:13 am

Today, organizations face increasing data integration challenges. The amounts of data grow progressively, demanding for new levels of data protection, and making data migration even more complex.

At the same time, business demands access to a real-time information and quality data to make right business decisions. There are plenty of technologies that are able to address data integration challenges, though some of them get old-fashioned, some continue to mature, etc. Thus according to Forrester Research, MDM and data quality services continue to mature, while ETL (extract, transform, and load) and data replication “have reached the Equilibrium phase,” and some technologies are moving to a decline.

But that doesn’t mean that a technology or tools which are in the top of the list today, may become a successful solution for data integration challenges. Lots of organizations regard data integration as mostly a technological process, not taking into account how it impacts organization’s long-term plans and the success (or failure) of business. However, successful data integration is mostly about strategy. And when a strategy is defined, the choice of tools get’s much easier, and there’s no risk that a chosen technology won’t cope with the task.

February 25, 2010

What May Complicate Data Migration?

Filed under: Data Integration, Data Migration — Olga Belokurskaya @ 1:38 am

Data migration is a complex process, and it differs greatly from other IT projects. Good sound approaches should be introduced when data migration strategy is being planned. Success of data migration depends on many things, each detail is important. But there are certain conditions that affect the complexity of data migration process, and thus demand for even more attention.

One of such challenging conditions is moving from a current vendor to a new one, which may mean another type of applications and systems, different data formats, etc. This may, probably, demand the use of data migration tools different from those, a company had utilized.

Another condition that complicates data migration is moving from physical environment to virtualized one, such as the Cloud. Though the end result of the shift is going to be great, a much greater effort is needed to overcome difficulties, connected with the process (lack of interoperability between different cloud and physical platforms, security provisions, access to the data, etc.)

So, as new possibilities for storing, accessing, and working with data appear, promising significant decrease in expenses and resources, organizations will adopt them. However, certain things and conditions that may complicate the process of data migration should be taken into account, and companies should provide for them.

February 18, 2010

Data Migration: Challenges of Moving Data to a CRM

Filed under: Data Integration, Data Migration, Data Quality — Tags: , — Olga Belokurskaya @ 4:36 am

CRM is a great solution to effectively manage company’s customer data. However, to ensure efficiency, get the most out of CRM system, and avoid CRM failure, special attention should by paid to the data that is being migrated to a CRM, and how it is being migrated. There are challenges that may affect the process of data migration:

  • Migration from heterogeneous sources. Data migration from various sources to a CRM is a challenge, so it should be approached with care and strategy. Migrating the data just as is leads to a CRM failure, as you simply take siloed data and move it to another system (the data still remains siloed). So, before data migration takes place, there is a work that should be done to ensure the quality of the data that is targeted to a CRM.
  • Data mapping. To migrate data, one should map the source and the target. That’s clear. The problem is, that in different sources data fields possessing equal information, may be named differently (for example, “Username” and “Account”), or vice versa, equally named fields may stand for different values (“Name” may stand for a first name solely, or for the full name, including first name and surname). That’s why it’s very important to map fields in a data source with appropriate fields in a CRM application.

It’s highly probable that before data migration is started, data will need to be transformed, in order to make it appropriate for a new CRM system. Surely, it should be cleaned from duplications, incomplete records, outdated records, etc. Only when data is prepared, clean and in the appropriate format it may be migrated to a CRM.

February 17, 2010

ETL Tools: How to Make a Choice

Filed under: Data Integration, ETL — Tags: , — Olga Belokurskaya @ 5:55 am

There is a wide variety of ETL tools available on the market, starting with solutions with minimum functionality and ending with tools that help solve complex tasks. There is also a choice between proprietary offerings and open source ETL tools, Web based and desktop solutions. Selecting an ETL tool requires some effort. When choosing an ETL tool for a particular company, a lot of things should be taken into consideration, including currently used data management processes, technologies utilized, IT staff available, etc.

Thus, to evaluate ETL tools and make a decision in favor of a certain offering, a set of questions should be answered, such as:

  • The operating systems supported by an ETL tool,
  • The volume of data the tool is able to handle in a given period of time,
  • Data sources and data formats the ETL tool supports, etc.
  • There is also significant to find out on what conditions maintenance and support are provided (paid, free of charge, etc.)

Besides, company’s requirements to data integration should be analyzed, and compared with the functionality that different ETL tools provide before making a decision and purchase a tool. Thus, a company may avoid paying for the functionality they are not going to use.

February 16, 2010

When Developing Systems Architecture Think About Data Integration

Filed under: Data Integration, Data Quality — Tags: , — Olga Belokurskaya @ 6:34 am

“The data, and integration strategies around the data, is something that most figure is there, will be there, and requires very little thinking and planning.” – This is what I’ve read today at ebizQ.

This again supports the idea that data, though being “the biggest companies’ value”, is still often being neglected. And this results in data integration and quality issues, providing inconsistent data and ruining the entire idea of data integration as a way to provide a clear view on enterprise data. The data that is important for business decisions.

In fact, very often when it comes to designing and developing the architecture, all the attention is focused on technical side of the process. Thus data integration strategies become an afterthought making it difficult to meet business requirements. So, the message is that provisions for data integration should be made at the level of the development of enterprise systems architecture.

« Older Posts