Restricted access

October 8, 2010

Data Integration in Social Networks: Is It Real?

Filed under: Data Integration, Data Quality — Tags: , — Katherine Vasilega @ 8:07 am

Social networking is growing rapidly. The large number of social networks has resulted in vast—but diverse—information about individuals. In order to put this data to commercial use, we need a smart solution to integrate all information available among different social networks. This is quite a challenging task for any data integration software, and let me explain why.

First, there are no restrictions on the amount of data which a user can publish in social networks. In addition, this massive amount of data is not necessarily structured. Therefore, data integration amongst social networks may become a headache.

Second, there are plenty of privacy and security concerns in social networks. Forged identity is extremely difficult to track and its prevention is kind of impossible. There are no means for proper monitoring of unauthorized access to data in social networks. Anyone can create a profile for Bill Gates, Barack Obama, or Charlie Chaplin. Misrepresentation of information in social network may lead to incorrect data mapping, which creates obstacles to developing a consistent, single view of data.

Data integration in social networks is now generating a lot of interest and is definitely a future trend of the data integration development. However, the potentially high commercial value attached to the development of social networks is really hard to be utilized in full. There are still too many privacy and consistency issues related to data integration in social networking. At the moment, they slow down the development process of a comprehensive data integration solution.

Though some attempts are taken to integrate data from social networks, these solutions can not yet be applied to commercial use.

October 5, 2010

Data Integration in Hybrid Clouds

Filed under: Data Integration — Tags: , — Katherine Vasilega @ 4:04 am

Hybrid Clouds are the latest buzz in the industry. This architecture allows you to utilize a combination of Private and Public Clouds, sharing the processing load between them. Unfortunately, businesses often do not consider the need for data integration as a crucial part of Hybrid Cloud solutions. However, it can be quite a challenge.

Some IT analysts even state that Hybrid Clouds will require organizations to employ a full-time data integration specialist. That’s because Hybrid Clouds are not only about application integration, but also about infrastructure integration. A successful infrastructure integration strategy must be able to incorporate on-premises and off-premises system resources to enable consistent processes and resource management, which is a pretty tough task.

When performing data integration, you also have to keep in mind that you are extending your Private Cloud infrastructure out to Public Cloud providers. It means that you need to create a solution that allows data to be moved seamlessly between Private and Public Clouds. Apart from the usual data integration issues, such as careful planning, you also need to keep in mind solution’s latency and data integration support for the Public Clouds you use.

Latency is the ability to transfer data in a timely manner. It is especially crucial, since it ensures the Public and Private Clouds can share data without affecting the speed of one another. Data integration support means that your Public Cloud provider should support data integration solutions. This support should include access to all major Cloud APIs and shouldn’t require an immense amount of customization.

Though data integration in Hybrid Clouds is definitely a trend today, there are still some challenges for applications and processes to seamlessly move information between Private and Public Clouds.

October 4, 2010

Data Integration Categories

Filed under: Data Integration, Data Quality — Tags: , , — Katherine Vasilega @ 6:45 am

There are three major data categories to consider when carrying out data integration initiatives. They require a clear understanding to help find a proper data integration solution. Here is a brief description of each category.

1. Master Data. Also called reference data, master data is any information that is considered to play a key role in the business. Master data may include information about customers, products, employees, locations, inventory, suppliers, and more. Master data is stored in the Data Warehouse.

2. Operational Transaction Data. This data includes the information about the activities, such as purchases, call details, claims, transactions, and so on. This data is stored in the Operational Data Store and is considered low-level data with limited history that is captured “real time” or “near real time” as opposed to the much greater volumes of master data.

3. Decision Support Data.
This data category includes historic data used in strategic and tactical analyses. Trends, patterns, data mining, and multi-dimensional analytics can then be used in Decision Support systems that are able to provide predicted outcomes from different scenarios and strategies, so answering “what if?” questions.

All three data types require similar processes, as data must be collected, cleaned, integrated, and populated into the repository. In addition, the three forms of data share many of the same data integration technologies: ETL, hardware, software, applications.

Whether you create a distinct data integration solution for each data type, or a single data integration solution for all three types, you have to study what data integration vendors are offering and choose the best technology to fit your needs.

October 1, 2010

Data Integration: 3 Most Common Mistakes

Filed under: Data Integration, Data Quality, ETL — Tags: , , — Katherine Vasilega @ 4:51 am

Implementing a data integration solution is not an easy task. There are some common mistakes that companies tend to make in data integration. These mistakes result in delayed data integration projects, increased costs, and reduced data quality. Today, I’d like to focus on three most common data integration mistakes that businesses tend to make.

1. Lack of a comprehensive approach

Data integration is not only about gathering requirements, determining what data is needed, creating the target databases, and then moving data. You have to develop a comprehensive data integration approach that will provide for:

• Processing complex data, such as products and customers, in relation to facts, such as business transactions
• Filtering and aggregating data
• Handling data quality
• Capturing changes and avoiding gaps in historical data.

2. Missing data quality requirements

You may think that data quality problems are simply data errors or inconsistencies in the transactional systems that can be easily fixed. The truth is that you have to prevent quality problems at the initial stage of a data integration process. You have to plan how to set data quality requirements, incorporate data quality metrics into your system architecture, monitor those metrics in all your data integration processes, and report on data quality.

3. Using custom coding instead of ETL

While most businesses consider ETL the best practice, there are still a lot of companies that use custom coding to create countless data shadow systems. Keep in mind that custom code makes it difficult to manage and maintain programs, does not offer the centralized storage of programs, limits metadata capabilities, and also has a longer development cycle. Besides, debugging is more difficult with a custom code than with an ETL tool. To add more, an ETL tool usually has a user-friendly interface, provides for centralized storage of programs, and is relatively easy to customize.

Thinking ahead about all these issues before developing and implementing a data integration solution, you are sure to save time, money, and valuable data.

September 30, 2010

Choosing the Right Tool for Custom Data Integration Software

Apatar can be used as data integration software of choice for a variety of implementation options. However, sometimes our development team faces challenges of developing custom data integration applications that would fit specific needs of the customer. In my previous post, I told you about the integration with MS Dynamics CRM. Today, I will tell you why the team voted for a custom data integration application to be developed in PHP.

First, the team considered Apatar to be implemented for data integration. To success fully apply this solution, a GUI Server had to be installed and configured first. The customer didn’t have a GUI Server and did not intend to deploy one. Besides, it was challenging to support customer’s database triggers with Apatar.

The team decided to build a custom data integration solution. They had to figure out what would be the best tool for this application, considering three major languages: Java, Delphi, and PHP.

To run the application on Java, the team would have had to build a custom server, and implement an .http or a custom protocol. In case they would have implemented a custom protocol, some additional application features had to be developed as well. It appeared to be a complex and time-consuming task. Delphi had the same limitations as Java; what’s more, it is not a cross-platform tool and works with Windows environment only.

Finally, the team voted for PHP, as it appeared to be the right tool to suit customer’s requirements. PHP can be deployed on most Web servers, many operating systems and platforms, and can be used with many relational database management systems. It does not have the limitations of the above-mentioned tools. Moreover, customer’s hosting server had a configured PHP environment, so it did not require any additional settings.

As a result, PHP was the right solution for building a custom data integration solution in a short period of time. The deployment of the solution has not required additional costs and efforts on implementing new features and technologies.

September 29, 2010

Data Integration Application for Microsoft Dynamics CRM

A couple of days ago, our team faced an issue when developing a custom application for data integration to Microsoft Dynamics CRM. We thought that sharing the problem and the solution will be useful for the community, so here it is.

Apatar developers had chosen PHP technology for that custom data integration application; I will explain why they had chosen PHP and not Java or Delphi in my next post. Meanwhile, please, keep in mind that this case describes accessing Microsoft Dynamics CRM by means of PHP NuSOAP.

The Microsoft Dynamics CRM Software Development Kit includes documentation that covers a wide range of instructive and practical information. Unfortunately, it does not provide the appropriate information on accessing Microsoft Dynamics CRM MetadataService (fields, tables, and their descriptions) and CrmService (accounts, contacts, leads). To be more exact, it provides the same URL formula for accessing both metadata and data services:


where service name is either MetadataService or CrmService.

In the course of development, it became clear that this formula does not work for both services. So, the correct formulas are:

For Metadata


For CRM data


As you can see, to access MetadataService you have to add ?wsdl at the end of URL, and to access CrmService you don’t have to add anything.

Hope you’ll find this information useful!

September 24, 2010

Customer Data Integration Using RSS Feeds

Filed under: Data Integration, Database Integration — Tags: , , — Katherine Vasilega @ 7:45 am

One of the ways to improve your business management is to keep your CRM users better informed. You can send out emails to your employees when a specific event happens in CRM (new contact or new lead is added), but you can go further than that and provide the same up-to-date information without emails. Database integration with RSS feeds opens up great opportunities for customer service and workflow.

Database integration with RSS allows aggregating RSS feeds, filtering them by relevant keywords and providing the relevant content to a specific user. Database integration allows creating one generic feed or multiple feeds to give CRM users the option to customize the desired information. Let’s say you have five new customers a day – you can immediately inform sales managers of new opportunities and leads, alarm executives when opportunities close and new ones come in, send contacts’ details to your marketing department, etc.

You can use CRM – RSS feeds integration in another way. Database integration with news search RSS via Yahoo! News, Google news, CNET and other portals will help you gather information about your potential customers. You can immediately send out this information to sales people and give them something to talk about with possible clients.

Database integration with RSS feeds allows collecting relevant information and using it to update your CRM system, arm your employees with relevant information and improve customer service.

September 23, 2010

Data Integration to Achieve Data Quality

With an ever-increasing amount of data coming from various sources, you are sure to face data quality issues. Should you maintain a huge database of contacts and send notifications, sales offers and other documents to all of them? Isn’t it way too time-consuming and cost-ineffective? Wouldn’t it be smarter to check, which contacts do have a potential of becoming your customers and whether they really exist at all?

One of the ways to tackle the issue is to implement demographics-focused solutions in data integration of your CRM system, Web site membership database, Excel documents, and other data sources. For example, data integration with CDYNE Demographics Web service will allow you to receive relevant information about contacts from any U.S. postal address before you launch an advertising/marketing campaign. With a help of the appropriate ETL software, you can integrate this Web service into your customer database to determine contacts’ age, nationality, income or other characteristics, such as type of residence, average income, average house value, average number of vehicles for residents in their neighborhood, etc.

You can also integrate your CRM or any other contacts database with StrikeIron Email Verification service. This data integration solution allows instant determining the validity of an email address or domain. You can check all of your contacts and send emails to those that actually have them.

Data integration with demographics-focused solutions and address verification software ensures enhanced data quality, which results in better customer service, effective marketing and advertising, and, eventually, increases your revenue.

March 12, 2010

What Should Data Migration Plan Comprise?

Filed under: Data Cleansing, Data Migration, Data Quality — Tags: , — Olga Belokurskaya @ 2:22 am

In my previous posting, I wrote about the importance of planning to avoid data migration project failure. So today, I’d like to have some words on what data migration plan should provide for. I mentioned pre-migration procedures and process. A good data migration project starts from planning the necessary pre-migration procedures and then gets to planning the process itself.

Why pre-migration stage? The data can’t be migrated from an old system to a new one just as is, because the old problems will be migrated to the new system as well, thus making data migration useless. To take the most of the new system, a company should ensure the data migrated there can bring value to the business and can be utilized by the business users. Thus, before being migrated:

  • The business value of the data should be analyzed to define what data to be migrated
  • Data cleansing (elimination of duplicate records, etc.) should be performed to ensure the quality of the data.
  • If needed, data transformation should also be performed, to ensure that data formats qualify the new system’s requirements.

Well-elaborated process is the key to data migration project’s success.

  • Data migration project requires creating and implementing migration policies to define the order of the process and a responsible person for each stage of the migration. When the order of the process is set, it’s easier to prevent the troubles, such as server or system crash due to the excessive amount of data migrated at once, etc.
  • Testing is an important stage. One should test each portion of the migrated data to ensure that it’s accurate and in the right format. Without proper testing, the whole data migration project may fail. It’s not a good decision to migrate tons of data only to find out that it’s not in the expected format or the new system can’t read it, and thus the migrated records are useless.
  • In order to ensure future success of data migration project, the process of migration—each stage—should be carefully documented.

So, to conclude: ensure you know what to migrate, provide the quality, systematize the process, test, again test, and document it. This may seem rather time consuming, however, in the reality, when all the procedures and stages are planned, you get more clear picture about time and budget data migration process will require.

January 12, 2010

Data Migration: Ensure Quality When Moving Data to the Cloud

Filed under: Data Migration, Data Quality — Tags: , — Olga Belokurskaya @ 5:57 am

When migrating data to the cloud, ensuring data quality is essential.  Data can’t be taken to the cloud as is, so before starting data migration, provisions should be made for data quality.

It’s wrong to start data migration, until data is checked to be accurate, complete, duplications are found and cleared up, etc. Otherwise, data issues will be taken to the cloud, which will make it inconvenient to work with the data.

One more thing to be taken into account before beginning data migration process is a provider’s possibility to provide fresh, real-time data, and give constant access to the data.

Normally, companies have best practices for data quality. Cloud providers also have tools for data management, so when those tools and company’s best practices are united, it makes data management more flexible, and thus a company will have possibility to control data quality when data is migrated to the cloud.

« Older Posts