April 25, 2009

Data Warehousing Pros and Cons

Filed under: Data Warehousing — Tags: , — Olga Belokurskaya @ 4:32 am

First, let’s remember what is data warehouse, and why it may be useful for a business.

In fact, it is a repository of an organization’s  data which is electronically stored, and it is designed to facilitate reporting and analysis. The broader meaning of data warehouse focuses not only on data storage, but the means to retrieve and analyze data, to extract, transform and load data, and to manage the data dictionary are also considered essential components of a data warehousing system.

Nowadays, data warehousing is a popular management technique and is frequently used as a business model. However, not every system is applicable to every business setting. So when thinking about implementing the strategy, one should consider pros and cons of data warehousing.

Among the major benefits of data warehousing is enhanced access to data and information and easy reporting and analysis. Besides:

  • Data retrieval is faster within data warehouses.
  • Prior to loading data into the data warehouse, inconsistencies are identified and resolved.
  • Data warehouses can work in conjunction with and, hence, enhance the value of operational business applications, such as, for example, CRM systems.

And here are some cons:

  • Preparation is very frequently time consuming for effort is needed to create a cohesive, compatible system of data collection, storage, and retrieval. Moreover, because data must be extracted, transformed and loaded into the warehouse, there is an element of latency in data warehouse data.
  • Compatibility with existing systems. The use of data warehousing technology may require a company to modify the database system already in place. This could really be the foremost concern of businesses when adapting the model given the cost of the computer systems and software needed.
  • Security flaws that data warehousing technology may contain. If the database contains sensitive information, its use may be restricted to a limited group of people and precautions will be required to insure that access is not compromised. Limited data access situations can also effect the overall utilization of the data strategy.
  • Over their life, data warehouses can have high costs. The data warehouse is usually not static, it gets outdated and needs regular maintenance, which may be quite costly.

So, before any implementations, one should make sure that data warehousing will be a good fit for the business and be prepared to commit to the level of work required to get the system in place. However, once data warehouse starts working, most companies are glad to have their “corporate memory.”

April 24, 2009

Cutting Costs through Data Integaration Tools Rationalization

Filed under: Data Integration, Data Quality — Tags: , — Olga Belokurskaya @ 4:29 am

Gartner on cost-cutting bent again! According their latest press-release, rationalizing data integration tools can save companies more than $500,000 annually. Moreover, they suggest adopting a shared-services model in the longer term.

It’s a fact that most companies often purchase and implement new data integration tools in a fragmented way without considering extending investments already made in other parts of the business.  This results in multiple tools from different vendors and consequently – loads of money on licensing and maintenance.

Taking into account today’s organizations’ and industries’ focus on cost optimization (cost cutting, to be exact), fragmented approach to data integration should be reviewed in order to increase efficiency and reduce expenses.

Here what Ted Friedman, Gartner distinguished expert says:

The first step is for IT teams focused on data integration to save money by rationalizing tools. Further, there is a greater longer-term opportunity to substantially reduce costs and increase efficiency and quality by moving to a shared-services model for the associated skills and computing infrastructure.

Gartner recommends three elements executed to realize this first step:

  • Planners should rationalize across the three main categories of data integration tools: extraction, transformation and loading (ETL); data replication; and data federation, ideally arriving at a standard tool for each of these styles of data delivery. They should decide which tools to keep and which to discontinue based on the business context and requirements, rather than blindly rationalizing wherever possible based purely on cost.
  • Centralize Data Integration Computing Infrastructure to avoid redundant servers and storage caused by deployment of each tool on dedicated hardware. Many organizations can make substantial savings on computing capacity by implementing shared computing infrastructure for data integration workload.
  • Gartner recommends that organizations centralize data integration roles and skills into a shared services team model to reduce staffing costs directly by 50 per cent or more each year.

In conclusion, Ted Friedman stresses that rationalization should not be limited to one business unit, and that CIOs and data integration teams should work together to lead the rationalization and shared-services program.

April 20, 2009

Working with CRM Consultants: Best Practices

Filed under: Data Synchronization — Tags: , — Olga Belokurskaya @ 12:30 am

CRM implementation is a challenging step for every company, and qualified consultant is a valuable asset for such a project. However, it’s no less challenging to form successful relationships with the CRM consultants and system integration partners.

Here are five best practices to help overcome the challenge:

  • Establishing requirements – One of the most important phase of preparation to implement a CRM is to set requirements for the CRM project and determine which areas of the project will require the most help from consultants. Having clear requirements from the start lets both the company and the consultant to be on the same page going in.
  • Defining the relationship with the CRM consultant – By establishing relations and setting boundaries, the consultant’s made fully aware of his/her role and the overall scope of the project. Otherwise, s/he is less likely to be helpful.
  • Selecting project team members – One of the most important things for a company is to get its best employees working with the CRM consultant.  Involving the best people from both the IT and business side of the company ensures that there will be total buy-in for the project and a large cross section of skills that can be utilized.
  • Having a well-defined project plan – Make sure that everyone involved has a unique set of responsibilities throughout the project. The plan must be as specific as possible, including the names of consultants, each consultant’s responsibilities within the CRM project, and dates for each project phase.  Moreover, each stage of the project is better to be documented to reduce confusion once the consultants have moved on.
  • Controlling the CRM project from start to finish – Taking control of the CRM project from the very beginning is something mane organizations fail to do. However, this is just the thing that helps ensure a successful outcome.

April 15, 2009

Staging Approaching To Data Integration

Filed under: Data Integration — Tags: — Olga Belokurskaya @ 7:17 am

I’ve recently come across an article by David Linthicum at ebizQ. There he elaborates on a staging approaching to data integration.

According to him, not many consider the use of a staging area when looking at data integration. However it’s a great solution when support is needed for more complex and valuable data integration operations, including support for many large data sets and data operations that are more complex and of higher value. Using a staging area helps perform complex operations on data, which are, normally, difficult to do using direct integration approaches.

David provides benefits of a staging approach to data integration:

  • The ability to perform more complex operations on data, including complete transformation of semantics and the data content using any number of dimensions since, in essence, you operate on an intermediary database that you control completely.
  • The ability to leverage more coarse grained and complex data sets that may not always repeat.
  • Informational focused, supporting valuable information externalization approaches, including business intelligence.
  • More flexibility around business cycles, data processing cycles, widely disbursed systems, and hardware and network limitations, where it may not be feasible to extract all operational databases at the same time.
  • The ability to better support complex database functions, including replication, cleansing, and aggregation.

Use it or not, it’s up to you. But it seems helpful, no doubt.

April 14, 2009

Open Source Tools - Good Choice For ETL

Filed under: ETL, Open Source — Tags: , — Olga Belokurskaya @ 2:14 am

According to Ted Friedman, an analyst with Stamford, Conn.-based Gartner Inc., more and more companies start considering open source IT technologies as the economy continues sliding down. Surely, it’s done to keep costs down, which is a major driver of open source technologies of all kinds including open source data integration software.

However, one should take into account that open source data integration tools are not absolutely cost-free.  You don’t have to buy licenses from open source software providers, what you certainly have to do dealing with their commercial counterparts. Customers still must pay for support and maintenance and for internal manpower to run and monitor the software.

There is one more thing one should take into account with regards to open source software. According to Friedman:

 ”We definitely are seeing an increased interest in open source tools in the data integration space. But open source solutions are not nearly as well proven in large-scale, mission-critical implementations.”

Most open source data integration software is geared to traditional extract, transform and load (ETL) methods. As for data virtualization and enterprise information integration, real-time data integration methods like change data capture – they are not well represented, at least at this point, in the open source community. Open source metadata management capabilities are also “lighter” than most on-premises variants.

There is a continuous growth of interest in open source ETL and other open source technologies, as the recession continues to put pressure on IT departments to do more with fewer resources.

Friedman’s recommendation for companies is to consider open source data integration software to reduce costs “when your requirements map well to the level of maturity of the open source solution,” which, as he reiterated, are mostly geared to standard ETL.

April 13, 2009

Data Quality Steps For Successful MDM Program

Filed under: Data Cleansing, Data Quality — Tags: — Olga Belokurskaya @ 4:57 am

It’s surely no secret that data quality management and MDM are two key factors of enterprise information management. They are interrelated for without DQM, MDM is simply a pile of the data storage as well as DQM cannot bring ROI to the organization without MDM. Actually data quality management plays a role of a building block of an MDM hub as quality and accurate data is a key to the success of an MDM program.

In-depth analysis of the quality and health of data is a prerequisite of the MDM program. Here are data quality management steps suggested at Information-Management.com, which are needed to support an agile MDM program:

  1. Identify and qualify the master data and its sources. The definition of master data may be different for different business units. The first step involves identifying and qualifying master data for each business unit in the organization
  2. Identify and define the global and local data elements. More than one system may store/generate the same master information. Additionally, there could also be a global version as well as local versions of the master data. Perform detailed analysis to understand the commonalities and differences between local, global and global-local attributes of data elements.
  3. Identify the data elements that require data cleansing and correction. At this stage, the data elements supporting the MDM hub that require data cleansing and correction have to be identified. Communication with the stakeholders is necessary so that as part of the MDM initiative, data quality will be injected into these selected data elements on an organization-wide basis.
  4. Perform data discovery and analysis. Data collected from source applications needs to be analyzed to understand the sufficiency, accuracy, consistency and redundancy issues associated with data sets. Analyze source data from both business and technical perspectives.
  5. Define the strategy for initial and incremental data quality management. A well-defined strategy should be in place to support initial and incremental data cleansing for the MDM hub. Asynchronous data cleansing using the batch processes can be adopted for initial data cleansing. Industry-standard ETL and DQM commercial off-the-shelf tools should be used for initial data cleansing. The incremental data cleansing will be supported using synchronous/real-time data cleansing.
  6. Monitor and manage the data quality of the MDM hub. Continuous data vigilance is required to maintain up-to-date and quality data in an MDM hub. Data quality needs to be analyzed on a periodic basis to identify the trends associated with the data and its impact over the organization MDM program.

In fact, data quality management is the foundation for an effective and successful MDM implementation. A well defined strategy improves the success probability of an MDM program. Organization should embark a data discovery and analysis phase to understand the health, quality and origin of the master data.

April 7, 2009

10 Steps to Smooth Open Source Implementations

Filed under: Open Source — Tags: — Olga Belokurskaya @ 4:24 am

Today open source has become standard fare for enterprises, now it’s time for them to get smart about open source implementations.  Though we take into account ability of open source to down costs and boost innovation, that doesn’t mean that deploying open source software within an enterprise should go without proper planning. According to Baseline, there are 10 strategies facilitating the success of an open source implementation.

Here they are:

  • Create governance program to know who and what for uses open source and how the software performs.
  • Create open source review board so it could evaluate in-company requests to use open source products.
  • Thoroughly test the applications.
  • Maintain separate environment for testing and production.
  • Select widely supported platforms for open source platforms with the greatest support are, normally, the most reliable and mature ones.
  • Keep abreast of release changes for open source applications are often updated, and you need to know about new features and capabilities as soon as they are released.
  • Upgrade only when needed, it’s not necessary to upgrade with every release. Focus on key requirements and update only when key requirements like security updates appear.
  • Be active in communities. Open source succeeds because people are improving software all the time. Users’ active approach is a key to success.
  • Any revisions in open source code should be submitted to community for review so it could be included in the mainline code base later.
  • Share successful strategies. Successful adoption of open source is based on best practices and experiences from others.

In fact, community involvement is a very important point. Enterprises can get a lot more out of open source, if they put more into it.  Instead of thousands of enterprises modifying open-source projects in isolation, contributing back code and getting involved in the relevant communities would help enterprises to coordinate and pool resources across industries.

April 2, 2009

Integrating New Systems Acquired With the Merger in a Data Warehouse

We all remember those talks about mergers and acquisitions among companies due to the present economic downturn.  Integrating new systems acquired with the merger or acquisition in a data warehouse is a big challenge for those in charge. There are several things with respect to ETL integration from one system to the other one can do to make this process easier. Here is an advice by Joe Oates, an internationally known consultant on data warehousing:

  • Get involved as early as possible. Work out an arrangement with IT management so that the data warehouse team can be involved with the planning for the merger or acquisition from the start.
  • In your requirements gathering, you should have obtained a list of the top 20 questions that each manager or involved stakeholder in the current data warehouse project would like to get from the data warehouse. Consolidate these questions into a single list and run them by the management of the new company as well as your existing management to see if they still apply and if any new questions are needed. This is something you should do periodically.
  • Prepare templates for the ETL process flow. You may not have much time to bring the new systems in. You also may have to bring on inexperienced resources to do the ETL. Having well-designed templates that handle the overall flow of the ETL, the dimension processing, fact processing, exception processing and audit trail will be invaluable. The best source for this information in a readily available book that I have seen is The Microsoft Data Warehouse Toolkit: With SQL Server 2005 and the Microsoft Business Intelligence Toolset by Ralph Kimball, et al. Regardless of the database system or ETL tool that you may use, the section on how to develop the ETL is excellent and the principles can be transferred to any ETL tool or database.
  • You will probably want to have two or more phases involved in bringing the data over from the new company. It is much easier to handle several smaller ETL projects than one gigantic ETL project. Regardless of whether you can or not, the previously mentioned templates will make your life much easier.
  • Negotiate a trial period or parallel if the acquired company already has a DW. You may have to develop special reports and/or analyses for the new company. Being able to work out the details before going into full production will help you set expectations and help ensure a smoother transition.