Data Integration Blog

July 16, 2008

Simple Solutions to Huge Data Management Problems

Filed under: Data Cleansing, Data Integration, Data Migration, Data Quality — Alena Semeshko @ 2:42 am

Ponemon Institute surveyed 870 IT professionals and found 23 per cent of respondents admit that their data is often left unsecured and inadequately protected.

The problem usually lies in the way unstructured data is spread across the organization’s knowledge management systems, corporate applications (CRM/ERP systems), databases, files etc. and the lack of a clear vision of how it should be consolidated. Recent Gartner Group research supports this with the figures of as much as 80 percent of actual or potentially mission-critical enterprise information taking the form of unstructured or semi-structured data.

Integration, migration, synchronization, data cleansing… it’s all already out there, why not make use of it?

July 11, 2008

Data Management longs for Data Quality & Solid Grounds

Filed under: Data Quality, SaaS, Salesforce — Alena Semeshko @ 2:53 am

Jill Dyché is answering questions about the recent data management trends for SearchDataManagement.com’s Ask the Expert section.

The three master data management trends she singles out are:

MDM trend No. 1: The use of data quality tools.

That is, establishing data quality standards and following through with them.

MDM trend No. 2: Using a common platform for multiple data subject areas.

Yep, the switch to a single platform that would unify your multiple applications and be convenient is taking place. To go further, this is the switch to on-demant applications from your desktop software.

MDM trend No. 3: Building solid business cases for MDM that transcend the “feeds and speeds” conversation and pitch bona-fide business value of MDM. Take case management in state government–whereby the state can track an individual despite multiple identities and addresses–thereby more quickly targeting food stamp fraud and saving millions in taxpayer dollars.

June 12, 2008

SalesForce.com Thinks About the Process

Filed under: Data Quality, Salesforce — Alena Semeshko @ 5:11 am

In the world of customer data quality problems, simply solving the problem, which is exactly what most vendors are trying to do, is not the best option. The thing is, most problems arise as a result of a whole bunch of issues that too need to be adressed. When the data you get from your sources is dirty, for instance, it’s not your CRM system that needs to be cleansed and checked fot data quality regularly, but rather it’s your the sources that need to be altered to provide a better quality data in the first place.

CRMBuyer has an insignful article by Denis Pombriant, the managing principal of the Beagle Research Group, discussing why the original issues don’t get addressed and the way some turnkey technologies try to deal with it.

We need to think more about the process than we think about solving the point problem,” Pombriant says.

Salesforce.com with its AppExchange and on-demand platform seems to have the best approach solution.

By far, Salesforce has demonstrated this understanding best. With the AppExchange and Force.com, the company has brought forth a logical platform that supports whole business processes by bringing applications together.

June 3, 2008

Solving Data Quality Problems in Three Steps

Filed under: Data Quality — Alena Semeshko @ 1:32 am

Vicki P. Raeburn wrote an article for DMReview.com titled “Talking with Your Business Partners about Data Quality,” in which she discusses what needs to be done to solve bad data problems.

Solving the bad data problem requires:

* Clearly defining the nature of the problems your business partners/customers are experiencing,
* Establishing priorities to tackle the most strategically important issues first, and
* Implementing an improvement plan with appropriate metrics and communications to your business partners/customers.

Vicki also names four dimensions for all business users to understand (which is also a problem, as most don’t even know the problem CAN be dealt with) and keep in mind whenever working with the corporate data: timeliness, accuracy, completeness and consistency 

  * Timeliness: Currency of data elements.
* Accuracy: Attributes of the entity (object) are correctly represented.
* Completeness: Breadth (number of entities) and depth (number of fields defined and populated).
* Consistency: Identity, definitions, hierarchies, standards and metrics are the same within and across databases.

May 30, 2008

What it takes to have a QUALITY data quality solution

Filed under: Data Quality — Alena Semeshko @ 4:18 am

There’s really no universal criteria for selecting a data quality provider/solution, it’s rather personal for each company. Before looking at different providers the smart thing to do is to decide on the list of priorities to look for in potential solutions. Some give priority to accuracy, others find execution time the most critical element.

Andrew J. Brooks, wrote a post in his blog on datamigrationpro.com discussing his top three must-be’s for a data quality solution. Come to think of it, I wouldn’t argue with his top-3. It’s Engineered - Understood - Trusted. I’ll look at them step-by-step.

1) What he calls engineered, I’d rather call integrated actually.

It’s about making data quality management an integral part of your architectural design principles; it’s about culture change and cannot be solved by buying a tool.

The solution needs to be fully integrated into all the work processes and become a part of the company’s overall strategy and performance.

2) Understood.

Having accurate, complete, relevant meta data, reference data, master data – call it what you will, is one hell of an obstacle that many have thought about, and most have failed at.

So, a more organized approach than many companies have would definitely work better.

3) Business Trust.

Without business trust, no amount of data profile reports will ‘make’ the business use your data for decision making.

Trust, that never really occured to me. Focusing on your data and leaving out networking and building trust-based relations with the business players around you, as a lot of companies tend to do, is definitely the approach that lacks wisdom.

May 19, 2008

ILM howtos

Filed under: Data Quality, Data Warehousing, SaaS, data security — Alena Semeshko @ 11:41 pm

There’s an insightful article by Mike Karp on ILM (information lifecycle managememnt) and the six steps of implementing a successful and efficient policy on data storage, verification, classification and management. Mike identifies the following steps to follow to ensure your ILM efficiency:
Stage 1. Preliminary
1) Determine whether your company’s data is answerable to regulatory demands.
2) Determine whether your company uses its storage in an optimal manner.

Stage 2. Identifying file type, users accessing the data and key words used.
1) Make a list of regulatory requirements that may apply. Get this from your legal department or compliance office.
2) Define stakeholder needs. You must understand what users need and what they consider to be nonnegotiable.
3) Third, verify the data life cycles. Verify the value change for each life cycle with at least two other sources, a second source within the department that owns the data (if that is politically impossible, raise the issue through management), and someone familiar with the potential legal issues.
4) Define success criteria and get them widely accepted.

Stage 3. Classification (aligning your stakeholders’ business requirements to the IT infrastructure).
0) Identifying the business value of each type of data object, i.e. understanding three things: what kind of data you are dealing with, who will be using it and what its keywords are.
1) Create classification rules.
2) Build retention policies.

When you engage with the vendors, make sure to understand their products’ capabilities in each of the following areas:
* Ability to tag files as compliant for each required regulation.
* Data classification.
* Data deduplication.
* Disaster recovery and business continuity.
* Discovery of compliance-answerable files across Windows, Linux, Unix and any other operating systems you may have.
* Fully automated file migration based on locally set migration policies.
* Integration with backup, recovery and archiving solutions already on-site.
* Searching (both tag-based and other metadata-based).
* Security (access control, identity management and encryption).
* Security (antivirus).
* Set policies to move files to appropriate storage devices (content-addressed storage, WORM tape).
* Finding and tagging outdated, unused and unwanted files for demotion to a lower storage tier.
* Tracking access to and lineage of objects through their life cycle.

Finally, when you know your vendor, you can look for solutions to automate the needed processes and phase-in.

See full article for more details.

May 12, 2008

Enriching Customer Information

Filed under: Apatar, CDYNE, Data Integration, Data Quality — Tags: , — Alena Semeshko @ 9:15 pm

In one of my previous posts I briefly mentioned the possibility of integrating the data received from CDYNE Demographics web service with your customer database and thus extending your customer information. Well, this is now officially possible with CDYNE Demographics connector for the Apatar Open Source Data Integration toolset. The new connector delivers statistical data about customers and allows organizations to identify the ethnic and socio-economic makeup of their current customer base or purchased marketing lists. Aside from that, the connector can be used with any contacts database to determine the age, race, income, as well as type of residence, median income, median house value, or median number of vehicles, all without coding.

Ideal for data modeling and marketing
Whether you need to build customized marketing campaigns and determine ethnic or socio-economic information, this new data quality service from Apatar and CDYNE can be used to tweak your product offerings or advertising messages to reach your desired target market. Non-profit organizations or companies relying on donor support can use this data to match other groups or geographic areas to these traits in order to expand membership base and increase donations and support.

The CDYNE Demographics Web service can help companies better select target groups and learn more about their customers. With Apatar’s visual drag-and-drop interface, this source of useful socio-economic information can be integrated with your database or CRM system in minutes.

You can learn more over here.

May 8, 2008

Case for data quality

Filed under: Data Quality — Alena Semeshko @ 5:25 am

I wrote a lot about data cleansing and data quality as a one-time procedure and as a repeated practice. But here’s a catch, when do you usually think of and/or worry about the quality of your data? When migrating and integrating it? When using it in marketing? When building your strategy based on it? Has it ever crossed your mind that the quality of data needs to be thought through before it’s actually gathered? That’s new and you don’t see too many companies thinking of it…yet. We don’t quite realize that even at the early stages of preparing for data gathering and while obtaining it, data quality aleady plays an important role in the future of your company. It is the early stages that make a difference in how your data turns out and if. If approached properly from the very beginning, your data will surely pay off when you get to sharing and maintaining it, and especially when applying it.

Cost-wise, this approach is rather efficient too. Although I always say that corporate data is not the thing to save up on and it might not sound quite like me =), but investing in data quality from the very beginning would save up a lot when it comes to verification, cleansing and usage. Cleansing, as a matter of fact, may very well become redundant. Sweet?

April 29, 2008

Let’s talk data security

Filed under: Data Integration, Data Migration, Data Quality, data security — Alena Semeshko @ 11:49 pm

You are at the stage where you’ve already realized that your company lives and thrives on data (research, development data, customer private data, contact list, spreadsheets and tables etc.). You work so hard and do everything you can to keep your data clean and consolidated, and once you finally have the system that delivers quality at hand, you realize that your data isn’t exactly safe. Bummer! Today, when information is as valuable as it is and companies cannot afford having it stolen, lost or disclosed, information security becomes the critical element and basically the driving force in most business processes.

All potential threats can be divided into external or internal ones. External threats include unauthorized programs (such as worms, Trojan viruses, spy-programs, etc.), and there is really no universal solution that would protect your company from all types of threats, that’s why there are so many specialized tools taking care of each particular problem. These can be efficient, I’ll have to admit. However, it’s the internal threats that usually make companies most vulnerable. And two of the most probable scenarios of information security violation are 1) the deliberate theft of confidential data by authorized users (or so called insiders) and 2) unintentional leak that can be caused by a number of factors (lack of awareness about company’s security policies, for instance).

When creating an information security system, developers try to extend its functional to the maximum so as it would ensure extensive protection. Even operation systems today contain security functions designed to increase the enterprise’s safety level. But this “universality” is unacceptable when speaking of valuable data. A universal security system becomes useless in corporate networks where internal threats (whether intentional or not) prevail.

A recent Forrester survey of 305 security and email professionals revealed some scary but realistic statistics:
1 in 3 companies investigated a breach of confidential data last year.
1 in 4 companies experienced an “embarrassing” leak of confidential information.
1 in 5 emails contains a legal, financial or regulatory risk.

Ways out? Again, a global approach. This article on EbizQ.net suggests Data Loss Prevention (DLP) technologies as a way of securing your most valuable asset and creating transparency by enabling companies to monitor and track the whole data flow. Transparency is good. Transparency is good everywhere actually. Come to think of it, transparency is the key to creating a healthy and productive environment. Even in data integration systems, transparency is a neccessity, allowing you to see where your sensitive data is going, how it’s being transformed and saved and howsecure it is during these transactions. Transparency is another global asset that needs to be integrated into the corporate system o values. You could say, of course, that transparency is just another vague notion (like total security and clean data), perfection hard to achieve, especially for the old market players with set processes. Hard, yes, but not impossible. It’s something to go for. In the end, when your transparency efforts deliver security, it’s your company that will benefit.

So, looks like get transparency equals get security.

p.s. keep in mind, like with anything that has to do with data cleansing, integration and migration, technology usually comes in more handy and much cheaper than employees’ training!

April 25, 2008

Good Customer Data is a Must-Have

Filed under: Data Cleansing, Data Integration, Data Quality, ETL — Alena Semeshko @ 12:58 am

Making the most out of your customer database and relations management solution is what every company wants. No doubt about that. Nonetheless, a huge number of CRM approaches prove insufficient and inefficient.

Here are the six aspects of CRM deployment that Richard Boardman in his recent article calls essential:

1. Poorly defined requirements
2. The availability of internal staff
3. Sign offs
4. Data Good systems require good data, and, if the new system is to be populated with existing data, it’s important that the quality of that data is high. Many organisations are surprised at how many data sources they possess and how poor the data quality is. The cleansing of data and reconciliation of different versions of the same record in multiple data sources can be very time consuming. While there are tools that can help, this process tends to be very manual, and is not something that can be fully outsourced as it requires considerable input from the data owners.
5. User acceptance testing
6. User adoption

I still think data is the key element in this. It’s how you approach, structure and work with your data that makes a difference in your company’s progress. I’d break number four into more precise items like
1. Well-defined data requirements
2. Customer Data Integration & Data Quality (including ETL, data cleansing and everything related to it)
3. Data management, that among other includes following through with your requirements and cleansing procedures rather than adopting a once-in-a-lifetime/lifecycle (whatever you wanna call it) scheme.

But I agree with Richard, you still need to be “realistic about the demands these projects will place on the organisation and manage expectations accordingly. Too often CRM projects are deemed failures because they failed to meet impossibly demanding and often self-inflicted deadlines. A better review of what’s involved and a more analytical appraisal of the availability of resources to meet those demands will go a long way to ensure project success.

Older Posts »

Powered by WordPress