May 8, 2008

Case for data quality

Filed under: Data Quality — Alena Semeshko @ 5:25 am

I wrote a lot about data cleansing and data quality as a one-time procedure and as a repeated practice. But here’s a catch, when do you usually think of and/or worry about the quality of your data? When migrating and integrating it? When using it in marketing? When building your strategy based on it? Has it ever crossed your mind that the quality of data needs to be thought through before it’s actually gathered? That’s new and you don’t see too many companies thinking of it…yet. We don’t quite realize that even at the early stages of preparing for data gathering and while obtaining it, data quality aleady plays an important role in the future of your company. It is the early stages that make a difference in how your data turns out and if. If approached properly from the very beginning, your data will surely pay off when you get to sharing and maintaining it, and especially when applying it.

Cost-wise, this approach is rather efficient too. Although I always say that corporate data is not the thing to save up on and it might not sound quite like me =), but investing in data quality from the very beginning would save up a lot when it comes to verification, cleansing and usage. Cleansing, as a matter of fact, may very well become redundant. Sweet?

April 29, 2008

Let’s talk data security

Filed under: Data Integration, Data Migration, Data Quality, data security — Alena Semeshko @ 11:49 pm

You are at the stage where you’ve already realized that your company lives and thrives on data (research, development data, customer private data, contact list, spreadsheets and tables etc.). You work so hard and do everything you can to keep your data clean and consolidated, and once you finally have the system that delivers quality at hand, you realize that your data isn’t exactly safe. Bummer! Today, when information is as valuable as it is and companies cannot afford having it stolen, lost or disclosed, information security becomes the critical element and basically the driving force in most business processes.

All potential threats can be divided into external or internal ones. External threats include unauthorized programs (such as worms, Trojan viruses, spy-programs, etc.), and there is really no universal solution that would protect your company from all types of threats, that’s why there are so many specialized tools taking care of each particular problem. These can be efficient, I’ll have to admit. However, it’s the internal threats that usually make companies most vulnerable. And two of the most probable scenarios of information security violation are 1) the deliberate theft of confidential data by authorized users (or so called insiders) and 2) unintentional leak that can be caused by a number of factors (lack of awareness about company’s security policies, for instance).

When creating an information security system, developers try to extend its functional to the maximum so as it would ensure extensive protection. Even operation systems today contain security functions designed to increase the enterprise’s safety level. But this “universality” is unacceptable when speaking of valuable data. A universal security system becomes useless in corporate networks where internal threats (whether intentional or not) prevail.

A recent Forrester survey of 305 security and email professionals revealed some scary but realistic statistics:
1 in 3 companies investigated a breach of confidential data last year.
1 in 4 companies experienced an “embarrassing” leak of confidential information.
1 in 5 emails contains a legal, financial or regulatory risk.

Ways out? Again, a global approach. This article on EbizQ.net suggests Data Loss Prevention (DLP) technologies as a way of securing your most valuable asset and creating transparency by enabling companies to monitor and track the whole data flow. Transparency is good. Transparency is good everywhere actually. Come to think of it, transparency is the key to creating a healthy and productive environment. Even in data integration systems, transparency is a neccessity, allowing you to see where your sensitive data is going, how it’s being transformed and saved and howsecure it is during these transactions. Transparency is another global asset that needs to be integrated into the corporate system o values. You could say, of course, that transparency is just another vague notion (like total security and clean data), perfection hard to achieve, especially for the old market players with set processes. Hard, yes, but not impossible. It’s something to go for. In the end, when your transparency efforts deliver security, it’s your company that will benefit.

So, looks like get transparency equals get security.

p.s. keep in mind, like with anything that has to do with data cleansing, integration and migration, technology usually comes in more handy and much cheaper than employees’ training!

April 25, 2008

Good Customer Data is a Must-Have

Filed under: Data Cleansing, Data Integration, Data Quality, ETL — Alena Semeshko @ 12:58 am

Making the most out of your customer database and relations management solution is what every company wants. No doubt about that. Nonetheless, a huge number of CRM approaches prove insufficient and inefficient.

Here are the six aspects of CRM deployment that Richard Boardman in his recent article calls essential:

1. Poorly defined requirements
2. The availability of internal staff
3. Sign offs
4. Data Good systems require good data, and, if the new system is to be populated with existing data, it’s important that the quality of that data is high. Many organisations are surprised at how many data sources they possess and how poor the data quality is. The cleansing of data and reconciliation of different versions of the same record in multiple data sources can be very time consuming. While there are tools that can help, this process tends to be very manual, and is not something that can be fully outsourced as it requires considerable input from the data owners.
5. User acceptance testing
6. User adoption

I still think data is the key element in this. It’s how you approach, structure and work with your data that makes a difference in your company’s progress. I’d break number four into more precise items like
1. Well-defined data requirements
2. Customer Data Integration & Data Quality (including ETL, data cleansing and everything related to it)
3. Data management, that among other includes following through with your requirements and cleansing procedures rather than adopting a once-in-a-lifetime/lifecycle (whatever you wanna call it) scheme.

But I agree with Richard, you still need to be “realistic about the demands these projects will place on the organisation and manage expectations accordingly. Too often CRM projects are deemed failures because they failed to meet impossibly demanding and often self-inflicted deadlines. A better review of what’s involved and a more analytical appraisal of the availability of resources to meet those demands will go a long way to ensure project success.

April 22, 2008

Data Quality At Large

Filed under: Data Cleansing, Data Quality — Alena Semeshko @ 10:35 pm

What’s data quality for you? Right customer contact information in your CRM? Think again? Data quality is more than that, much more than that. Product numbers, associated descriptions, part numbers, units of measure, medical procedure codes and patient identification numbers, telephone numbers, email addresses, commodity codes, vendor numbers and vehicle identification numbers, the list goes on.

This article in CXO describes some consequesnes of poor data quality:

For the CEO, whose ultimate responsibility is to increase customer retention and loyalty, the effects of poor data can have long-term, devastating consequences. For example, the inability to eliminate redundant name and address records results in additional mail-order campaign costs. Recipients of duplicate mailings are also likely to become frustrated and question the firm’s overall operating efficiency. If these redundant mailings each consistently misspell the individual’s name or address, the frustration level is likely to approach alienation or even a legal concern – especially if the recipient had previously made a request to the mailer that they be removed from the vendor’s mailing list or asked to be placed on an industry-wide, do-not-mail list.Add to this the cost of the catalogs or merchandise delivered to the wrong address and the real magnitude of the problem only just begins to surface. If a single customer is included in a company’s database multiple times, each time with a different value for the customer identifier, the company will be unable to determine the true volume of this customer’s purchases. It could even be placed in the embarrassing situation of attempting to sell the customer an item that he or she has already purchased. Poor data quality can negatively influence how a company is perceived in the marketplace and damage brand equity.

These data inefficiencies can also result in missed up-sell and cross-sell opportunities. Without a single view of the customer across the enterprise, it’s impossible to aggregate information to make decisions. This makes it impossible to distinguish between single-product and multi-product buyers, or between new and existing customers

For the CFO – who is in charge of regulatory compliance, managing security risk and other methods of limiting exposure – poor data can result in the company facing public embarrassment, loss of credibility, significant fines and even lawsuits. A forward-thinking organization should include data quality as a part of its everyday operations. While this may not happen overnight, recent regulatory and Homeland Security initiatives such as the U.S. Department of Treasury’s Office of Foreign Assets Control (OFAC), Sarbanes-Oxley, the U.S. Patriot Act, and the Health Insurance Portability and Accountability Act (HIPAA) can quickly spur a company to establish a solid data foundation.

[…]

For the CIO, who spends his days striving to achieve peak operational efficiency, inferior data quality can lead to missed opportunities to negotiate better rates with suppliers. Large companies can have thousands, or even millions, of suppliers. Unless you have precise data on how much total business you are conducting with a single vendor across all divisions, you are likely to pay too much for their service.

So what do you do to improve? The article suggests the following:

First,  conduct a Data Quality Assessment to help you recognize the severity of data quality issues.

Second,  adopt a well-defined Data Governance Plan across your organization. That is, define who owns the data, who is authorized to access the data, and which specific standards should apply to the data.

Third, choose a technology to serve as the backbone for the intelligent use and preparation of relevant customer data.

Sounds short and sweet, but try following it through. Will take a while, but you won’t regret it.

April 18, 2008

Wrong Approach to Data Quality, Right Approach to Data Quality.

Filed under: Data Quality — Alena Semeshko @ 2:24 am

As much as you hear about the importance of data quality being the determinant of your organization’s success, companies all over the world still use inaccurate and outdated data in their daily work. In the bulk of information that piles up over the months, even years, you usually can’t even identify what’s more urgent and important.

The traditional ETL approach that quite a few companies have come to use sure is helpful. But once you’ve gone through its stages, it’s important not to forget that your new cleansed data is still constantly being enriched and changed. So in less than no time a new challenge emerges as you get your cleansed data mixed with new data that isn’t necessarity as consistent and reliable as it should be. What do you do? Try implementing a unified and repeated data quality monitoring approach.

The steps you could follow while at that include:

  • Create a clear standard that your incoming data should match
  • Identify the main issues with incoming data by checking it against the created standard
  • Look for the ways to solve the identified problems (as a possibility you cold create a notification system to send out alerts whenever unvalid or inconsistent data is detected)

At the first glance this looks like it could solve your problems. But that’s just your incoming data. Another part of the problem lies in the clean data already stored in your warehouse. It’s validity isn’t everlasting, is it? Thus a few more things in your to-do list:

  • Identify the most appropriate time span for your data to be re-verified
  • Schedule your data verification system to conduct repeated checks according to the identified data validity time span

All in all, just keep in mind that data within any organization is a dynamic and constanly-changing asset, and data quality checking should become a repeated procedure, rather than a one-time practice.

April 17, 2008

Apatar CDYNE Phone Verification Connector Released

Filed under: Apatar, CDYNE, Data Quality — Tags: — Alena Semeshko @ 4:34 am

We all know that the combination of contact data from many sources introduces myriad opportunities for error. There’s this bulk of databases with data entered by different people (and humans are prone to error, right?) at different time… and you have to trust all of it is correct and still up to date? Auch. Checking the validity by hit-and-miss method? Auch. Tired of dialing phone numbers from your CRM and hearing that you’ve got wrong number?

Well, you don’t really have to anymore. New CDYNE Phone Verification connector for Apatar data integration toolset can automatically verify and filter customer phone numbers before they enter CRM applications for you. And it doesn’t matter where your data came from, whether it’s databases (such as MySQL, Microsoft SQL, Oracle), files (Microsoft Excel spreadsheets, CSV/TXT files), applications (Salesforce.com, SugarCRM), or the top Web 2.0 destinations (Flickr, Amazon S3, RSS feeds).

This service identifies the phone numbers in your list that have new area codes following a NANPA split and replaces incorrect area codes. If the area code is incorrect or missing, Phone Verification can be used to identify the error or return the corrected one to update your data.

William Chenoweth, VP Director of Marketing CDYNE Corporation says:

“This new Apatar Connector provides customers the ability to automate their every day data management duties with scheduling features and visual drag-and-drop interface. The more automated the data cleansing process, the less expensive and more consistent the end result will be for your company.”

There’s more over here.

April 6, 2008

New CIO at StrikeIton

Filed under: Apatar, Data Cleansing, Data Quality, StrikeIron — Alena Semeshko @ 9:59 pm

News from here, emphasis mine.

RESEARCH TRIANGLE PARK, N.C.–(BUSINESS WIRE)–StrikeIron, Inc., the leader in providing innovative solutions for delivering data over the Internet, today announced that David Linthicum will take the helm as company CEO. Linthicum will be responsible for continuing to drive the companys leadership position as the frontrunner in delivering critical Web services and data, on demand, for the emerging next-generation Internet. Bob Brauer remains as president and co-founder and will continue to lead the day-to-day operations of StrikeIron.

StrikeIrons revenue more than doubled from Q107 to Q108 and has tremendous momentum in the industry. Were moving beyond simply delivering data as a service and into a new era of growth and development for new innovative products, stated Brauer. As an industry thought leader and visionary, Daves addition to the StrikeIron team helps us take the appropriate steps to deliver on the promise of Service Oriented Architecture via the Web and building the foundation for Web 2.0 applications with our managed Web services platform. We are confident that under Daves leadership, StrikeIron is well-positioned to go to the next level.

The emerging Web is an exciting medium that has come of age. Web services and mashups are changing how we access and deliver information and StrikeIron has established themselves as one of the driving forces in the industry, stated Linthicum. I look forward to building on the success StrikeIron has already achieved to date.

A quick reminder - as a result of a recent partnership agreement with StrikeIron, Apatar has recently released two connectors to StrikeIron’s data quality services: StrikeIron US Address Verification connector and StrikeIron E-mail Verification connector. These data quality services from Apatar and StrkireIron ensure the validity of your data, increase productivity, improve sales strategies, and take customer service to a new level by providing faster transaction processing and higher accuracy.

March 26, 2008

Data Quality Ups and Downs

Filed under: Data Cleansing, Data Quality — Tags: — Alena Semeshko @ 3:53 am

Everyone seems to be discussing a recent QAS data quality survey entitled ‘Contact Data: Neglected asset seeks responsible owner’ that questioned over 2,000 organizations worldwive and revealed an increasing number of businesses taking data quality isses seriously and bringing it up to the boardroom level.

“Within the past three years, the number of businesses where the responsibility of data integrity has risen to boardroom level has soared by 16 per cent, showing how important an issue accurate data has now become.”

The survey also stated that:

* the number of employees directly involved in the data quality management has increased by 5% only in the last year
* 23% of the businesses that participated in the survey claimed to use strategica data planning applications on daily basis
* 46% have their own documented data quality strategy

These increasing numbers sure are encouraging and if the growth persists, or even speeds up a bit, we might see a conceptually new, better, cleaner data emerge as an accepted standard of data quality. Now that would be nice, wouldn’t it?

However, with the survey showing 34% of respondents not validating any of their customer and prospect data, there’s still a long way to go to reach the “standard” I’m talking about.

QAS group operating officer Jonathan Hulford-Funnell says: “I find it incredible that organisations are not paying more attention to data quality. It shouldn’t be seen as a burden for middle management, it should be something that every employee in the business takes responsibility for.”

March 21, 2008

Stop Accusing IT for Dirty Data

Filed under: Data Cleansing, Data Quality — Tags: , — Alena Semeshko @ 4:19 am

IT is the easiest to blame for drawbacks and holes in your data, that’s no news. Whenever you don’t get the results and the information you need (provided your business processes are set to present you with quality data), you naturally start looking for someone or something to blame. And IT seems to be the perfect scapegoat. Little do we realize that the problems lie in the business, not in IT.

The thing is, we associate data with IT, consider it a part of IT and don’t realize the two are totally different. Gartner research VP Ted Friedman suggest the solution that should keep the blame off of IT and cause less data quality problems:

“Business needs to be in the driver’s seat,” Friedman said. “At the moment we feel that the focus on the topic is way way too much in the IT camp.”

To advance data quality, Friedman suggests the use of a data steward, who is responsible for benchmarking current levels of data quality and measuring the impact on the business of bad data. The data steward looks at the data transfer processes, making sure, for instance, that the data passes through as few people as possible.

Data stewards will come from a business background, but have good relations to IT, Friedman said. They will only be effective if they are held accountable for their progress, and receive bonuses for meeting quality targets.

March 20, 2008

5 things to Watch out for in Data Warehousing

Filed under: Data Cleansing, Data Integration, Data Quality, Data Warehousing — Tags: — Alena Semeshko @ 7:45 am

There’s been talk of the concept of data warehousing being misleading, failing to deliver efficient solutions at the enterprise level and frequently causing problems upon implementation. Problems like that, again, don’t come out of nowhere, there usually are good reasons behind them. In this post I’l try to sum up a few things you should definitely try to watch out for when tackling your data warehouses:

1) First and foremost – Data Quality. When your data is dirty, outdated and/or inconsistent upon entering the warehouse, the results you are gonna get won’t be any better, really. Data Warehousing is not supposed to deal with your erroneous data, it’s not supposed to perform data cleansing. These processes need to take place BEFORE your data gets even close to the warehouse, that I s, your data integration strategy needs to address low quality data problem.

2) Come to think of it, Data Integration is the second thing to watch out for. Do your integration tools live up to your requirements? Can your software handle the data volumes you have? Will it comply with the newly added to your warehouse source systems and subject areas? How high is the level of automation of your integration system? Can you avoid mannual intervention? You gotta ask yourself all of these questions before you complain that your warehous isn’t providing you with the quality of information you expected.

3) Next, dreaming too big. When you build sand castles you gotta realize they’ll disappear in a matter of days, even hours. Your can’t have it all and at the same time, you can’t have your pie and eat it too. Brreaking the project into small segments, giving them enough time to deliver and having patience is the key to having a pleasant experience with your data warehousing solution. What? Did you think you can fix all the mess in your data in a matter of days? =)

4) Then, don’t go rushing into solutions. Don’t panic. Yes, warehouse projects require time and effort on your part. Yes, it’s gonna be complicated at first. But that’s not the reason to stop with one project and rush into another. Stick with your first choice, fix it, work on it. Multiple projects will waste your resources and end up as another silo aimlessly taking up your corporate resources.

5) Finally, make sure you have a scalable architecture that you can redesign according to your increasing needs. Your business grows, sometimes grows quicker than you think (the number of customers increases, they have more information, more data to be processed) and you want your solution to continue to perform on the same level and live up to your expectations.

The list goes on actually, as there are more things to watch out for… but these are the first that come to mind. =)

« Newer PostsOlder Posts »