Does Your Project Suffer From Data Quality Product Myopia?

Are you focusing your data quality efforts around the capabilities of your chosen data quality software or the actual needs of your organisation?

The following article explores the common ailment of "data quality product myopia" and provides some practical advice for curing the condition.

Does Your Project Suffer From Data Quality Product Myopia?

When the only tool you have is a hammer, it is tempting to treat everything as if it were a nail.

- Abraham Maslow,1966

Most people will have heard of this famous quote. It simply states that if we use just one tool then the results will be limited to what that tool can deliver.

One of the increasingly common situations I am witnessing in our profession is where an organisation relies on the capabilities of a specific product to dictate their data quality strategy.

This article was triggered from a visit to a large financial institution that were about to embark on a major data migration. When we discussed their data quality approach their reply was simply "we have a corporate license for [product x]".

The problem with this situation is that you automatically focus your data quality capabilities on what the software can deliver. In this particular case, the tool they had purchased was a very popular but fairly limited data profiling product. This was for a project that certainly required a far wider depth of data quality capabilities than data profiling and discovery alone.

This condition is likened to a form of myopia.

Myopia is the medical term for nearsightedness. People with myopia see objects more clearly when they are close to the eye, while distant objects appear blurred or fuzzy. Close-up work may be clear, but distance vision is blurry.

- Definition of myopia

If we apply this to our profession, the focus is often concentrated on what's in front of you, ie. the product, instead of those fuzzy, distant objects - your true data quality requirements.

So, is the answer to purchase more expensive, feature-rich software?

Actually, no.

The answer is to understand the process of data quality that will work for your project and then determine how and where technology should play a role.

Another case study springs to mind that emphasizes this point. The organisation wished to consolidate many legacy systems into a single, highly automated, next generation, master data services hub. Once again, a data quality process was implemented which focused primarily on the core features of the product.

With any product there are a finite set of data quality dimensions that can be measured. Unfortunately no product can measure all the dimensions that are critical to the success of a typical data quality initiative.

In the case study above, the organisation initially omitted one of the fundamental dimensions of data quality - accuracy, and very nearly paid the price. Half way through the project and with the data quality work "on track" as dictated by the capabilities of the product, the company (against numerous objections) was encouraged to perform a site survey. They found that 40% of their equipment was recorded inaccurately.

From the perspective of the product methodology everything was in fine shape but by looking at dimensions beyond the ability of the product far more serious issues were uncovered.

There are many other dimensions that we often need to measure that most products are poorly equipped to manage.

Here are some examples:

  • Data Presentation Quality -Is the GUI structured in a manner that suits the needs of the user?
  • Data Definition Quality -Is the data defined adequately? Are the business rules documented? Is the naming of data consistent across the enterprise?
  • Data Accessibility -Is the data accessible? Can all users gain the right level of information for their needs?
  • Schema Data Quality -Is the physical schema accurate? Does it follow a corporate standard? Does the schema create overloaded fields or redundant data?
  • Data Protection -Is the data secure? Are there governance controls that ensure restricted access?

In short, most products perform exceptionally well at "inside-out", data quality processing but are clearly poorly equipped to deliver "outside-in" data quality processes.

How can we cure data quality product myopia?

Here are some practical tips to cure the condition:

  • Educate. Study a wider range of data quality methodologies than just the one provided by the data quality product vendor. This is not to say that the vendors methodology won't be perfectly suitable but it certainly pays to explore the alternatives. Be wary of the hammer and nail syndrome.
  • Listen. Identify exactly what data quality issues the company is facing. Speak to the users, customers, stakeholders and technicians in your company. Resist the temptation to "dive in" with data quality technology. Perform surveys and view the issues from 30,000 feet before switching the products on.
  • Focus. Understand what dimensions of data quality are critical to your organisation and its customers. Ensure that your data quality process is capable of assessing, measuring and improving ALL dimensions relevant to your needs.
  • Practice. Walk through your data quality process simply using white boards, pen and paper. Put down the technology and understand the process in detail before you switch the software on. Understand WHY you are doing something instead of just HOW. People can often become blinded by the automated data quality technology at their disposal instead of understanding the underlying principles of what they're trying to achieve.
  • Iterate. Don't expect to get this right first time. Most data quality practitioners accept that there is no perfect methodology or process for data quality. No two "gurus" agree on what is a definitive process so take time to create a process that works for your specific project as each project will dictate your approach.

What are your views? Is this a growing issue? What advice can you offer sufferers of this condition?