Dilbert and The Data Quality Dimensions Bias

Transient

So what is a "Data Quality Dimensions Bias"?

Well, first of all, don't panic if you're about to take a data quality exam and you're flapping because you've not studied this topic. I've made the term up but it is a very real problem that affects our profession.

As Dilbert points out above, it's easy to get hung up on accuracy when there are far more pressing issues with your data. Your data could be 100% accurate but still useless to the end user because the data quality dimensions they require are lacking.

There are lots of data quality dimensions but the problem is that most data quality teams now rely heavily on technology to do a lot of the data quality management of these dimensions. So, things like consistency, formatting, precision and conformity can be easily validated and measured but other more abstract dimensions often get left out.

When Data Quality Accessibility Outranks Accuracy

In the Dilbert example above, the quality of the metadata or documentation is poor and this can have a massive impact. The best accuracy in the world is irrelevant if no-one uses your data.

If you've ever carried out a data migration and issued an amnesty for all undocumented items of data to be revealed you'll see how big this problem can be. People come out of the woodwork with all kinds of spreadsheets and Access databases because they couldn't understand that multi-million pound ERP system you put in 5 years ago.

How to Cure The Data Quality Dimensions Bias

To solve this problem you need to  get into the head of the information consumer.

Gather the voice of the customer and hear their concerns. Map these to data quality dimensions and figure out the best way to measure them. Don't just rely on technology.

For example, KFR Services created quality levels that went beyond Six Sigma levels simply by encouraging their customers to spot issues and help them broaden their data quality dimensions.

What data quality dimensions do you manage? Do you manage dimensions that data quality technology simply can't reach? Please share your views below.