How to Mature Data Quality and Data Governance by Stealth - Interview with James Phare of Data to Value

How do you create traction with data governance and data quality when your organisation is wary of these initiatives due to past historical projects that didn’t live up to expectations?

In this interview with James Phare of UK data management consultants, Data to Value, we learn how James helped a past employer in the financial sector successfully implement a major corporate initiative while delivering data quality and data governance ‘by stealth’.

The end result was a centre of excellence model for data quality and data governance that became in-demand across the organisation.

The interview provides many insights for the reader to apply in their own organisation.

James Phare is a past contributor to Data Quality Pro and recently presented at our inaugural Data Quality and Data Governance Virtual Summit.

You can contact James at Data to Value or learn more of his background on his LinkedIn page.


Dylan Jones: Can you describe what the reactive state of your organisation looked like before you started the journey towards data quality?

James Phare: The company that I worked for had grown primarily through acquisition and had achieved exponential growth over a 10-15 year period. As a result, lots of new systems had been brought in.

It was pre-financial crisis, so there wasn't really a rush to necessarily fully integrate those systems and adopt common processes so we had lots of duplicate and disparate versions of the truth for master datasets. For example, we didn't have a single product master but instead had over ten data sources, with a similar story for client data, securities data, counter party data and other key datasets.

I think the company had always been very data driven in terms of its philosophy on the trading side (it was a world leader in algorithmic trading), however in other areas of the business the attitude was more along the lines of ‘if it isn't broke, don't fix it’.

So the business was still functioning and nothing major had blown up but there was this big underlying inefficiency with the same data being maintained in different places. As a result, different versions of the truth made it very hard to do accurate central reporting for things like sales and funds under management and business performance and so on.

I think problems started to become a bit more apparent within the firm, like many other firms, as soon as the financial crisis hit in 2008. Assets under management started decreasing, there were more redemptions, share prices were falling and thus we were coming under a lot of pressure to look at cost savings.

It's fairly obvious to work out the cost of maintaining data in lots of different places as you can easily calculate the number of people involved across different departments.

But we also did a good job of working out what the less visible costs were.

So we looked at things like:

  • What is the cost of people integrating this data in an ad hoc way for reports?

  • What is the cost of people having to clean up this data and how much time are they spending doing so?

  • How much time are people having to spend on the phone talking about different issues?

  • How often are the fund managers having to phone Operations because they are having problems with trades not being settled correctly due to reference data issues and other problems?

In a way, as a data function, we were quite fortuitous as the financial crisis created a large driver. However, it still wasn't quite enough to get sponsorship for quickly getting a data governance framework up and running.

We still had to make it easier for the executives to support this work.

Dylan Jones: How did you make it easier for the executives to buy-in to your data quality and data governance goals?

James Phare: We did it by linking some of the corporate objectives to very tangible burning platforms that could only be solved by actually getting a data quality and data governance framework up and running.

One of the core objectives of the firm was to become more efficient at marketing and distributing products to generate more sales for products that were performing well.  

A key part of that strategy was to implement a new website.

To do that we needed to get the product data into a clean shape onto the website which was a next to impossible task as it was being maintained in pretty much every department within the company to varying degrees of quality, using different data models, different definitions, different standards and different formats.

So the desire within the business to launch a new website became our burning platform.

We explained to executives that we can't launch the new website and start to generate new sales without solving the product data problem. And we can't solve the product data problem without getting some tools, trained staff and some targets with a sense of urgency up and running around actually implementing data quality and data governance frameworks.

So that's how it all started and that then led to a year long period of getting that up and running.

Dylan Jones: What did the management persuasion process consist of?

James Phare: The two stakeholders that were heavily involved in scrutinising some of the proposals that we were making and eventually supporting the proposals were the CIO and the COO. The CFO was involved to an extent where it impacted things like funds under management reporting, business performance and MIS.

It really struck home with them when we talked about it not being possible to launch this new website without solving the data problem.

To solve the problem we needed to go up this maturity curve and they had a choice about how we went about that. At the time the company was making lots of redundancies to try and slim down and become more efficient so it wasn't really a scenario where we could start building a big team so we had to be smarter about using existing resources.

The proposal we made was to bring in some cutting edge data quality technology and train existing staff that were doing this work day to day in the latest data profiling, data mining and rule building techniques.

It was positioned as a low cost, rapid form of doing this work and initially they were very sceptical.

Dylan Jones: How did you win them over?

James Phare: One of the ways that we managed to win the sponsors over was by doing a very successful pilot. We brought the software in on a temporary basis for about 6 weeks and selected a use case where we knew the data really well.

We did a training course on the product and then lots of interactive sessions. We worked through the issues to show this doesn't need to be a long project with lots of paperwork, targets and people involved.

We demonstrated that all you need to do is give us the funding to buy the data quality tool and we'll almost take the pain away. We can involve you as much or as little as you like.

That made it a much easier decision for the sponsors to make as opposed to going down a big project route with lots of policies, stewards, formally defined owners and lots of handouts and process stamps.

We went more down the line of “just do it”.

Dylan Jones: Most senior management obviously don’t possess a data quality background. How did you educate them on the need for improved data quality? What did that actually involve at a practical level?

James: It was several months of presentations, and also quite high level presentations, talking about how data quality is not IT's problem, it's also not just the business' problem - it's everyone's responsibility. We explained that the problem is multi-dimensional, it covers people, process, data, technology.

We also talked to them about how it's not anyone's fault and how it important it is not to create a blame culture. It's developed as a problem as a result of the way the company has grown rather than specific things that people have done.

Of course, you can only get so far with presentations, particularly in an entrepreneurial environment like we had. Sooner or later you have to ‘walk the walk once you've talked the talk’ so one of the big things we did very well was not being afraid to show some of the real things that we were doing in terms of the data profiling work, the data rules and the problems that we were finding and the dashboards we were creating to senior managers.

This ability to demonstrate our progress made it a lot more tangible to them. They could see we've found a problem and we know exactly what to do with it. We know who to talk to and a finger in the air estimate it's going to take 1-2 weeks or maybe a month to fix.

That gave them a lot more confidence that the problem was in safe hands and was solvable because before it had been an elephant of a problem that no-one had done a particularly good job of breaking down into manageable tasks. No-one had wanted to go near it.

That was the main achievement. It was a paradigm shift to it now being lots of small, completely solvable, problems.

Dylan Jones: In terms of those dashboards, what type of information were you relaying?

James Phare: We tried to keep the dashboards very simple, tailored and easy to generate.

Our first dashboards were just powerpoint presentations with six charts from Excel that we'd copied in based on aggregations of individual rule results that we then tagged against data quality dimensions.

We also used metrics that as practitioners we wouldn't necessarily say were the most revealing metrics but from an executive level they gave a lot of comfort we were making progress.

One example is we were sorting out the product data problem and migrating the data from different sources into a single source. One of the headline statistics that everyone tracked in all of the meetings was the number of fields that were being retired, not in total but from one particular IT application. This system had been a problem for a number of years and it was by no means the most significant target that we were working to but we were able to kill off a few fields each month as we got that data migrated and cleaned up and defined in a more intuitive and useful way for the business.

I think keeping things simple was a good approach to take and it worked really well for us.

Another effective tactic we did was whenever management queried any of the statistics we would bring our data quality software into the meeting and quickly navigate into the data because we'd captured very clear notes on what we'd found in those particular issues.

We would drill into the data and explain what was happening. We would perform analysis ‘on the fly’ rather than it being a more traditional approach of saying “I'd better make a note of that and go away and check and get back to you” which sometimes doesn't help your credibility and suggests you don't know your subject area as well as you should.

It was high level but very interactive.

Dylan Jones: I love this approach as you’re actively involving senior management in the valuable work of data quality management, there is this instant feedback that is always beneficial. Great tip.

You talked about data governance earlier but what was the project recognised as: a data migration, data quality or data governance project?

James Phare: Great question and that was one of the challenges.

We couldn't necessarily call these things a data quality framework or a data governance project because those were terms that had bad connotations within the firm due to previous failed efforts.

They were viewed as quite academic and things that wouldn't necessarily solve problems so it wasn't shaped as a data migration or a quality or a governance project at all but it was shaped as part of a new website build project.

We used this project as a vehicle to get lots of best practices like data quality, data governance, data modelling and data definition in place but without actually using those terms directly because we didn't want to ‘scare the horses’ so to speak.

Dylan Jones: That’s a great idea, kind of ‘data quality by stealth’.

Executives didn't want data governance because in their eyes, data governance wouldn't drive revenue for them. But a new website would add value because they could see that people would come to the website, buy the product and fuel growth.

How bad was the reputation of data quality and data governance? Was it tarnished from top to bottom or just for some senior leaders?

James Phare: I think it was more in the senior leadership forum that it wasn't viewed hugely favourably because I think they'd had lots of large projects in the past that had governance streams and heads of these different areas doing things which had the typical problems that large projects do in terms of overruns, de-scoping, failing to fully deliver on objectives, creating architectural debt and all those problems.

Due to this history within the organisation, we had to shape our project as a much more agile, leaner project where we were working on specific objectives that needed to be done in the timeframe of weeks or months but doing it in a way that was repeatable, reusable and sustainable.

We explained how the benefits of our project would still be around for a long while after the project finished.

But it was also more specific. Increasing data maturity wasn't necessarily viewed as a bad thing but when people spoke about certain specifics those things were viewed with disdain.

Dylan Jones: Can you give an example?

James Phare: Yes, one example is the company had attempted numerous times to build a conceptual data model and a data dictionary. These hadn't been particularly well received.

As many of your readers will know from experience, you can't really implement a data governance or data quality framework without understanding what the data means and what the rules are that govern whether the data is fit for purpose.

So clearly you need those kind of artefacts and we found sometimes it's easy to get people to buy into them if you try a different approach.

For example, it can be something as simple as re-naming a concept. In our case, we renamed the dictionary to a lexicon. This wasn't a big step but it disassociated us from work that had been done before.

Another big thing we focused on was using intuitive tools.

Rather than buying a dedicated data dictionary or data modelling tools, we used things like Sharepoint and Excel with the business, tools they were familiar with.

The more technical members of the team were obviously using dedicated data migration tools for example but we rendered most of our data models in Visio or Powerpoint to make them more accessible than a complex modelling tool.

You can still use a more powerful and complex tool under the hood but there's no reason to stick with terminology if it doesn't achieve the goals you're trying to achieve.

Dylan: You mentioned earlier that you were creating some reusable capabilities. What were some of the things you were creating that could be reused after the project was finished?

James Phare: The web project was a really good blueprint.

Product data was the main problem. We had over ten separate sources of product data all with different data models and data quality issues. We were integrating and creating a new clean master dataset so not every significant data quality problem we had within the firm resembled that same problem set but a lot of the artefacts and techniques we used were reusable.

Things like the data quality dashboard and the method by which we created it was reused from the datasets like client data, security data, counter party data and so on.

A lot of the techniques around how we would initially approach a data set were reused.

For example, we developed a very good top-down approach for rapidly profiling a data set and getting a feel for what the characteristics and outliers were within the data set. We would start to document those and understanding which Subject Matter Experts (SME’s) to talk to and exactly what to talk to them about.

So that was something that worked very well in terms of reuse and carrying over to other data sets.

Dylan Jones: What skill sets did you have in this 'secret' centre of excellence?

James Phare: Previously, we had two separate teams within IT in the business:

  1. The IT team focussing on architecture, modelling, integration challenges and data warehousing

  2. The business team focussing on upstream source systems, inputting the data day to day, reconciling the data, dealing with queries from the business.

One of the first things we did was co-locate those teams together to form a BAU part of the team and also a projects part of the team.

That projects part was a mix of the two teams working on things like the website project and the data requirements as part of that. That overcame physical location barriers.

Another thing was spending a lot of time knowledge-sharing, training, doing workshops, sharing some of the pain points that people were working through and sharing some best practice.

One of the things that really helped was using a common toolset.

We brought in a data quality profiling tool and business users from both areas were trained in the tool at the same time.

They could see how useful it was to help share different problems.

Dylan Jones: Can you give an example?

James Phare: Sure. One example was the data modelling people could see the data quality tool was very powerful at doing metadata discovery and helping with reverse engineering, things you need to do for designing target data models for a data migration.

But the business people could also see how they would need to populate this target model with clean data.

Another key aspect was process.

Working out a single process for capturing issues, understanding which subject matter experts to talk to and work through those issues. What the escalation parts were for dealing with those issues. And also putting together realistic estimates for some of those issues.

Previously within the business team the default position whenever finding data quality defects was: “How can I hack this thing into the system and tick these things off my list ?”

Historically, this would often involve someone manually typing things into the system but now, being co-located with the IT (more technical) team, they got more of an understanding around those particular systems to say “this system is built with this technology and designed in this particular way” so it was easier for them to script data in a particular way.

Sometimes the opposite was the case and it wasn't worth talking to IT because the architecture was such that it would be better off doing it manually.

Ultimately, maybe these people aren't going to be able to talk a common language and work towards a common goal. But more often than not when you put people in the same room they end up sharing the same objectives and similar ways of working.

Dylan Jones: What data quality fundamentals were you trying to put in?

James Phare: The main objective for the pilot project was to move to a new target source and retire the historical sources.

We did that in a very agile, iterative-phased approach.

The first thing we did was create a common master set of entities and then we populated out and migrated the clean attributes in phases. We would do maybe ten attributes a month for a particular requirement within the website. If they were developing certain pages at that point it needed certain data points so we would focus on those as a priority.

One of the big things that made a difference from a data quality perspective was moving towards a single source. It made the target process a lot easier, especially for keeping the data clean. We no longer had this hugely complex architecture with data flowing through lots of different systems and having lots of different transformations applied to it. We didn't have to get hugely complex root-cause analysis processes up and running any more.

One of the things we did at the same time, which really paid dividends, was sorting out the structured and unstructured data challenges. We created a single document repository for all of the legal documents around the funds. It was very quick to reconcile a particular entry manually against the original source. It reduced the size of the data chain from lots of different linkages down to a very small and manageable set.

Dylan Jones: How long did the project run for?

James Phare: We initially spent about 6 months doing tool selection, lining up resources and doing training. After this, it was probably a year doing the actual project itself. i.e. loading in the sources, doing the integration/migration work, data clean up and the target process roles/responsibilities and so on.

It was a fairly rapid project in the end but luckily it was a key focus from management to get this done so that made it easier to focus on getting the fundamentals and essentials delivered.

Dylan Jones: Was the project ultimately considered a success?

James Phare: Definitely. It was a success in two ways.

Firstly, it enabled the company to launch a very successful new website which got lots of very positive feedback from clients and other stakeholders.

Secondly, it positioned the team really well because they demonstrated a big success as an initial blueprint for repeating in other areas.

I think there were other areas in the business that had maybe a bit more complexity but there were also areas where the blueprint could be rolled out where the problems were similar. It created a positive environment for keeping the momentum moving forward.

That is one thing a lot of reactive organisations struggle with. Even for organisations that are doing data quality, it's hard to keep that sense of urgency moving forward. It's much easier to sit back into a more reactive frame of mind where you've got things running, you're not necessarily looking at the right things, and you deal with problems as and when they appear rather than actively going out and proactively looking for problems and building business cases and ultimately making your organisation more efficient.

Dylan Jones: What did you do to keep that momentum going?

James Phare: After that successful project it was easy to justify the existence of the team so ongoing funding wasn't a problem.

One of the things we did well was run roadshows demonstrating the tools and techniques we'd used. We were showing people how quickly we'd made progress because it was a very similar story in other datasets within the firm that people had previously viewed it as a problem that they'd tried to solve before but it was just too big, too much of a thankless task, just impossible.

But our roadshow showed it is possible to solve these issues and it can actually be very simple. You just need to make some small commitments in terms of time, resources and funding and we can take a lot of the pain away within this competency centre in terms of the technical requirements around data profiling, data migration and data mining, rule building and reverse engineering.

We demonstrated that you don't need to be fully involved in it - you just need to supply some subject matter expertise and sponsorship and we can take that on board.

The challenge was not finding problems but triaging and prioritising people that wanted our time. We had to look at which of the problem-sets support the corporate objectives to the highest degree.

Dylan Jones: So you had demonstrated clear success. Did you go straight from there to co-founding your practice?

James Phare: I'd been at the company for quite a long time by the point we got the framework and competency centre up and running.

We were spending a lot of time talking to other people in firms within the financial sector and sharing challenges and solutions that people were coming up with.

I soon observed that there weren't many people that seemed to be having as much success as we did. People were getting bogged down with process and governance, policies, roles and culture rather than getting on with it and letting those things naturally sort themselves out.

It was a good opportunity for me to move into the consulting arena and try to look for new exciting cases and share some of that knowledge.

Dylan Jones: What has the experience of growing your own data quality practice been like for you?

James Phare: We launched Data to Value three years ago and the journey has been great fun.

It’s hard work but fortunately with data quality, the problem doesn’t go away, it just changes shape. Priorities change but ultimately the challenges are always there.

I’ve worked on diverse projects ranging from low volume, relatively straightforward reconciliation projects, right up to the Big Data end of the spectrum.

For example, I’ve worked on data quality in derivatives trading platforms for banks that are executing a million trades a day. This gives you a great chance to test your data quality skillset.

I recommend it to anyone thinking of going down that route; particularly if you've done something innovative in the past that should be shared.

Dylan Jones: Do you still find within the financial sector that people respond more positively when you don't talk about data quality, data governance or MDM, but instead talk about the benefit of these projects? Do you find this happens in other sectors?

James Phare: I think there's a similar challenge across all sectors.

I sometimes wonder why I don't tend to talk about specific terms to do with the process. e.g. data stewards, owners, policies and principles. I think it's because (for me at least) it's always been about the outcomes rather than how you get to that place.

It's easy to talk about how things should be but unless you can make a meaningful impact to the people that are using this data, then it's easy to add unnecessary layers of complexity into the organisation.

We therefore always try to focus on outcomes and work back from that rather than start with the process and hope we end up at those outcomes.

Dylan Jones: When companies come to you requesting your help, do they ask for help with data quality?

How do they define and present their problems to you?

James Phare: Within the data profession and related professions, we often hear about this common language.

In reality, I think we're still quite far away from that.

One thing that surprised me when I moved into consultancy is how interlinked these related professions and disciplines are. You can't talk about a data quality problem without exploring the governance, modelling and definition aspects for example, but also how people term things.

For example, quite often we are approached by people saying: “we have a data governance problem”.

As we start talking to them about the burning issues they're struggling with, they say things like:

“people don't understand what the data means, they keep misusing it, the data is not fit for purpose, it’s inconsistent, board members aren't able to understand each other when they're comparing reports”.

All of these things are wrapped within the same eco system.

One of the mantras we've always adopted is to have multi-skilled practitioners who are trained in the fundamentals of all areas because it's hard to get to a boundary and stop.

You can’t say: “I've solved the data quality aspects for this but I'm not going to look at any of the data governance or data definition issues”. You can't really solve any of those things in isolation without addressing some of the related aspects.

Dylan Jones: That’s a great lesson to end the interview on. Work back from the desired outcomes and build a skillset that is multi-disciplinary. I think particularly for practitioners looking to move out of the role of permanent employee and into a consulting role, even running their own practice as you do at Data to Value, that’s excellent advice.

Thank you so much for the interview and all of the great insights you continue to share on Data Quality Pro and your Virtual Summit sessions. Best of luck with Data to Value in 2016.

James Phare: Not a problem, I’m always keen to share my experiences and I hope they support your members as they look to tread a similar path.


Summary of key points:

  1. Become effective at working out the obvious cost of maintaining data in lots of different places - James found it easy to assign labour costs this way
  2. Once you’ve found the obvious costs, work out a hidden costing model by looking for things like the cost of ad hoc reporting, manual clean up, support due to defects, transaction failures
  3. Find a burning platform, a data-driven initiative, that has executive focus and can be used as a blueprint for more effective data quality and data governance
  4. Lead with the business outcomes and drive data quality and data governance behind the scenes (if these terms have negative connotations with the leadership)
  5. Aim high with your sponsorship by supporting high-impact projects, James focused on getting support from the CIO, COO and CFO (to a lesser extent)
  6. Position your project as lightweight and agile, most businesses won’t have the stomach for a massive programme of improvement right out of the gate
  7. Educate executive management on the basics of data quality but in a business-focused manner, back up your educational drive with lots of presentations 
  8. Demonstrate your value continually via dashboards and examples of where improvements are made in the live data
  9. Keep dashboards simple, Excel and Powerpoint are fine to start with, don’t overload the executive sponsors
  10. Allow instant drill-down whenever management have a query, facilitate this using data quality technology that can navigate from the issue reported down into the underlying operational data
  11. Don’t be precious about naming your data quality project specifically in these terms, in James’ organisation data quality and governance were viewed as academic initiatives, he focused on the business outcome of a new web presence for selling products instead, this will help you deliver data quality by stealth
  12. When presenting your project to executives, focus on how agile and lean your project is, explain how you are working on specific objectives that can be done in the timeframe of weeks/months
  13. Emphasize that your initiative will also create repeatable, reusable and sustainable processes that will benefit the entire company
  14. Don’t be afraid to change terms and definitions if they are viewed negatively, James’ organisation had negative experiences with building a data dictionary so he changed the name to a ‘lexicon’ and found it became well received
  15. When selecting executive or business facing tools, don’t blind people with science, use simple tools like Powerpoint, Excel and Sharepoint, so they become instantly familiar (you can still use the hi-tech alternatives behind the scenes)
  16. Try and co-locate your teams for maximum skill sharing of best practice techniques, this will help you establish a centre of excellence
  17. Adopt a common toolset that can help create repeatable processes that can be quickly deployed elsewhere within the organisation
  18. Learn from your successes in data quality and data governance to create a blueprint for replicating the same conditions elsewhere
  19. Find projects and situations in the organisation that closely match your last project, don’t ramp up the complexity until you have incrementally added value
  20. Create internal roadshows for demonstrating the tools and techniques, this will help you get buy-in on new projects and also find new recruits to the team
  21. Develop a triaging and prioritisation process so that you actively select the right projects to be involved with in future
  22. When creating your own practice, don’t focus on any one data discipline exclusively, demonstrate to clients how you tackle the outcome in the first instance and then build a skillset that is inter-disciplinary

About the Interviewee

James Phare

James Phare

James Phare is a founding partner of Data to Value, a Consultancy that specialise in applying Lean principles to Information Management.

He has over 12 years experience of working in various data centric roles on both the Buy and Sell sides of Financial Services. After graduating in Economics and Economic History at the University of York, James started his career at Thomson Reuters before joining Man Group in 2007 within Operations and Investment Management.

Contact James Phare: http://www.datatovalue.co.uk/contact