Note: This request was posted by a recent member of the site.
Dear Community Members/Experts,
We are trying to create a Data Profiling tool and was wondering if you could point me to resources/books/articles on Algorithms/Approaches for Redundancy, Attribute and Dependency Profiling. We want our solution to be accurate and fast.
I eagerly await your responses.
*** Editors Note: Additional information from member ***
MDM is an upcoming area and we wanted to get up to speed with that while it is still in infancy. We think that all MDM and DQ initiatives should start with Data Profiling. Hence we decided to create a tool for the same (at least give it a shot). Most importantly, we are fascinated by MDM and DQ and hope to develop advanced expertise in this area. We have read that Pattern matching at least as it is implemented in many products in the market simply isn't good enough. So we really wanted to find out algorithms/approach that would be more efficient and accurate.
Any help in this matter is greatly appreciated.