As the “cloud wars” between Amazon, Google, and Microsoft rage on, Chief Data Officers (CDOs) face their own battle: ensuring data quality while embracing cloud migration.
The cloud unlocks opportunities to improve flexibility, increase agility, and reduce costs. But it’s not a panacea for every tech pain point. Unfortunately, many organizations fall for marketing buzz and expect that once they migrate to the cloud, they’ll have the tools to easily solve data quality issues too.
In reality, data quality isn’t an outcome of cloud migration – it’s a prerequisite for its success. Despite best efforts by companies to address data quality, Gartner research shows that less than one-third of CDOs perceive value from their data and analytics tools. Difficulties in verifying the quality of their data may be to blame.
While moving to the cloud should be part of an overall strategy to improve data quality, it shouldn’t be the entire strategy. Today’s organizations require a more holistic approach.
Why many organizations struggle with data quality
When organizations treat data quality as an afterthought instead of an integral part of their migration strategy, their cloud migration is likely to encounter less-than-optimal outcomes. Unfortunately, these challenges are all too common. About six in 10 IT professionals say data quality and cleanliness are the most significant challenges they face when working with data — and it’s resulting in escalating costs, increased disruptions, and delayed processes.
Low-quality data impacts the entire business. Leaders face a shortage of data-driven insights to guide decision-making and enhance operations. Even more concerning is the risk of incomplete or inaccurate information potentially leading organizations astray.
Low-quality customer data makes an organization especially vulnerable. Disjointed customer interactions or poorly personalized offerings — both the result of low-quality customer data — can erode trust with customers and cause them to look elsewhere.
Ultimately, it’s up to the CDO to ensure the organization has the technology, processes, and people to make more informed decisions. For many, that may involve charting a new course forward to strengthen data quality.
3 steps CDOs can take to create a more holistic data quality strategy
There is no silver bullet when it comes to improving data quality. Organizations need a holistic data strategy to understand, optimize, and monitor data on the cloud. At a minimum, that should entail:
Discovery to identify all the relevant data sources across the organization
Profiling to understand the unique characteristics and requirements of each data source
Standardization of the quality rules, permissions, and governance
Deduplication of data to ensure migration involves the most up-to-date data
Ongoing monitoring and refinement of data quality
This high-level approach isn’t new. But there’s still substantial room for improvement in implementing comprehensive, cohesive data quality programs in cloud-native environments. Here’s how:
Center data on the customer, not the application
Even when data is accurate, complete, and consistent, it’s often centered around applications, with each application collecting, copying, and storing its own data. The inevitable result is a massive amount of data duplication that overlaps at the application level, creating a convoluted web of data variants and relationships that are confusing to navigate.
To avoid these redundancies, shift your organization to a customer-centered data model, instead of an application-centered one. Centering data on the customer in this way eliminates redundancy and overlap in authoritative data, resulting in a more concise schema and datasets. It also ensures data is well defined while flexible enough to change over time without breaking applications.
Empower application owners to map data schemas
Your data quality processes should strike the right balance between flexibility and consistency. You can find the sweet spot by empowering application owners to map application schemas to an organization-wide authoritative model while providing the necessary governance to ensure consistency, cohesion, and control.
The resulting structure enables application owners across the organization to keep mappings up to date. This feeds analytics on high-quality, relevant data used by each application. For the organization, data schemas become more standardized and simplified, with each entity, concept, and relationship in the organization described in a single place. Over time, mappings in the application space can be simplified or even removed.
Leverage digital tools to automate and simplify processes
Managing data quality processes can seem overwhelming, but the good news is that you aren’t in it alone. You have an array of digital solutions at your disposal to streamline and simplify data management. To start, automate the data flow between applications and the corresponding mapped data store; from there synchronize with the centralized data store to feed your analytics tooling.
Open-source standards can also help, providing a universal API for reading and writing data that separates identity, applications, and data. This data infrastructure provides a structured environment to define schema mappings, manage linked data models, and exchange datasets, along with the mechanisms to protect data and ensure proper access control policies. Applying these advanced digital tools and others fortifies the foundation of your data quality strategy.
A starting point for data quality
The road to robust data quality is a long, challenging journey — and cloud migration is just one step. CDOs need a more comprehensive data roadmap to fully unlock the cloud’s potential and achieve results faster.
Rather than relying on the cloud to support your entire data strategy, refocus your efforts on core elements of data quality and the integration of advanced technology tools to support it.
You know where you want to go. It’s just a matter of starting out on the right foot.
About the Author
Emmet Townsend, VP of Engineering at Inrupt, is a technology leader with over 25 years experience of delivering software solutions across a range of industry sectors and in companies ranging from startups to large global multinationals. With a background in software engineering and architecture Emmet works closely with engineering teams and customers to deliver enterprise grade solutions with an emphasis on the needs of highly available, secure, real time products and services.
Sign up for the free insideBIGDATA newsletter.
Join us on Twitter: https://twitter.com/InsideBigData1
Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/
Join us on Facebook: https://www.facebook.com/insideBIGDATANOW