Data Governance in Healthcare: Best Practices
What Is Data Governance in Healthcare?
The American Health Information Management Association (AHIMA) defines data governance in healthcare as “the overall administration, through clearly defined procedures and plans, assures the availability, integrity, security, and usability of the structured and unstructured data available to an organization”.
In other words: data governance makes healthcare information more accessible to a wider range of users through the secure implementation of data sharing and data analytics technologies.
A 2024 survey found that 51% of healthcare stakeholders strongly agreed that data analytics will be key to the success of their organization in the coming years. Yet, 47% of healthcare data, on average, is underused when making clinical and business decisions.
The biggest blockers to making more data usable include low data literacy skills among the workforce and technical challenges with data access, aggregation, and analytics. These are the exact problems healthcare data governance aims to solve.
As a collection of people, processes, and technologies, data governance aims to improve data ingestion, processing, storage, and security across the entire organization.
Key Components of Data Governance Strategy in Healthcare
Data Ownership
Data ownership is a set of principles and policies that establishes accountability for data assets within an organization. It defines the role of data owners — people with authoritative rights over specific datasets — and determines their responsibility for maintaining the accuracy, integrity, and security of data. Data ownership helps set up appropriate identity and access management (IAM) for different types of assets and maintain clear data lineage across the organization.
Data Standards
Data standards lay out principles for defining acceptable data format, structure, and quality requirements to promote uniformity and interoperability across data storage and data analytics systems. They prevent silos accumulation and enhance data interpretation, making it easier for business users and data scientists to access various data sources, build custom datasets, and run analytics.
Data Policies
Data policies provide a rulebook for handling data throughout its lifecycle. These guidelines outline the principles, procedures, and best practices for data creation, storage, access, usage, protection, and disposal. The goal of data policies is to strengthen data security, ensure ethical data usage, and nurture a culture of data responsibility.
Data Stewardship
Data stewardship is the strategic oversight of the data governance principles at the organization. Typically, spearheaded by a chief data officer (CDO), the data stewardship team continuously works on improving the data governance program, fostering deeper collaboration between technical and healthcare teams.
Benefits of Data Governance in Healthcare
To ensure the best patient outcomes, advance clinical research, and regulatory compliance, healthcare organizations put data governance in focus. The advantages include:
- Higher data fidelity. Mistakes in data entry happen, but in healthcare they come at a tremendous price. Analysis of ambulatory electronic healthcare records (EHR) revealed that one-fifth contained errors. Approximately 20% of patients said those errors were critical because they related to their diagnosis or medication. Since data governance reduces manual entry and introduces data quality checks, you gain more reliable data for decision-making.
- Improved staff productivity. Fast access to validated data and user-friendly analytics increases your teams’ efficiency. Data platform adopters cite 30% improvements in cost savings through better resource allocation and workforce management, resulting in faster action. Dutch Santeon achieved a 27% reduction in reoperations due to post-opt complications by sharing data for key value-based metrics across clinical teams.
- Better preventive care. Preventive care prolongs population longevity and reduces healthcare costs. With greater access to data, healthcare teams can implement more effective strategies for addressing healthcare risks at early stages. British NHS developed an application that uses live patient clinical data to detect patients at risk of Acute Kidney Injury (AKI). Using a combination of qualitative and quantitative methods, the app reduces diagnostic times to 14 minutes.
- Regulatory compliance. Strong data governance helps meet HIPAA and GDPR obligations around patient data privacy and security. It reduces the risks of unauthorized data collection, access, or reuse, which can lead to hefty regulatory fines.
- Improved cybersecurity. Full visibility into how your data is captured, stored, and shared is critical for building a tight security perimeter. The average cost of a data breach in healthcare is $10.93 million, up by 42% from 2020. Effective data governance, paired with proactive cybersecurity measures, negates these risks.
Cornerstone Practices for Effective Data Governance in Healthcare
When it comes to implementing data governance in healthcare, several challenges must be prioritized: data silos, poor systems interoperability, and missing chain of custody.
Conduct a Data Audit
Various parts of your organization use data in different ways, making it harder to maintain visibility and implement unified controls. An audit helps discover how data flows through your organization: where it originates, how it is stored, accessed, and reused.
Similar to IT asset management at large, data management starts with identifying and tagging all applications and infrastructure elements at your disposal and organizing them in a shared data catalog. Services like Microsoft Purview help automatically discover and categorize different data assets across your organization, establish lineage, and enforce your data governance procedures.
Establish a Compliant Data Collection Process
The general public has a positive attitude toward sharing healthcare data with their healthcare practitioners. However, the attitudes somewhat shift, depending on the type of data recipient e.g., 64% will share with a primary healthcare provider, but only 35% — with a research institution.
Ensuring explicit consent and providing explanations on how the submitted data will be stored, used, and anonymized is also critical. So is maintaining the patients’ rights to demand data removal. Your data management infrastructure must accommodate these obligations.
Data protection solutions can automatically scan all the stored data and categorize it by sensitivity levels. For example, you can immediately identify documents containing personally identifiable protected health information (PHI) by a respective tag and auto-enforce the highest levels of protection. Extra controls can then be implemented to restrict access or sharing.
Optimize Your Data Pipelines
Internal data users (business, technical, and healthcare teams) expect easier access to pre-approved data sets — something that’s often constrained by poor system interoperability and legacy data management infrastructure. A lot of medical datasets are multi-modal and highly dimensional. Their transformation and storage require a lean architecture that combines data warehousing and data lake solutions.
Legacy data architectures often rely on extract, transform, load (ETL) tools with low data throughput capabilities, restricting access to fresh data and causing delays in processing high volumes. When data access is restricted, organizations often turn to manual reporting efforts (hand-coding) to obtain the necessary data, putting extra pressure on software engineering teams.
Investing in newer data ingestion and data transformation architectures like data fabric can dramatically reduce the cost and speed of insights. After modernizing its legacy data infrastructure, the Memorial Sloan Kettering Cancer Center (MSK) reduced data access times to hours, over weeks in the past, while also eliminating redundant data copies, hindering compliance.
Case in point: Infopulse together with our partner Imperio created an effective RPA solution for the Healthcare Institution of North Iceland. The project automated 20+ business processes, including report generation and verification, data entry, migration across HSN’s systems, and patient data synchronization. Additionally, data management processes were streamlined with an AI functionality that helps automatically recognize, extract, process, and transfer all of the patient data into the HSN’s dedicated database, and updates it in real-time.
Improve the Quality of Your Data
Better patient outcomes and faster clinical research hinge on high data fidelity. Data quality management is a set of processes and technologies for continuously monitoring data accuracy and reliability.
To establish an effective DQM process:
- Define data quality measures (e.g., accuracy rate or duplication thresholds)
- Conduct an assessment of the current assets
- Evaluate which extra tools you need to reach the targets.
Depending on your current technology portfolio, you may look for extra data quality management tools for data profiling, which help analyze data structures and map the relationships between different assets. Or additional data cleansing solutions to standardize ingested data formatting, eliminate duplicates, and erroneous entries. SQL Server Data Quality Services (DQS) from Microsoft, for example, allow you to build and use a knowledge base for data correcting, enriching, standardizing, and de-duplicating tasks.
SAS Data Quality also streamlines data preparation and enrichment for SAS analytics products. Riverside County in California used the platform to integrate health and non-health data from different sources to support its whole person care (WPC) program, now servicing over 2,5 million residents.
Invest in Data Storage Security
A staggering 88% of healthcare organizations experienced an average of 40 cyber attacks over the last year. Some of these have proven to be successful, resulting in data loss and heavy regulatory scrutiny.
Unprotected data storage is the prime target for cyber-criminals as it often contains the most valuable assets — patient addresses, identity documents, payment details, etc. A comprehensive data governance strategy must emphasize data storage security, covering data warehouses (DWH), data lakes, databases, and blob storage, both in the cloud and on-premises.
Our cybersecurity team suggests implementing best practices like data masking for datasets, used in analytics applications. This also helps ensure data privacy. Encryption of data at rest is also necessary to protect sensitive assets. Role-level security is recommended. Regular data backups, coupled with software-enabled disaster recovery (DR) and business continuity (BC) measures minimize the risks of accidental (and malicious) data loss and service disruptions.
Provide Staff Training
Last, but not least, you should educate your people on the importance of following the new data governance procedures. While modern data governance solutions can automate the enforcement of numerous data quality standards and access procedures, it is your staff who will be the biggest ambassador of the changes.
Training and upskilling may be necessary to encourage more people to use the available data analytics tools and dashboards. Co-design is another powerful way to ensure buy-in among business users. Engage end-users in the implementation of new data collection, management, and usage processes, requesting their feedback at every stage. By including
Feedback from the workforce during the design process increases the chances of new processes being accepted by the workforce, positively impacting the entire organization.
Conclusion
Having a strong data governance framework ensures that the data you collect, process, and use is accurate, consistent, and reliable. By prioritizing investments in new technologies for data discovery, data ingestion, and data security, healthcare organizations substantially improve patient outcomes, increase workforce efficiency, and reduce compliance hurdles.