1. Operating the Data Integrity Check Utility
Key Points
- Purpose: Identify MPIID inconsistencies between EMPI tables and Registry (Hub)
- Three-Table Check: Validates MPIID consistency across Registry, EMPI, and normalized tables
- Weekly Schedule: InterSystems recommends weekly execution for first month, then monthly
- Terminal Execution: Run from Terminal in EMPI namespace using ##class method
Detailed Notes
Overview
The data integrity check utility is a critical maintenance tool for InterSystems EMPI systems. It identifies MPIID (Master Patient Index ID) inconsistencies between the three tables that store patient identity information: the Registry table (HS.Registry.Patient), the EMPI table (HSPI.Data.Patient), and the normalized linkage table (Local.Linkage.Definition.Normalized or custom location). Inconsistencies typically arise when IDUpdateNotification messages from EMPI fail to be received or processed by the Registry, although other causes are possible.
For a patient record to pass the integrity check, it must appear in all three tables with the same, non-null MPIID value. Any deviation from this condition constitutes a discrepancy that requires investigation and repair.
The Three Tables
In a Unified Care Record deployment using InterSystems EMPI as the MPI, patient records are stored in three critical tables:
1. HS.Registry.Patient: The Registry (Hub) table that stores the authoritative patient index for the entire UCR federation 2. HSPI.Data.Patient: The EMPI table that stores patient demographic data and linkage information 3. Local.Linkage.Definition.Normalized: The normalized linkage table (or custom table location) that stores normalized demographic values used for matching
All three tables must have consistent MPIID values for each patient. Discrepancies between these tables can result in incorrect patient matching, broken Composite View display, or data retrieval failures.
Running the MPIID Check Method
To execute the MPIID data integrity check:
1. Open a Terminal window 2. Switch to your InterSystems EMPI namespace (e.g., `HSPIDATA`) 3. Call the MPIID check method:
``` set status=##class(HSPI.Util.Checkup).MPIIDCheck(.outfile,.total) ```
The method outputs its progress to the terminal and generates a CSV file containing any discrepancies found. The `outfile` variable receives the path to the output file, and the `total` variable receives the count of discrepancies.
Understanding Output
The MPIID check method produces:
- Terminal Output: Progress messages indicating which tables are being checked and how many records have been processed
- CSV Output File: A comma-separated values file listing all discrepancies, with columns for record identifiers, table names, MPIID values, and discrepancy types
- Discrepancy Count: The total number of discrepancies found
The output file location is returned in the `outfile` parameter. The file is typically stored in the instance's default directory (often the manager directory).
Recommended Execution Schedule
InterSystems recommends the following schedule for running the data integrity check:
- First Month: Weekly execution during off-peak hours (e.g., overnight on weekends)
- After First Month: If no new discrepancies are found, reduce frequency to monthly execution
- After Major Events: Run the check after system upgrades, production incidents, or Registry failures
The check may take several hours to complete on large systems and consumes system resources, so schedule execution during periods of low system utilization.
Automated Task Scheduling
To implement ongoing data integrity monitoring, create a scheduled task using `HSPI.Util.Checkup.Task`:
1. Navigate to System Operation > Task Manager > New Task 2. Set Task Type to `HSPI.Util.Checkup.Task` 3. Configure the schedule (weekly or monthly) 4. Set the namespace to your EMPI namespace 5. Select a superuser to run the task 6. Optionally configure email alerts for discrepancies
The automated task issues an alert to the production Event Log if discrepancies are found, and can be configured to send email notifications.
---
Documentation References
2. Assessing Errors Identified by Data Integrity Check
Key Points
- Existence Discrepancies: Patient missing from one or two of the three tables (7 variations)
- MPIID Mismatch: MPIID values differ across tables or are null
- Output File Analysis: CSV file lists discrepancy type, affected tables, and MPIID values
- Database Pointer Validation: Ensures referential integrity between tables
Detailed Notes
Overview
The data integrity check utility identifies two primary types of discrepancies: existence discrepancies (where a patient record is missing from one or more tables) and MPIID mismatch discrepancies (where MPIID values differ across tables or are null). Assessing these errors requires understanding the expected data flow, the relationship between tables, and the severity of each discrepancy type.
The output file produced by the MPIID check method provides detailed information about each discrepancy, enabling systematic analysis and repair.
Discrepancy Types
Existence Discrepancies
An existence discrepancy occurs when a patient record does not exist in one or two of the three tables. There are seven possible variations:
1. Record exists only in Registry table 2. Record exists only in EMPI table 3. Record exists only in normalized table 4. Record exists in Registry and EMPI, but not normalized table 5. Record exists in Registry and normalized, but not EMPI table 6. Record exists in EMPI and normalized, but not Registry table 7. Record does not exist in any table (edge case, detected through orphaned references)
Each variation indicates a different failure mode in the data synchronization process.
MPIID Mismatch Discrepancies
An MPIID mismatch discrepancy occurs when the MPIID values among the three tables are not all the same. This includes:
- Different MPIID values in different tables
- Null MPIID in one or more tables
- MPIID present in some tables but missing in others
MPIID mismatches prevent correct patient identity resolution and can cause failures in downstream applications.
Output File Format
The CSV output file contains the following information for each discrepancy:
- Discrepancy Type: Description of the discrepancy (existence or mismatch)
- Affected Tables: Which of the three tables contain the record
- Registry MPIID: The MPIID value in the Registry table (or null)
- EMPI MPIID: The MPIID value in the EMPI table (or null)
- Normalized MPIID: The MPIID value in the normalized table (or null)
- Record Identifiers: Facility, MRN, or other identifiers to locate the patient record
Severity Assessment
Not all discrepancies have equal impact:
High Severity:
- MPIID mismatch where different non-null MPIIDs exist across tables (indicates serious synchronization failure)
- Missing Registry entry for active patient (prevents data retrieval in UCR applications)
Medium Severity:
- Null MPIID in one table (may resolve automatically when next update occurs)
- Record exists in EMPI and normalized but not Registry (indicates failed IDUpdateNotification)
Low Severity:
- Orphaned normalized table entries with no corresponding EMPI or Registry record (stale data)
- Historical records with expected inconsistencies (patients who moved or merged)
Database Pointer Validation
The data integrity check also validates database pointers between tables:
- Foreign Key Integrity: Ensures that references between tables point to valid records
- Orphaned Pointers: Identifies pointers to records that no longer exist
- Circular References: Detects invalid circular reference patterns
Pointer validation errors are typically addressed by the repair method (covered in the next section).
Common Causes
Understanding common causes helps prioritize remediation:
- Failed IDUpdateNotification Messages: Most common cause; EMPI sends notification to Registry, but Registry fails to process it
- Network or Service Interruptions: Temporary outages during critical update operations
- Production Errors: Business operation failures during message processing
- Data Migration Issues: Inconsistencies introduced during system upgrades or data migrations
- Manual Database Modifications: Direct database changes bypassing normal message flow
---
Documentation References
3. Reporting Unresolved Data Integrity Issues
Key Points
- Repair Method: After check, run repair method to fix identified inconsistencies (per sample Q16)
- Rerun MPIID Check: After repair, rerun the MPIID check to verify corrections (per sample Q16)
- Forward Output File: If issues persist, forward output file to InterSystems WRC (per sample Q16)
- Contact WRC: Escalate ongoing discrepancies to InterSystems support for investigation
Detailed Notes
Overview
According to sample exam question Q16, after running the data integrity check utility and identifying problems in the output file, the recommended workflow is: (1) run the repair method, (2) rerun the MPIID check to verify that repairs resolved the issues, and (3) if problems persist, forward the output file to the InterSystems Worldwide Response Center (WRC) for further investigation.
This systematic approach ensures that routine discrepancies are automatically corrected while complex or persistent issues receive expert attention.
Running the Repair Method
The MPIID repair method attempts to automatically fix inconsistencies identified by the check method:
1. Open a Terminal window 2. Switch to your InterSystems EMPI namespace 3. Call the repair method:
``` set status=##class(HSPI.Util.Checkup).MPIIDRepair(.repairfile) ```
The repair method:
- Reads the output file from the most recent MPIID check
- Attempts to synchronize MPIID values across the three tables
- Corrects database pointer errors where possible
- Generates a repair log file documenting all actions taken
The repair method operates conservatively, only making changes where the correct MPIID value can be determined with confidence.
Rerunning the MPIID Check
After running the repair method, you must rerun the MPIID check to verify that discrepancies have been resolved (per sample Q16):
1. Execute the MPIID check method again:
``` set status=##class(HSPI.Util.Checkup).MPIIDCheck(.outfile,.total) ```
2. Review the new output file to determine if discrepancies remain 3. Compare the total count before and after repair to assess effectiveness
If the rerun check shows zero discrepancies, the repair was successful. If discrepancies remain, further investigation is required.
Forwarding Output to InterSystems WRC
If unresolved discrepancies remain after running the repair method and rechecking (per sample Q16):
1. Locate the output file from the most recent MPIID check (path stored in `outfile` variable) 2. Open a case with the InterSystems Worldwide Response Center (WRC) 3. Attach the following to the case:
- MPIID check output file (CSV)
- Repair method output file
- Production Event Log excerpts showing any related errors
- System configuration details (production configuration, namespace settings)
4. Provide context:
- When the discrepancies were first detected
- Any recent system events (upgrades, migrations, outages)
- Whether discrepancies are increasing or stable
What to Include in WRC Report
When reporting unresolved data integrity issues to InterSystems WRC:
Required Information:
- Output Files: MPIID check output and repair method output
- System Version: InterSystems IRIS/HealthShare version and patch level
- Discrepancy Count: Total number of unresolved discrepancies
- Discrepancy Types: Breakdown of existence vs. mismatch discrepancies
- Timeline: When discrepancies first appeared and frequency of occurrence
Helpful Contextual Information:
- Recent Changes: System upgrades, configuration changes, production modifications
- Production Errors: Event Log entries related to IDUpdateNotification failures
- Message Flow: Volume and throughput of patient update messages
- Database Size: Size of the three patient tables
- Network Issues: Any known network or connectivity problems between EMPI and Registry
Ongoing Discrepancy Monitoring
If the data integrity check task is configured (as described in section 1), ongoing discrepancies will trigger alerts:
- Event Log Alerts: Automated task logs alerts when discrepancies are detected
- Email Notifications: Optional email alerts can be configured using EnsLib.EMail.AlertOperation
- Alert Content: "MPIID Check Utility has found x discrepancies. Output file is y."
Contact InterSystems support if the automated task consistently reports discrepancies over multiple weeks, indicating a systemic issue requiring root cause analysis.
Prevention of Future Discrepancies
After resolving existing discrepancies with WRC assistance, implement preventive measures:
- Monitor IDUpdateNotification Messages: Review production message flow to ensure notifications are successfully delivered to Registry
- Event Log Monitoring: Set up alerts for IDUpdateNotification failures
- Network Stability: Ensure reliable network connectivity between EMPI and Registry
- Production Configuration: Verify Pool Size settings and retry logic in the production
- Regular Integrity Checks: Maintain weekly or monthly automated integrity check schedule
When NOT to Contact WRC
Some scenarios do not require WRC escalation:
- First-time discrepancies resolved by repair: If the repair method successfully corrects all discrepancies, no escalation is needed
- Expected historical inconsistencies: Old records from data migrations or system conversions may have explainable inconsistencies
- Isolated incidents: A single occurrence with few discrepancies that are corrected by repair
Use judgment to determine whether discrepancies represent systemic issues requiring WRC assistance or routine maintenance items.
---