T43.1: Implements Mirroring - InterSystems IRIS System Administration Specialist Study Guide

1. Describe Mirror Architecture (Failover, Async, Reporting)

Key Points

Failover members: Primary and backup for automatic failover with no data loss
Async members: DR async (disaster recovery) and reporting async (data warehousing)
Mirror supports up to 16 total members with 2 failover + 14 async members
ISCAgent and Arbiter components support automatic failover decisions
Logical data replication avoids risks of physical replication corruption

Detailed Notes

InterSystems IRIS mirroring provides high availability through logical data replication between physically independent systems. A mirror consists of failover members and optional async members.

Failover Members: A mirror requires exactly two failover members for automatic failover. At any time, one acts as primary (providing database access) and the other as backup (maintaining synchronized copies). When the primary becomes unavailable, the backup takes over automatically with no data loss. The failover members are coequal - neither is preferred as primary. Network latency between failover members significantly impacts performance, so they should be located to minimize latency.

Async Members - DR Type: Disaster recovery async members maintain asynchronous copies and can be manually promoted to failover member if both failover members fail. A DR async belongs to one mirror only, but you can configure up to 14 async members. DR asyncs provide geographically dispersed disaster recovery capability.

Async Members - Reporting Type: Reporting async members maintain read-only or read-write copies for data mining and business intelligence. They cannot be promoted to failover members. A reporting async can belong to up to 10 mirrors simultaneously, functioning as an enterprise-wide data warehouse bringing together databases from separate locations.

Supporting Components: ISCAgent runs on each mirror member's host system, providing communication when normal channels are interrupted. The Arbiter provides an additional decision point for failover, preventing split-brain scenarios where both members think they should be primary.

Mirroring avoids single points of failure by maintaining independent resources on primary and backup members, and uses logical data replication to avoid risks associated with physical replication like out-of-order updates and carry-forward corruption.

Documentation References

2. Configure Mirror Members

Key Points

Configure two failover members with matching mirror name
Primary and backup roles assigned during initial configuration
Configure network addresses for mirror communication channels
Virtual IP (VIP) typically used for external client connections
ECP application servers automatically redirect after failover (no VIP needed)

Detailed Notes

Configuring a mirror involves setting up the two failover members to communicate and replicate data synchronously. The configuration process establishes the mirror relationship and sets up communication channels.

Initial Setup: Create the mirror on the first failover member, specifying a mirror name. Then configure the second failover member to join the same mirror name. During initial configuration, one member is designated as primary and the other as backup, but these are temporary designations only.

Network Configuration: Failover members communicate through several channels using multiple network addresses:

Mirror connection addresses for member-to-member communication
Agent connection addresses for ISCAgent communication
Optional arbiter connection address
Virtual IP address for external client redirection (optional)

Virtual IP Address: External clients typically connect through a VIP that is always bound to the current primary. When failover occurs, the VIP moves to the new primary, automatically redirecting client connections. However, application servers in a distributed cache cluster (ECP) automatically redirect to the new primary without requiring a VIP.

Important Considerations:

The two failover members are coequal; neither is preferred as primary
Primary and backup are temporary designations that change during failover
Network latency between failover members directly impacts application performance
Members should be physically located to minimize network latency
Failover members can be configured in separate data centers for additional redundancy

The backup failover member's purpose is to be ready to take over as primary. It is NOT supported to use the backup member directly to run queries or application code - the LOCK command will fail with a PROTECT error if attempted.

Documentation References

3. Add Databases to a Mirror

Key Points

Databases must be journaled before adding to mirror
Add databases to mirror from primary member
Databases automatically synchronized to backup member
Mirrored databases maintain exact copies on both members
Only databases explicitly added are mirrored

Detailed Notes

Once a mirror is created, you can add databases to be mirrored between the failover members. Only databases explicitly added to the mirror configuration will be replicated.

Prerequisites: Before adding a database to a mirror, journaling must be enabled for that database. Journaling provides the complete record of all database modifications that mirroring uses for replication. Write image journaling (WIJ) is automatically enabled and protects against physical corruption during updates.

Adding Process: Databases are added to the mirror from the primary member using the Management Portal or programmatic methods. When a database is added: 1. The database is marked as mirrored on the primary 2. Journal records for that database begin replicating to the backup 3. The backup creates or updates its copy of the database 4. Ongoing changes are synchronously replicated

Synchronization: The mirroring process maintains exact copies of mirrored databases on both failover members. When data is modified on the primary: 1. The change is written to the primary's journal 2. Journal records are sent to the backup 3. Backup acknowledges receipt 4. Backup applies the changes to its database copy 5. Transaction commits on primary after backup acknowledgment

This synchronous replication ensures zero data loss during failover. The backup always has the most recent committed transactions.

Database Access: On the primary, mirrored databases are read-write. On the backup, they are read-only - any attempt to modify data will generate a PROTECT error. After failover, the new primary's databases become read-write and the new backup's become read-only.

Namespace Mapping: Applications access data through namespaces, which map to databases. The namespace configuration should be consistent across both mirror members to ensure applications work correctly after failover.

Documentation References

4. Monitor Mirror Status

Key Points

Mirror Monitor provides real-time status of all mirror members
Shows current primary/backup roles and member health
Displays dejournaling lag and database synchronization status
Monitor agent connectivity and communication channels
Track failover history and mirror events

Detailed Notes

Monitoring mirror status is essential for ensuring high availability and identifying potential issues before they cause outages. InterSystems IRIS provides comprehensive monitoring through the Mirror Monitor and related tools.

Mirror Monitor: Accessed through the Management Portal (System Administration > Mirror Monitor), the Mirror Monitor displays:

Current primary and backup member identification
Member status (Primary, Backup, Connected, Disconnected, etc.)
Database synchronization status
Dejournaling lag (how far behind backup is in applying journal records)
Agent connectivity status
Recent failover events

Health Indicators:

Green/Normal: Member is functioning correctly, databases synchronized
Yellow/Warning: Minor issues like dejournaling lag or temporary disconnection
Red/Error: Serious issues like member down, agent disconnected, or major sync lag

Dejournaling Status: The backup continuously applies (dejournals) journal records from the primary. Monitor displays:

Current dejournaling position
Lag time or journal file count behind primary
Rate of dejournaling progress
Any dejournaling errors or issues

Connection Monitoring: Track the status of communication channels:

Mirror connection between failover members
ISCAgent connections
Arbiter connection (if configured)
Application server (ECP) connections to mirror

Event History: The monitor maintains a history of mirror events including:

Failover occurrences (automatic and manual)
Member state changes
Connection/disconnection events
Configuration changes
Error conditions

Proactive Monitoring: Regular monitoring allows you to:

Detect network issues affecting mirror communication
Identify dejournaling lag that could impact failover recovery time
Verify successful database synchronization
Confirm both members are healthy and ready for failover
Track failover frequency and causes

You can also monitor mirror status programmatically using the %SYS.Mirror class methods and SQL queries against mirror status tables for integration with enterprise monitoring systems.

Documentation References

5. Perform Manual and Automatic Failover

Key Points

Automatic failover: triggered by primary failure, no data loss
Manual/planned failover: for maintenance, testing, or upgrades
Failover typically completes in seconds
Application servers (ECP) automatically reconnect to new primary
VIP address moves to new primary for client redirection

Detailed Notes

Failover is the process where the backup member becomes the new primary when the current primary becomes unavailable. Failover can be automatic (triggered by system detection) or manual (administrator-initiated).

Automatic Failover Process: 1. Primary becomes unavailable (crash, network failure, shutdown) 2. Backup detects loss of communication with primary 3. Backup contacts ISCAgent on primary's host to verify primary is truly down 4. If arbiter configured, backup contacts arbiter for confirmation 5. Backup promotes itself to primary role 6. Virtual IP (if used) moves to new primary 7. Databases on new primary become read-write 8. Application servers (ECP) automatically reconnect to new primary 9. External clients reconnect via VIP or connection redirect

Automatic Failover Triggers:

Primary InterSystems IRIS instance crashes or stops
Primary host system failure
Network interruption between primary and backup
Primary becomes unresponsive to health checks

Manual Failover (Planned): Manual failover is initiated by an administrator for:

Planned maintenance on primary member
Operating system or hardware upgrades
InterSystems IRIS version upgrades
Testing failover procedures
Rebalancing load after recovering from unplanned failover

To perform manual failover: 1. Access Mirror Monitor on intended new primary 2. Select "Become Primary" action 3. Confirm the failover operation 4. System performs controlled role swap 5. Former primary becomes new backup 6. Databases and connections transition smoothly

Failover Duration: Typical automatic failover completes in seconds to tens of seconds, depending on:

Network latency between members
Dejournaling lag on backup
Number and size of mirrored databases
Application server reconnection time

Failover and Applications:

Applications using ECP: automatic reconnection to new primary, minimal disruption
Applications using VIP: automatic redirection when VIP moves
Applications with hardcoded connections: may require manual reconnection
Transaction in progress during failover: rolled back, application must retry

Zero Data Loss: Because mirroring uses synchronous replication, the backup acknowledges receipt of journal records before the primary commits transactions. This guarantees that all committed transactions are on the backup at failover time, ensuring zero data loss.

Post-Failover: After failover:

Monitor the new primary for normal operation
Investigate cause of failover if it was unplanned
When failed member recovers, it automatically rejoins as backup
Consider manual failover back to original primary if desired

Important: If a problem is detected on the primary and the backup is available, failover occurs immediately even if the primary problem might resolve on its own. This ensures maximum availability but means transient issues can trigger failover.

Documentation References

6. Understand Arbiter Role and Configuration

Key Points

Arbiter provides tie-breaking vote to prevent split-brain scenarios
Lightweight component, minimal system requirements
Should be on separate system from both failover members
Contacted by backup during failover decision process
Prevents both members from simultaneously acting as primary

Detailed Notes

The arbiter is an optional but recommended component that provides an additional decision point during failover. It prevents "split-brain" scenarios where both failover members might simultaneously believe they should be primary due to network issues.

Arbiter Purpose: When the backup loses contact with the primary, it must determine whether:

The primary has actually failed (should become primary)
Network issues prevent communication but primary is still running (should NOT become primary)

The arbiter provides a third perspective. If the backup can contact the arbiter but not the primary, this indicates the primary likely failed rather than just a network partition.

Split-Brain Prevention: Without an arbiter, network issues between failover members could cause both to act as primary simultaneously, leading to:

Divergent database copies
Data conflicts
Complex recovery procedures
Potential data loss

The arbiter prevents this by serving as a witness - the member that can contact the arbiter is allowed to be primary.

Arbiter Requirements:

Lightweight process with minimal CPU/memory requirements
Should run on a different host from both failover members
Ideally in a different physical location/network segment from members
Requires reliable network connectivity to both failover members
Available as Docker container or standard installation

Arbiter Installation: The arbiter can be:

Installed as part of ISCAgent on any system
Run as a Docker container (intersystems/arbiter image)
Deployed on virtual or physical hardware
Hosted on a third system in the mirror infrastructure

Arbiter Configuration: 1. Install arbiter on chosen host system 2. Configure arbiter IP address and port 3. Add arbiter configuration to both failover members 4. Failover members automatically contact arbiter during failover decisions 5. No ongoing management needed - arbiter operates autonomously

Arbiter Communication: During failover: 1. Backup loses contact with primary 2. Backup attempts to contact ISCAgent on primary host 3. Backup contacts arbiter 4. If backup can reach arbiter but not primary/agent, backup becomes primary 5. If backup cannot reach arbiter, it remains backup (primary likely still active)

Arbiter Recommendations:

Always use an arbiter in production mirror configurations
Place arbiter on independent network segment if possible
Monitor arbiter connectivity as part of mirror monitoring
Consider arbiter when planning network architecture
Test failover scenarios with arbiter disconnected to understand behavior

Without Arbiter: If no arbiter is configured, failover decisions rely on ISCAgent alone. This is acceptable for testing but not recommended for production due to increased risk of split-brain scenarios.

Documentation References

Exam Preparation Summary

Critical Concepts to Master:

Mirror Member Types: Failover (2 max, auto-failover), DR Async (manual promotion), Reporting Async (read-only, multi-mirror)
Failover Members: Exactly 2 required, coequal (no preferred primary), zero data loss
Mirror Limits: Up to 16 total members (2 failover + 14 async)
ISCAgent: Runs on each member's host, enables communication when normal channels fail
Arbiter: Third-party decision point for failover, prevents split-brain scenarios
Virtual IP (VIP): External client connection point, moves with primary role
ECP + Mirroring: App servers auto-redirect after failover (no VIP needed)

Common Exam Scenarios:

Identifying when to use failover vs DR async vs reporting async members
Understanding automatic failover requirements (ISCAgent, Arbiter)
Configuring mirror member network addresses and VIP
Troubleshooting mirror synchronization and failover issues
Planning mirror topology for disaster recovery
Understanding split-brain prevention with Arbiter
Promoting DR async to failover member

Hands-On Practice Recommendations:

Configure a two-node failover mirror
Add DR async and reporting async members
Install and configure ISCAgent and Arbiter
Test automatic failover scenarios
Monitor mirror status and synchronization
Practice manual failover and DR async promotion
Configure VIP for client connections

1. Describe Mirror Architecture (Failover, Async, Reporting) Report Issue

Key Points

Detailed Notes

Documentation References

2. Configure Mirror Members Report Issue

Key Points

Detailed Notes

Documentation References

3. Add Databases to a Mirror Report Issue

Key Points

Detailed Notes

Documentation References

4. Monitor Mirror Status Report Issue

Key Points

Detailed Notes

Documentation References

5. Perform Manual and Automatic Failover Report Issue

Key Points

Detailed Notes

Documentation References

6. Understand Arbiter Role and Configuration Report Issue

Key Points

Detailed Notes

Documentation References

Exam Preparation Summary

Critical Concepts to Master:

Common Exam Scenarios:

Hands-On Practice Recommendations:

Report an Issue

1. Describe Mirror Architecture (Failover, Async, Reporting)

2. Configure Mirror Members

3. Add Databases to a Mirror

4. Monitor Mirror Status

5. Perform Manual and Automatic Failover

6. Understand Arbiter Role and Configuration