1. Backup Strategies (Online, External, Concurrent)
Key Points
- Online backup: full or incremental while system is running
- External backup: operating system level backup tools
- Concurrent backup: backup while database changes occur
- Write Image Journal (WIJ) protects against physical corruption
- Journaling enables point-in-time recovery
Detailed Notes
InterSystems IRIS provides multiple backup strategies to suit different requirements for availability, recovery time objectives (RTO), and recovery point objectives (RPO).
Online Backup: Online backup is performed while InterSystems IRIS is running and users are active. This is the most common backup type for production systems requiring 24/7 availability.
Full Online Backup:
- Backs up all databases completely
- Creates complete copy suitable for restore
- Provides baseline for incremental backups
- Longer duration and larger storage requirement
- Should be performed regularly (e.g., weekly)
Incremental Online Backup:
- Backs up only database blocks changed since last backup
- Much faster than full backup
- Smaller storage requirement
- Requires last full backup plus all incremental backups for restore
- Can be performed more frequently (e.g., daily or hourly)
External Backup: External backup uses operating system or SAN-level tools to backup database files rather than InterSystems IRIS backup utilities.
Requirements for external backup:
- System must be in backup mode or shut down
- Use `BACKUP^DBACK` to freeze write daemon
- Database files remain consistent during backup
- External tools (tar, rsync, SAN snapshot, etc.) copy files
- Resume normal operation after backup completes
Advantages:
- Leverage existing OS backup infrastructure
- May be faster for very large databases
- Can use SAN snapshots for instant backups
- Integration with enterprise backup systems
Disadvantages:
- More complex to configure correctly
- Requires careful coordination with IRIS
- Must ensure all related files are backed up
- May not backup journal files in sync with databases
Concurrent Backup: Concurrent backup allows database updates to continue during backup with minimal performance impact. This is the default mode for InterSystems IRIS online backups.
How it works:
- Backup begins by marking database blocks
- Database blocks are copied to backup
- If block is modified during backup, both old and new versions saved
- Guarantees consistent backup even with ongoing changes
- WIJ journal ensures physical integrity
Write Image Journal (WIJ) Protection: The WIJ is critical for backup integrity. It prevents corruption if system crashes during backup:
- Updates written to WIJ before database
- If crash occurs during backup, WIJ recovery restores consistency
- Ensures backup is structurally sound
- Protects against incomplete block writes
Journaling for Logical Integrity: While WIJ protects physical structure, journaling protects logical database integrity:
- Records all database modifications
- Enables recovery beyond last backup
- Supports point-in-time recovery
- Required for roll-forward recovery after restore
Backup Strategy Considerations:
- Frequency: Balance between backup overhead and acceptable data loss
- Retention: How many backups to keep and for how long
- Storage: Local disk, network storage, tape, or cloud
- Verification: Regular restore tests to verify backup integrity
- Documentation: Clear procedures for backup and restore operations
Best Practices: 1. Perform regular full backups (weekly or monthly) 2. Supplement with frequent incremental backups (daily or hourly) 3. Keep multiple backup generations 4. Store backups on separate physical storage from databases 5. Test restore procedures regularly 6. Monitor backup completion and verify success 7. Maintain journal files synchronized with backups 8. Document backup schedule and retention policy
Documentation References
2. Configure and Run Backups
Key Points
- Configure backup settings via Management Portal or programmatically
- Specify backup type, location, and options
- Schedule automated backups using task manager
- Monitor backup progress and completion
- Verify backup success and maintain backup log
Detailed Notes
InterSystems IRIS provides tools for configuring, scheduling, and executing backups through the Management Portal, command-line utilities, and programmatic interfaces.
Backup Configuration via Management Portal: Navigate to: System Operation > Backup
Configure settings:
- Backup Type: Full or incremental
- Backup Device: Local directory, network path, or tape device
- Databases to Backup: Select specific databases or all databases
- Options: Include switch journal, verify backup, compress backup
- Scheduling: One-time or recurring schedule
Full Backup Configuration: Full backup settings:
- Select all databases or specific subset
- Specify backup directory with sufficient space
- Choose whether to switch journal file after backup
- Enable verification to check backup integrity
- Consider compression to reduce storage requirements
Incremental Backup Configuration: Incremental backup requires:
- Previous full backup as baseline
- Same backup directory as full backup
- Consistent database list between full and incremental
- Regular full backups to reset incremental chain
Backup Location: Choose backup destination carefully:
- Local Disk: Fast but shares risk with database storage
- Network Storage: Separates backup from database server
- SAN Storage: High performance, enterprise features
- Tape: Long-term archival, slower restore
- Cloud Storage: Off-site protection, network dependent
Running Manual Backup: To execute immediate backup: 1. Open Management Portal 2. Navigate to System Operation > Backup 3. Select backup type and options 4. Click "Backup Now" 5. Monitor progress in Backup Status 6. Verify completion in backup log
Programmatic Backup: Execute backup using ObjectScript: ```objectscript Set status = ##class(Backup.General).ExternalFullBackup("/backup/location/") ```
Or using Terminal command: ``` Do BACKUP^DBACK ```
Scheduling Automated Backups: Use Task Manager for regular automated backups: 1. Navigate to System Operation > Task Manager 2. Create new task 3. Select task type: "Backup" 4. Configure backup parameters 5. Set schedule (daily, weekly, specific times) 6. Save and enable task
Backup Schedule Example:
- Full Backup: Sunday 2:00 AM weekly
- Incremental Backup: Monday-Saturday 2:00 AM daily
- Journal Archive: Every 4 hours
- Backup Verification: After each backup
Backup Options:
Switch Journal:
- Creates new journal file after backup
- Recommended for each backup
- Synchronizes journal state with backup
- Simplifies recovery procedures
Verify Backup:
- Tests backup file integrity after creation
- Attempts to read all backed-up data
- Identifies corruption early
- Adds time to backup process but highly recommended
Compress Backup:
- Reduces backup file size
- Saves storage space
- Increases backup time slightly
- Increases restore time slightly
- Usually worth the tradeoff
Include Non-Database Files:
- Backup of .cpf configuration file
- Backup of journal files
- Backup of routine files
- Ensures complete recovery capability
Monitoring Backup Progress: During backup execution:
- Management Portal shows real-time progress
- Console displays status messages
- Backup log records each step
- Estimated completion time displayed
- Current database being backed up shown
Backup Status Indicators:
- Running: Backup in progress
- Completed: Successful completion
- Failed: Error occurred, check messages log
- Incomplete: Interrupted or partial backup
Backup Verification: After backup completes: 1. Check backup log for completion message 2. Verify backup files exist at specified location 3. Check file sizes are reasonable 4. Review any warning or error messages 5. Test restore process periodically
Backup Log: The backup log (backup.log) contains:
- Backup start and end times
- Databases included in backup
- Backup file names and sizes
- Any warnings or errors encountered
- Success or failure status
Best Practices for Running Backups: 1. Schedule backups during low-usage periods 2. Monitor first few scheduled backups to verify success 3. Maintain backup log history 4. Alert on backup failures 5. Verify sufficient storage space before backup 6. Document backup procedures and schedules 7. Test restore process regularly 8. Keep journal files aligned with backups
Troubleshooting Backup Issues: Common problems:
- Insufficient disk space: Monitor storage capacity
- Permission errors: Verify write access to backup location
- Network interruptions: Use local storage or reliable network
- Database locked: Ensure no exclusive operations running
- Interrupted backup: Check system resources and stability
3. Perform Database Recovery
Key Points
- Restore database from most recent backup
- WIJ recovery ensures structural integrity after restore
- Roll forward from journal files to recover recent transactions
- Verify database integrity after recovery
- Document recovery procedures and test regularly
Detailed Notes
Database recovery restores databases from backup and applies journal records to recover transactions made after the backup. The recovery process ensures both structural and logical database integrity.
Recovery Scenarios:
- Hardware failure: Disk crash, server failure
- Data corruption: Database structure corrupted
- Logical errors: Application bug corrupted data
- Accidental deletion: User or admin error
- Disaster recovery: Site-wide failure
Recovery Prerequisites: Before beginning recovery:
- Identify most recent valid backup
- Locate all journal files since backup
- Ensure sufficient disk space for restore
- Document current database state if possible
- Notify users of system downtime
Basic Recovery Process: 1. Stop InterSystems IRIS (if running) 2. Restore database files from backup 3. Start InterSystems IRIS 4. System automatically runs WIJ recovery 5. Apply journal files to roll forward 6. Verify database integrity 7. Resume normal operations
Detailed Recovery Steps:
Step 1: Assess Damage:
- Determine which databases are affected
- Identify the cause of failure
- Determine recovery point objective (how much data loss is acceptable)
- Locate required backup and journal files
Step 2: Stop System: Shut down InterSystems IRIS cleanly if possible: ``` iris stop IRIS ```
If system won't stop cleanly, may need to force shutdown or use OS kill commands.
Step 3: Restore from Backup: Via Management Portal (after restart): 1. Navigate to System Operation > Backup 2. Select "Restore" 3. Choose backup to restore from 4. Select databases to restore 5. Confirm and execute restore
Via Command Line: ``` Do DBREST^DBACK ```
Step 4: WIJ Recovery: When InterSystems IRIS starts after restore, it automatically:
- Detects that databases were restored
- Runs WIJ recovery to ensure structural integrity
- Completes any interrupted database updates
- Ensures database blocks are consistent
- Logs recovery actions to console
WIJ recovery guarantees the restored database has:
- Consistent block structure
- No partial writes
- Valid internal pointers
- Sound physical integrity
Step 5: Journal Roll Forward (covered in next section): Apply journal files to recover transactions after backup point.
Step 6: Verify Integrity: After recovery:
- Run database integrity checker
- Verify critical application data
- Test application functionality
- Check database statistics
- Review error logs for issues
Integrity Check Commands: ```objectscript Do ^INTEGRIT // Check database structure ```
Step 7: Resume Operations: Once verified:
- Enable user access
- Resume normal operations
- Monitor system closely initially
- Document recovery actions taken
Recovery in Mirrored Environment: If recovering a mirror member:
- May need to temporarily remove from mirror
- Restore database from backup
- Catch up to current primary via journaling
- Rejoin mirror when synchronized
- Or restore from mirror partner instead of backup
Parallel Recovery: InterSystems IRIS supports parallel recovery:
- Multiple databases restored simultaneously
- Speeds recovery of large systems
- Configure via backup/restore settings
- Requires sufficient I/O capacity
Recovery Time Considerations: Factors affecting recovery duration:
- Database size: Larger databases take longer to restore
- Backup location: Local faster than network or tape
- Disk I/O speed: SSD faster than spinning disk
- Journal volume: More journals = longer roll forward
- System resources: More CPU/RAM speeds recovery
Minimizing Recovery Time:
- Perform regular incremental backups (less journal to apply)
- Keep backups on fast local storage
- Use high-performance storage for databases
- Practice recovery procedures to optimize process
- Consider parallel recovery for large systems
- Keep journal files organized and accessible
Testing Recovery Procedures: Regularly test recovery process:
- Schedule periodic recovery drills
- Restore to test system, not production
- Verify all steps in recovery procedure
- Time the recovery process
- Update documentation based on testing
- Train staff on recovery procedures
Common Recovery Issues:
- Missing journal files: Incomplete recovery, data loss
- Backup too old: Large volume of journals to apply
- Insufficient disk space: Cannot complete restore
- Permission errors: Cannot write restored files
- WIJ recovery fails: May indicate deeper corruption
4. Use Journal Restore for Point-in-Time Recovery
Key Points
- Journals record all database modifications for recovery
- Roll forward applies journal entries after backup restore
- Point-in-time recovery to specific date/time
- Selective recovery of specific databases or globals
- Essential for minimizing data loss after failures
Detailed Notes
Journal restore (roll forward) applies journal records to databases restored from backup, recovering transactions that occurred after the backup was taken. This minimizes data loss and enables point-in-time recovery.
Understanding Journaling: Journaling is a feature enabled per database that records all modifications:
- Every SET, KILL, and transaction
- Complete record of database changes
- Sequential log of all modifications
- Time-stamped entries for point-in-time recovery
- Critical for data integrity and recovery
Journal File Structure:
- Journal files are sequentially numbered (e.g., journal1.log, journal2.log)
- New journal file created when current fills or manually switched
- Journal directory should be on separate disk from databases
- Regular archival of journal files recommended
- Keep journals synchronized with backups
Roll Forward Recovery Process:
Step 1: Restore from Backup: As described in previous section, restore databases from most recent backup.
Step 2: Identify Required Journals: Determine which journal files are needed:
- Start: Journal file active at backup time
- End: Most recent journal file (for full recovery) or specific point in time
- Ensure all journals in sequence are available
- Missing journals create gap in recovery
Step 3: Execute Journal Restore:
Via Management Portal: 1. Navigate to System Operation > Journal 2. Select "Restore Journals" 3. Specify journal directory 4. Select start and end journal files or timestamps 5. Choose databases to recover 6. Start restore process
Via Terminal Command: ```objectscript Do ^JRNRESTO // Journal restore utility ```
Via Programmatic Method: ```objectscript Set status = ##class(SYS.Journal.System).Restore(parameters) ```
Journal Restore Parameters:
- Journal directory: Location of journal files
- Start time/file: Beginning of restore period
- End time/file: End of restore period (or latest)
- Databases: Which databases to recover (or all)
- Filter: Optional filtering by globals or routines
- Options: Verbose logging, test mode, error handling
Point-in-Time Recovery: Recover to specific date and time:
- Specify exact timestamp for recovery endpoint
- Useful for recovering from logical errors (bad data entered)
- Excludes transactions after specified time
- Can recover to just before known problem occurred
Example scenario:
- Backup taken at midnight
- Bad data entered at 2:30 PM
- Restore from midnight backup
- Roll forward journals until 2:29 PM
- Database recovered to state just before error
Selective Recovery: Recover specific databases or globals:
- Don't have to recover entire system
- Specify individual databases in journal restore
- Filter by global names to recover specific data
- Useful when only subset of data is affected
Recovery Monitoring: During journal restore:
- Progress displayed showing journal file being processed
- Timestamp of current journal entry
- Number of records processed
- Estimated completion time
- Any errors or warnings
Journaling Best Practices:
Enable Journaling: All production databases should have journaling enabled: ```objectscript Set status = ##class(Config.Databases).Get(dbname, .properties) Set properties("JournalON") = 1 Set status = ##class(Config.Databases).Modify(dbname, .properties) ```
Journal File Management:
- Place journal files on separate physical disk from databases
- Configure adequate journal directory size
- Set up journal file purging to prevent disk full
- Archive old journal files to backup storage
- Synchronize journal archival with backup schedule
Journal Switching: Switch to new journal file:
- After each backup
- At defined intervals (e.g., every 4 hours)
- Before major operations
- Manually when needed
Switch journal via: ```objectscript Do SWITJ^JOURNAL ```
Journal Restore Verification: After journal restore completes: 1. Check restore log for completion message 2. Verify expected number of journal entries processed 3. Test critical application functionality 4. Compare database statistics to expected values 5. Run integrity check 6. Verify specific data that should be recovered
Incremental Restore: Journal restore supports incremental recovery:
- Restore from full backup
- Apply incremental backup
- Apply journals since incremental backup
- Faster than applying all journals from full backup
Parallel Journal Restore: For large systems:
- Multiple databases can be recovered in parallel
- Speeds recovery process
- Configure based on system resources
- Monitor system load during parallel restore
Recovery Time Objective (RTO): Factors affecting journal restore duration:
- Volume of journal files to process
- Database size
- Disk I/O performance
- System CPU and memory
- Filtering complexity
Minimizing Recovery Time:
- More frequent incremental backups reduce journal volume
- Keep journal files on fast storage
- Use journal file compression
- Maintain organized journal archive
- Practice recovery to optimize procedure
Transaction Consistency: Journal restore maintains transaction integrity:
- Complete transactions are restored fully
- Incomplete transactions are rolled back
- ACID properties preserved
- Database remains consistent throughout recovery
Advanced Journal Features:
Journal Filters:
- Recover only specific globals: `^Patient*, ^Order*`
- Exclude specific globals from recovery
- Time-based filtering
- User-based filtering (who made changes)
Test Mode:
- Run journal restore in test mode
- Validates journal files without modifying databases
- Reports what would be recovered
- Useful for planning recovery strategy
Error Handling:
- Configure behavior on journal errors
- Stop on error vs. continue with warnings
- Log all errors for review
- Preserve partial recovery if error occurs
Common Issues:
Missing Journal Files:
- Gap in journal sequence prevents complete recovery
- May need to accept data loss for gap period
- Emphasizes importance of journal backup/archival
Journal Corruption:
- Rare but possible
- May require recovery from earlier point
- WIJ recovery protects against most corruption
- Regular verification of journal files recommended
Insufficient Disk Space:
- Journal restore needs space for processing
- Ensure adequate free space before starting
- Monitor space during long-running restores
5. Use FREEZE and THAW APIs for External Snapshots
Key Points
- Freeze() pauses database writes while allowing user activity
- Thaw() resumes normal database write operations
- Enables consistent snapshots with SAN or VM tools
- Backup.General class provides ExternalFreeze() and ExternalThaw() methods
- Critical for enterprise backup integration
Detailed Notes
The FREEZE and THAW APIs enable integration with external backup tools (SAN snapshots, VM snapshots, enterprise backup software) by pausing physical database writes while allowing user processes to continue operating. This creates a consistent point-in-time for snapshot creation.
Understanding External Backup with FREEZE/THAW: External backup uses operating system or storage-level tools rather than InterSystems IRIS backup utilities. For external backups to be consistent, database writes must be frozen during the snapshot operation. The FREEZE/THAW APIs provide this capability.
Backup.General Class Methods: The `Backup.General` class provides the primary methods for external backup integration:
ExternalFreeze(): ```objectscript Set status = ##class(Backup.General).ExternalFreeze(timeout, logfile, journal, updwij) ```
- Pauses physical writes to databases
- User processes continue updating data in memory (cached writes)
- Default timeout: 600 seconds (10 minutes)
- Automatically switches journal file if specified
- Returns status indicating success or failure
- System queues database updates during freeze
ExternalThaw(): ```objectscript Set status = ##class(Backup.General).ExternalThaw() ```
- Resumes database write operations
- Queued updates written to disk
- Normal write daemon operation restored
- Must be called after snapshot completes
- Critical to call promptly to prevent memory exhaustion
ExternalSetHistory(): ```objectscript Set status = ##class(Backup.General).ExternalSetHistory(description) ```
- Records backup completion in history
- Can trigger journal file purging
- Documents when external backup completed
- Maintains backup audit trail
Command-Line Wrappers: For convenience, command-line wrapper methods exist:
- `CommandLineFreeze()` - Handles cluster membership detection before freezing
- `CommandLineThaw()` - Handles cluster membership detection before thawing
These wrappers are useful when scripting external backups.
Freeze/Thaw Workflow: 1. Call `ExternalFreeze()` with appropriate timeout 2. Verify freeze was successful (check return status) 3. Trigger external snapshot (SAN, VM, or OS-level) 4. Wait for snapshot to complete 5. Call `ExternalThaw()` immediately after snapshot 6. Call `ExternalSetHistory()` to record backup 7. Verify snapshot integrity if possible
Example Script: ```objectscript ; External backup script Set timeout = 600 ; 10 minute timeout Set logfile = "/backup/freeze.log" Set journal = 1 ; Switch journal during freeze Set updwij = 1 ; Update WIJ during freeze
; Freeze database writes Set status = ##class(Backup.General).ExternalFreeze(timeout, logfile, journal, updwij) If $$$ISERR(status) { Write "Freeze failed: ", $System.Status.GetErrorText(status), ! Quit }
; At this point, external snapshot tool should run Write "Databases frozen - run snapshot now", ! ; ... snapshot occurs ...
; Thaw database writes Set status = ##class(Backup.General).ExternalThaw() If $$$ISERR(status) { Write "Thaw failed: ", $System.Status.GetErrorText(status), ! }
; Record backup history Set status = ##class(Backup.General).ExternalSetHistory("External snapshot backup") ```
Terminal Command Alternative: For external backup without programming: ``` Do BACKUP^DBACK ``` This utility provides an interactive menu for external backup operations including freeze and thaw.
SAN Snapshot Integration: When using SAN-level snapshots:
- Coordinate with storage administrator
- Ensure all volumes containing databases are snapshotted simultaneously
- Include journal file volumes in snapshot
- Verify snapshot consistency after creation
- Test restore from snapshot periodically
VM Snapshot Integration: For virtual machine snapshots:
- Freeze databases before VM snapshot
- Thaw after snapshot completes
- Consider quiescing guest OS as well
- Be aware of snapshot chains and management
- Test VM snapshot restores regularly
Timeout Considerations: The freeze timeout must be long enough for snapshot to complete:
- Default 600 seconds (10 minutes)
- Increase for large or slow storage systems
- Monitor for timeout warnings
- If freeze times out, writes resume automatically
- Failed freeze may result in inconsistent snapshot
Memory Implications During Freeze: While frozen:
- Database updates accumulate in memory
- Global buffers fill with pending writes
- Extended freeze can exhaust buffer space
- System may slow or stall if buffers full
- Keep freeze duration as short as possible
Concurrent External Backup: For systems that can't tolerate any freeze:
- Concurrent external backup mode available
- Uses WIJ to track changes during backup
- More complex but allows continuous operation
- See `GCDI_backup_methods_ext_concurrent` documentation
Best Practices: 1. Test freeze/thaw procedures before production use 2. Monitor freeze duration and buffer utilization 3. Set appropriate timeout for your environment 4. Always call Thaw() after freeze, even if snapshot fails 5. Document freeze/thaw procedures for operations staff 6. Include freeze/thaw in disaster recovery documentation 7. Validate snapshots through periodic restore testing 8. Consider concurrent backup for systems requiring 24/7 operation
Troubleshooting:
- Freeze timeout: Increase timeout or investigate slow snapshot
- Freeze fails: Check system resources, disk space, permissions
- Thaw fails: Rare but critical - may need restart
- Inconsistent snapshot: Verify all volumes captured simultaneously
- Memory pressure during freeze: Reduce freeze duration or add buffers
Documentation References
Exam Preparation Summary
Critical Concepts to Master:
- Online Backup: Full or incremental while system running; most common for 24/7 systems
- External Backup: OS/SAN-level tools; requires backup mode or shutdown
- Concurrent Backup: Database changes continue during backup
- WIJ (Write Image Journal): Protects against physical corruption during writes
- Journaling: Enables point-in-time recovery; critical for disaster recovery
- Full vs Incremental: Full provides baseline; incremental backs up changed blocks only
- Backup Mode: Freezes write daemon for consistent external backup
Common Exam Scenarios:
- Choosing appropriate backup strategy for given requirements
- Understanding full vs incremental backup tradeoffs
- Configuring external backup with BACKUP^DBACK
- Performing point-in-time recovery using journals
- Understanding WIJ role in database integrity
- Planning backup schedules for RTO/RPO requirements
- Troubleshooting backup and recovery failures
Hands-On Practice Recommendations:
- Perform full and incremental online backups
- Configure external backup using OS tools
- Practice database restore from backup
- Perform point-in-time recovery using journals
- Monitor backup progress and verify completion
- Test recovery procedures in non-production environment
- Configure Task Manager for automated backup schedules