1. Understands the role of bitmap extent indexes
Key Points
- Bitmap Extent Index: Specialized index automatically created for each table to track row existence
- COUNT(*) Optimization: Dramatically improves performance of counting total rows in a table
- Automatic Creation: DDL-created tables automatically define bitmap extent index (%%DDLBEIndex)
- One Per Table: Maximum of one bitmap extent index allowed per table
- System-Assigned RowID: Requires tables using system-assigned RowID with positive integer values
Detailed Notes
What is a Bitmap Extent Index?
Bitmap extent indexes are specialized indexes that InterSystems IRIS uses to optimize table-level operations, particularly COUNT(*) queries. When you create a table using CREATE TABLE, InterSystems IRIS automatically defines a bitmap extent index with the SQL MapName %%DDLBEIndex. This index maintains a compressed bitmap representation of which rows exist in the table, allowing the database to rapidly determine row counts without scanning the entire table.
How it Works
The bitmap extent index tracks the table extent itself - the complete set of rows that exist in the table. The index uses bit positions to represent row IDs, with each bit indicating whether a row exists (1) or not (0). Unlike regular indexes that track specific column values, the bitmap extent index tracks existence only.
Performance Benefits
The bitmap extent index is particularly valuable for large tables where counting rows would otherwise require a full table scan. COUNT(*) queries can be resolved entirely from the index without accessing the table data.
Constraints and Limitations
- One per table: Maximum of one bitmap extent index allowed per table
- SQLCODE -400: Attempting to create multiple bitmap extent indexes results in this error
- Not created automatically when:
- Table is a global temporary table
- Table defines an explicit IDKEY index
- Table contains an IDENTITY column with MINVAL other than 1
- System configuration option DDLDefineBitmapExtent is disabled
RowID Requirements
The bitmap extent index works only with:
- Tables using system-assigned RowID values with positive integers
- Tables using a primary key IDKEY based on %Integer with MINVAL > 0
- Tables using a primary key IDKEY based on %Numeric with SCALE = 0 and MINVAL > 0
This restriction ensures the bitmap can efficiently map bit positions to row identifiers.
Documentation References
2. Determines appropriate index use cases
Key Points
- Performance vs. Maintenance: Indexes improve query speed but slow INSERT, UPDATE, DELETE operations
- Selectivity Matters: Index columns with high cardinality (many distinct values) for best results
- Query Patterns: Analyze WHERE clauses, JOIN conditions, and ORDER BY requirements
- Index Overhead: Each index requires storage space and maintenance during data modifications
- Covering Indexes: WITH DATA clause allows queries resolved entirely from index
Detailed Notes
Balancing Performance and Maintenance
Determining appropriate index use cases requires balancing query performance benefits against maintenance costs. Indexes dramatically improve query performance by providing fast lookup paths to data, but every index adds overhead to INSERT, UPDATE, and DELETE operations because InterSystems IRIS must maintain each index when data changes.
Selectivity Principles
The most effective indexes are created on columns with high selectivity - meaning columns that have many distinct values relative to the total number of rows.
- High selectivity examples: Employee ID, Social Security Number, Order Number
- Low selectivity examples: Gender (M/F), Status codes with few values
- Recommendation: Use standard indexes for high selectivity; consider bitmap indexes for low selectivity
Identifying Index Candidates
Analyze your application's query patterns to identify index candidates:
- WHERE clauses: Columns frequently used in predicates are prime candidates
- JOIN operations: Both the joining column and referenced column benefit from indexes
- ORDER BY/GROUP BY: Indexes can eliminate expensive sort operations
- Avoid over-indexing: Creating indexes on every column wastes resources and degrades write performance
Composite and Covering Indexes
Consider composite indexes when queries frequently filter on multiple columns together:
- Column order matters: Place the most selective column first, followed by less selective columns
- WITH DATA clause: Creates covering indexes that include additional column data
- Covering queries: Can be satisfied entirely from the index without accessing the master data map
- Tradeoff: Increases index size and maintenance overhead
Documentation References
3. Differentiates between index types (standard, bitmap, bitslice, columnar)
Key Points
- Standard Index (Type=index): General-purpose B-tree index for high-cardinality columns
- Bitmap Index (Type=bitmap): Optimized for low-cardinality columns with few distinct values
- Bitslice Index (Type=bitslice): Specialized for fast aggregate calculations on numeric data
- Columnar Index (Type=columnar): Enables fast filtering and aggregation on columnar-stored data
- Unique Constraint: Only standard indexes support UNIQUE keyword
Detailed Notes
Standard Index (Type=index)
Standard indexes use a traditional B-tree structure suitable for columns with many distinct values.
- Best for: Primary keys, foreign keys, high-cardinality columns
- Supports: Both unique and non-unique constraints
- Operations: Efficient lookup, range scanning, and sorting
- Default choice: Most indexing scenarios
Bitmap Index (Type=bitmap)
Bitmap indexes excel when indexing columns with a small number of distinct values (low cardinality).
- Best for: Gender, status codes, categorical data
- How it works: Compressed bit strings where bit position = row ID, bit value = presence/absence
- Efficient for: AND/OR operations in WHERE clauses
- Limitations:
- Should only be used for < 10,000 distinct values
- Requires tables with system-assigned RowID using positive integers
- Cannot specify UNIQUE constraints
Bitslice Index (Type=bitslice)
Bitslice indexes are highly specialized for fast aggregate calculations on numeric data.
- Best for: SUM operations, range conditions on numeric data
- How it works: Represents each numeric value as a binary bit string
- Limitations:
- Significantly slower INSERT, UPDATE, DELETE performance
- Not used by query optimizer for WHERE clause filtering
- Cannot specify UNIQUE constraints
Columnar Index (Type=columnar)
Columnar indexes enable very fast queries on columnar-stored data.
- Best for: Analytics workloads (OLAP), filtering and aggregation
- Ideal scenario: Frequently querying specific columns across many rows
- Not ideal for: Transactional workloads (OLTP) accessing complete rows
Documentation References
4. Distinguishes between primary keys, unique constraints, and ID keys
Key Points
- Primary Key: User-defined unique identifier for table rows, optional but recommended
- ID Key (IDKEY): System-level unique index used for object identity and references
- Unique Constraint: Enforces uniqueness without designating the primary key
- RowID Column: System-generated integer serving as default unique identifier
- Configuration Impact: DDLPKeyNotIDKey setting determines if primary key becomes IDKEY
Detailed Notes
RowID Column
The RowID column is an automatically generated column (default name "ID") containing unique, non-modifiable positive integers assigned by the system to each record.
- Always exists: Provides a guaranteed unique identifier even without a primary key
- Immutable: Values cannot be modified after creation
- System-assigned: Automatically assigned by InterSystems IRIS
Primary Key
A primary key is a user-defined constraint that designates one or more columns as the table's primary record identifier.
- Requirements: Must contain unique values, cannot accept NULL
- Optional but recommended: Provides meaningful business-level identification
- Examples: Employee ID, Order Number, SSN
DDLPKeyNotIDKey Configuration
The relationship between primary keys and ID keys depends on the system configuration setting DDLPKeyNotIDKey:
- DDLPKeyNotIDKey=1 (default): Primary key is NOT the IDKEY
- Creates separate index for uniqueness
- Primary key values CAN be modified
- DDLPKeyNotIDKey=0: Primary key becomes the IDKEY
- More efficient data access
- Primary key values CANNOT be modified after creation
- Note: If IDENTITY column exists, primary key can never be IDKEY regardless of setting
Unique Constraints
Unique constraints enforce uniqueness without designating the primary key.
- Multiple allowed: Table can have multiple unique constraints but only one primary key
- Creates index: To enforce uniqueness
- Sharded tables: Can significantly impact performance unless shard key is subset of unique key
IDKEY
The IDKEY is the system-level unique index used internally for object identity.
- Required: Every persistent class must have exactly one IDKEY
- Used for: Object identity, parent-child relationships, object references
- Fallback logic: If no primary key and RowID hidden, uses IDENTITY column or creates internal IDKEY
Documentation References
5. Defines foreign key relationship properties
Key Points
- Foreign Key Definition: Column referencing unique values in another table
- Referenced Column Requirements: Must be unique (primary key, RowID, IDENTITY, or unique constraint)
- Constraint Naming: Required CONSTRAINT identifier for ALTER TABLE operations
- Multiple Columns: Support for composite foreign keys with matching column order
- Default Reference: Omitting referenced column defaults to primary key, then IDENTITY, then RowID
Detailed Notes
What Foreign Keys Do
Foreign keys establish referential relationships between tables by designating a column (or combination of columns) in one table that references unique values in another table.
- Purpose: Create logical links between tables
- Data integrity: Ensures referenced values actually exist
- Prevents: Orphaned records
Definition Requirements
When defining a foreign key:
- CONSTRAINT name required: Needed for future ALTER TABLE operations
- Column names: Can differ between foreign key and referenced columns
- Data types: Must be compatible
- Referenced column must be unique: Primary key, unique constraint, RowID, or IDENTITY
Composite Foreign Keys
Foreign keys support multiple columns by specifying comma-separated lists:
- Must match in number and order
- Example: `FOREIGN KEY (CustomerNum, SalesPersonNum) REFERENCES Customers (CustID, SalespID)`
- CustomerNum → CustID
- SalesPersonNum → SalespID
Default Reference Hierarchy
If you omit the referenced column, the foreign key defaults to: 1. Primary key column (if defined) 2. IDENTITY column (if no primary key) 3. RowID column (if neither exists)
Note: RowID can only be referenced if defined as public (SqlRowIdPrivate=0 or %PUBLICROWID keyword)
Reference Property Conversion
InterSystems IRIS can convert single-column foreign keys to reference properties if:
- Foreign key references parent table's IDKEY
- Table contains no data
- Property is not already a reference property
- Data types match exactly
- Foreign key field is not an IDENTITY field
Sharded Table Considerations
- Supported: Foreign keys work for any combination of sharded/unsharded tables
- Restriction: Only NO ACTION referential action supported
- Privilege required: REFERENCES on referenced table or columns
Documentation References
6. Understands referential action behavior for updates and deletes
Key Points
- ON DELETE/ON UPDATE: Clauses defining actions when referenced data changes
- NO ACTION (default): Prevents delete/update if foreign key references exist
- CASCADE: Propagates delete/update to all referencing rows
- SET NULL: Sets foreign key columns to NULL when referenced row changes
- SET DEFAULT: Sets foreign key columns to their default values
- Sharded Restrictions: Only NO ACTION supported for sharded tables
Detailed Notes
Purpose of Referential Actions
Referential actions define how InterSystems IRIS maintains data consistency when records in referenced tables are modified or deleted. The ON DELETE and ON UPDATE clauses specify what action to take when changing or removing a row referenced by foreign key constraints.
NO ACTION (Default)
NO ACTION is the default for both ON DELETE and ON UPDATE.
- Behavior: Prevents deletion/modification if foreign key references exist
- Result: Operation fails with error, preserving referential integrity
- Exception: Does not apply to self-referential foreign keys
CASCADE
CASCADE propagates changes through the relationship.
- ON DELETE CASCADE: Automatically deletes all referencing rows
- ON UPDATE CASCADE: Automatically updates foreign key values to match new primary key
- Warning: Deleting one row can cascade to many related rows across tables
SET NULL
SET NULL breaks the relationship by nullifying foreign key columns.
- Requirement: Foreign key columns must allow NULL (no NOT NULL constraint)
- Behavior: Maintains child rows but removes parent relationship
- Use case: When the relationship is optional
SET DEFAULT
SET DEFAULT reassigns orphaned records to a default parent.
- Behavior: Sets foreign key columns to their default values
- No default defined: Sets to NULL
- Requirement: A row matching the default value must exist in referenced table
- Use case: Reassigning to "Unknown" or "Unassigned" parent
Important Warnings
- No contradictory foreign keys: Don't define two FKs on same columns with different actions
- Example: One with ON DELETE CASCADE, another with ON DELETE SET NULL
- IRIS doesn't prevent this at definition, but errors occur at runtime
- Sharded tables: Only NO ACTION supported
- CASCADE, SET NULL, SET DEFAULT result in SQLCODE -400 error
Documentation References
7. Evaluates tradeoffs when adding indexes
Key Points
- Read Performance Gain: Indexes can reduce query time from seconds to milliseconds
- Write Performance Cost: Each index adds overhead to INSERT, UPDATE, DELETE operations
- Storage Overhead: Indexes consume additional disk space
- Maintenance Burden: Index corruption risks require rebuilding; multiple indexes multiply complexity
- Optimal Balance: Index only columns frequently used in WHERE, JOIN, ORDER BY clauses
Detailed Notes
Query Performance Benefits
Adding indexes dramatically accelerates query operations by providing direct access paths to data.
- Full table scan: Seconds or minutes
- Index lookup: Milliseconds
- Benefit: Orders of magnitude improvement for selective queries
Write Performance Costs
Each data modification operation must update the base table AND every index.
- INSERT: Must add entries to all applicable indexes
- UPDATE: Must update index entries for modified indexed columns
- DELETE: Must remove entries from all indexes
- Impact: Multiple indexes significantly degrade write performance in transaction-heavy applications
Storage Overhead
Each index requires disk space to store the index structure and data.
- Standard indexes: Store indexed column values and row pointers
- WITH DATA clause: Store additional column data (larger but enables covering queries)
- Bitmap indexes: Use compressed bit strings (efficient but can be substantial for large tables)
Maintenance Complexity
Indexes introduce operational requirements:
- Index corruption/inefficiency: May require rebuilding using BUILD INDEX or %BuildIndices()
- Collation changes: Require recompiling classes and rebuilding all indexes
- Initial creation: Building index data can be time-consuming for large tables
- Multiple indexes: Multiply all maintenance requirements
Optimal Indexing Strategy
Balance tradeoffs by indexing selectively:
- DO index: Columns frequently used in WHERE, JOIN, ORDER BY
- DON'T index: Every column "just in case" - wastes resources
- Low selectivity columns: Use bitmap indexes instead of standard indexes
- Monitor and iterate: Add indexes based on actual usage patterns, not speculation
- Use query plans: Identify queries that would benefit from new indexes
Documentation References
8. Identifies composite index use cases
Key Points
- Definition: Single index spanning multiple columns in specified order
- Column Order Matters: Most selective column should be first
- Left-to-Right Usage: Query must use leftmost columns for index utilization
- GROUP BY Optimization: Composite indexes improve performance for grouped queries
- WITH DATA Enhancement: Include additional columns for covering index capability
Detailed Notes
What are Composite Indexes?
Composite indexes (also called multi-column or concatenated indexes) are indexes defined on two or more columns.
- Syntax: `CREATE INDEX idx ON Table (col1, col2, col3)`
- Key point: Column order is significant and affects performance
Left-to-Right Rule
The query optimizer follows the "left-to-right" rule for composite index utilization.
For index (A, B, C):
- CAN use efficiently: Queries filtering on (A), (A,B), or (A,B,C)
- CANNOT use efficiently: Queries filtering only on (B), (C), or (B,C)
Design principle: Put most selective (highest cardinality) column first, then less selective columns.
Multi-Condition Queries
Composite indexes excel for queries with multiple AND filter conditions.
- Example query: `WHERE State='CA' AND City='San Francisco'`
- Index benefit: Index on (State, City) provides direct access to matching rows
- Without index: Query uses single-column index on State, then filters for City
GROUP BY Optimization
GROUP BY operations benefit when columns match the index order.
- Index (State, City): Allows efficient grouping by State alone or by State+City
- Benefit: Pre-sorted data eliminates expensive sort operations
Covering Indexes with WITH DATA
The WITH DATA clause includes additional columns in the index structure.
- Syntax: `CREATE INDEX idx ON Orders (CustomerID, OrderDate) WITH DATA (OrderTotal)`
- Benefit: Queries selecting CustomerID, OrderDate, and OrderTotal are satisfied entirely from index
- Impact: Dramatically reduces I/O by avoiding base table access
- Tradeoff: Increases index size and maintenance overhead
- Use for: Frequently-executed queries on stable data
When to Create Composite Indexes
Consider composite indexes when:
- Multiple columns are frequently queried together
- Queries combine multiple filter conditions with AND
- GROUP BY or ORDER BY operates on multiple columns
- Covering index opportunities exist for critical queries
Warning: Avoid excessive composite indexes - they consume storage and slow writes.
Documentation References
Exam Preparation Summary
Critical Concepts to Master:
- Bitmap Extent Indexes: Know they optimize COUNT(*), are automatically created for DDL tables, and only one exists per table
- Index Type Selection: Understand when to use standard (high cardinality), bitmap (low cardinality), bitslice (aggregates), or columnar (analytics) indexes
- Primary Key vs. IDKEY: Distinguish between user-defined primary keys and system-level ID keys; understand DDLPKeyNotIDKey configuration impact
- Foreign Key Configuration: Master composite foreign keys, default reference rules, and reference property conversion
- Referential Actions: Know NO ACTION (default), CASCADE (propagate), SET NULL, and SET DEFAULT behaviors
- Index Tradeoffs: Balance query performance gains against write performance costs and storage overhead
- Composite Index Design: Apply left-to-right rule, selectivity ordering, and covering index opportunities
- Sharded Table Restrictions: Remember bitmap index requirements, unique constraint impacts, and NO ACTION-only referential actions
Common Exam Scenarios:
- Selecting appropriate index type based on column cardinality and query patterns
- Designing composite indexes with optimal column ordering
- Configuring foreign key relationships with appropriate referential actions
- Evaluating whether to create bitmap extent index for specific table structures
- Distinguishing when primary key should or should not be the IDKEY
- Analyzing index tradeoff decisions for mixed OLTP/OLAP workloads
- Troubleshooting foreign key reference errors and default reference behavior
- Identifying covering index opportunities using WITH DATA clause
Hands-On Practice Recommendations:
- Create tables with different index types and compare query performance
- Build composite indexes with various column orders and test query optimizer usage
- Define foreign keys with different referential actions and test cascade behavior
- Create bitmap extent indexes manually and verify COUNT(*) performance improvement
- Configure primary keys with and without IDKEY designation, observe modification restrictions
- Design schemas with multiple related tables using various foreign key patterns
- Practice adding indexes to existing tables and measuring write performance impact
- Use Management Portal Catalog Details to inspect index definitions and constraints