T1.1: Designs a SQL Schema - InterSystems IRIS SQL Specialist Study Guide

1. Understands the role of bitmap extent indexes

Key Points

Bitmap Extent Index: Specialized index automatically created for each table to track row existence
COUNT(*) Optimization: Dramatically improves performance of counting total rows in a table
Automatic Creation: DDL-created tables automatically define bitmap extent index (%%DDLBEIndex)
One Per Table: Maximum of one bitmap extent index allowed per table
System-Assigned RowID: Requires tables using system-assigned RowID with positive integer values

Detailed Notes

What is a Bitmap Extent Index?

Bitmap extent indexes are specialized indexes that InterSystems IRIS uses to optimize table-level operations, particularly COUNT(*) queries. When you create a table using CREATE TABLE, InterSystems IRIS automatically defines a bitmap extent index with the SQL MapName %%DDLBEIndex. This index maintains a compressed bitmap representation of which rows exist in the table, allowing the database to rapidly determine row counts without scanning the entire table.

How it Works

The bitmap extent index tracks the table extent itself - the complete set of rows that exist in the table. The index uses bit positions to represent row IDs, with each bit indicating whether a row exists (1) or not (0). Unlike regular indexes that track specific column values, the bitmap extent index tracks existence only.

Performance Benefits

The bitmap extent index is particularly valuable for large tables where counting rows would otherwise require a full table scan. COUNT(*) queries can be resolved entirely from the index without accessing the table data.

Constraints and Limitations

One per table: Maximum of one bitmap extent index allowed per table
SQLCODE -400: Attempting to create multiple bitmap extent indexes results in this error
Not created automatically when:
Table is a global temporary table
Table defines an explicit IDKEY index
Table contains an IDENTITY column with MINVAL other than 1
System configuration option DDLDefineBitmapExtent is disabled

RowID Requirements

The bitmap extent index works only with:

Tables using system-assigned RowID values with positive integers
Tables using a primary key IDKEY based on %Integer with MINVAL > 0
Tables using a primary key IDKEY based on %Numeric with SCALE = 0 and MINVAL > 0

This restriction ensures the bitmap can efficiently map bit positions to row identifiers.

Documentation References

2. Determines appropriate index use cases

Key Points

Performance vs. Maintenance: Indexes improve query speed but slow INSERT, UPDATE, DELETE operations
Selectivity Matters: Index columns with high cardinality (many distinct values) for best results
Query Patterns: Analyze WHERE clauses, JOIN conditions, and ORDER BY requirements
Index Overhead: Each index requires storage space and maintenance during data modifications
Covering Indexes: WITH DATA clause allows queries resolved entirely from index

Detailed Notes

Balancing Performance and Maintenance

Determining appropriate index use cases requires balancing query performance benefits against maintenance costs. Indexes dramatically improve query performance by providing fast lookup paths to data, but every index adds overhead to INSERT, UPDATE, and DELETE operations because InterSystems IRIS must maintain each index when data changes.

Selectivity Principles

The most effective indexes are created on columns with high selectivity - meaning columns that have many distinct values relative to the total number of rows.

High selectivity examples: Employee ID, Social Security Number, Order Number
Low selectivity examples: Gender (M/F), Status codes with few values
Recommendation: Use standard indexes for high selectivity; consider bitmap indexes for low selectivity

Identifying Index Candidates

Analyze your application's query patterns to identify index candidates:

WHERE clauses: Columns frequently used in predicates are prime candidates
JOIN operations: Both the joining column and referenced column benefit from indexes
ORDER BY/GROUP BY: Indexes can eliminate expensive sort operations
Avoid over-indexing: Creating indexes on every column wastes resources and degrades write performance

Composite and Covering Indexes

Consider composite indexes when queries frequently filter on multiple columns together:

Column order matters: Place the most selective column first, followed by less selective columns
WITH DATA clause: Creates covering indexes that include additional column data
Covering queries: Can be satisfied entirely from the index without accessing the master data map
Tradeoff: Increases index size and maintenance overhead

Documentation References

3. Differentiates between index types (standard, bitmap, bitslice, columnar)

Key Points

Standard Index (Type=index): General-purpose B-tree index for high-cardinality columns
Bitmap Index (Type=bitmap): Optimized for low-cardinality columns with few distinct values
Bitslice Index (Type=bitslice): Specialized for fast aggregate calculations on numeric data
Columnar Index (Type=columnar): Enables fast filtering and aggregation on columnar-stored data
Unique Constraint: Only standard indexes support UNIQUE keyword

Detailed Notes

Standard Index (Type=index)

Standard indexes use a traditional B-tree structure suitable for columns with many distinct values.

Best for: Primary keys, foreign keys, high-cardinality columns
Supports: Both unique and non-unique constraints
Operations: Efficient lookup, range scanning, and sorting
Default choice: Most indexing scenarios

Bitmap Index (Type=bitmap)

Bitmap indexes excel when indexing columns with a small number of distinct values (low cardinality).

Best for: Gender, status codes, categorical data
How it works: Compressed bit strings where bit position = row ID, bit value = presence/absence
Efficient for: AND/OR operations in WHERE clauses
Limitations:
Should only be used for < 10,000 distinct values
Requires tables with system-assigned RowID using positive integers
Cannot specify UNIQUE constraints

Bitslice Index (Type=bitslice)

Bitslice indexes are highly specialized for fast aggregate calculations on numeric data.

Best for: SUM operations, range conditions on numeric data
How it works: Represents each numeric value as a binary bit string
Limitations:
Significantly slower INSERT, UPDATE, DELETE performance
Not used by query optimizer for WHERE clause filtering
Cannot specify UNIQUE constraints

Columnar Index (Type=columnar)

Columnar indexes enable very fast queries on columnar-stored data.

Best for: Analytics workloads (OLAP), filtering and aggregation
Ideal scenario: Frequently querying specific columns across many rows
Not ideal for: Transactional workloads (OLTP) accessing complete rows

Documentation References

4. Distinguishes between primary keys, unique constraints, and ID keys

Key Points

Primary Key: User-defined unique identifier for table rows, optional but recommended
ID Key (IDKEY): System-level unique index used for object identity and references
Unique Constraint: Enforces uniqueness without designating the primary key
RowID Column: System-generated integer serving as default unique identifier
Configuration Impact: DDLPKeyNotIDKey setting determines if primary key becomes IDKEY

Detailed Notes

RowID Column

The RowID column is an automatically generated column (default name "ID") containing unique, non-modifiable positive integers assigned by the system to each record.

Always exists: Provides a guaranteed unique identifier even without a primary key
Immutable: Values cannot be modified after creation
System-assigned: Automatically assigned by InterSystems IRIS

Primary Key

A primary key is a user-defined constraint that designates one or more columns as the table's primary record identifier.

Requirements: Must contain unique values, cannot accept NULL
Optional but recommended: Provides meaningful business-level identification
Examples: Employee ID, Order Number, SSN

DDLPKeyNotIDKey Configuration

The relationship between primary keys and ID keys depends on the system configuration setting DDLPKeyNotIDKey:

DDLPKeyNotIDKey=1 (default): Primary key is NOT the IDKEY
Creates separate index for uniqueness
Primary key values CAN be modified
DDLPKeyNotIDKey=0: Primary key becomes the IDKEY
More efficient data access
Primary key values CANNOT be modified after creation
Note: If IDENTITY column exists, primary key can never be IDKEY regardless of setting

Unique Constraints

Unique constraints enforce uniqueness without designating the primary key.

Multiple allowed: Table can have multiple unique constraints but only one primary key
Creates index: To enforce uniqueness
Sharded tables: Can significantly impact performance unless shard key is subset of unique key

IDKEY

The IDKEY is the system-level unique index used internally for object identity.

Required: Every persistent class must have exactly one IDKEY
Used for: Object identity, parent-child relationships, object references
Fallback logic: If no primary key and RowID hidden, uses IDENTITY column or creates internal IDKEY

Documentation References

5. Defines foreign key relationship properties

Key Points

Foreign Key Definition: Column referencing unique values in another table
Referenced Column Requirements: Must be unique (primary key, RowID, IDENTITY, or unique constraint)
Constraint Naming: Required CONSTRAINT identifier for ALTER TABLE operations
Multiple Columns: Support for composite foreign keys with matching column order
Default Reference: Omitting referenced column defaults to primary key, then IDENTITY, then RowID

Detailed Notes

What Foreign Keys Do

Foreign keys establish referential relationships between tables by designating a column (or combination of columns) in one table that references unique values in another table.

Purpose: Create logical links between tables
Data integrity: Ensures referenced values actually exist
Prevents: Orphaned records

Definition Requirements

When defining a foreign key:

CONSTRAINT name required: Needed for future ALTER TABLE operations
Column names: Can differ between foreign key and referenced columns
Data types: Must be compatible
Referenced column must be unique: Primary key, unique constraint, RowID, or IDENTITY

Composite Foreign Keys

Foreign keys support multiple columns by specifying comma-separated lists:

Must match in number and order
Example: `FOREIGN KEY (CustomerNum, SalesPersonNum) REFERENCES Customers (CustID, SalespID)`
CustomerNum → CustID
SalesPersonNum → SalespID

Default Reference Hierarchy

If you omit the referenced column, the foreign key defaults to: 1. Primary key column (if defined) 2. IDENTITY column (if no primary key) 3. RowID column (if neither exists)

Note: RowID can only be referenced if defined as public (SqlRowIdPrivate=0 or %PUBLICROWID keyword)

Reference Property Conversion

InterSystems IRIS can convert single-column foreign keys to reference properties if:

Foreign key references parent table's IDKEY
Table contains no data
Property is not already a reference property
Data types match exactly
Foreign key field is not an IDENTITY field

Sharded Table Considerations

Supported: Foreign keys work for any combination of sharded/unsharded tables
Restriction: Only NO ACTION referential action supported
Privilege required: REFERENCES on referenced table or columns

Documentation References

6. Understands referential action behavior for updates and deletes

Key Points

ON DELETE/ON UPDATE: Clauses defining actions when referenced data changes
NO ACTION (default): Prevents delete/update if foreign key references exist
CASCADE: Propagates delete/update to all referencing rows
SET NULL: Sets foreign key columns to NULL when referenced row changes
SET DEFAULT: Sets foreign key columns to their default values
Sharded Restrictions: Only NO ACTION supported for sharded tables

Detailed Notes

Purpose of Referential Actions

Referential actions define how InterSystems IRIS maintains data consistency when records in referenced tables are modified or deleted. The ON DELETE and ON UPDATE clauses specify what action to take when changing or removing a row referenced by foreign key constraints.

NO ACTION (Default)

NO ACTION is the default for both ON DELETE and ON UPDATE.

Behavior: Prevents deletion/modification if foreign key references exist
Result: Operation fails with error, preserving referential integrity
Exception: Does not apply to self-referential foreign keys

CASCADE

CASCADE propagates changes through the relationship.

ON DELETE CASCADE: Automatically deletes all referencing rows
ON UPDATE CASCADE: Automatically updates foreign key values to match new primary key
Warning: Deleting one row can cascade to many related rows across tables

SET NULL

SET NULL breaks the relationship by nullifying foreign key columns.

Requirement: Foreign key columns must allow NULL (no NOT NULL constraint)
Behavior: Maintains child rows but removes parent relationship
Use case: When the relationship is optional

SET DEFAULT

SET DEFAULT reassigns orphaned records to a default parent.

Behavior: Sets foreign key columns to their default values
No default defined: Sets to NULL
Requirement: A row matching the default value must exist in referenced table
Use case: Reassigning to "Unknown" or "Unassigned" parent

Important Warnings

No contradictory foreign keys: Don't define two FKs on same columns with different actions
Example: One with ON DELETE CASCADE, another with ON DELETE SET NULL
IRIS doesn't prevent this at definition, but errors occur at runtime
Sharded tables: Only NO ACTION supported
CASCADE, SET NULL, SET DEFAULT result in SQLCODE -400 error

Documentation References

7. Evaluates tradeoffs when adding indexes

Key Points

Read Performance Gain: Indexes can reduce query time from seconds to milliseconds
Write Performance Cost: Each index adds overhead to INSERT, UPDATE, DELETE operations
Storage Overhead: Indexes consume additional disk space
Maintenance Burden: Index corruption risks require rebuilding; multiple indexes multiply complexity
Optimal Balance: Index only columns frequently used in WHERE, JOIN, ORDER BY clauses

Detailed Notes

Query Performance Benefits

Adding indexes dramatically accelerates query operations by providing direct access paths to data.

Full table scan: Seconds or minutes
Index lookup: Milliseconds
Benefit: Orders of magnitude improvement for selective queries

Write Performance Costs

Each data modification operation must update the base table AND every index.

INSERT: Must add entries to all applicable indexes
UPDATE: Must update index entries for modified indexed columns
DELETE: Must remove entries from all indexes
Impact: Multiple indexes significantly degrade write performance in transaction-heavy applications

Storage Overhead

Each index requires disk space to store the index structure and data.

Standard indexes: Store indexed column values and row pointers
WITH DATA clause: Store additional column data (larger but enables covering queries)
Bitmap indexes: Use compressed bit strings (efficient but can be substantial for large tables)

Maintenance Complexity

Indexes introduce operational requirements:

Index corruption/inefficiency: May require rebuilding using BUILD INDEX or %BuildIndices()
Collation changes: Require recompiling classes and rebuilding all indexes
Initial creation: Building index data can be time-consuming for large tables
Multiple indexes: Multiply all maintenance requirements

Optimal Indexing Strategy

Balance tradeoffs by indexing selectively:

DO index: Columns frequently used in WHERE, JOIN, ORDER BY
DON'T index: Every column "just in case" - wastes resources
Low selectivity columns: Use bitmap indexes instead of standard indexes
Monitor and iterate: Add indexes based on actual usage patterns, not speculation
Use query plans: Identify queries that would benefit from new indexes

Documentation References

8. Identifies composite index use cases

Key Points

Definition: Single index spanning multiple columns in specified order
Column Order Matters: Most selective column should be first
Left-to-Right Usage: Query must use leftmost columns for index utilization
GROUP BY Optimization: Composite indexes improve performance for grouped queries
WITH DATA Enhancement: Include additional columns for covering index capability

Detailed Notes

What are Composite Indexes?

Composite indexes (also called multi-column or concatenated indexes) are indexes defined on two or more columns.

Syntax: `CREATE INDEX idx ON Table (col1, col2, col3)`
Key point: Column order is significant and affects performance

Left-to-Right Rule

The query optimizer follows the "left-to-right" rule for composite index utilization.

For index (A, B, C):

CAN use efficiently: Queries filtering on (A), (A,B), or (A,B,C)
CANNOT use efficiently: Queries filtering only on (B), (C), or (B,C)

Design principle: Put most selective (highest cardinality) column first, then less selective columns.

Multi-Condition Queries

Composite indexes excel for queries with multiple AND filter conditions.

Example query: `WHERE State='CA' AND City='San Francisco'`
Index benefit: Index on (State, City) provides direct access to matching rows
Without index: Query uses single-column index on State, then filters for City

GROUP BY Optimization

GROUP BY operations benefit when columns match the index order.

Index (State, City): Allows efficient grouping by State alone or by State+City
Benefit: Pre-sorted data eliminates expensive sort operations

Covering Indexes with WITH DATA

The WITH DATA clause includes additional columns in the index structure.

Syntax: `CREATE INDEX idx ON Orders (CustomerID, OrderDate) WITH DATA (OrderTotal)`
Benefit: Queries selecting CustomerID, OrderDate, and OrderTotal are satisfied entirely from index
Impact: Dramatically reduces I/O by avoiding base table access
Tradeoff: Increases index size and maintenance overhead
Use for: Frequently-executed queries on stable data

When to Create Composite Indexes

Consider composite indexes when:

Multiple columns are frequently queried together
Queries combine multiple filter conditions with AND
GROUP BY or ORDER BY operates on multiple columns
Covering index opportunities exist for critical queries

Warning: Avoid excessive composite indexes - they consume storage and slow writes.

Documentation References

Exam Preparation Summary

Critical Concepts to Master:

Bitmap Extent Indexes: Know they optimize COUNT(*), are automatically created for DDL tables, and only one exists per table
Index Type Selection: Understand when to use standard (high cardinality), bitmap (low cardinality), bitslice (aggregates), or columnar (analytics) indexes
Primary Key vs. IDKEY: Distinguish between user-defined primary keys and system-level ID keys; understand DDLPKeyNotIDKey configuration impact
Foreign Key Configuration: Master composite foreign keys, default reference rules, and reference property conversion
Referential Actions: Know NO ACTION (default), CASCADE (propagate), SET NULL, and SET DEFAULT behaviors
Index Tradeoffs: Balance query performance gains against write performance costs and storage overhead
Composite Index Design: Apply left-to-right rule, selectivity ordering, and covering index opportunities
Sharded Table Restrictions: Remember bitmap index requirements, unique constraint impacts, and NO ACTION-only referential actions

Common Exam Scenarios:

Selecting appropriate index type based on column cardinality and query patterns
Designing composite indexes with optimal column ordering
Configuring foreign key relationships with appropriate referential actions
Evaluating whether to create bitmap extent index for specific table structures
Distinguishing when primary key should or should not be the IDKEY
Analyzing index tradeoff decisions for mixed OLTP/OLAP workloads
Troubleshooting foreign key reference errors and default reference behavior
Identifying covering index opportunities using WITH DATA clause

Hands-On Practice Recommendations:

Create tables with different index types and compare query performance
Build composite indexes with various column orders and test query optimizer usage
Define foreign keys with different referential actions and test cascade behavior
Create bitmap extent indexes manually and verify COUNT(*) performance improvement
Configure primary keys with and without IDKEY designation, observe modification restrictions
Design schemas with multiple related tables using various foreign key patterns
Practice adding indexes to existing tables and measuring write performance impact
Use Management Portal Catalog Details to inspect index definitions and constraints

1. Understands the role of bitmap extent indexes Report Issue

Key Points

Detailed Notes

What is a Bitmap Extent Index?

How it Works

Performance Benefits

Constraints and Limitations

RowID Requirements

Documentation References

2. Determines appropriate index use cases Report Issue

Key Points

Detailed Notes

Balancing Performance and Maintenance

Selectivity Principles

Identifying Index Candidates

Composite and Covering Indexes

Documentation References

3. Differentiates between index types (standard, bitmap, bitslice, columnar) Report Issue

Key Points

Detailed Notes

Standard Index (Type=index)

Bitmap Index (Type=bitmap)

Bitslice Index (Type=bitslice)

Columnar Index (Type=columnar)

Documentation References

4. Distinguishes between primary keys, unique constraints, and ID keys Report Issue

Key Points

Detailed Notes

RowID Column

Primary Key

DDLPKeyNotIDKey Configuration

Unique Constraints

IDKEY

Documentation References

5. Defines foreign key relationship properties Report Issue

Key Points

Detailed Notes

What Foreign Keys Do

Definition Requirements

Composite Foreign Keys

Default Reference Hierarchy

Reference Property Conversion

Sharded Table Considerations

Documentation References

6. Understands referential action behavior for updates and deletes Report Issue

Key Points

Detailed Notes

Purpose of Referential Actions

NO ACTION (Default)

CASCADE

SET NULL

SET DEFAULT

Important Warnings

Documentation References

7. Evaluates tradeoffs when adding indexes Report Issue

Key Points

Detailed Notes

Query Performance Benefits

Write Performance Costs

Storage Overhead

Maintenance Complexity

Optimal Indexing Strategy

Documentation References

8. Identifies composite index use cases Report Issue

Key Points

Detailed Notes

What are Composite Indexes?

Left-to-Right Rule

Multi-Condition Queries

GROUP BY Optimization

Covering Indexes with WITH DATA

When to Create Composite Indexes

Documentation References

Exam Preparation Summary

Critical Concepts to Master:

Common Exam Scenarios:

Hands-On Practice Recommendations:

Report an Issue

1. Understands the role of bitmap extent indexes

2. Determines appropriate index use cases

3. Differentiates between index types (standard, bitmap, bitslice, columnar)

4. Distinguishes between primary keys, unique constraints, and ID keys

5. Defines foreign key relationship properties

6. Understands referential action behavior for updates and deletes

7. Evaluates tradeoffs when adding indexes

8. Identifies composite index use cases