What is the difference between clustered and non-clustered indexes?
Answer
Clustered indexes physically reorder and store table data based on the index key, while non-clustered indexes create a separate structure that points to the actual data rows. Each table can have only one clustered index but multiple non-clustered indexes.
Key Differences
Aspect | Clustered Index | Non-Clustered Index |
---|---|---|
Data Storage | Data pages stored in order of index key | Separate index structure points to data |
Quantity per Table | Only one | Multiple allowed |
Data Retrieval | Direct access to data | Lookup to find data location |
Storage Space | No additional space for index | Additional space required |
Insert Performance | Slower (may require page splits) | Faster |
Range Queries | Very fast | Fast but requires key lookup |
Clustered Index
Characteristics
- Physical ordering: Table data is physically stored in the order of the clustered index key
- Leaf level contains data: The leaf pages of the index contain the actual table data
- Automatic creation: Usually created automatically on primary key
Example: Clustered Index Creation
-- SQL Server: Create clustered index
CREATE CLUSTERED INDEX IX_Orders_OrderDate
ON Orders (OrderDate);
-- MySQL: Primary key automatically creates clustered index
CREATE TABLE Orders (
OrderID INT PRIMARY KEY, -- Clustered index on OrderID
CustomerID INT,
OrderDate DATE,
OrderTotal DECIMAL(10,2)
);
-- PostgreSQL: No explicit clustered indexes, but similar with CLUSTER
CREATE INDEX IX_Orders_OrderDate ON Orders (OrderDate);
CLUSTER Orders USING IX_Orders_OrderDate;
How Clustered Index Works
-- Table with clustered index on OrderDate
-- Physical storage order:
-- Page 1: Orders from 2024-01-01 to 2024-01-15
-- Page 2: Orders from 2024-01-16 to 2024-01-31
-- Page 3: Orders from 2024-02-01 to 2024-02-15
-- Range query benefits from clustered index
SELECT * FROM Orders
WHERE OrderDate BETWEEN '2024-01-10' AND '2024-01-20';
-- Reads only pages 1 and 2, very efficient
Clustered Index Structure
-- Conceptual structure of clustered index
/*
Root Level: [Jan] [Feb] [Mar] [Apr]
| | | |
Intermediate: [1-15][16-31] [1-14][15-28] [1-15][16-31] [1-15][16-30]
| | | | | | | |
Leaf Level: [Data][Data] [Data][Data] [Data][Data] [Data][Data]
(Actual table rows stored here)
*/
-- Query execution
SELECT * FROM Orders WHERE OrderDate = '2024-01-15';
-- 1. Start at root: Navigate to Jan
-- 2. Go to intermediate: Navigate to 1-15 range
-- 3. Read leaf page: Get actual data rows
Non-Clustered Index
Characteristics
- Separate structure: Index is stored separately from table data
- Pointers to data: Leaf level contains pointers (row locators) to actual data
- Multiple allowed: Can have many non-clustered indexes per table
- Additional storage: Requires extra storage space
Example: Non-Clustered Index Creation
-- Create non-clustered index
CREATE NONCLUSTERED INDEX IX_Orders_CustomerID
ON Orders (CustomerID);
-- MySQL: All secondary indexes are non-clustered
CREATE INDEX IX_Orders_CustomerID ON Orders (CustomerID);
-- Composite non-clustered index
CREATE INDEX IX_Orders_Customer_Date
ON Orders (CustomerID, OrderDate);
How Non-Clustered Index Works
-- Non-clustered index structure
/*
Index Pages (IX_Orders_CustomerID):
CustomerID | Row Locator
101 | Page 5, Slot 3
102 | Page 2, Slot 7
103 | Page 8, Slot 1
104 | Page 5, Slot 9
Table Pages (Heap or Clustered):
Page 2: [OrderID: 1002, CustomerID: 102, ...]
Page 5: [OrderID: 1001, CustomerID: 101, ...] [OrderID: 1004, CustomerID: 104, ...]
Page 8: [OrderID: 1003, CustomerID: 103, ...]
*/
-- Query using non-clustered index
SELECT * FROM Orders WHERE CustomerID = 102;
-- 1. Search non-clustered index for CustomerID = 102
-- 2. Find row locator: Page 2, Slot 7
-- 3. Go to Page 2, Slot 7 to get actual data (Key Lookup)
Performance Comparison
Range Queries
-- Clustered index on OrderDate
SELECT * FROM Orders
WHERE OrderDate BETWEEN '2024-01-01' AND '2024-01-31';
-- Very fast: Data is physically ordered by OrderDate
-- Reads contiguous pages
-- Non-clustered index on OrderDate
SELECT * FROM Orders
WHERE OrderDate BETWEEN '2024-01-01' AND '2024-01-31';
-- Slower: Must perform key lookups for each matching row
-- Random I/O to get actual data
Point Lookups
-- Clustered index lookup
SELECT * FROM Orders WHERE OrderID = 1001; -- OrderID is clustered key
-- Fast: Direct access to data page
-- Non-clustered index lookup
SELECT * FROM Orders WHERE CustomerID = 102; -- CustomerID has non-clustered index
-- Good: Index seek + key lookup
-- Still efficient for single row
Insert Performance
-- Insert with clustered index on OrderDate
INSERT INTO Orders (OrderID, CustomerID, OrderDate, OrderTotal)
VALUES (2001, 105, '2024-01-15', 250.00);
-- May be slow: If inserted in middle of date range,
-- may cause page splits and data movement
-- Insert with non-clustered indexes
INSERT INTO Orders (OrderID, CustomerID, OrderDate, OrderTotal)
VALUES (2001, 105, '2024-06-15', 250.00);
-- Faster for table data, but must update all non-clustered indexes
Practical Examples
E-commerce Order System
-- Orders table design
CREATE TABLE Orders (
OrderID INT IDENTITY(1,1) PRIMARY KEY, -- Clustered index
CustomerID INT,
OrderDate DATETIME,
OrderTotal DECIMAL(10,2),
Status VARCHAR(20)
);
-- Non-clustered indexes for common queries
CREATE INDEX IX_Orders_CustomerID ON Orders (CustomerID);
CREATE INDEX IX_Orders_OrderDate ON Orders (OrderDate);
CREATE INDEX IX_Orders_Status ON Orders (Status);
CREATE INDEX IX_Orders_Customer_Date ON Orders (CustomerID, OrderDate);
-- Query patterns and index usage:
-- 1. Customer order history (uses IX_Orders_CustomerID)
SELECT * FROM Orders WHERE CustomerID = 12345;
-- 2. Daily orders report (uses IX_Orders_OrderDate)
SELECT COUNT(*), SUM(OrderTotal)
FROM Orders
WHERE OrderDate >= '2024-01-01' AND OrderDate < '2024-01-02';
-- 3. Customer orders in date range (uses IX_Orders_Customer_Date)
SELECT * FROM Orders
WHERE CustomerID = 12345
AND OrderDate >= '2024-01-01';
Employee Management System
-- Employees table with clustered index on EmployeeID
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY, -- Clustered index
FirstName VARCHAR(50),
LastName VARCHAR(50),
Email VARCHAR(100),
DepartmentID INT,
HireDate DATE,
Salary DECIMAL(10,2)
);
-- Non-clustered indexes for different access patterns
CREATE INDEX IX_Employees_Email ON Employees (Email); -- Login queries
CREATE INDEX IX_Employees_Department ON Employees (DepartmentID); -- Department reports
CREATE INDEX IX_Employees_LastName_FirstName ON Employees (LastName, FirstName); -- Name searches
CREATE INDEX IX_Employees_HireDate ON Employees (HireDate); -- Seniority queries
-- Covering index for salary reports
CREATE INDEX IX_Employees_Dept_Salary
ON Employees (DepartmentID)
INCLUDE (FirstName, LastName, Salary);
Index Design Strategies
Choosing Clustered Index Key
-- Good clustered index candidates:
-- 1. Primary key (if sequential)
CREATE TABLE Orders (
OrderID INT IDENTITY(1,1) PRIMARY KEY -- Sequential, good for clustered
);
-- 2. Frequently used in range queries
CREATE CLUSTERED INDEX IX_Sales_Date ON Sales (SaleDate); -- Date ranges common
-- 3. Narrow, unique, static
CREATE CLUSTERED INDEX IX_Products_SKU ON Products (ProductSKU); -- Narrow key
-- Poor clustered index candidates:
-- 1. Wide keys
CREATE CLUSTERED INDEX IX_Bad_Wide ON Orders (CustomerName, ProductName, OrderDate); -- Too wide
-- 2. Frequently updated
CREATE CLUSTERED INDEX IX_Bad_Updated ON Products (LastModified); -- Changes often
-- 3. Random values
CREATE CLUSTERED INDEX IX_Bad_Random ON Orders (OrderGUID); -- Random, causes page splits
Covering Indexes
-- Non-clustered covering index
CREATE INDEX IX_Orders_Customer_Covering
ON Orders (CustomerID)
INCLUDE (OrderDate, OrderTotal, Status);
-- Query satisfied entirely by index (no key lookup needed)
SELECT OrderDate, OrderTotal, Status
FROM Orders
WHERE CustomerID = 12345;
-- Index contains all needed columns
Database-Specific Implementations
SQL Server
-- Explicit clustered index creation
CREATE CLUSTERED INDEX IX_Orders_Date ON Orders (OrderDate);
-- Non-clustered with included columns
CREATE NONCLUSTERED INDEX IX_Orders_Customer
ON Orders (CustomerID)
INCLUDE (OrderTotal, Status);
-- Filtered index
CREATE INDEX IX_Orders_Active
ON Orders (CustomerID)
WHERE Status = 'ACTIVE';
MySQL (InnoDB)
-- Primary key is always clustered
CREATE TABLE Orders (
OrderID INT AUTO_INCREMENT PRIMARY KEY, -- Clustered
CustomerID INT,
OrderDate DATE
);
-- All other indexes are non-clustered (secondary indexes)
CREATE INDEX IX_Customer ON Orders (CustomerID);
-- InnoDB secondary indexes contain primary key value
-- No separate row locator needed
PostgreSQL
-- No true clustered indexes, but CLUSTER command
CREATE INDEX IX_Orders_Date ON Orders (OrderDate);
CLUSTER Orders USING IX_Orders_Date; -- Physically reorders table once
-- Regular indexes (similar to non-clustered)
CREATE INDEX IX_Orders_Customer ON Orders (CustomerID);
-- Partial indexes
CREATE INDEX IX_Orders_Active ON Orders (CustomerID)
WHERE Status = 'ACTIVE';
Performance Monitoring
Index Usage Analysis
-- SQL Server: Check index usage
SELECT
i.name as index_name,
s.user_seeks,
s.user_scans,
s.user_lookups,
s.user_updates
FROM sys.indexes i
JOIN sys.dm_db_index_usage_stats s ON i.object_id = s.object_id AND i.index_id = s.index_id
WHERE OBJECT_NAME(i.object_id) = 'Orders';
-- MySQL: Check index usage
SELECT
TABLE_NAME,
INDEX_NAME,
SEQ_IN_INDEX,
COLUMN_NAME,
CARDINALITY
FROM information_schema.STATISTICS
WHERE TABLE_NAME = 'Orders';
Index Fragmentation
-- SQL Server: Check fragmentation
SELECT
i.name,
ps.avg_fragmentation_in_percent,
ps.page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('Orders'), NULL, NULL, 'DETAILED') ps
JOIN sys.indexes i ON ps.object_id = i.object_id AND ps.index_id = i.index_id;
-- Rebuild fragmented indexes
ALTER INDEX IX_Orders_Date ON Orders REBUILD;
Best Practices
1. Clustered Index Design
-- Choose narrow, unique, static keys
CREATE CLUSTERED INDEX IX_Orders_ID ON Orders (OrderID); -- Good
-- Avoid wide or frequently updated keys
-- Bad: CREATE CLUSTERED INDEX IX_Bad ON Orders (CustomerName, ProductDescription);
2. Non-Clustered Index Strategy
-- Create indexes based on query patterns
-- Frequently filtered columns
CREATE INDEX IX_Orders_Status ON Orders (Status);
-- Composite indexes for multi-column filters
CREATE INDEX IX_Orders_Customer_Date ON Orders (CustomerID, OrderDate);
-- Covering indexes for frequently selected columns
CREATE INDEX IX_Orders_Summary
ON Orders (CustomerID)
INCLUDE (OrderDate, OrderTotal);
3. Maintenance
-- Regular index maintenance
-- Rebuild fragmented indexes
ALTER INDEX ALL ON Orders REBUILD;
-- Update statistics
UPDATE STATISTICS Orders;
-- Remove unused indexes
DROP INDEX IX_Unused ON Orders;
Interview Tips
- Understand that clustered indexes physically order data while non-clustered indexes are separate structures
- Know the “one clustered, many non-clustered” rule
- Be able to explain key lookup operations in non-clustered indexes
- Understand when each type provides better performance
- Know how to choose appropriate clustered index keys
- Be familiar with covering indexes and included columns
- Understand the trade-offs between query performance and maintenance overhead
Test Your Knowledge
Take a quick quiz to test your understanding of this topic.