TheThunderclap
Backend Databases PostgreSQL Performance

Database Indexing Deep Dive

B-trees, hash indexes, partial indexes and query planning — everything you need to stop your queries from crawling.

D

Deepa Nair

Database Engineer

📅 25 January 2025
⏱ 10 min read

Introduction

A slow query is almost always a missing or mis-configured index. Understanding how indexes work internally allows you to choose the right one for each access pattern — and avoid the traps that make indexes hurt more than they help.

B-Tree Indexes

The default PostgreSQL index type. B-trees are balanced tree structures that support equality, range, LIKE 'prefix%', ORDER BY, and MIN/MAX efficiently. They are NOT good for full-text search or low-cardinality columns (like boolean fields).

btree.sql
sql
                                            -- Single column index
CREATE INDEX idx_users_email ON users(email);

-- Composite index (order matters!)
-- This index supports: WHERE created_at = ? AND status = ?
-- And also:          WHERE created_at = ?
-- But NOT:           WHERE status = ?
CREATE INDEX idx_orders_date_status ON orders(created_at, status);

-- Use EXPLAIN ANALYZE to see "hl-keyword">if your index is used
EXPLAIN ANALYZE
  SELECT * FROM users WHERE email = 'hello@thunderclap.com';

-- Output you want to see:
-- Index Scan using idx_users_email on users
-- (instead "hl-keyword">of: Seq Scan on users)
                                        

Partial Indexes

A partial index indexes only the rows matching a WHERE clause. This is far more efficient when you only query a subset of data — e.g., unprocessed jobs in a queue.

partial.sql
sql
                                            -- Index only pending jobs (the ones you actually query)
CREATE INDEX idx_jobs_pending
  ON jobs(created_at)
  WHERE status = 'pending';

-- This query will use the partial index
SELECT * FROM jobs
WHERE status = 'pending'
ORDER BY created_at ASC
LIMIT 10;

-- Much smaller index than indexing all rows!
-- Trade-off: queries on non-pending rows won't use this index
                                        

Query Planning & Statistics

PostgreSQL's query planner decides whether to use an index based on statistics it collects about your data. Stale statistics can cause the planner to choose a slow sequential scan even when a usable index exists.

analyze.sql
sql
                                            -- Force statistics update
ANALYZE users;

-- Check index usage stats
SELECT
  indexrelname AS index_name,
  idx_scan     AS times_used,
  idx_tup_read AS tuples_read
FROM pg_stat_user_indexes
WHERE relname = 'users'
ORDER BY idx_scan DESC;

-- Find missing indexes (queries doing sequential scans on large tables)
SELECT
  schemaname,
  tablename,
  seq_scan,
  seq_tup_read,
  idx_scan,
  n_live_tup
FROM pg_stat_user_tables
WHERE seq_scan > idx_scan
  AND n_live_tup > 10000
ORDER BY seq_tup_read DESC;
                                        

💬 Comments

0 comments

Leave a comment

0/1000

Comments are moderated. Be respectful. ✌️

📚 Related Articles