System Design
About Lesson

A database index is a data structure that improves the speed of data retrieval operations on a database table. It works like the index in a book, allowing you to quickly find the specific information you’re looking for without reading every page. In a database context, indexes make database queries run faster by providing a way to efficiently access and locate rows within a table. Here are the key points about database indexes:

Purpose

  • Faster Data Retrieval: Indexes are primarily used to speed up data retrieval operations like SELECT queries. Without indexes, a query would typically require scanning the entire table to find matching rows, which can be slow for large tables.
  • Enforcing Constraints: Indexes can also be used to enforce uniqueness constraints (e.g., ensuring that values in a particular column are unique) and to speed up the enforcement of foreign key relationships between tables.

Structure

  • B-Tree Structure: Most database systems use a B-Tree (Balanced Tree) data structure for indexes. B-Trees are well-balanced structures that allow for efficient searching, insertion, and deletion of data.
  • Other Types: While B-Tree indexes are the most common, some databases support other index types, such as hash indexes (faster for exact matches but less versatile) and bitmap indexes (suitable for low-cardinality data like gender or status flags).

Indexed Columns

  • Indexes can be created on one or more columns of a table. These columns are called indexed columns, and the order in which they are listed in the index definition can impact query performance. Indexes can also cover multiple columns to support composite indexing.

Index Creation

  • Indexes are created explicitly by the database administrator or automatically by the database system in response to constraints (e.g., primary keys and unique constraints).
  • Creating indexes requires additional storage space and can slow down write operations (INSERT, UPDATE, DELETE), so they should be used judiciously.

Query Optimisation

  • When you run a query with a WHERE clause, the database optimiser uses the index to quickly locate the rows that match the search criteria, significantly reducing the time needed to retrieve the data.

Primary and Unique Indexes

  • A primary index enforces the uniqueness of values in a specific column or set of columns and is used as a fast lookup for the primary key.
  • Unique indexes ensure that values in a column (or columns) are unique but do not serve as the primary key.

Clustered and Non-clustered Indexes

  • In some database systems like SQL Server, a clustered index determines the physical order of data in a table. Each table can have only one clustered index.
  • Non-clustered indexes store a copy of the indexed column(s) with a pointer to the actual row in the table. A table can have multiple non-clustered indexes.

Index Maintenance

  • Indexes need to be maintained as data in the table changes. Inserting, updating, or deleting rows can lead to index fragmentation, which can impact query performance. Regular maintenance (index rebuilding or reorganising) is necessary to keep indexes efficient.

In summary, database indexes are essential for optimising query performance and ensuring data integrity. However, they should be used thoughtfully, as creating too many indexes or indexing the wrong columns can lead to increased storage requirements and maintenance overhead. Proper indexing is a critical aspect of database design and query optimisation.

Scroll to Top