How Tables and Indexes are Stored on Disk in Databases

How Tables and Indexes are Stored on Disk in Databases

?????????? ?????????????? ???? ????????

?????????-???????????????? ?????????????? (???????? ???? ?????????????????? ????????????):

???-> Rows are stored sequentially in no particular order (Heap storage).

???-> Clustered indexes store rows in the order of the index key (e.g., primary key).

???-> Efficient for range queries and ordered scans when using clustered indexes.

???-> Heap storage is faster for full-table scans but slower for column-specific queries.

???????????????-???????????????? ??????????????:

???-> Data is stored column by column instead of row by row.

???-> Ideal for analytical queries that involve aggregations over large datasets.

???-> Slower for write-heavy workloads due to the need to update multiple column files.


???????????-?????????? ??????????????:

???-> Data is divided into fixed-size blocks called pages (typically 4KB, 8KB, or 16KB).

???-> Pages are grouped into extents (contiguous groups of pages) for efficient I/O.

???-> Pages can be cached in memory (buffer pool) for faster access.


?????????? ?????????????? ???? ????????


?????-???????? ??????????????:

???-> B-trees are balanced tree structures with keys and pointers to child nodes or data rows.

???-> Leaf nodes contain pointers to actual rows or their locations on disk.

???-> Efficient for range queries, equality searches, and ordered scans.

???-> Requires additional storage and maintenance overhead for insertions and deletions.

??????????? ??????????????:

???-> Hash indexes use a hash function to map keys to specific locations in a hash table.

???-> Extremely fast for exact match queries (equality searches).

???-> Not suitable for range queries or ordered scans due to lack of ordering.


??????????????? ??????????????:

???-> Bitmap indexes use bit arrays to represent the presence or absence of values in a column.

???-> Compact and efficient for columns with low cardinality (few distinct values).

???-> Expensive to update and not suitable for high-cardinality columns.


????????????????????? ??????????????:

???-> Secondary indexes are created on non-primary key columns.

???-> Store indexed column values along with pointers to actual rows or the clustered index key.

???-> Improve query performance for non-primary key columns but add overhead for updates.

???????? ???????????????????????? ???? ????????

??????????? ?????????? ?????? ????????????????:

???-> Tables and indexes are stored in data files, which are divided into segments.

???-> Each segment corresponds to a table or index and is further divided into extents and pages.

?????????????????????????:

???-> Some databases use tablespaces to group related data files.

???-> A tablespace is a logical container for segments and maps to one or more physical data files.

????????? ?????????? (??????????-?????????? ?????????????? - ??????):

???-> Log files ensure durability and recovery by recording changes before they are applied to data files.

???-> In case of a crash, the database replays the log to restore consistency.

??????????????? ???????? (???????????? ??????????):

???-> Frequently accessed pages are stored in a buffer pool (memory cache) to reduce disk I/O.

???-> If data is not in the buffer pool, it is fetched from disk and loaded into memory.

?????? ???????????????????????????? ?????? ??????????????

????????????????????????? ??????????-????????:

???-> Row-oriented storage is better for transactional workloads, while column-oriented storage excels in analytical workloads.

???-> Indexes improve query performance but add overhead for write operations.

????????????????? ????????????????????:

???-> Page-based storage ensures efficient use of disk space and memory.

???-> Compression techniques may be used to reduce the size of data and indexes.

??????????????????????? ?????? ????????????????:

???-> Write-ahead logging ensures data integrity and recovery in case of system failures.

???-> Regular backups and checkpoints help maintain data consistency.

要查看或添加评论,请登录

Jairaj Sahgal的更多文章

社区洞察

其他会员也浏览了