Sunday, October 28, 2007

B-tree MyISAM index structure

Interesting references still pending to investigate are:

* http://forge.mysql.com/wiki/MySQL_Internals_MyISAM (section "The .MYI Key Values ")

* http://lists.mysql.com/internals/18131

* "Pro MySQL" by Michael Kruckenberg, Jay Pipes, Chad Russell, page 160:

After the header section, the MyISAM index blocks compose the remainder of the .MYI file. The index blocks are 1KB on-disk pages of data, representing the B-tree leaf and non-leaf nodes. A single index block contains only key values pertaining to one index in the table. The header section (detailed in the previous section) contains information about how the MyISAM storage engine should find the root node of each index by supplying the offset for the root node’s index block in the keydef elements.

Each index block contains the following:

A single 2-byte block header. The first bit of the 16 bits in the header indicates whether the block is a leaf node (0 for leaf; 1 for non-leaf). The remaining 15 bits contain the total length of bytes used in the block (non free space).

Following the header, index keys and record identifiers are laid out in a balanced organization (the B-tree format). With each key is stored the key value (of a length equal to the data type of the indexed field) and a 4-byte record pointer.

The remainder of the index block is junk bytes (filler bytes), which become used as the B-tree index “fills out” with inserts. This is the “fill factor” for MyISAM B-tree index pages, and typically represents between 65% and 80% of the data used within the index block under normal operations, to allow for split-free growth along with the insertions.

No comments: