B-Trees, (a,b)-Trees

San Diego State University

A tree T is a B-Trees of degree t if

a) All leaves of T have the same depth

b) All internal nodes of T except the root we have:

- t<= c(v) <= 2t

c) The root of T satisfies 2 <= c(v) <= 2t

c(v) = number of children of node v

All internal nodes of T except the root we have:

- t <= c(v) <= 2t

All internal nodes of T except the root have between

- t-1 and 2t-1 keys

All internal nodes of T except the root we have:

- t+1 <= c(v) <= 2t + 1

All internal nodes of T except the root we have:

- t/2 <= c(v) <= t

All internal nodes of T except the root we have:

_{ }<= c(v) <= t

proof.

_{ }

_{ }

take log of both sides.

t | N | # of Levels |

256 | 33,000,000 | 4 |

256 | 8,550,000,000 | 5 |

128 | 4,100,000 | 4 |

128 | 530,000,000 | 5 |

- A node in T has t-1 <= K <= 2t-1 keys in sorted order.
- Worst case:
- K = t-1 for all nodes
- searching for X not in the tree
- Given a node, W, in T, how much work does it take to find the subtree of W that would contain X?
- Using binary search it takes
_{ }=_{ }=_{ }comparisons- Since the height of the tree is in worst case
_{ }the total amount of work is: _{ }

A full node is one that contains 2t-1 keys

1. Find the leaf that should contain X

2. If the path from the root to the leaf contains a full node, then split the node when you first search it.

3. Insert X into the proper leaf

A minimal node is one that contains t-1 keys and is not the root

In the search path from the root to node containing X, if you come across a minimal node add a key to it.

Case 3. Searching node W that does not contain X. Let c be the child of W that would contain X.

Case 3a. if c has t-1 keys and a sibling has t or more keys, steal a key from the sibling

Case 2a. If the child y of W that precedes X in W has at least t keys, steal predecessor of W

Case 1. X is in node W a leaf. By case 3, W has at least t keys. Remove X from W

Theorem. A Red-Black tree is a B-Tree with degree 2

proof:

Must show:

- 1. If a node is red, then both its children are black
- 2. Every simple path from a node to a descendant leaf contains the same number of black nodes

Data is stored in leaves. Internal nodes are used to index into leaves.

Will assume items of interest are stored in the leafs, but this is not required

Leaf contains one key

Internal nodes contain keys used to find leafs

Let a and b be integers with a >= 2 and 2a-1 <= b. A tree T is an (a, b)-tree if

a) All leaves of T have the same depth

b) All internal nodes v of T except the root satisfy a <= c(v) <= b

c) The root of T satisfies 2 <= c(v) <= b

c(v) = number of children of node v

Let T be an ( a, b )-tree with n leaves and height h. Then:

a)

b) lg (n)/lg (b) <= h <= 1 + lg( n/2 ) / log ( a )

- SP = total number of node splittings
- F = total number of node fusings
- SH = total number of node sharings

then:

This is not true when b = 2a - 1. That is for certain definitions of B-trees!

Assume b = 2a

Assume it costs C

C

Assume it costs K

Total search time in (a , b ) -tree will be bound by

- Time to search one node * number of levels
- ( K
_{1}+ K_{2}a + C_{1}+ C_{2}a ) lg( n ) / lg( a )

This is minimal when

- a* ln( a - 1 ) = ( K
_{1}+ C_{1 }) / ( K_{2}+ C_{2})

K

K

This gives a ~ 100

Let leaves be level 0 ( just for this slide )

Parents of leaves be level 1, ...

- SP
_{h}= total number of node splittings at height h - F
_{h}= total number of node fusings at height h - SH
_{h}= total number of node sharings at height h

then:

- SP
_{h}+ SH_{h }+ F_{h}<= 2( c + 2 ) n / (c + 1 )^{h}

A = access by a seperate processor

=node locked as processor changes node

AB = access blocked by locked node

A-sort (next slides) uses the fact that the action is near the leaves

Let x[1], x[2], ..., x[n] be a sequence to be sorted

Let f[k] = | { x[j] : j > k and x[j] < x[k] } |

Let

F is the number of inversions of x[1], x[2], ..., x[n]

1 2 7 3 4 5 9 6 8has 6 inversions

1) 0 <= F <= N*(N+1)/2 for a list of N items

2) Let F = number of inversions of a list A. Insertion sort takes

_{ }( n + F ) operations to sort A

Sort x[1], x[2], ..., x[n] by inserting into a ( a, b )-Tree

Insert x[1], then x[2], then x[3], ... into the tree

When inserting x[k] need to find the proper location for x[k]

Don't start the search at the root

Start the search at the "right most" internal node

This process is called A-sort

Theorem[3] A sequence of n elements with F inversions can be sorted using A-sort in:

- O( n + n lg( F / n ) )

Theorem[4] A-sort is better than quicksort for list with number of inversion F <= 0.02N