7.4 Binary search tree¶
As shown in Figure 7-16, a binary search tree satisfies the following conditions.
- For the root node, the value of all nodes in the left subtree \(<\) the value of the root node \(<\) the value of all nodes in the right subtree.
- The left and right subtrees of any node are also binary search trees, i.e., they satisfy condition
1.
as well.
Figure 7-16 Binary search tree
7.4.1 Operations on a binary search tree¶
We encapsulate the binary search tree as a class BinarySearchTree
and declare a member variable root
pointing to the tree's root node.
1. Searching for a node¶
Given a target node value num
, one can search according to the properties of the binary search tree. As shown in Figure 7-17, we declare a node cur
, start from the binary tree's root node root
, and loop to compare the size between the node value cur.val
and num
.
- If
cur.val < num
, it means the target node is incur
's right subtree, thus executecur = cur.right
. - If
cur.val > num
, it means the target node is incur
's left subtree, thus executecur = cur.left
. - If
cur.val = num
, it means the target node is found, exit the loop, and return the node.
Figure 7-17 Example of searching for a node in a binary search tree
The search operation in a binary search tree works on the same principle as the binary search algorithm, eliminating half of the cases in each round. The number of loops is at most the height of the binary tree. When the binary tree is balanced, it uses \(O(\log n)\) time. The example code is as follows:
def search(self, num: int) -> TreeNode | None:
"""Search node"""
cur = self._root
# Loop find, break after passing leaf nodes
while cur is not None:
# Target node is in cur's right subtree
if cur.val < num:
cur = cur.right
# Target node is in cur's left subtree
elif cur.val > num:
cur = cur.left
# Found target node, break loop
else:
break
return cur
/* Search node */
TreeNode *search(int num) {
TreeNode *cur = root;
// Loop find, break after passing leaf nodes
while (cur != nullptr) {
// Target node is in cur's right subtree
if (cur->val < num)
cur = cur->right;
// Target node is in cur's left subtree
else if (cur->val > num)
cur = cur->left;
// Found target node, break loop
else
break;
}
// Return target node
return cur;
}
/* Search node */
TreeNode search(int num) {
TreeNode cur = root;
// Loop find, break after passing leaf nodes
while (cur != null) {
// Target node is in cur's right subtree
if (cur.val < num)
cur = cur.right;
// Target node is in cur's left subtree
else if (cur.val > num)
cur = cur.left;
// Found target node, break loop
else
break;
}
// Return target node
return cur;
}
2. Inserting a node¶
Given an element num
to be inserted, to maintain the property of the binary search tree "left subtree < root node < right subtree," the insertion operation proceeds as shown in Figure 7-18.
- Finding insertion position: Similar to the search operation, start from the root node, loop downwards according to the size relationship between the current node value and
num
, until the leaf node is passed (traversed toNone
), then exit the loop. - Insert the node at this position: Initialize the node
num
and place it whereNone
was.
Figure 7-18 Inserting a node into a binary search tree
In the code implementation, note the following two points.
- The binary search tree does not allow duplicate nodes to exist; otherwise, its definition would be violated. Therefore, if the node to be inserted already exists in the tree, the insertion is not performed, and the node returns directly.
- To perform the insertion operation, we need to use the node
pre
to save the node from the previous loop. This way, when traversing toNone
, we can get its parent node, thus completing the node insertion operation.
def insert(self, num: int):
"""Insert node"""
# If tree is empty, initialize root node
if self._root is None:
self._root = TreeNode(num)
return
# Loop find, break after passing leaf nodes
cur, pre = self._root, None
while cur is not None:
# Found duplicate node, thus return
if cur.val == num:
return
pre = cur
# Insertion position is in cur's right subtree
if cur.val < num:
cur = cur.right
# Insertion position is in cur's left subtree
else:
cur = cur.left
# Insert node
node = TreeNode(num)
if pre.val < num:
pre.right = node
else:
pre.left = node
/* Insert node */
void insert(int num) {
// If tree is empty, initialize root node
if (root == nullptr) {
root = new TreeNode(num);
return;
}
TreeNode *cur = root, *pre = nullptr;
// Loop find, break after passing leaf nodes
while (cur != nullptr) {
// Found duplicate node, thus return
if (cur->val == num)
return;
pre = cur;
// Insertion position is in cur's right subtree
if (cur->val < num)
cur = cur->right;
// Insertion position is in cur's left subtree
else
cur = cur->left;
}
// Insert node
TreeNode *node = new TreeNode(num);
if (pre->val < num)
pre->right = node;
else
pre->left = node;
}
/* Insert node */
void insert(int num) {
// If tree is empty, initialize root node
if (root == null) {
root = new TreeNode(num);
return;
}
TreeNode cur = root, pre = null;
// Loop find, break after passing leaf nodes
while (cur != null) {
// Found duplicate node, thus return
if (cur.val == num)
return;
pre = cur;
// Insertion position is in cur's right subtree
if (cur.val < num)
cur = cur.right;
// Insertion position is in cur's left subtree
else
cur = cur.left;
}
// Insert node
TreeNode node = new TreeNode(num);
if (pre.val < num)
pre.right = node;
else
pre.left = node;
}
Similar to searching for a node, inserting a node uses \(O(\log n)\) time.
3. Removing a node¶
First, find the target node in the binary tree, then remove it. Similar to inserting a node, we need to ensure that after the removal operation is completed, the property of the binary search tree "left subtree < root node < right subtree" is still satisfied. Therefore, based on the number of child nodes of the target node, we divide it into three cases: 0, 1, and 2, and perform the corresponding node removal operations.
As shown in Figure 7-19, when the degree of the node to be removed is \(0\), it means the node is a leaf node and can be directly removed.
Figure 7-19 Removing a node in a binary search tree (degree 0)
As shown in Figure 7-20, when the degree of the node to be removed is \(1\), replacing the node to be removed with its child node is sufficient.
Figure 7-20 Removing a node in a binary search tree (degree 1)
When the degree of the node to be removed is \(2\), we cannot remove it directly, but need to use a node to replace it. To maintain the property of the binary search tree "left subtree \(<\) root node \(<\) right subtree," this node can be either the smallest node of the right subtree or the largest node of the left subtree.
Assuming we choose the smallest node of the right subtree (the next node in in-order traversal), then the removal operation proceeds as shown in Figure 7-21.
- Find the next node in the "in-order traversal sequence" of the node to be removed, denoted as
tmp
. - Replace the value of the node to be removed with
tmp
's value, and recursively remove the nodetmp
in the tree.
Figure 7-21 Removing a node in a binary search tree (degree 2)
The operation of removing a node also uses \(O(\log n)\) time, where finding the node to be removed requires \(O(\log n)\) time, and obtaining the in-order traversal successor node requires \(O(\log n)\) time. Example code is as follows:
def remove(self, num: int):
"""Remove node"""
# If tree is empty, return
if self._root is None:
return
# Loop find, break after passing leaf nodes
cur, pre = self._root, None
while cur is not None:
# Found node to be removed, break loop
if cur.val == num:
break
pre = cur
# Node to be removed is in cur's right subtree
if cur.val < num:
cur = cur.right
# Node to be removed is in cur's left subtree
else:
cur = cur.left
# If no node to be removed, return
if cur is None:
return
# Number of child nodes = 0 or 1
if cur.left is None or cur.right is None:
# When the number of child nodes = 0/1, child = null/that child node
child = cur.left or cur.right
# Remove node cur
if cur != self._root:
if pre.left == cur:
pre.left = child
else:
pre.right = child
else:
# If the removed node is the root, reassign the root
self._root = child
# Number of child nodes = 2
else:
# Get the next node in in-order traversal of cur
tmp: TreeNode = cur.right
while tmp.left is not None:
tmp = tmp.left
# Recursively remove node tmp
self.remove(tmp.val)
# Replace cur with tmp
cur.val = tmp.val
/* Remove node */
void remove(int num) {
// If tree is empty, return
if (root == nullptr)
return;
TreeNode *cur = root, *pre = nullptr;
// Loop find, break after passing leaf nodes
while (cur != nullptr) {
// Found node to be removed, break loop
if (cur->val == num)
break;
pre = cur;
// Node to be removed is in cur's right subtree
if (cur->val < num)
cur = cur->right;
// Node to be removed is in cur's left subtree
else
cur = cur->left;
}
// If no node to be removed, return
if (cur == nullptr)
return;
// Number of child nodes = 0 or 1
if (cur->left == nullptr || cur->right == nullptr) {
// When the number of child nodes = 0 / 1, child = nullptr / that child node
TreeNode *child = cur->left != nullptr ? cur->left : cur->right;
// Remove node cur
if (cur != root) {
if (pre->left == cur)
pre->left = child;
else
pre->right = child;
} else {
// If the removed node is the root, reassign the root
root = child;
}
// Free memory
delete cur;
}
// Number of child nodes = 2
else {
// Get the next node in in-order traversal of cur
TreeNode *tmp = cur->right;
while (tmp->left != nullptr) {
tmp = tmp->left;
}
int tmpVal = tmp->val;
// Recursively remove node tmp
remove(tmp->val);
// Replace cur with tmp
cur->val = tmpVal;
}
}
/* Remove node */
void remove(int num) {
// If tree is empty, return
if (root == null)
return;
TreeNode cur = root, pre = null;
// Loop find, break after passing leaf nodes
while (cur != null) {
// Found node to be removed, break loop
if (cur.val == num)
break;
pre = cur;
// Node to be removed is in cur's right subtree
if (cur.val < num)
cur = cur.right;
// Node to be removed is in cur's left subtree
else
cur = cur.left;
}
// If no node to be removed, return
if (cur == null)
return;
// Number of child nodes = 0 or 1
if (cur.left == null || cur.right == null) {
// When the number of child nodes = 0/1, child = null/that child node
TreeNode child = cur.left != null ? cur.left : cur.right;
// Remove node cur
if (cur != root) {
if (pre.left == cur)
pre.left = child;
else
pre.right = child;
} else {
// If the removed node is the root, reassign the root
root = child;
}
}
// Number of child nodes = 2
else {
// Get the next node in in-order traversal of cur
TreeNode tmp = cur.right;
while (tmp.left != null) {
tmp = tmp.left;
}
// Recursively remove node tmp
remove(tmp.val);
// Replace cur with tmp
cur.val = tmp.val;
}
}
4. In-order traversal is ordered¶
As shown in Figure 7-22, the in-order traversal of a binary tree follows the traversal order of "left \(\rightarrow\) root \(\rightarrow\) right," and a binary search tree satisfies the size relationship of "left child node \(<\) root node \(<\) right child node."
This means that when performing in-order traversal in a binary search tree, the next smallest node will always be traversed first, thus leading to an important property: The sequence of in-order traversal in a binary search tree is ascending.
Using the ascending property of in-order traversal, obtaining ordered data in a binary search tree requires only \(O(n)\) time, without the need for additional sorting operations, which is very efficient.
Figure 7-22 In-order traversal sequence of a binary search tree
7.4.2 Efficiency of binary search trees¶
Given a set of data, we consider using an array or a binary search tree for storage. Observing Table 7-2, the operations on a binary search tree all have logarithmic time complexity, which is stable and efficient. Arrays are more efficient than binary search trees only in scenarios involving frequent additions and infrequent searches or removals.
Table 7-2 Efficiency comparison between arrays and search trees
Unsorted array | Binary search tree | |
---|---|---|
Search element | \(O(n)\) | \(O(\log n)\) |
Insert element | \(O(1)\) | \(O(\log n)\) |
Remove element | \(O(n)\) | \(O(\log n)\) |
Ideally, the binary search tree is "balanced," allowing any node can be found within \(\log n\) loops.
However, if we continuously insert and remove nodes in a binary search tree, it may degenerate into a linked list as shown in Figure 7-23, where the time complexity of various operations also degrades to \(O(n)\).
Figure 7-23 Degradation of a binary search tree
7.4.3 Common applications of binary search trees¶
- Used as multi-level indexes in systems to implement efficient search, insertion, and removal operations.
- Serves as the underlying data structure for certain search algorithms.
- Used to store data streams to maintain their ordered state.