I l@ve RuBoard |
17.4 Binary Search TreesBinary trees are a data structure that impose an order on inserted nodes: items less than a node are stored in its left subtree, and items greater than a node are inserted in the right. At the bottom, the subtrees are empty. Because of this structure, binary trees naturally support quick, recursive traversals -- at least ideally, every time you follow a link to a subtree, you divide the search space in half.[3]
Binary trees are named for the implied branch-like structure of their subtree links. Typically, their nodes are implemented as a triple of values: (LeftSubtree, NodeValue, RightSubtree). Beyond that, there is fairly wide latitude in the tree implementation. Here we'll use a class-based approach:
Rather than coding distinct search functions, binary trees are constructed with "smart" objects (class instances) that know how to handle insert/lookup and printing requests and pass them to subtree objects. In fact, this is another example of the OOP composition relationship in action: tree nodes embed other tree nodes and pass search requests off to the embedded subtrees. A single empty class instance is shared by all empty subtrees in a binary tree, and inserts replace an EmptyNode with a BinaryNode at the bottom (see Example 17-13). Example 17-13. PP2E\Dstruct\Classics\btree.pyclass BinaryTree: def __init__(self): self.tree = EmptyNode( ) def __repr__(self): return `self.tree` def lookup(self, value): return self.tree.lookup(value) def insert(self, value): self.tree = self.tree.insert(value) class EmptyNode: def __repr__(self): return '*' def lookup(self, value): # fail at the bottom return 0 def insert(self, value): return BinaryNode(self, value, self) # add new node at bottom class BinaryNode: def __init__(self, left, value, right): self.data, self.left, self.right = value, left, right def lookup(self, value): if self.data == value: return 1 elif self.data > value: return self.left.lookup(value) # look in left else: return self.right.lookup(value) # look in right def insert(self, value): if self.data > value: self.left = self.left.insert(value) # grow in left elif self.data < value: self.right = self.right.insert(value) # grow in right return self def __repr__(self): return '( %s, %s, %s )' % (`self.left`, `self.data`, `self.right`) As usual, BinaryTree can contain objects of any type that support the expected interface protocol -- here, > and < comparisons. This includes class instances with the __cmp__ method. Let's experiment with this module's interfaces. The following code stuffs five integers into a new tree, and then searches for values 0...9: C:\...\PP2E\Dstruct\Classics>python >>> from btree import BinaryTree >>> x = BinaryTree( ) >>> for i in [3,1,9,2,7]: x.insert(i) ... >>> for i in range(10): print (i, x.lookup(i)), ... (0, 0) (1, 1) (2, 1) (3, 1) (4, 0) (5, 0) (6, 0) (7, 1) (8, 0) (9, 1) To watch this tree grow, add a print statement after each insert. Tree nodes print themselves as triples, and empty nodes print as *. The result reflects tree nesting: >>> y = BinaryTree( ) >>> y * >>> for i in [3,1,9,2,7]: ... y.insert(i); print y ... ( *, 3, * ) ( ( *, 1, * ), 3, * ) ( ( *, 1, * ), 3, ( *, 9, * ) ) ( ( *, 1, ( *, 2, * ) ), 3, ( *, 9, * ) ) ( ( *, 1, ( *, 2, * ) ), 3, ( ( *, 7, * ), 9, * ) ) At the end of this chapter, we'll see another way to visualize trees in a GUI (which means you're invited to flip ahead now). Node values in this tree object can be any comparable Python object; for instances, here is a tree of strings: >>> z = BinaryTree( ) >>> for c in 'badce': z.insert(c) ... >>> z ( ( *, 'a', * ), 'b', ( ( *, 'c', * ), 'd', ( *, 'e', * ) ) ) >>> z = BinaryTree( ) >>> for c in 'abcde': z.insert(c) ... >>> z ( *, 'a', ( *, 'b', ( *, 'c', ( *, 'd', ( *, 'e', * ) ) ) ) ) Notice the last result here: if items inserted into a binary tree are already ordered, then you wind up with a linear structure, and you lose the search-space partitioning magic of binary trees (the tree grows in right branches only). This is a worst-case scenario, and binary trees generally do a good job of dividing up values in practice. But if you are interested in pursuing this topic further, see a data structures text for tree-balancing techniques that automatically keep the tree as dense as possible. Also note that to keep the code simple, these trees store a value only, and lookups return a 1 or (true or false). In practice, you sometimes may want to store both a key and an associated value (or even more) at each tree node. Example 17-14 shows what such a tree object looks like, for any prospective lumberjacks in the audience. Example 17-14. PP2E\Dstruct\Classics\btree-keys.pyclass KeyedBinaryTree: def __init__(self): self.tree = EmptyNode( ) def __repr__(self): return `self.tree` def lookup(self, key): return self.tree.lookup(key) def insert(self, key, val): self.tree = self.tree.insert(key, val) class EmptyNode: def __repr__(self): return '*' def lookup(self, key): # fail at the bottom return None def insert(self, key, val): return BinaryNode(self, key, val, self) # add node at bottom class BinaryNode: def __init__(self, left, key, val, right): self.key, self.val = key, val self.left, self.right = left, right def lookup(self, key): if self.key == key: return self.val elif self.key > key: return self.left.lookup(key) # look in left else: return self.right.lookup(key) # look in right def insert(self, key, val): if self.key == key: self.val = val elif self.key > key: self.left = self.left.insert(key, val) # grow in left elif self.key < key: self.right = self.right.insert(key, val) # grow in right return self def __repr__(self): return ('( %s, %s=%s, %s )' % (`self.left`, `self.key`, `self.val`, `self.right`)) if __name__ == '__main__': t = KeyedBinaryTree( ) for (key, val) in [('bbb', 1), ('aaa', 2), ('ccc', 3)]: t.insert(key, val) print t print t.lookup('aaa'), t.lookup('ccc') t.insert('ddd', 4) t.insert('aaa', 5) # changes key's value print t And here is this script's self-test code at work; nodes simply have more content this time around: C:\...\PP2E\Dstruct\Classics>python btree-keys.py ( ( *, 'aaa'=2, * ), 'bbb'=1, ( *, 'ccc'=3, * ) ) 2 3 ( ( *, 'aaa'=5, * ), 'bbb'=1, ( *, 'ccc'=3, ( *, 'ddd'=4, * ) ) ) |
I l@ve RuBoard |