| I l@ve RuBoard |
|
17.4 Binary Search TreesBinary trees are a data structure that impose an order on inserted nodes: items less than a node are stored in its left subtree, and items greater than a node are inserted in the right. At the bottom, the subtrees are empty. Because of this structure, binary trees naturally support quick, recursive traversals -- at least ideally, every time you follow a link to a subtree, you divide the search space in half.[3]
Binary trees are named for the implied branch-like structure of their subtree links. Typically, their nodes are implemented as a triple of values: (LeftSubtree, NodeValue, RightSubtree). Beyond that, there is fairly wide latitude in the tree implementation. Here we'll use a class-based approach:
Rather than coding distinct search functions, binary trees are constructed with "smart" objects (class instances) that know how to handle insert/lookup and printing requests and pass them to subtree objects. In fact, this is another example of the OOP composition relationship in action: tree nodes embed other tree nodes and pass search requests off to the embedded subtrees. A single empty class instance is shared by all empty subtrees in a binary tree, and inserts replace an EmptyNode with a BinaryNode at the bottom (see Example 17-13). Example 17-13. PP2E\Dstruct\Classics\btree.pyclass BinaryTree:
def __init__(self): self.tree = EmptyNode( )
def __repr__(self): return `self.tree`
def lookup(self, value): return self.tree.lookup(value)
def insert(self, value): self.tree = self.tree.insert(value)
class EmptyNode:
def __repr__(self):
return '*'
def lookup(self, value): # fail at the bottom
return 0
def insert(self, value):
return BinaryNode(self, value, self) # add new node at bottom
class BinaryNode:
def __init__(self, left, value, right):
self.data, self.left, self.right = value, left, right
def lookup(self, value):
if self.data == value:
return 1
elif self.data > value:
return self.left.lookup(value) # look in left
else:
return self.right.lookup(value) # look in right
def insert(self, value):
if self.data > value:
self.left = self.left.insert(value) # grow in left
elif self.data < value:
self.right = self.right.insert(value) # grow in right
return self
def __repr__(self):
return '( %s, %s, %s )' % (`self.left`, `self.data`, `self.right`)
As usual, BinaryTree can contain objects of any type that support the expected interface protocol -- here, > and < comparisons. This includes class instances with the __cmp__ method. Let's experiment with this module's interfaces. The following code stuffs five integers into a new tree, and then searches for values 0...9: C:\...\PP2E\Dstruct\Classics>python >>> from btree import BinaryTree >>> x = BinaryTree( ) >>> for i in [3,1,9,2,7]: x.insert(i) ... >>> for i in range(10): print (i, x.lookup(i)), ... (0, 0) (1, 1) (2, 1) (3, 1) (4, 0) (5, 0) (6, 0) (7, 1) (8, 0) (9, 1) To watch this tree grow, add a print statement after each insert. Tree nodes print themselves as triples, and empty nodes print as *. The result reflects tree nesting: >>> y = BinaryTree( ) >>> y * >>> for i in [3,1,9,2,7]: ... y.insert(i); print y ... ( *, 3, * ) ( ( *, 1, * ), 3, * ) ( ( *, 1, * ), 3, ( *, 9, * ) ) ( ( *, 1, ( *, 2, * ) ), 3, ( *, 9, * ) ) ( ( *, 1, ( *, 2, * ) ), 3, ( ( *, 7, * ), 9, * ) ) At the end of this chapter, we'll see another way to visualize trees in a GUI (which means you're invited to flip ahead now). Node values in this tree object can be any comparable Python object; for instances, here is a tree of strings: >>> z = BinaryTree( ) >>> for c in 'badce': z.insert(c) ... >>> z ( ( *, 'a', * ), 'b', ( ( *, 'c', * ), 'd', ( *, 'e', * ) ) ) >>> z = BinaryTree( ) >>> for c in 'abcde': z.insert(c) ... >>> z ( *, 'a', ( *, 'b', ( *, 'c', ( *, 'd', ( *, 'e', * ) ) ) ) ) Notice the last result here: if items inserted into a binary tree are already ordered, then you wind up with a linear structure, and you lose the search-space partitioning magic of binary trees (the tree grows in right branches only). This is a worst-case scenario, and binary trees generally do a good job of dividing up values in practice. But if you are interested in pursuing this topic further, see a data structures text for tree-balancing techniques that automatically keep the tree as dense as possible. Also note that to keep the code simple, these trees store a value only, and lookups return a 1 or (true or false). In practice, you sometimes may want to store both a key and an associated value (or even more) at each tree node. Example 17-14 shows what such a tree object looks like, for any prospective lumberjacks in the audience. Example 17-14. PP2E\Dstruct\Classics\btree-keys.pyclass KeyedBinaryTree:
def __init__(self): self.tree = EmptyNode( )
def __repr__(self): return `self.tree`
def lookup(self, key): return self.tree.lookup(key)
def insert(self, key, val): self.tree = self.tree.insert(key, val)
class EmptyNode:
def __repr__(self):
return '*'
def lookup(self, key): # fail at the bottom
return None
def insert(self, key, val):
return BinaryNode(self, key, val, self) # add node at bottom
class BinaryNode:
def __init__(self, left, key, val, right):
self.key, self.val = key, val
self.left, self.right = left, right
def lookup(self, key):
if self.key == key:
return self.val
elif self.key > key:
return self.left.lookup(key) # look in left
else:
return self.right.lookup(key) # look in right
def insert(self, key, val):
if self.key == key:
self.val = val
elif self.key > key:
self.left = self.left.insert(key, val) # grow in left
elif self.key < key:
self.right = self.right.insert(key, val) # grow in right
return self
def __repr__(self):
return ('( %s, %s=%s, %s )' %
(`self.left`, `self.key`, `self.val`, `self.right`))
if __name__ == '__main__':
t = KeyedBinaryTree( )
for (key, val) in [('bbb', 1), ('aaa', 2), ('ccc', 3)]:
t.insert(key, val)
print t
print t.lookup('aaa'), t.lookup('ccc')
t.insert('ddd', 4)
t.insert('aaa', 5) # changes key's value
print t
And here is this script's self-test code at work; nodes simply have more content this time around: C:\...\PP2E\Dstruct\Classics>python btree-keys.py ( ( *, 'aaa'=2, * ), 'bbb'=1, ( *, 'ccc'=3, * ) ) 2 3 ( ( *, 'aaa'=5, * ), 'bbb'=1, ( *, 'ccc'=3, ( *, 'ddd'=4, * ) ) ) |
| I l@ve RuBoard |
|