17.9 Data Structures Versus Python Built-ins
Now that I've shown you all these complicated algorithms, I
need to also tell you that at least in some cases, they may not be an
optimal approach. Built-in types like lists and dictionaries are
often a simpler and more efficient way to represent data. For
instance:
- Binary trees
-
These may be useful in many
applications, but Python dictionaries already provide a highly
optimized, C-coded, search table tool. Indexing a dictionary by key
is likely to be faster than searching a Python-coded tree structure:
>>> x = {}
>>> for i in [3,1,9,2,7]: x[i] = None # insert
>>> for i in range(10): print (i, x.has_key(i)), # lookup
(0, 0) (1, 1) (2, 1) (3, 1) (4, 0) (5, 0) (6, 0) (7, 1) (8, 0) (9, 1)
Because dictionaries are built in to the language, they are always
available, and will usually be faster than Python-based data
structure implementations.
- Graph algorithms
-
These
serve many purposes, but a purely Python-coded implementation of a
very large graph might be less efficient than you want in some
applications. Graph programs tend to require peak performance; using
dictionaries instead of class instances to represent graphs may boost
performance some, but using linked-in compiled extensions may as
well.
- Sorting algorithms
-
These
are an important part of many programs too, but Python's
built-in list sort method is so fast that you
would be hard pressed to beat it in Python in most scenarios. In
fact, it's generally better to convert sequences to lists first
just so you can use the
built-in:
temp = list(sequence)
temp.sort( )
...use items in temp...
For custom sorts, simply pass in a comparison function of your own:
>>> L = [{'n':3}, {'n':20}, {'n':0}, {'n':9}]
>>> L.sort( lambda x, y: cmp(x['n'], y['n']) )
>>> L
[{'n': 0}, {'n': 3}, {'n': 9}, {'n': 20}]
- Reversal algorithms
-
These are generally superfluous by the same token -- because
Python lists provide a fast reverse method, you
may be better off converting a non-list to a list first, just so that
you can run the built-in list method.
Don't misunderstand: sometimes you really do need objects that
add functionality to built-in types, or do something more custom. The
set classes we met, for instance, add tools not directly supported by
Python today, and the tuple-tree stack implementation was actually
faster than one based upon built-in lists for common usage patterns.
Permutations are something you need to add on your own too.
Moreover, class encapsulations make it possible to change and extend
object internals without impacting the rest of your system. They also
support reuse much better than built-in types -- types are not
classes today, and cannot be specialized directly without wrapper
class logic.
Yet because Python comes with a set of built-in, flexible, and
optimized datatypes, data structure implementations are often not as
important in Python as they are in lesser-equipped languages such as
C or C++. Before you code that new datatype, be sure to ask yourself
if a built-in type or call might be more in line with the Python way
of thinking.
|