I l@ve RuBoard |
3.3 ThreadsThreads are another way to start activities running at the same time. They sometimes are called "lightweight processes," and they are run in parallel like forked processes, but all run within the same single process. For applications that can benefit from parallel processing, threads offer big advantages for programmers:
Using threads is surprisingly easy in Python. In fact, when a program is started it is already running a thread -- usually called the "main thread" of the process. To start new, independent threads of execution within a process, we either use the Python thread module to run a function call in a spawned thread, or the Python threading module to manage threads with high-level objects. Both modules also provide tools for synchronizing access to shared objects with locks. 3.3.1 The thread ModuleSince the basic thread module is a bit simpler than the more advanced threading module covered later in this section, let's look at some of its interfaces first. This module provides a portable interface to whatever threading system is available in your platform: its interfaces work the same on Windows, Solaris, SGI, and any system with an installed "pthreads" POSIX threads implementation (including Linux). Python scripts that use the Python thread module work on all of these platforms without changing their source code. Let's start off by experimenting with a script that demonstrates the main thread interfaces. The script in Example 3-5 spawns threads until you reply with a "q" at the console; it's similar in spirit to (and a bit simpler than) the script in Example 3-1, but goes parallel with threads, not forks. Example 3-5. PP2E\System\Threads\thread1.py# spawn threads until you type 'q' import thread def child(tid): print 'Hello from thread', tid def parent( ): i = 0 while 1: i = i+1 thread.start_new(child, (i,)) if raw_input( ) == 'q': break parent( ) There are really only two thread-specific lines in this script: the import of the thread module, and the thread creation call. To start a thread, we simply call the thread.start_new function, no matter what platform we're programming on.[3] This call takes a function object and an arguments tuple, and starts a new thread to execute a call to the passed function with the passed arguments. It's almost like the built-in apply function (and like apply, also accepts an optional keyword arguments dictionary), but in this case, the function call begins running in parallelwith the rest of the program.
Operationally speaking, the thread.start_new call itself returns immediately with no useful value, and the thread it spawns silently exits when the function being run returns (the return value of the threaded function call is simply ignored). Moreover, if a function run in a thread raises an uncaught exception, a stack trace is printed and the thread exits, but the rest of the program continues. In practice, though, it's almost trivial to use threads in a Python script. Let's run this program to launch a few threads; it can be run on both Linux and Windows this time, because threads are more portable than process forks: C:\...\PP2E\System\Threads>python thread1.py Hello from thread 1 Hello from thread 2 Hello from thread 3 Hello from thread 4 q Each message here is printed from a new thread, which exits almost as soon as it is started. To really understand the power of threads running in parallel, we have to do something more long-lived in our threads. The good news is that threads are both easy and fun to play with in Python. Let's mutate the fork-count program of the prior section to use threads. The script in Example 3-6 starts 10 copies of its counter running in parallel threads. Example 3-6. PP2E\System\Threads\thread-count.py################################################## # thread basics: start 10 copies of a function # running in parallel; uses time.sleep so that # main thread doesn't die too early--this kills # all other threads on both Windows and Linux; # stdout shared: thread outputs may be intermixed ################################################## import thread, time def counter(myId, count): # this function runs in threads for i in range(count): #time.sleep(1) print '[%s] => %s' % (myId, i) for i in range(10): # spawn 10 threads thread.start_new(counter, (i, 3)) # each thread loops 3 times time.sleep(4) print 'Main thread exiting.' # don't exit too early Each parallel copy of the counter function simply counts from zero up to two here. When run on Windows, all 10 threads run at the same time, so their output is intermixed on the standard output stream: C:\...\PP2E\System\Threads>python thread-count.py ...some lines deleted... [5] => 0 [6] => 0 [7] => 0 [8] => 0 [9] => 0 [3] => 1 [4] => 1 [1] => 0 [5] => 1 [6] => 1 [7] => 1 [8] => 1 [9] => 1 [3] => 2 [4] => 2 [1] => 1 [5] => 2 [6] => 2 [7] => 2 [8] => 2 [9] => 2 [1] => 2 Main thread exiting. In fact, these threads' output is mixed arbitrarily, at least on Windows -- it may even be in a different order each time you run this script. Because all 10 threads run as independent entities, the exact ordering of their overlap in time depends on nearly random system state at large at the time they are run. If you care to make this output a bit more coherent, uncomment (that is, remove the # before) the time.sleep(1) call in the counter function and rerun the script. If you do, each of the 10 threads now pauses for one second before printing its current count value. Because of the pause, all threads check in at the same time with the same count; you'll actually have a one-second delay before each batch of 10 output lines appears: C:\...\PP2E\System\Threads>python thread-count.py ...some lines deleted... [7] => 0 [6] => 0 pause... [0] => 1 [1] => 1 [2] => 1 [3] => 1 [5] => 1 [7] => 1 [8] => 1 [9] => 1 [4] => 1 [6] => 1 pause... [0] => 2 [1] => 2 [2] => 2 [3] => 2 [5] => 2 [9] => 2 [7] => 2 [6] => 2 [8] => 2 [4] => 2 Main thread exiting. Even with the sleep synchronization active, though, there's no telling in what order the threads will print their current count. It's random on purpose -- the whole point of starting threads is to get work done independently, in parallel. Notice that this script sleeps for four seconds at the end. It turns out that, at least on my Windows and Linux installs, the main thread cannot exit while any spawned threads are running; if it does, all spawned threads are immediately terminated. Without the sleep here, the spawned threads would die almost immediately after they are started. This may seem ad hoc, but isn't required on all platforms, and programs are usually structured such that the main thread naturally lives as long as the threads it starts. For instance, a user interface may start an FTP download running in a thread, but the download lives a much shorter life than the user interface itself. Later in this section, we'll see different ways to avoid this sleep with global flags, and will also meet a "join" utility in a different module that lets us wait for spawned threads to finish explicitly. 3.3.1.1 Synchronizing access to global objectsOne of the nice things about threads is that they automatically come with a cross-task communications mechanism: shared global memory. For instance, because every thread runs in the same process, if one Python thread changes a global variable, the change can be seen by every other thread in the process, main or child. This serves as a simple way for a program's threads to pass information back and forth to each other -- exit flags, result objects, event indicators, and so on. The downside to this scheme is that our threads must sometimes be careful to avoid changing global objects at the same time -- if two threads change an object at once, it's not impossible that one of the two changes will be lost (or worse, will corrupt the state of the shared object completely). The extent to which this becomes an issue varies per application, and is sometimes a nonissue altogether. But even things that aren't obviously at risk may be at risk. Files and streams, for example, are shared by all threads in a program; if multiple threads write to one stream at the same time, the stream might wind up with interleaved, garbled data. Here's an example: if you edit Example 3-6, comment-out the sleep call in counter, and increase the per-thread count parameter from 3 to 100, you might occasionally see the same strange results on Windows that I did: C:\...\PP2E\System\Threads\>python thread-count.py | more ...more deleted... [5] => 14 [7] => 14 [9] => 14 [3] => 15 [5] => 15 [7] => 15 [9] => 15 [3] => 16 [5] => 16 [7] => 16 [9] => 16 [3] => 17 [5] => 17 [7] => 17 [9] => 17 ...more deleted... Because all 10 threads are trying to write to stdout at the same time, once in a while the output of more than one thread winds up on the same line. Such an oddity in this script isn't exactly going to crash the Mars Lander, but it's indicative of the sorts of clashes in time that can occur when our programs go parallel. To be robust, thread programs need to control access to shared global items like this such that only one thread uses it at once.[4]
Luckily, Python's thread module comes with its own easy-to-use tools for synchronizing access to shared objects among threads. These tools are based on the concept of a lock -- to change a shared object, threads acquire a lock, make their changes, and then release the lock for other threads to grab. Lock objects are allocated and processed with simple and portable calls in the thread module, and are automatically mapped to thread locking mechanisms on the underlying platform. For instance, in Example 3-7, a lock object created by thread.allocate_lock is acquired and released by each thread around the print statement that writes to the shared standard output stream. Example 3-7. PP2E\System\Threads\thread-count-mutex.py################################################## # synchronize access to stdout: because it is # shared global, thread outputs may be intermixed ################################################## import thread, time def counter(myId, count): for i in range(count): mutex.acquire( ) #time.sleep(1) print '[%s] => %s' % (myId, i) mutex.release( ) mutex = thread.allocate_lock( ) for i in range(10): thread.start_new_thread(counter, (i, 3)) time.sleep(6) print 'Main thread exiting.' Python guarantees that only one thread can acquire a lock at any given time; all other threads that request the lock are blocked until a release call makes it available for acquisition. The net effect of the additional lock calls in this script is that no two threads will ever execute a print statement at the same point in time -- the lock ensures mutually exclusive access to the stdout stream. Hence, the output of this script is the same as the original thread_count.py, except that standard output text is never munged by overlapping prints. Incidentally, uncommenting the time.sleep call in this version's counter function makes each output line show up one second apart. Because the sleep occurs while a thread holds the lock, all other threads are blocked while the lock holder sleeps. One thread grabs the lock, sleeps one second and prints; another thread grabs, sleeps, and prints, and so on. Given 10 threads counting up to 3, the program as a whole takes 30 seconds (10 x 3) to finish, with one line appearing per second. Of course, that assumes that the main thread sleeps at least that long too; to see how to remove this assumption, we need to move on to the next section. 3.3.1.2 Waiting for spawned thread exitsThread module locks are surprisingly useful. They can form the basis of higher-level synchronization paradigms (e.g., semaphores), and can be used as general thread communication devices.[5] For example, Example 3-8 uses a global list of locks to know when all child threads have finished.
Example 3-8. PP2E\System\Threads\thread-count-wait1.py################################################## # uses mutexes to know when threads are done # in parent/main thread, instead of time.sleep; # lock stdout to avoid multiple prints on 1 line; ################################################## import thread def counter(myId, count): for i in range(count): stdoutmutex.acquire( ) print '[%s] => %s' % (myId, i) stdoutmutex.release( ) exitmutexes[myId].acquire( ) # signal main thread stdoutmutex = thread.allocate_lock( ) exitmutexes = [] for i in range(10): exitmutexes.append(thread.allocate_lock( )) thread.start_new(counter, (i, 100)) for mutex in exitmutexes: while not mutex.locked( ): pass print 'Main thread exiting.' A lock's locked method can be used to check its state. To make this work, the main thread makes one lock per child, and tacks them onto a global exitmutexes list (remember, the threaded function shares global scope with the main thread). On exit, each thread acquires its lock on the list, and the main thread simply watches for all locks to be acquired. This is much more accurate than naively sleeping while child threads run, in hopes that all will have exited after the sleep. But wait -- it gets even simpler: since threads share global memory anyhow, we can achieve the same effect with a simple global list of integers, not locks. In Example 3-9, the module's namespace (scope) is shared by top-level code and the threaded function as before -- name exitmutexes refers to the same list object in the main thread and all threads it spawns. Because of that, changes made in a thread are still noticed in the main thread without resorting to extra locks. Example 3-9. PP2E\System\Threads\thread-count-wait2.py#################################################### # uses simple shared global data (not mutexes) to # know when threads are done in parent/main thread; #################################################### import thread stdoutmutex = thread.allocate_lock( ) exitmutexes = [0] * 10 def counter(myId, count): for i in range(count): stdoutmutex.acquire( ) print '[%s] => %s' % (myId, i) stdoutmutex.release( ) exitmutexes[myId] = 1 # signal main thread for i in range(10): thread.start_new(counter, (i, 100)) while 0 in exitmutexes: pass print 'Main thread exiting.' The main threads of both of the last two scripts fall into busy-wait loops at the end, which might become significant performance drains in tight applications. If so, simply add a time.sleep call in the wait loops to insert a pause between end tests and free up the CPU for other tasks. Even threads must be good citizens. Both of the last two counting thread scripts produce roughly the same output as the original thread_count.py -- albeit, without stdout corruption, and with different random ordering of output lines. The main difference is that the main thread exits immediately after (and no sooner than!) the spawned child threads: C:\...\PP2E\System\Threads>python thread-count-wait2.py ...more deleted... [2] => 98 [6] => 97 [0] => 99 [7] => 97 [3] => 98 [8] => 97 [9] => 97 [1] => 99 [4] => 98 [5] => 98 [2] => 99 [6] => 98 [7] => 98 [3] => 99 [8] => 98 [9] => 98 [4] => 99 [5] => 99 [6] => 99 [7] => 99 [8] => 99 [9] => 99 Main thread exiting. Of course, threads are for much more than counting. We'll put shared global data like this to more practical use in a later chapter, to serve as completion signals from child processing threads transferring data over a network, to a main thread controlling a Tkinter GUI user interface display (see Section 11.4 in Chapter 11). 3.3.2 The threading ModuleThe standard Python library comes with two thread modules -- thread , the basic lower-level interface illustrated thus far, and threading, a higher-level interface based on objects. The threading module internally uses the thread module to implement objects that represent threads and common synchronization tools. It is loosely based on a subset of the Java language's threading model, but differs in ways that only Java programmers would notice.[6] Example 3-10 morphs our counting threads example one last time to demonstrate this new module's interfaces.
Example 3-10. PP2E\System\Threads\thread-classes.py####################################################### # uses higher-level java like threading module object # join method (not mutexes or shared global vars) to # know when threads are done in parent/main thread; # see library manual for more details on threading; ####################################################### import threading class mythread(threading.Thread): # subclass Thread object def __init__(self, myId, count): self.myId = myId self.count = count threading.Thread.__init__(self) def run(self): # run provides thread logic for i in range(self.count): # still synch stdout access stdoutmutex.acquire( ) print '[%s] => %s' % (self.myId, i) stdoutmutex.release( ) stdoutmutex = threading.Lock( ) # same as thread.allocate_lock( ) threads = [] for i in range(10): thread = mythread(i, 100) # make/start 10 threads thread.start( ) # start run method in a thread threads.append(thread) for thread in threads: thread.join( ) # wait for thread exits print 'Main thread exiting.' The output of this script is the same as that shown for its ancestors earlier (again, randomly distributed). Using the threading module is largely a matter of specializing classes. Threads in this module are implemented with a Thread object -- a Python class which we customize per application by providing a run method that defines the thread's action. For example, this script subclasses Thread with its own mythread class; mythread's run method is what will be executed by the Thread framework when we make a mythread and call its start method. In other words, this script simply provides methods expected by the Thread framework. The advantage of going this more coding-intensive route is that we get a set of additional thread-related tools from the framework "for free." The Thread.join method used near the end of this script, for instance, waits until the thread exits (by default); we can use this method to prevent the main thread from exiting too early, rather than the time.sleep calls and global locks and variables we relied on in earlier threading examples. The example script also uses threading.Lock to synchronize stream access (though this name is just a synonym for thread.allocate_lock in the current implementation). Besides Thread and Lock, the threading module also includes higher-level objects for synchronizing access to shared items (e.g., Semaphore, Condition, Event), and more; see the library manual for details. For more examples of threads and forks in general, see the following section and the examples in Part III. |
I l@ve RuBoard |