What Python threads are good for
Because of the sometimes much-maligned Global Interpreter Lock, pure Python code itself can't run simultaneously on multiple CPUs. So what should you use Python threads for?
The real use for Python threads is turning synchronous functions in
extension modules into asynchronous things that don't delay your main
program. Often these functions have no asynchronous equivalents (unlike
network IO), so it is either use threads or have your main program
delayed. This works for sufficiently compute-intensive functions as well
as functions, like socket.gethostbyname
, that have to wait on outside
things.
Python threads are not a good way to do asynchronous network IO,
because it's inefficient overkill; use either select()
or poll()
from the
select module
instead (along with non-blocking sockets and so on). If you need a
canned solution for this, consider
Twisted,
or asyncore and
asynchat from the
standard library.
Note that threads are the only way to make gethostbyname()
and
gethostbyaddr()
asynchronous, because they don't necessarily just do
DNS lookups. Exactly what data sources they consult and how is highly
system dependent; you really need to just be calling the platform C
library routines. This cuts both ways; if you want just DNS lookups,
do just DNS lookups via something like dnspython.
My thread-using Python programs wind up being built around completion
queues and thread pools; they hand off work to auxiliary threads and
then wait for things to finish. (Sometimes in conjunction with network
IO; see here for how I mix
work completion notification and select()
et al.)
(Someday I will have a general 'thread pool' module that I'm happy with. I probably need to write more thread-using programs first.)
|
|