Some notes on abusing the pexpect
Python module
What you are theoretically supposed to use pexpect for is to have your program automatically interact with interactive programs. When they produce certain sorts of output, you recognize it and take action; when you see prompts, you can automatically answer them. Pexpect is often used this way to automate things that expect to be operated manually by a real person. This is not what I'm using pexpect for. What I'm using it for is to start a program in what it thinks is an interactive environment, capture its output if all goes well, and if things go wrong allow a human operator to step in and interact with the program (all the while still capturing the output). This means that I'm ignoring almost all of pexpect's functionality and abusing parts of the rest in ways that it was probably not designed for.
Before I start, I need to throw in a disclaimer. There are multiple versions of pexpect out there; my impression is that development stalled for a while and then picked up recently. As I write this, the pexpect documentation talks about 4.0.1, but what I've used is no later than 3.1. Pexpect 4 may fix some of the issues I'm going to grumble about.
Supposing that my case is what you want to do, you start out by spawning a command:
child = pexpect.spawn(YOURCOMMAND, args=args, timeout=None)
It's important to set a timeout of None
as the starting timeout.
If you want to have a timeout at all, for example to detect that
the remote end has gone silent, you want to control it on a call
by call basis.
Now you want to collect output from the child command:
res = [] while not child.closed and child.isalive(): try: r = child.read_nonblocking(size = 16*1024, timeout=YOURTIMEOUT) res.append(r) except pexpect.EOF: # expected, just stop break except pexpect.TIMEOUT: # do whatever you want to recover return recover_child(child, res)
You might as well set size
to large here. Although the documentation
doesn't tell you this, it is just the maximum amount of data your
read can ever return; it doesn't block until that much data is
available. My principle is 'if the command generates a lot of output,
let's read it in big blocks'.
We're not done once pexpect has raised an EOF. We need to do some cleanup to make sure that the child's exit status is available:
# Some of this is probably superstition if not child.closed and child.isalive(): child.wait() return (res, child.status)
Pexpect 3.1's documentation is not entirely clear on what you have
to check when in order to see if the child is alive or not. Note
that .isalive()
has the (useful) side effect of harvesting the
child's exit status if the child is not alive. It's helpfully not
valid to call .wait()
on a dead child, at least in 3.1, so you
have to check carefully first.
As pexpect documents, it splits the actual OS process exit status
into child.exitstatus
and child.signalstatus
(and various things
return one or the other). The whole status is available as
child.status
, but you may find one or the other variant more
useful (for example if you're really only interested in 'did the
command exit with status 0 or did something go boom').
Allowing the user to interact with the child is somewhat more
involved. Fundamentally we call child.interact()
repeatedly,
but there is a bunch of things that you need to do around this.
def talkto(child): # Set up to log interactive output res = [] def save_output(data): if data: res.append(data) return data while not child.closed and child.isalive(): try: child.interact(output_filter=save_output) except OSError as e: # Usually an EOF from the command. # Complain somehow. break # If the child is alive here, the user has # typed a ^] to escape from interact(). # What happens next is up to you.
Yes, you read that right. Uniquely, pexpect's child.interact()
does not raise pexpect.EOF
on EOF from the child; instead it
generally passes through an underlying OSError
that it got (my
notes don't say what that OSError usually is). In general, if you
get an OSError
here you have to assume that the session is dead,
although pexpect doesn't necessarily know it yet.
Usefully, child.interact()
sets things up so that control characters
and so on that the user types are normally passed through directly
to the child process instead of affecting your Python program. This
means that under normal circumstances, if you type eg ^C your Python
code won't get hit with a SIGINT
; it'll go through to the child
program and the child program will do whatever it does in reaction.
What you do if the user chooses to use ^[ to exit from child.interact()
is up to you. Note that you can allow them to resume the interaction;
just go back through your loop to call child.interact()
again.
If you allow the user to abandon the child and exit your talkto()
function (you probably want to), you need to do some more cleanup
of the child:
# after interact() returns, try to # read anything left over, then close the child. try: r = child.read_nonblocking(size=128*1024, timeout=0) res.append(r) except (pexpect.EOF, pexpect.TIMEOUT, OSError): pass child.close(force=True)
Calling read_nonblocking
with timeout=0
means what you think
it does; it's a non-blocking read of whatever (final) data is
available right now, with no waiting for anything more to come in
from the child.
At least in pexpect 3.1, you basically should call child.close()
with force=True
or you will get a pexpect error if the child stays
alive, which it may. Setting force
winds up hitting the child
with a SIGKILL
if nothing else seems to work, which is relatively
sure.
(Although the documentation doesn't mention it, if the child is
alive it always gets sent SIGHUP
and then SIGINT
first. Well,
this happens in older versions of pexpect; the 4.0.1 code is a bit
different and I haven't dug through it.)
Possibly there is a better Python module for this sort of interaction in general. If so, it is too late for me; I've already written all of this code and I hope to not have to touch it again before we have to port it to Python 3 (if ever).
(My impression is that you should try to use pexpect 4 if you can, as the code has been overhauled and the documentation at least somewhat improved.)
|
|