2007-01-26
First impressions of pyOpenSSL
pyOpenSSL is a high-level Python wrapper around a subset of the OpenSSL library (to quote it). It was pulled onto my machines recently as part of a Fedora update, and since I'm currently interested in OpenSSL-related things I decided to play around with it to try it out.
After writing some basic stuff, I have to say that I like it. It's not entirely documented (and for some things you really want to read the OpenSSL manpages too), but it works fine. It's also nicely simple to use; my test program to connect to a server and extract its SSL certificate is only 40 lines.
Here's the gotchas and stuff I know about so far:
- despite what the
Connectionobject documentation says about the.connect()method, it just connects the underlying socket. You must then call.do_handshake() to start SSL on the new connection. - end-of-connection raises
ZeroReturnErrorinstead of having the connection's.recv()method return a zero-sized result. I imagine this makes sense if you understand the OpenSSL library. - I'm not entirely clear what exactly
.shutdown()does; it seems to only tell the other side that you're not going to send more stuff. (I probably need to write a server in pyOpenSSL to really understand it.) - there seems to be no way to find out the expiry date of an X509
certificate, although
X509objects have a function to tell you if they've expired. - The X509
.digest()function's single argument is a string that is the name of the hash you want to use. The available hashes on my Fedora Core 6 machine seem to be be md2, md5, sha, sha1, dss1, and ripemd aka ripemd160. However, the exact list is in the depths of the OpenSSL library, so yours may differ; see theEVP_DigestInitmanpage. - the string values of X509Name objects are almost but not quite the
usual '/C=...' form; they have Python repr-style type bits glued
on the ends.
- some of the in-Python help text is misleading or wrong; trust the PostScript documentation instead, which is pretty good (if you like minimalism).
These are pretty much minor concerns, though; the only one that
required me to do some serious delving was the .digest() issue.
(And I'm coming into this completely ignorant of the underlying
OpenSSL library routines, which doesn't help. Advanced usage of
pyOpenSSL, like callbacks, clearly requires familiarity with
OpenSSL itself.)
Sidebar: a simple example client
Here's an example of how you'd use the client side of pyOpenSSL (without error checking and the like):
import socket
from OpenSSL import SSL
def ding(host, port, msg):
s = socket.socket()
# TLSv1 chosen more or less arbitrarily
cx = SSL.Context(SSL.TLSv1_METHOD)
cn = SSL.Connection(cx, s)
cn.connect((host, port))
cn.do_handshake()
cn.sendall(msg)
tl = []
try:
while 1:
r = cn.recv(8192)
tl.append(r)
except SSL.ZeroReturnError:
pass
cn.shutdown()
cn.close()
return "".join(tl)
As far as I can tell from strace, the .recv() calls are blocking and
this won't spin waiting for network IO.
2007-01-17
A grump about the socket module's SSL support
It is nice that Python's socket module has simple SSL support (although it has some limitations). My grump is that it doesn't give you any good way of checking the identity of the server's certificate, which is especially annoying as the SSL code doesn't do any certificate verification.
(This matters to me because I have recently become quite interested in being able to verify machines by checking that they have a specific SSL certificate.)
What SSL objects have is .issuer() and .server(), which
give you the text form of the 'distinguished name' for the
certificate authority (if any) and the server certificate. In
practice, these are useless for reliably identifying a specific
server (in part because there are significant ambiguities in the
text versions of distinguished names, see eg this bug report).
What you actually need is information about the server certificate
itself. The best thing would be a full copy of the server certificate
as a binary object (since then I can just do whatever I want with it,
including comparing it to my existing copy), but I'd be reasonably happy
with a hash or other signature of the server's certificate. (And OpenSSL
already has functions that will give you the certificate; I believe it
would take two OpenSSL calls to pull the certificate out as a memory
blob, namely SSL_get_peer_certificate followed by an appropriate
i2d_X509 invocation.)
But I suppose that I shouldn't be too surprised. Almost nothing seems to offer an option to accept only a specific server certificate; at best you can insist that the certificate you get is signed by a specific CA.
2007-01-14
Wrapping exceptions versus propagating them untouched
I have somewhat recently written a program that makes heavy use of Python's xmlrpclib module. While the xmlrpclib module is very nice, the whole experience has given me some strong views on how it does exception handling.
The basic problem with xmlrpclib's error handling is that it doesn't capture exceptions from stuff that it calls, so they leak out to you. Fortunately I don't think xmlrpclib calls anything that can raise errors except the socket module, but I'm not sure (so my program may someday detonate with an uncaught error).
I suspect that there is a big debate over whether you should wrap exceptions from stuff you call or just pass them through, but having gone through the experience of dealing with xmlrpclib I have to come down solidly on the side of 'wrap exceptions'; it is simply much easier to deal with. It also means that I am less exposed to the internal details of how your module is implemented. As it is, I will have to modify my program if a future version of xmlrpclib starts calling something else that can also raise exceptions.
So one way to put it is that by not capturing exceptions from stuff your module calls, you make what your module calls part of your module's interface, because users of your module have to be aware of it in order to use your module safely. (And if some of those modules behave the same way, what they call is implicitly added to your module's interface and so on.)
History has shown us that large, vulnerable to change interfaces are pretty much a bad idea; you want small stable interfaces instead. Yes, wrapping up other people's exceptions in your own exceptions inevitably loses some information, but it's almost always going to be worth it.
(Maybe this is conventional wisdom, but since the xmlrpclib module doesn't wrap exceptions I suspect that it is not as widespread as I would like.)