Wandering Thoughts archives

2007-01-26

First impressions of pyOpenSSL

pyOpenSSL is a high-level Python wrapper around a subset of the OpenSSL library (to quote it). It was pulled onto my machines recently as part of a Fedora update, and since I'm currently interested in OpenSSL-related things I decided to play around with it to try it out.

After writing some basic stuff, I have to say that I like it. It's not entirely documented (and for some things you really want to read the OpenSSL manpages too), but it works fine. It's also nicely simple to use; my test program to connect to a server and extract its SSL certificate is only 40 lines.

Here's the gotchas and stuff I know about so far:

  • despite what the Connection object documentation says about the .connect() method, it just connects the underlying socket. You must then call .do_handshake() to start SSL on the new connection.

  • end-of-connection raises ZeroReturnError instead of having the connection's .recv() method return a zero-sized result. I imagine this makes sense if you understand the OpenSSL library.

  • I'm not entirely clear what exactly .shutdown() does; it seems to only tell the other side that you're not going to send more stuff. (I probably need to write a server in pyOpenSSL to really understand it.)

  • there seems to be no way to find out the expiry date of an X509 certificate, although X509 objects have a function to tell you if they've expired.

  • The X509 .digest() function's single argument is a string that is the name of the hash you want to use. The available hashes on my Fedora Core 6 machine seem to be be md2, md5, sha, sha1, dss1, and ripemd aka ripemd160. However, the exact list is in the depths of the OpenSSL library, so yours may differ; see the EVP_DigestInit manpage.

  • the string values of X509Name objects are almost but not quite the usual '/C=...' form; they have Python repr-style type bits glued on the ends.

  • some of the in-Python help text is misleading or wrong; trust the PostScript documentation instead, which is pretty good (if you like minimalism).

These are pretty much minor concerns, though; the only one that required me to do some serious delving was the .digest() issue. (And I'm coming into this completely ignorant of the underlying OpenSSL library routines, which doesn't help. Advanced usage of pyOpenSSL, like callbacks, clearly requires familiarity with OpenSSL itself.)

Sidebar: a simple example client

Here's an example of how you'd use the client side of pyOpenSSL (without error checking and the like):

import socket
from OpenSSL import SSL

def ding(host, port, msg):
  s = socket.socket()
  # TLSv1 chosen more or less arbitrarily
  cx = SSL.Context(SSL.TLSv1_METHOD)
  cn = SSL.Connection(cx, s)

  cn.connect((host, port))
  cn.do_handshake()

  cn.sendall(msg)
  tl = []
  try:
    while 1:
      r = cn.recv(8192)
      tl.append(r)
  except SSL.ZeroReturnError:
    pass

  cn.shutdown()
  cn.close()
  return "".join(tl)

As far as I can tell from strace, the .recv() calls are blocking and this won't spin waiting for network IO.

PyOpenSSLComments written at 23:13:16; Add Comment

2007-01-17

A grump about the socket module's SSL support

It is nice that Python's socket module has simple SSL support (although it has some limitations). My grump is that it doesn't give you any good way of checking the identity of the server's certificate, which is especially annoying as the SSL code doesn't do any certificate verification.

(This matters to me because I have recently become quite interested in being able to verify machines by checking that they have a specific SSL certificate.)

What SSL objects have is .issuer() and .server(), which give you the text form of the 'distinguished name' for the certificate authority (if any) and the server certificate. In practice, these are useless for reliably identifying a specific server (in part because there are significant ambiguities in the text versions of distinguished names, see eg this bug report).

What you actually need is information about the server certificate itself. The best thing would be a full copy of the server certificate as a binary object (since then I can just do whatever I want with it, including comparing it to my existing copy), but I'd be reasonably happy with a hash or other signature of the server's certificate. (And OpenSSL already has functions that will give you the certificate; I believe it would take two OpenSSL calls to pull the certificate out as a memory blob, namely SSL_get_peer_certificate followed by an appropriate i2d_X509 invocation.)

But I suppose that I shouldn't be too surprised. Almost nothing seems to offer an option to accept only a specific server certificate; at best you can insist that the certificate you get is signed by a specific CA.

SocketSSLGrump written at 14:01:28; Add Comment

2007-01-14

Wrapping exceptions versus propagating them untouched

I have somewhat recently written a program that makes heavy use of Python's xmlrpclib module. While the xmlrpclib module is very nice, the whole experience has given me some strong views on how it does exception handling.

The basic problem with xmlrpclib's error handling is that it doesn't capture exceptions from stuff that it calls, so they leak out to you. Fortunately I don't think xmlrpclib calls anything that can raise errors except the socket module, but I'm not sure (so my program may someday detonate with an uncaught error).

I suspect that there is a big debate over whether you should wrap exceptions from stuff you call or just pass them through, but having gone through the experience of dealing with xmlrpclib I have to come down solidly on the side of 'wrap exceptions'; it is simply much easier to deal with. It also means that I am less exposed to the internal details of how your module is implemented. As it is, I will have to modify my program if a future version of xmlrpclib starts calling something else that can also raise exceptions.

So one way to put it is that by not capturing exceptions from stuff your module calls, you make what your module calls part of your module's interface, because users of your module have to be aware of it in order to use your module safely. (And if some of those modules behave the same way, what they call is implicitly added to your module's interface and so on.)

History has shown us that large, vulnerable to change interfaces are pretty much a bad idea; you want small stable interfaces instead. Yes, wrapping up other people's exceptions in your own exceptions inevitably loses some information, but it's almost always going to be worth it.

(Maybe this is conventional wisdom, but since the xmlrpclib module doesn't wrap exceptions I suspect that it is not as widespread as I would like.)

WrappingExceptions written at 22:56:32; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.