Wandering Thoughts archives

2008-09-30

Using Python to find out what cipher a SSL server is using

I have a new-found interest in finding out what ciphers various SSL servers around here are using in some easy and convenient way. Doing this in Perl is easy (there's an example here), but I prefer Python. I'd normally use pyOpenSSL, my favorite Python OpenSSL module, but unfortunately it doesn't (currently) have an interface to the necessary 'get the connection's cipher' OpenSSL routine, and there's no visible substitute for it.

(While the .get_cipher_list() method of Connection objects looks tempting, I discovered through some experimentation that it doesn't return anything like the list of ciphers that are common between the server and the client.)

However, all is not lost; it turns out that the commonly available M2Crypto module does have enough functionality to do this. In fact, it's pretty easy, and so here is sample code to do it (omitting error checking, as usual):

from M2Crypto import SSL
def printcipher(host, port):
    ctx = SSL.Context('tlsv1')
    s = SSL.Connection(ctx)

    # No, really, all I care is that you successfully
    # negotiated SSL, especially as some of your checks
    # are broken.
    s.postConnectionCheck = None
    s.connect((host, port))

    # The space at the end is [sic]
    if s.get_state() == "SSLOK ":
        c = s.get_cipher()
        cp = c.name()
        print "Cipher %s, %d bits" % (cp, len(c))
        if cp.startswith("DHE-") or cp.startswith("EDH-"):
            print "forward secrecy: yes"
        else:
            print "forward secrecy: probably not"
    s.close()

This code works for protocols that start SSL/TLS immediately on connection, such as https or imaps. Extension to protocols that have a plaintext conversation and then start TLS (such as ESTMP with STARTTLS) is left as an exercise for the sufficiently interested.

I believe but am not sure that there is no point in asking for anything except TLS v1; the OpenSSL ciphers(1) manpage suggests that SSLv3 and TLSv1 have the same set of ciphers. (Presumably this is TLS v1.0.)

(Note that M2Crypto is pretty under-documented; reading the source is not so much recommended as required. Fortunately it comes with a large set of examples.)

FindingSSLCipher written at 00:41:06; Add Comment

2008-09-09

The problem with unit testing programs

A commentator on my entry on why your main program should be importable mentioned that a third reason is to be able to unit test your program. I agree with the sentiment in general, but in specific I've never been able to get unit testing of my main program code to work very well.

While I can unit test most of the low-level code involved in a program, I've found that much of the high level code, the stuff that goes in the main program, has two issues. First, this code really wants to interact with the real world, unless it is horribly contorted to add yet another level of indirection. (This isn't surprising, as interacting with the real world is really the purpose of the top level of code; everything else manipulates data and objects, and the top level code does something useful with all that work.)

Second, when the program produces output (that should be tested to make sure that it is producing the right output) I find that I fiddle with the output formats a lot in order to get things to look right. Unfortunately this generally destroys unit tests that 'know' what the correct output looks like, and it rapidly becomes too much of a pain to maintain them; instead of being helpful, the unit tests have become an extra bureaucratic step (and obstacle) in tuning the program.

(I am probably unusually picky about how the output of my programs looks, and I also find that I can't tell for sure until I actually run the program and see it live.)

ProgramUnitTestProblem written at 00:54:13; Add Comment

2008-09-08

How to get as much of your program byte-compiled as possible

The CPython interpreter doesn't run Python code directly, but first compiles it to bytecode. Because this parsing and compiling steps are somewhat time-consuming, CPython optimizes things by saving the compiled bytecodes into .pyc files (or .pyo files if you used the -O switch to python), and trying to use them when possible. This speed increase doesn't necessarily matter all that much for a program that runs a long time, but it does matter for typical utility programs, which only run for a short time and may get executed a lot. (This is especially so if they have a lot of code that is only used occasionally, so that Python has a lot of code to compile that it will never actually run.)

One of the less well known things about CPython (at least on Unix) is that it only loads bytecode files when you import modules. If you just run a file of Python code with 'python file.py' (which is equivalent to a script that starts with '#!/usr/bin/python'), Python does not load file.pyc even if it exists, and it certainly won't create it. So if you have a big program full of Python code, you're paying the cost to byte-compile it each time you run it.

The way around this is obvious: put all of your program's code in a module, and then have your 'program' just be a tiny bit of Python that imports the module and then calls your main entry point. (Of course, this may then expose you to search path issues.)

The one other thing to watch out for is that the user running the Python program may not have enough permissions on the directory with the Python modules to write .pyc files there. If this is the case, you may need to 'pre-compile' the modules by running the script once as a user with enough permissions (or just manually import'ing them in an interpreter).

ByteCompiledPrograms written at 00:18:07; Add Comment

2008-09-07

Why your main program should be importable

When I first started coding in Python, I didn't know what I was doing. So I structured my Python programs the way I would write Bourne shell scripts or Perl programs, writing functions as necessary and useful but otherwise putting all of the logic and code in the program's file outside of functions (in what I now call 'module scope').

This is a perfectly rational structure for Python programs, and even works; my programs ran fine and were perfectly functional. But it was also a bad mistake, as I slowly discovered later; what you really want to do is put all of your code in functions (and then start one with magic).

The problem that makes it a mistake is that a program written this way cannot be imported as if it was just another Python module; if you try, the program's code immediately starts running and explosive things start happening. There are at least two reasons why this is unfortunate:

  • various useful tools like pychecker rely on importing your code in order to pick through it. This is arguably a mistake on pychecker's part and they should be using a more robust mechanism, but it's how they work right now, so if you want to use them (and pychecker is usually quite useful) you have to live with it.

    (Discovering pychecker and trying to use it on my programs was how I began to realize the mistake I'd made.)

  • being able to import your main program gives you a handy method of testing bits of it from an interactive interpreter.

    To make this really work you need to code your program so that it calls sys.exit() as little as possible. If a function runs into a fatal error it should not do the usual 'call die() with an error message' thing; instead, it should raise an exception. Only the very top of the program should catch those exceptions and wind up calling sys.exit().

    (And if you don't like phase tracking, catching and wrapping exceptions can give you a nice method to add context to the error message that you'll wind up reporting.)

I'm sure that this is strongly suggested somewhere in the Python documentation and the smart people were aware of it from the start, but I missed it (to my regret with those early programs).

Oh yes, the magic you need to make your top level function start running when your program is actually run (instead of being imported) is:

if __name__ == "__main__":
    ... run code here ...

At the module scope, __name__ is normally the name of your module (well, the name it is being imported by). When Python is running your code because it has been directly handed to the interpreter, Python sets the name to "__main__" instead.

Sidebar: my current program structure

The program structure that I have wound up adopting for my own programs looks something like this:

import sys
def process(...):
    .....

def main(args):
    .....
    try:
        process(...)
    except EnvironmentError, e:
        die("OS problem: "+str(e))
    except MyError, e:
        die(str(e))
    ....

if __name__ == "__main__":
    main(sys.argv[1:])

The main() function parses the arguments, loads configuration files, and so on, and then calls process() with whatever arguments are appropriate for the program; process() actually starts to do work. To put it one way, main() does all the stuff that only has to be done when the program is being run as an actual program.

ImportableMain written at 01:02:07; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.