2015-12-09
Goroutines, network IO streams, and the resulting synchronization problem
These days, I use my Go take on a netcat-like program for all of my casual network
(TCP) service testing. One of the things that makes it convenient
is that it supports TLS, so I can just do things like 'call tls
imap.cs imaps
' instead of having to dabble in yet another netcat-like
program. However, this TLS support has an important and unfortunate
limit right now, which is that it only supports protocols where you
do TLS from the start. A certain number of protocols support a
during-connection upgrade to TLS, generally via some form of what
gets called STARTTLS;
the one most relevant to me is SMTP.
I would like to support STARTTLS in call, but this exposes a structural issue. Call is a netcopy program and these are naturally asynchronous in that copying from standard input to the network is (and often should be) completely decoupled from copying from the network to standard output. Call implements this in the straightforward Go way by using two goroutines, one for each copy. This works quite well normally (and is the only real way to support arbitrary protocols), but the moment you add STARTTLS you have a problem.
The problem is that STARTTLS intrinsically requires the input and
the output to synchronize (and really you need them to merge). The
moment you send STARTTLS, the previously independent to-network and
from-network streams must be synchronized and yoked together as TLS
exchanges messages with the server at the other end of the connection.
In a program based around poll()
or select()
, this is easy; you
can trivially shift from asynchronous IO streams back to synchronous
ones while you set up TLS, and then go back to asynchronous streams
over the TLS connection. But in a goroutine based system, it's far
from clear how to achieve this (especially efficiently), for two
reasons.
The first problem is simply establishing synchronization between the sending and receiving sides in some reasonably elegant way. One approach is some sort of explicit variable for 'starting TLS' that is written by the sending side and checked at every IO by the receiving side. In theory another approach is to switch to a model with more goroutines, with one for every IO source and sink and a central manager of some sort. The explicit variable seems not very Go-y, while the central manager raises the issue of how to keep slow IO output to either the network or stdout from blocking the other direction.
The second problem is that mere synchronization is not enough,
because there is no way to wake up on incoming network IO without
consuming some of it. This matters because when we send our STARTTLS,
we want to immediately switch network input handling from its
normal mode to 'bring up TLS' mode, without reading any of the
pending network input. If we cannot switch right away, we will wake
up having .Read()
some amount of the STARTTLS response, which
must then be handled somehow.
(Hopefully all STARTTLS-enabled protocols have a server response
in plaintext that comes before any TLS protocol messages. If the
first thing we may get back from the network in response to our
STARTTLS is part of the server TLS handshake, I'd have to somehow
relay that initial network data to tls.Client()
as (fake) network
input. The potential headaches there are big.)
I don't have any answers here and I don't know what the best solution is; I just have my usual annoyance that Go forces asynchronous (network) IO into a purely goroutine based architecture.
(The clear pragmatic answer is to not even try to support STARTTLS in call and to leave that to other tools, at least until I wind up with a burning need for it. But that kind of annoys me.)
A surprise about cryptographic signatures
I won't say that I know a lot about cryptography, but I know a certain amount. What this really means is that every so often I get the opportunity to be really surprised about something in cryptography. The most recent incident came about from reading Andrew Ayer's Duplicate Signature Key Selection Attack in Let's Encrypt. I had to read the article more than once before I really understood the problem, but here is the cryptographic thing that really startled me, boiled down:
A digital signature is not necessarily tied to a message.
As Ayer puts it:
Digital signatures guarantee that a message came from a particular private key. They do not guarantee that a signature came from a particular private key, [...]
By extension (and Ayer mentions this later), a signature does not uniquely identify a message; many pairs of messages and keys may result in the same signature. The specific vulnerability that Ayer exploited is that in RSA, if you have a message and a signature, you can quite easily generate a private key that produces the given signature for the message. The original Let's Encrypt protocol was vulnerable to this issue because it had you basically publish your signature of their validation message to you. Since this signature was on its own, an attacker could arrange a situation where it was also a valid signature for a different message signed with the attacker's key.
(The article is well worth reading in full, just to absorb the details of both how this works in RSA and how the specific attack worked against Let's Encrypt's original protocol.)
Until I read this article, I would not have expected this result at all. Had I been in a situation where it mattered, I wouldn't even have thought about the assumptions I was making about how a message, a signature, and a private key were connected; I probably would have just assumed that a signature was inextricably tied to both the message and the private key. Nope. Not so at all.
The direct lesson I take away from this is that anything involving a signature floating around on its own is dangerous, and if I ever design any sort of validation protocol I should avoid it. The indirect lesson is yet another useful reminder that I do not know enough about cryptography to be designing anything involving cryptography. If I try to do this in any non-toy context, the things I don't even know I don't know will probably eat me for breakfast without breaking a sweat.