What fast SSH bulk transfer speed (probably) looks like in mid-2022

June 19, 2022

A number of years ago I wrote about what influences SSH's bulk transfer speeds, and in 2009 I wrote what turned out to be an incomplete entry on how fast various ssh ciphers were on the hardware of the time. Today, for reasons outside the scope of this entry, I'm interested in the sort of best case performance we can get on good modern hardware, partly because we actually have some good modern hardware for once. Specifically, we have two powerful dual-socket systems, one with AMD Epyc 7453s and one with Intel Xeon Gold 6348s.

To take our actual physical network out of the picture (since this is absolute best case performance), I ran my test suite against the system itself (although over nominal TCP by ssh'ing to its own hostname, not localhost). Both servers have more than enough CPUs and memory that this is not at all a strain for them. Both servers are running Ubuntu 22.04, where the default SSH cipher and MAC are chacha20-poly1305@openssh.com and no MAC (it's implicit in the cipher).

On the AMD Epyc 7453 server, the default SSH cipher choice ran at about 448 MBytes/sec. A wide variety of AES ciphers (both -ctr and -gcm versions) and MACs pushed the speed to over 600 MBytes/sec and sometimes over 700 Mbytes/sec, although I don't think there's any one option that stands out as a clear winner.

On the Intel Xeon Gold 6348 server, the default SSH cipher choice ran at about 250 Mbytes/sec. Using aes128-gcm could reliably push the speed over 300 Mbytes/sec (with various MACs). Using aes256-gcm seemed slightly worse.

I happen to have some not entirely comparable results from machines with another Intel CPU, the Pentium D1508, on tests that were run over an essentially dedicated 10G network segment between two such servers. Here the default performance was only about 150 Mbytes/sec, but aes128-gcm could reliably be pushed to 370 Mbytes/sec or better, and aes256-gcm did almost as well.

(These Pentium D1508 machines are currently busy running backups, so I can't run a same-host test on them for an apples to apples comparison.)

What this says to me is that SSH speed testing is not trivial and has non-obvious results that I don't (currently) understand. If we care about SSH speed in some context, we need to test it in exactly that context; we shouldn't assume that results from other servers or other network setups will generalize.

Comments on this page:

Did you consider using https://www.psc.edu/hpn-ssh-home/ ? This may help on high bandwidth/latency links as regular SSH suffers from buffer bloat. Alternatively, I had good experience with stunnel + mbuffer to saturate a 2ms latency/1 Gbps link where plain SSH failed to do so.

By cks at 2022-06-20 11:02:06:

We haven't looked at HPN-SSH or things like that so far, because for our needs the situation is okay as it is. The one bulk SSH transfer we do regularly has performance that's restricted by other things (and probably always will be). If those limits go away, we can get good enough performance with native OpenSSH and the right cipher choices, and it's simpler to manage and maintain over time.

It seems that the HPN-SSH software is available as a PPA for (focal and impish?) now:

Newer releases seem to have made two changes that make parallel installation cleaner: commands are prefixed with "hpn" (ssh->hpnssh, scp->hpnscp, sshd->hpnsshd), and the default for hpnsshd is 2222 out of the box, so it won't interfere with the default SSHd (run 'hpnssh -p2222 user@host').

One (recent?) enhancement is the use of AES-NI instructions for the AES-CTR cipher.

Probably worth checking out for those that need to push things further and SSH does end up being the bottleneck.

Written on 19 June 2022.
Last modified: Sun Jun 19 21:46:57 2022
