2015-04-04
An important note if you want to totally stop an IKE IPSec connection
Suppose, hypothetically, that you think your IPSec GRE tunnel may be contributing to some weird connection
problem you're having. In order to get it out
of the picture, you want to shut it down (which will still leave
you able to reach things). There
are three ways you can do this: you can use 'ipsec whack --terminate
'
to ask your local pluto
to shut down this specific IKE connection
(which you've engineered to stop the GRE tunnel), you can shut your
local pluto
down entirely with 'systemctl stop pluto
' (or
equivalent), or you can stop pluto
on both ends.
I will skip to the punchline: if you have no *protoport
set (so
that you're doing IPSec on all traffic just because you might as
well), you need to shut pluto
down on both ends. Merely shutting
down the IKE IPSec stuff for your GRE tunnel (and taking down the
tunnel itself) will leave the overall IPSec security policy intact
and this policy specifically instructs the kernel to drop any
non-IPSec packets between your left
and right
IPs. Only shutting
down pluto
itself will get rid of the security policy, and you
need to get rid of it on both ends so you need to shut down pluto
on both.
(If pluto
is handling more than one connection for you on one of
the ends, you're going to need to do something more complicated.
My situation is usefully simple here.)
If you shut down pluto
on only one end and then keep trying to
test things, you can get into very puzzling and head-scratching
problems. For instance, if you try to make a connection from the
shut-down side to the side with pluto
still running, tcpdump
on both ends will tell you that SYN packets are being send and
arriving at their destination but are getting totally ignored despite
there being no firewall rules and so on that would do this.
(If you have a selective *protoport
set, any traffic that would
normally be protected by IPSec will be affected by this because the
security policy says 'drop any of this traffic that is not protected
with IPSec'.)
PS: your current IPSec security policies can be examined with
'setkey -DP
'. There's probably some way to get a counter of how
many packets have been dropped for violating IPSec security policies,
but I don't know what it is (maybe it's hiding somewhere in 'ip
xfrm
', which has low-level details of this stuff, although
/proc/net/xfrm_stat
doesn't seem to be it).
A weird new IKE IPSec problem that I just had on Fedora 21's latest kernel
Back when I first wrote up my IKE configuration for my point to point GRE tunnel, I restricted the IKE IPSec configuration so that it would only apply IPSec to the GRE traffic with:
conn cksgre [...] leftprotoport=gre rightprotoport=gre [...]
I only did this restriction out of caution and matching my old
manual configuration. A while later I decided that it was a little
silly; although I basically didn't do any unencrypted traffic to
the special GRE touchdown IP address I use at the work end, I might
as well fully protect the traffic since it was basically free. So
I took the *protoport
restrictions out, slightly increasing my
security, and things worked fine for quite some time.
Today this change quietly blew up in my face. The symptoms were that often (although not always) a TCP connection specifically between my home machine and the GRE touchdown IP would stall after it transferred some number of bytes (it's possible that the transfer direction mattered but I haven't tested extensively). Once I narrowed down what was going on from the initial problems I saw, reproduction was pretty consistent: if I did 'ssh -v touchdown-IP' from home I could see it stall during key exchange.
I don't know what's going on here, but it seems specific to running the latest Fedora 21 kernel on both ends; I updated my work machine to kernel 3.19.3-200.fc21 a couple of days ago and did not have this problem, but I updated my home machine to 3.19.3-200.fc21 a few hours ago and started seeing this almost immediately (although it took some time and frustration to diagnose just what the problem was).
(I thought I had some evidence from tcpdump output but in retrospect I'm not sure it meant what I think it meant.)
(I had problems years ago with MTU collapse in the face of recursive GRE tunnel routing, but that was apparently fixed back in 2012 and anyways this is kind of the inverse of that problem, since this is TCP connections flowing outside my GRE tunnel. Still, it feels like a related issue. I did not try various ways of looking at connection MTUs and so on; by the time I realized this was related to IPSec instead of other potential problems it was late enough that I just wanted the whole thing fixed.)