"Invalid KEX record length" during SPTPS key regeneration and related issues

Guus Sliepen guus at tinc-vpn.org
Sat May 16 20:36:56 CEST 2015


On Sat, May 16, 2015 at 04:53:33PM +0100, Etienne Dechamps wrote:

> I believe there is a design flaw in the way SPTPS key regeneration
> works, because upon reception of the KEX message the other nodes will
> send both KEX and SIG messages at the same time. However, the node
> expects SIG to arrive after KEX. Therefore, there is an implicit
> assumption that messages won't arrive out of order. tinc makes no such
> guarantee, even over TCP metaconnections, because there is no
> guarantee the two messages will travel along the same path (consider
> the case where there is a change in the graph while the KEX and SIG
> messages are traveling). In fact, messages can even be lost if a node
> responsible for forwarding them crashed before being able to do so.

You are right. The main issue with the SPTPS datagram protocol is that
it actually doesn't handle any packet loss or reordering during
authentication and key regeneration. I will add this, so it will be able
to run completely over UDP.

One reason it is currently using TCP is that for UDP packet reception,
tinc requires an established session so it can very the source of the
packets, so the initial handshake has to be done via the metaprotocol.
And then I got lazy.

> This is not so much of an issue for initial SPTPS negotiation because
> the handshake is restarted after a 10-second timeout, but there is no
> such timeout for key regeneration,

Indeed, such a timeout should be added.

> The legacy protocol doesn't have that problem because KEY_CHANGED is a
> broadcast message - meaning it can't really get lost.

Actually, it can just as well, although it is very unlikely to happen
that a broadcast message can get lost, and even less likely that this
happens right when a KEY_CHANGED message gets sent.

> I believe there is yet another, more benign issue with key
> regeneration as well: during the short window of time where it
> happens, the tunnel is unusable, and packets get lost. Key
> regeneration takes 2 round trips over the network, which can easily
> result in 300+ ms outages on high latency links.

Yes, I will fix that. If the secondary KEX is done properly over UDP,
then it is quite trivial to have it switch over to a new key without
losing packets if there is no reordering going on.

> With these issues in mind, I wonder if it's really worth trying to
> patch the current key regeneration protocol - maybe we should simply
> come up with a new one. How about simply terminating the current SPTPS
> channel and creating a new one? That would remove the need for a key
> regeneration protocol altogether, since it's just creating and
> terminating SPTPS channels. In fact, req_key_ext_h(REQ_KEY) is already
> smart enough to restart SPTPS if there's already a channel.
> Furthermore, if we allow the old and new channels to overlap for a
> short period of time, we can prevent packet loss during regeneration.

I think this just shifts the complexity from sptps.c to net_packet.c. So
I'd rather fix SPTPS.

-- 
Met vriendelijke groet / with kind regards,
     Guus Sliepen <guus at tinc-vpn.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://www.tinc-vpn.org/pipermail/tinc-devel/attachments/20150516/3cb2011b/attachment.sig>


More information about the tinc-devel mailing list