PMTU Discovery

Guus Sliepen guus at tinc-vpn.org
Thu Jan 6 23:34:57 CET 2011


On Thu, Jan 06, 2011 at 10:33:28PM +0100, Daniel Schall wrote:

> > > I suggest to implement a check, if the compressed data really is smaller
> > > than the uncompressed one and decide, what to send over the wire.
> 
> > Although that could be done I don't think it is worth the effort for the
> > few packets that might benefit from it.
> 
> I don't get, why you do this the hard way with so much uncertainties.  PMTU
> Discovery could be done easier by sending data packets of arbitrary size
> WITHOUT compressing them.  That would enable the nodes to check the real data
> size that fits through the channel.

It doesn't matter, the code checks the size of the payload before decompressing
it, so it sees exactly how many bytes you can stuff in the actual UDP packets,
whether those bytes contain compressed data or not. For tinc 1.0.x, the probes
need to use compression to stay backwards compatible.

> Although PMTU Discovery works with compressed packets and some "magic" to
> allow badly compressible packets as well, I think just sending uncompressed
> packets would be a lot easier and more reliable.

You would still need to use the "magic" for badly compressible packets even if
you send uncompressed MTU probes.

> First, the method "send_udppacket" checks, if compression is enabled and if
> it really reduces the size of the packet.
> If not, the packet is sent uncompressed. A flag in vpn_packet_t is set
> accordingly to allow the remote node to check whether decompression is
> needed or not.
[...]
> --- a/src/net.h
> +++ b/src/net.h
> @@ -79,11 +79,16 @@
>  typedef struct vpn_packet_t {
[...]
> +	union {
> +		struct {
> +			uint32_t local:1;		/* is this packet sent to a local interface? */
> +			uint32_t pmtud:1;		/* is this packet used for PMTU discovery? */
> +			uint32_t pmtur:1;		/* is this packet used for PMTU reply? */
> +			uint32_t compress:1;	/* is this packet compressed? */
> +			uint32_t unused:28;		/* unused bits */
> +		} flags;
> +		uint32_t flag_bits;			/* to easily zero out and access flag bits */
> +	};

The problem with bitfields is that it is undefined how they are laid out in
memory, so you cannot use them when you send them over the wire. If you add a
field to a protocol, it is best to use an uint32_t or similar type and #define
bit masks to set/reset individual bits by name. Even then you need to convert
the new field to network byte order with htonl() before sending the packet, and
do the reverse when receiving it.

> Second, the method "send_mtu_probe" now sends 3 packets between minmtu and
> maxmtu, with compression disabled. ("send_udppacket" takes this into
> account.)
[...]
> -	for(i = 0; i < 3; i++) {
> +	const int nbPackets = 3;
> +	for(i = 0; i < nbPackets; i++) {
>  		if(n->maxmtu <= n->minmtu)
>  			len = n->maxmtu;
>  		else
> -			len = n->minmtu + 1 + rand() % (n->maxmtu - n->minmtu);
> +			len = n->minmtu + 1 + (n->maxmtu + 1 - n->minmtu) / (double) nbPackets * i;

There is a reason I used rand(): suppose minmtu = 0 and maxmtu = 1514, some
firewall is blocking ICMP packets, and the actual path MTU is 505. Your scheme
sends packets with sizes 1, 506 and 1011. The last two packets are dropped, but
since tinc doesn't know about it it doesn't lower maxmtu. Minmtu is now set to
1, and the next round of packets have sizes 2, 506, 1011. You can calculate
yourself where this is going: tinc will stop after 30 probes and think the PMTU
is 30, which is not correct.

You can not assume anything when a probe is not returned, this could be because
the probe was larger than the PMTU, but also because the packet was dropped due
to temporary congestion. If you could, then your method would work fine.

> The method "mtu_probe_h" receives the MTU probes and adjusts the min/max mtu
> of the nodes accordingly. Afterwards, it replies the MTU packet to the
> sender. That enables the sender to get the MTU correctly as well.

That also is unfortunately the wrong thing to do. The PMTU does not need to be
the same in both directions.

> By the way, I also removed the method "usleep" from your code, since my
> version of mingw has the method already built in. Therefore, there is no
> need to define/implement the method on your own.
> Maybe you need to include that method conditionally, depending if there
> already exists a built-in "usleep" method.

Yes, I'll add an autoconf check for usleep.

> In case you don't like my coding style / submission style, please feel free
> to incorporate the changes to your repository your own way.

The coding style is OK, the git-email commit is also fine. But you should not
combine unrelated changes in one commit. If you accidentily did, you can undo
the last commit with "git reset HEAD^" (this undoes the commit but doesn't
throw away the changes you made to the files), and then commit the individual
bits and pieces again. Also, in most cases you should just remove lines instead
of uncommenting them, you can find them back in the git history anyway.

> If you could provide some git branches for external developers, that'd be
> really great!

Yes, I'll do that, just have to figure out what the best way is. Does anyone on
the list have any experience setting up a public git repository with private
branches?

> In this commit, I also finished rewriting the logging part. Each log level
> has now a flag and you can combine log levels you want to log by ORing the
> according flags.

Hm, I don't see that in the commit, only what you write below:

> CTRL+C enables DEBUG_SCARY_THINGS similar to the old behavior.

-- 
Met vriendelijke groet / with kind regards,
     Guus Sliepen <guus at tinc-vpn.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://www.tinc-vpn.org/pipermail/tinc-devel/attachments/20110106/7c7d1a99/attachment.pgp>


More information about the tinc-devel mailing list