subnet flooded with lots of ADD_EDGE request

Guus Sliepen guus at tinc-vpn.org
Tue Dec 18 17:05:37 CET 2018


On Tue, Dec 11, 2018 at 02:36:18PM +0800, Amit Lianson wrote:

>   We're suffering from sporadic network blockage(read: unable to ping
> other nodes) with 1.1-pre17.  Before upgrading to the 1.1-pre release,
> the same network blockage also manifested itself in a pure 1.0.33
> network.
> 
>   The log shows that there are a lot of "Got ADD_EDGE from nodeX
> (192.168.0.1 port 655) which does not match existing entry" and it
> turns out that the mismatches were cuased by different weight received
> by add_edge_h().
> 
>   This network is consists of ~4 hub nodes and 50+ leaf nodes.  Sample
> hub config:
[...]

Could you send me the output of "tincctl -n <netname> dump graph"? That
would help me to try to reproduce the problem. Also, if you could do
"tincctl -n <netname> log 5 >logfile" when the issue occurs, on the node
that gives those "Got ADD_EDGE which does not match existing entry"
messages, and let it run for a few seconds before stopping the logging,
and send me the resulting logfile.

>   Back to the days of pure 1.0.33 nodes, if the network suddenly
> fails(users will see tincd CPU usage goes 50%+ and unable to get ping
> response from the other nodes), we can simply shutdown the hub nodes,
> wait for a few minutes and then restart the hub nodes to get the
> network back to normal; however, 1.1-pre release seems to autoconnect
> to non-hub hosts based on the information found in /etc/tinc/hosts, which
> means that the hub-restarting trick won't work.  Additionally, apart
> from high CPU usage, 1.1-pre tincd also starts hogging memory until
> Linux OOM kills the process(memory leakage perhaps?).

You can disable the autoconnect feature by adding "AutoConnect = no" to
tinc.conf, unfortunately you'd have to do that on all nodes. And it
doesn't solve the actual problem. If it's hogging memory, that
definitely points to a memory leak.

>    Given that many of our leaf nodes are behind NAT thus there's no
> direct connection to them expect tinc tunnel, I'm wondering about if
> there's any way to bring the network back to work without shutting
> down all nodes?  Moreover, is there any better way to pin-point the
> offending nodes that introduced this symptom?

I hope the output from the "log 5" command will shed some more light on
the issue, as it will show which nodes the offending ADD_EDGE belongs
to.

-- 
Met vriendelijke groet / with kind regards,
     Guus Sliepen <guus at tinc-vpn.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://www.tinc-vpn.org/pipermail/tinc/attachments/20181218/a73944ba/attachment.sig>


More information about the tinc mailing list