subnet flooded with lots of ADD_EDGE request

Vitaly Gorodetsky vgorodetsky at augury.com
Tue Dec 11 15:34:19 CET 2018


We have the same network topology of 6 hub nodes and 30+ leaf nodes.
We also suffering from periodic network blockage and very high network
usage of ~300mb per node a day (50mb of upload and 250 of download).
This high traffic usage happens on idle network.
We have 1.0.34 version on our nodes.

On Tue, Dec 11, 2018 at 12:52 PM Amit Lianson <lthmod at gmail.com> wrote:

> Hello,
>   We're suffering from sporadic network blockage(read: unable to ping
> other nodes) with 1.1-pre17.  Before upgrading to the 1.1-pre release,
> the same network blockage also manifested itself in a pure 1.0.33
> network.
>
>   The log shows that there are a lot of "Got ADD_EDGE from nodeX
> (192.168.0.1 port 655) which does not match existing entry" and it
> turns out that the mismatches were cuased by different weight received
> by add_edge_h().
>
>   This network is consists of ~4 hub nodes and 50+ leaf nodes.  Sample
> hub config:
>   Name = hub1
>   ConnectTo = hub2
>   ConnectTo = hub3
>   ConnectTo = hub4
>
>   Leaf looks like:
>    Name = node1
>    ConnectTo = hub1
>    ConnectTo = hub2
>    ConnectTo = hub3
>    ConnectTo = hub4
>
>   Back to the days of pure 1.0.33 nodes, if the network suddenly
> fails(users will see tincd CPU usage goes 50%+ and unable to get ping
> response from the other nodes), we can simply shutdown the hub nodes,
> wait for a few minutes and then restart the hub nodes to get the
> network back to normal; however, 1.1-pre release seems to autoconnect
> to non-hub hosts based on the information found in /etc/tinc/hosts, which
> means that the hub-restarting trick won't work.  Additionally, apart
> from high CPU usage, 1.1-pre tincd also starts hogging memory until
> Linux OOM kills the process(memory leakage perhaps?).
>
>    Given that many of our leaf nodes are behind NAT thus there's no
> direct connection to them expect tinc tunnel, I'm wondering about if
> there's any way to bring the network back to work without shutting
> down all nodes?  Moreover, is there any better way to pin-point the
> offending nodes that introduced this symptom?
>
> Thanks,
> A.
> _______________________________________________
> tinc mailing list
> tinc at tinc-vpn.org
> https://www.tinc-vpn.org/cgi-bin/mailman/listinfo/tinc
>


-- 
*Vitaly Gorodetsky*
Infrastructure Lead

Mobile: +972-52-6420530
vgorodetsky at augury.com

39 Haatzmaut St., 1st Floor,
Haifa, 3303320, Israel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.tinc-vpn.org/pipermail/tinc/attachments/20181211/8346cf5e/attachment.html>


More information about the tinc mailing list