SegFault when using TunnelServer=yes

Anton Avramov lukav at lukav.com
Fri Jun 19 18:22:36 CEST 2020


Hi all,

I have a network with about ~800. The network is a mix of tinc 1.0 and 
1.1 nodes. It is gradually expanding for several years now.

The problem is that at some point it seams the daemon can not handle the 
processing of the new connection and the edges.

There are 3 major nodes in the system and every other node initially 
makes connection to one of them.

Now after a lot of debugging I've limited to all nodes to connect only 
to one node, and use iptables to grant new connections gradually. last 
limit was 5 per minute.

I've started to monitor how the edges are growing on the main node and I 
see that although I've limited the connections on the other 2 major 
nodes at some point there are rapid spikes in the edges when new 
connection is established.
So my guess is that the other nodes have a previous state on the edges 
when they try to push it, that is causing the main nodes to become 
overwhelmed.

So I've decided to put TunnelServer=yes on the major nodes so they don't 
propagate the connections on the other nodes.

However I get a segfault soon after starting on each node that I enable 
that option.

I've build from the latest code and here is a trace of such a run: (this 
is not from a "major" node, but the effect is the same)

Got ANS_KEY from Backbone (164.138.216.106 port 655): 16 Office 
Lukav_Beast 
52201D7CFDC2C7E1FD7871A36E651B7AC24A52B4ED892CD953397F6BA859AB22D5D4CB235B9CF85910B6BDE91A34C85E 
427 672 4 0 94.155.19.130 13935
Using reflexive UDP address from Office: 94.155.19.130 port 13935
UDP address of Office set to 94.155.19.130 port 13935
Got REQ_KEY from Backbone (164.138.216.106 port 655): 15 Office Lukav_Beast

Program received signal SIGSEGV, Segmentation fault.
0x000055555556de41 in send_ans_key (to=to at entry=0x555555851060) at 
protocol_key.c:382
382        return send_request(to->nexthop->connection, "%d %s %s %s %d 
%d %d %d", ANS_KEY,
(gdb) bt
#0  0x000055555556de41 in send_ans_key (to=to at entry=0x555555851060) at 
protocol_key.c:382
#1  0x000055555556e169 in req_key_h (c=0x555555851be0, 
request=0x555555854bb7 "15 Office Lukav_Beast") at protocol_key.c:304
#2  0x000055555556a083 in receive_request (c=c at entry=0x555555851be0, 
request=0x555555854bb7 "15 Office Lukav_Beast") at protocol.c:146
#3  0x000055555555e993 in receive_meta (c=c at entry=0x555555851be0) at 
meta.c:333
#4  0x00005555555603f9 in handle_meta_connection_data 
(c=c at entry=0x555555851be0) at net.c:304
#5  0x00005555555678c2 in handle_meta_io (data=0x555555851be0, 
flags=<optimized out>) at net_socket.c:520
#6  0x000055555555c60a in event_loop () at event.c:359
#7  0x00005555555607f2 in main_loop () at net.c:510
#8  0x0000555555559208 in main (argc=6, argv=<optimized out>) at tincd.c:558
(gdb) bt full
#0  0x000055555556de41 in send_ans_key (to=to at entry=0x555555851060) at 
protocol_key.c:382
         keylen = <optimized out>
         key = 
"527E64B1DB47F2F527ADF7F609498FFCB4807AEC3CD49697D3D8D870619BC537E1B7C403875D81FC608A8F6E00D06063\000\306\377\377\377\177\000\000\331\334VUUU", 
'\000' <repeats 11 times>, 
"*ֲ\322\316\000\305\000\000\000\000\000\000\000\000\340\033\205UUU\000\000\001\000\000\000\000\000\000\000P\316\377\377\377\177\000\000\267K\205UUU\000\000`\020\205UUU\000\000@\306\377\377\377\177\000\000i\341VUUU\000\000\000\000\000\000\377\177\000\000\000\000\000\000\000\000\000\000"...
#1  0x000055555556e169 in req_key_h (c=0x555555851be0, 
request=0x555555854bb7 "15 Office Lukav_Beast") at protocol_key.c:304
         from_name = "Office\000\061\071.130", '\000' <repeats 1003 
times>...
         to_name = "Lukav_Beast", '\000' <repeats 366 times>...
         from = 0x555555851060
         to = <optimized out>
         reqno = 0
#2  0x000055555556a083 in receive_request (c=c at entry=0x555555851be0, 
request=0x555555854bb7 "15 Office Lukav_Beast") at protocol.c:146
         reqno = <optimized out>
#3  0x000055555555e993 in receive_meta (c=c at entry=0x555555851be0) at 
meta.c:333
         result = <optimized out>
         request = <optimized out>
         inlen = 0
         inbuf = 
"a\354\357\063J\363{\346d\177\271\371;+\212\371zFDt\271\061\370\ao\373\326\035\255=Α\254\257:\245\322ү\vƦ\205\035\336?1\234\372\001\004\063\323\t\004-\b8\367\f\201\342\304g\332\361jL76C\340-\t\006\210\214\314,C\352)ͺa\314\fAe\260\226\313\337\360|\256\236\263\344\205\061\207\303\t<\016\351\360\222\343[\317o\377\065<Ή?b(\267\321\356\360\242p$\314`\325ʆ\001|\036\204'\\\205i\314W\356#N4\000q\320\300\344\071\060\236w\016\306[\323X]\237\321\347\177\313KU\367ޚ\b}\307\374\367\032c\036\332:\307\367\265o\307Ƒ\212J\006NJ3!\305q\367\255\263\246\200i\035\327͌\001"...
         bufp = 0x7fffffffd6f0 
"a\354\357\063J\363{\346d\177\271\371;+\212\371zFDt\271\061\370\ao\373\326\035\255=Α\254\257:\245\322ү\vƦ\205\035\336?1\234\372\001\004\063\323\t\004-\b8\367\f\201\342\304g\332\361jL76C\340-\t\006\210\214\314,C\352)ͺa\314\fAe\260\226\313\337\360|\256\236\263\344\205\061\207\303\t<\016\351\360\222\343[\317o\377\065<Ή?b(\267\321\356\360\242p$\314`\325ʆ\001|\036\204'\\\205i\314W\356#N4"
         endp = <optimized out>
#4  0x00005555555603f9 in handle_meta_connection_data 
(c=c at entry=0x555555851be0) at net.c:304
No locals.
#5  0x00005555555678c2 in handle_meta_io (data=0x555555851be0, 
flags=<optimized out>) at net_socket.c:520
         c = 0x555555851be0
         socket_error = <optimized out>
         len = <optimized out>
#6  0x000055555555c60a in event_loop () at event.c:359
         node = 0x555555797dd8 <signalio+24>
         next = 0x555555797dd8 <signalio+24>
---Type <return> to continue, or q <return> to quit---
         io = 0x555555851d90
         tv = <optimized out>
         fds = <optimized out>
         curgen = 7
         diff = {tv_sec = 0, tv_usec = 512516}
         n = <optimized out>
         readable = {fds_bits = {256, 0 <repeats 15 times>}}
         writable = {fds_bits = {0 <repeats 16 times>}}
#7  0x00005555555607f2 in main_loop () at net.c:510
         sighup = {signum = 1, cb = 0x555555560480 <sighup_handler>, 
data = 0x7fffffffe1a0, node = {next = 0x7fffffffe2a8, prev = 0x0,
             parent = 0x7fffffffe2a8, left = 0x0, right = 0x0, data = 
0x7fffffffe1a0}}
         sigterm = {signum = 15, cb = 0x55555555f900 <sigterm_handler>, 
data = 0x7fffffffe1f0, node = {next = 0x0, prev = 0x7fffffffe2f8,
             parent = 0x7fffffffe2f8, left = 0x0, right = 0x0, data = 
0x7fffffffe1f0}}
         sigquit = {signum = 3, cb = 0x55555555f900 <sigterm_handler>, 
data = 0x7fffffffe240, node = {next = 0x7fffffffe2f8,
             prev = 0x7fffffffe2a8, parent = 0x7fffffffe2f8, left = 
0x7fffffffe2a8, right = 0x0, data = 0x7fffffffe240}}
         sigint = {signum = 2, cb = 0x55555555f900 <sigterm_handler>, 
data = 0x7fffffffe290, node = {next = 0x7fffffffe258,
             prev = 0x7fffffffe1b8, parent = 0x7fffffffe258, left = 
0x7fffffffe1b8, right = 0x0, data = 0x7fffffffe290}}
         sigalrm = {signum = 14, cb = 0x5555555605b0 <sigalrm_handler>, 
data = 0x7fffffffe2e0, node = {next = 0x7fffffffe208,
             prev = 0x7fffffffe258, parent = 0x0, left = 0x7fffffffe258, 
right = 0x7fffffffe208, data = 0x7fffffffe2e0}}
#8  0x0000555555559208 in main (argc=6, argv=<optimized out>) at tincd.c:558
         umbstr = <optimized out>
         priority = 0x0


Any help is much appreciated since my network is unusable at the moment




More information about the tinc-devel mailing list