Segfaults on connection loss

Tuomas Silen tuomas at silen.fi
Mon Jul 14 16:05:21 CEST 2014


Hi again,

I was able to get some additional debug log data of the issue also, there seems to be quite a 
ADD_EDGE/DEL_EDGE storm going on when it happens:

#0  edge_del (e=0x1930bf0) at edge.c:93
93                      e->reverse->reverse = NULL;

(gdb) bt
#0  edge_del (e=0x1930bf0) at edge.c:93
#1  0x0000000000408a65 in terminate_connection (report=false, c=0x18c5620) at net.c:188
#2  terminate_connection (c=0x18c5620, report=false) at net.c:168
#3  0x00000000004090d3 in check_network_activity (writeset=0x7fffd59438b0, readset=0x7fffd5943830) 
at net.c:342
#4  main_loop () at net.c:434
#5  0x000000000040478a in main (argc=<optimized out>, argv=<optimized out>) at tincd.c:656

(gdb) p c.name
$3 = 0x188a6d0 "fe_ap_se01"
(gdb) p c.status
$4 = {pinged = 0, active = 0, connecting = 0, unused_termreq = 0, remove = 1, timeout = 0, 
encryptout = 0, decryptin = 0, mst = 0, unused = 0}
(gdb) p c.edge.reverse
$6 = (struct edge_t *) 0x50
(gdb) p *c.edge.reverse
Cannot access memory at address 0x50

Log:
...
2014-07-14T11:40:44.74929 Forwarding DEL_EDGE from su01 (x.x.x.99 port 655): 13 4e37971b fe_ap_se01 be07
2014-07-14T11:40:44.74929 Sending 36 bytes of metadata to be04 (x.x.x.166 port 655)
2014-07-14T11:40:44.74930 Sending 36 bytes of metadata to be06 (x.x.x.177 port 655)
2014-07-14T11:40:44.74930 Sending 36 bytes of metadata to be05 (x.x.x.229 port 655)
2014-07-14T11:40:44.74930 Sending 36 bytes of metadata to be02 (x.x.x.181 port 655)
2014-07-14T11:40:44.74930 Sending 36 bytes of metadata to fe_eu_west01 (x.x.x.114 port 655)
2014-07-14T11:40:44.74931 Sending 36 bytes of metadata to fe_eu_west02 (x.x.x.115 port 655)
2014-07-14T11:40:44.74931 Sending 36 bytes of metadata to fe_us_da01 (x.x.x.21 port 655)
2014-07-14T11:40:44.74931 Sending 36 bytes of metadata to fe_us_ea01 (x.x.x.203 port 655)
2014-07-14T11:40:44.74932 Sending 36 bytes of metadata to fe_us_ea02 (x.x.x.137 port 655)
2014-07-14T11:40:44.74932 Sending 36 bytes of metadata to fe_us_ea03 (x.x.x.239 port 655)
2014-07-14T11:40:44.74932 Sending 36 bytes of metadata to fe04 (x.x.x.34 port 41551)
2014-07-14T11:40:44.74932 Sending 36 bytes of metadata to su02 (x.x.x.98 port 655)
2014-07-14T11:40:44.74932 Sending 36 bytes of metadata to su03 (x.x.x.202 port 656)
2014-07-14T11:40:44.74933 Sending 36 bytes of metadata to be03 (x.x.x.50 port 655)
2014-07-14T11:40:44.74933 Sending 36 bytes of metadata to fe03 (x.x.x.177 port 56896)
2014-07-14T11:40:44.74933 Sending 36 bytes of metadata to fe05 (x.x.x.198 port 35178)
2014-07-14T11:40:44.74933 Sending 36 bytes of metadata to be07 (x.x.x.4 port 655)
2014-07-14T11:40:44.74934 Sending 36 bytes of metadata to fe01 (x.x.x.164 port 655)
2014-07-14T11:40:44.74934 Sending 36 bytes of metadata to be01 (x:x:x:4424::1 port 40768)
2014-07-14T11:40:44.74934 Sending 36 bytes of metadata to fe_ap_se02 (x.x.x.82 port 34249)
2014-07-14T11:40:44.74934 Sending 36 bytes of metadata to be08 (x:x:x:b6::2 port 50485)
2014-07-14T11:40:44.74935 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 79d9bdfe fe_ap_se02 fe02 
x.x.x.146 655 c 849
2014-07-14T11:40:44.74935 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 1706c355 fe01 fe_ap_se02 
x:x:x:d74::2 655 c 3245
2014-07-14T11:40:44.74935 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 7e3dd1e2 fe_ap_se02 be05 
x.x.x.229 655 c 859
2014-07-14T11:40:44.74935 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 4e8c51cb fe_ap_se02 be03 
x.x.x.50 655 c 857
2014-07-14T11:40:44.74936 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 730c5871 fe_ap_se02 fe01 
x:x:x:54a3::2 655 c 3245
2014-07-14T11:40:44.74936 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 123b6762 fe_eu_west02 
fe_ap_se01
2014-07-14T11:40:44.74936 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 76d19a1d fe_eu_west01 
fe_ap_se01
2014-07-14T11:40:44.74936 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 354ef41b fe_ap_se02 be08 
x.x.x.182 655 c 882
2014-07-14T11:40:44.74937 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 6730c99a fe_ap_se02 be06 
x.x.x.177 655 c 882
2014-07-14T11:40:44.74937 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 7c5d290e fe_ap_se02 be07 
x.x.x.4 655 c 892
2014-07-14T11:40:44.74938 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 176b2ecc fe_ap_se02 fe04 
x.x.x.34 655 c 892
2014-07-14T11:40:44.74938 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 2a5deba5 fe_ap_se02 su01 
x.x.x.99 655 c 894
2014-07-14T11:40:44.74938 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 4b8c6f57 fe_ap_se02 fe01 
x.x.x.164 655 c 894
2014-07-14T11:40:44.74938 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 3b230564 fe_eu_west01 
fe_ap_se01 x.x.x.118 655 c 484
2014-07-14T11:40:44.74939 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 7fb94445 fe_eu_west02 
fe_ap_se01 x.x.x.118 655 c 486
2014-07-14T11:40:44.74939 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 3fe0e8ba fe_ap_se02 su02 
x.x.x.98 655 c 898
2014-07-14T11:40:44.74939 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 6d3dad8a fe_ap_se02 be01 
x.x.x.83 655 c 894
2014-07-14T11:40:44.74939 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 1e42f6e4 fe_ap_se02 be02 
x.x.x.181 655 c 896
2014-07-14T11:40:44.74940 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 1a3a80fc be06 fe_ap_se01
2014-07-14T11:40:44.74940 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 3903b0cb be05 fe_ap_se01
2014-07-14T11:40:44.74940 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 4d5ad531 fe04 fe_ap_se01
2014-07-14T11:40:44.74941 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 3d43707b su02 fe_ap_se01
2014-07-14T11:40:44.74941 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 213b50d1 be02 fe_ap_se01
2014-07-14T11:40:44.74941 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 4da56662 fe_ap_se02 
fe_ap_se01 x.x.x.118 655 c 19
2014-07-14T11:40:44.74941 Got ADD_EDGE from fe03 (x.x.x.177 port 56896): 12 4b37aee7 fe_ap_se01 
fe_ap_se02 x.x.x.82 655 c 19
2014-07-14T11:40:44.74942 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 1084d927 be07 fe_ap_se01
2014-07-14T11:40:44.74942 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 30a264aa fe_ap_se01 be06
2014-07-14T11:40:44.74942 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 11407212 fe_ap_se01 be05
2014-07-14T11:40:44.74942 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 46ab6a05 fe_ap_se01 be08
2014-07-14T11:40:44.74943 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 55d1f17b fe_ap_se01 be02
2014-07-14T11:40:44.74943 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 3cba43b0 fe_ap_se01 
fe_eu_west01
2014-07-14T11:40:44.74943 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 5dd7f23b fe_ap_se01 
fe_eu_west02
2014-07-14T11:40:44.74943 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 474ec98b fe03 fe_ap_se01
2014-07-14T11:40:44.74944 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 378ecf0c fe_ap_se01 
fe_us_da01
2014-07-14T11:40:44.74944 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 79c70f58 fe_ap_se01 
fe_us_ea01
2014-07-14T11:40:44.74944 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 633497ac fe_ap_se01 
fe_us_ea02
2014-07-14T11:40:44.74944 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 27ab13d9 fe_ap_se01 
fe_us_ea03
2014-07-14T11:40:44.74945 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 2d9fd736 fe_ap_se01 fe04
2014-07-14T11:40:44.74945 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 7de0ad83 fe_ap_se01 su02
2014-07-14T11:40:44.74945 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 34b3c326 fe_ap_se01 be01
2014-07-14T11:40:44.74946 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 539499c4 fe_ap_se01 su03
2014-07-14T11:40:44.74946 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 2e33b5b2 fe_ap_se01 be03
2014-07-14T11:40:44.74946 Got DEL_EDGE from fe03 (x.x.x.177 port 56896): 13 7af672e1 fe_ap_se01 fe03
2014-07-14T11:40:44.74947 Metadata socket read error for fe05 (x.x.x.198 port 35178): Connection 
reset by peer
2014-07-14T11:40:44.74947 Closing connection with fe05 (x.x.x.198 port 35178)
2014-07-14T11:40:44.74947 Sending DEL_EDGE to everyone (BROADCAST): 13 255515e9 fe02 fe05
2014-07-14T11:40:44.74947 Sending 29 bytes of metadata to be04 (x.x.x.166 port 655)
2014-07-14T11:40:44.74947 Sending 29 bytes of metadata to be06 (x.x.x.177 port 655)
2014-07-14T11:40:44.74948 Sending 29 bytes of metadata to be05 (x.x.x.229 port 655)
2014-07-14T11:40:44.74948 Sending 29 bytes of metadata to be02 (x.x.x.181 port 655)
2014-07-14T11:40:44.74948 Sending 29 bytes of metadata to fe_eu_west01 (x.x.x.114 port 655)
2014-07-14T11:40:44.74948 Sending 29 bytes of metadata to fe_eu_west02 (x.x.x.115 port 655)
2014-07-14T11:40:44.74949 Sending 29 bytes of metadata to fe_us_da01 (x.x.x.21 port 655)
2014-07-14T11:40:44.74949 Sending 29 bytes of metadata to fe_us_ea01 (x.x.x.203 port 655)
2014-07-14T11:40:44.74949 Sending 29 bytes of metadata to fe_us_ea02 (x.x.x.137 port 655)
2014-07-14T11:40:44.74950 Sending 29 bytes of metadata to fe_us_ea03 (x.x.x.239 port 655)
2014-07-14T11:40:44.74950 Sending 29 bytes of metadata to fe04 (x.x.x.34 port 41551)
2014-07-14T11:40:44.74950 Sending 29 bytes of metadata to su02 (x.x.x.98 port 655)
2014-07-14T11:40:44.74950 Sending 29 bytes of metadata to su03 (x.x.x.202 port 656)
2014-07-14T11:40:44.74950 Sending 29 bytes of metadata to be03 (x.x.x.50 port 655)
2014-07-14T11:40:44.74951 Sending 29 bytes of metadata to su01 (x.x.x.99 port 655)
2014-07-14T11:40:44.74951 Sending 29 bytes of metadata to fe03 (x.x.x.177 port 56896)
2014-07-14T11:40:44.74951 Sending 29 bytes of metadata to be07 (x.x.x.4 port 655)
2014-07-14T11:40:44.74951 Sending 29 bytes of metadata to fe01 (x.x.x.164 port 655)
2014-07-14T11:40:44.74952 Sending 29 bytes of metadata to be01 (x:x:x:4424::1 port 40768)
2014-07-14T11:40:44.74952 Sending 29 bytes of metadata to fe_ap_se02 (x.x.x.82 port 34249)
2014-07-14T11:40:44.74952 Sending 29 bytes of metadata to be08 (x:x:x:b6::2 port 50485)
2014-07-14T11:40:44.74952 Could not set up a meta connection to fe05
2014-07-14T11:40:44.74952 Trying to re-establish outgoing connection in 5 seconds
2014-07-14T11:40:44.74953 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 79d9bdfe fe_ap_se02 fe02 
x.x.x.146 655 c 849
2014-07-14T11:40:44.74953 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 1706c355 fe01 fe_ap_se02 
x:x:x:d74::2 655 c 3245
2014-07-14T11:40:44.74953 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 7e3dd1e2 fe_ap_se02 be05 
x.x.x.229 655 c 859
2014-07-14T11:40:44.74954 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 4e8c51cb fe_ap_se02 be03 
x.x.x.50 655 c 857
2014-07-14T11:40:44.74954 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 730c5871 fe_ap_se02 fe01 
x:x:x:54a3::2 655 c 3245
2014-07-14T11:40:44.74954 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 123b6762 fe_eu_west02 fe_ap_se01
2014-07-14T11:40:44.74954 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 76d19a1d fe_eu_west01 fe_ap_se01
2014-07-14T11:40:44.74955 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 354ef41b fe_ap_se02 be08 
x.x.x.182 655 c 882
2014-07-14T11:40:44.74955 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 6730c99a fe_ap_se02 be06 
x.x.x.177 655 c 882
2014-07-14T11:40:44.74955 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 7c5d290e fe_ap_se02 be07 
x.x.x.4 655 c 892
2014-07-14T11:40:44.74956 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 176b2ecc fe_ap_se02 fe04 
x.x.x.34 655 c 892
2014-07-14T11:40:44.74956 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 2a5deba5 fe_ap_se02 su01 
x.x.x.99 655 c 894
2014-07-14T11:40:44.74956 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 4b8c6f57 fe_ap_se02 fe01 
x.x.x.164 655 c 894
2014-07-14T11:40:44.74957 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 3b230564 fe_eu_west01 
fe_ap_se01 x.x.x.118 655 c 484
2014-07-14T11:40:44.74957 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 2ad32681 fe_eu_west02 
fe_ap_se01 x.x.x.118 655 c 486
2014-07-14T11:40:44.74957 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 3fe0e8ba fe_ap_se02 su02 
x.x.x.98 655 c 898
2014-07-14T11:40:44.74958 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 6d3dad8a fe_ap_se02 be01 
x.x.x.83 655 c 894
2014-07-14T11:40:44.74958 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 1e42f6e4 fe_ap_se02 be02 
x.x.x.181 655 c 896
2014-07-14T11:40:44.74958 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 1a3a80fc be06 fe_ap_se01
2014-07-14T11:40:44.74959 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 1084d927 be07 fe_ap_se01
2014-07-14T11:40:44.74959 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 3d43707b su02 fe_ap_se01
2014-07-14T11:40:44.74959 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 3903b0cb be05 fe_ap_se01
2014-07-14T11:40:44.74959 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 4d5ad531 fe04 fe_ap_se01
2014-07-14T11:40:44.74959 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 213b50d1 be02 fe_ap_se01
2014-07-14T11:40:44.74960 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 4da56662 fe_ap_se02 
fe_ap_se01 x.x.x.118 655 c 19
2014-07-14T11:40:44.74960 Got ADD_EDGE from be07 (x.x.x.4 port 655): 12 4b37aee7 fe_ap_se01 
fe_ap_se02 x.x.x.82 655 c 19
2014-07-14T11:40:44.74960 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 30a264aa fe_ap_se01 be06
2014-07-14T11:40:44.74961 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 11407212 fe_ap_se01 be05
2014-07-14T11:40:44.74961 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 46ab6a05 fe_ap_se01 be08
2014-07-14T11:40:44.74961 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 55d1f17b fe_ap_se01 be02
2014-07-14T11:40:44.74961 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 3cba43b0 fe_ap_se01 fe_eu_west01
2014-07-14T11:40:44.74962 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 5dd7f23b fe_ap_se01 fe_eu_west02
2014-07-14T11:40:44.74962 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 474ec98b fe03 fe_ap_se01
2014-07-14T11:40:44.74962 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 378ecf0c fe_ap_se01 fe_us_da01
2014-07-14T11:40:44.74962 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 79c70f58 fe_ap_se01 fe_us_ea01
2014-07-14T11:40:44.74963 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 633497ac fe_ap_se01 fe_us_ea02
2014-07-14T11:40:44.74963 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 27ab13d9 fe_ap_se01 fe_us_ea03
2014-07-14T11:40:44.74963 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 2d9fd736 fe_ap_se01 fe04
2014-07-14T11:40:44.74963 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 7de0ad83 fe_ap_se01 su02
2014-07-14T11:40:44.74964 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 34b3c326 fe_ap_se01 be01
2014-07-14T11:40:44.74964 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 539499c4 fe_ap_se01 su03
2014-07-14T11:40:44.74964 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 2e33b5b2 fe_ap_se01 be03
2014-07-14T11:40:44.74965 Got DEL_EDGE from be07 (x.x.x.4 port 655): 13 7af672e1 fe_ap_se01 fe03
2014-07-14T11:40:44.74965 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 79d9bdfe fe_ap_se02 fe02 
x.x.x.146 655 c 849
2014-07-14T11:40:44.74965 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 1706c355 fe01 fe_ap_se02 
x:x:x:d74::2 655 c 3245
2014-07-14T11:40:44.74965 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 7e3dd1e2 fe_ap_se02 be05 
x.x.x.229 655 c 859
2014-07-14T11:40:44.74966 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 4e8c51cb fe_ap_se02 be03 
x.x.x.50 655 c 857
2014-07-14T11:40:44.74966 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 730c5871 fe_ap_se02 fe01 
x:x:x:54a3::2 655 c 3245
2014-07-14T11:40:44.74966 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 123b6762 fe_eu_west02 
fe_ap_se01
2014-07-14T11:40:44.74967 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 76d19a1d fe_eu_west01 
fe_ap_se01
2014-07-14T11:40:44.74967 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 354ef41b fe_ap_se02 be08 
x.x.x.182 655 c 882
2014-07-14T11:40:44.74967 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 6730c99a fe_ap_se02 be06 
x.x.x.177 655 c 882
2014-07-14T11:40:44.74968 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 7c5d290e fe_ap_se02 be07 
x.x.x.4 655 c 892
2014-07-14T11:40:44.74968 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 176b2ecc fe_ap_se02 fe04 
x.x.x.34 655 c 892
2014-07-14T11:40:44.74968 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 2a5deba5 fe_ap_se02 su01 
x.x.x.99 655 c 894
2014-07-14T11:40:44.74968 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 4b8c6f57 fe_ap_se02 fe01 
x.x.x.164 655 c 894
2014-07-14T11:40:44.74969 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 3ed96212 fe_eu_west01 
fe_ap_se01 x.x.x.118 655 c 484
2014-07-14T11:40:44.74969 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 2ad32681 fe_eu_west02 
fe_ap_se01 x.x.x.118 655 c 486
2014-07-14T11:40:44.74969 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 3fe0e8ba fe_ap_se02 su02 
x.x.x.98 655 c 898
2014-07-14T11:40:44.74970 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 6d3dad8a fe_ap_se02 be01 
x.x.x.83 655 c 894
2014-07-14T11:40:44.74970 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 1e42f6e4 fe_ap_se02 be02 
x.x.x.181 655 c 896
2014-07-14T11:40:44.74970 Got PING from fe01 (x.x.x.164 port 655): 8
2014-07-14T11:40:44.74970 Sending PONG to fe01 (x.x.x.164 port 655): 9
2014-07-14T11:40:44.74971 Sending 2 bytes of metadata to fe01 (x.x.x.164 port 655)
2014-07-14T11:40:44.74971 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 1a3a80fc be06 fe_ap_se01
2014-07-14T11:40:44.74971 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 4d5ad531 fe04 fe_ap_se01
2014-07-14T11:40:44.74971 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 3d43707b su02 fe_ap_se01
2014-07-14T11:40:44.74972 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 3903b0cb be05 fe_ap_se01
2014-07-14T11:40:44.74972 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 213b50d1 be02 fe_ap_se01
2014-07-14T11:40:44.74972 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 4da56662 fe_ap_se02 
fe_ap_se01 x.x.x.118 655 c 19
2014-07-14T11:40:44.74972 Got ADD_EDGE from fe01 (x.x.x.164 port 655): 12 4b37aee7 fe_ap_se01 
fe_ap_se02 x.x.x.82 655 c 19
2014-07-14T11:40:44.74973 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 1084d927 be07 fe_ap_se01
2014-07-14T11:40:44.74973 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 30a264aa fe_ap_se01 be06
2014-07-14T11:40:44.74973 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 11407212 fe_ap_se01 be05
2014-07-14T11:40:44.74974 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 46ab6a05 fe_ap_se01 be08
2014-07-14T11:40:44.74974 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 55d1f17b fe_ap_se01 be02
2014-07-14T11:40:44.74974 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 3cba43b0 fe_ap_se01 
fe_eu_west01
2014-07-14T11:40:44.74974 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 5dd7f23b fe_ap_se01 
fe_eu_west02
2014-07-14T11:40:44.74975 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 474ec98b fe03 fe_ap_se01
2014-07-14T11:40:44.74975 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 378ecf0c fe_ap_se01 fe_us_da01
2014-07-14T11:40:44.74975 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 79c70f58 fe_ap_se01 fe_us_ea01
2014-07-14T11:40:44.74976 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 633497ac fe_ap_se01 fe_us_ea02
2014-07-14T11:40:44.74976 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 27ab13d9 fe_ap_se01 fe_us_ea03
2014-07-14T11:40:44.74976 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 2d9fd736 fe_ap_se01 fe04
2014-07-14T11:40:44.74977 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 7de0ad83 fe_ap_se01 su02
2014-07-14T11:40:44.74977 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 34b3c326 fe_ap_se01 be01
2014-07-14T11:40:44.74977 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 539499c4 fe_ap_se01 su03
2014-07-14T11:40:44.74977 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 2e33b5b2 fe_ap_se01 be03
2014-07-14T11:40:44.74978 Got DEL_EDGE from fe01 (x.x.x.164 port 655): 13 7af672e1 fe_ap_se01 fe03
2014-07-14T11:40:44.74978 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 79d9bdfe fe_ap_se02 
fe02 x.x.x.146 655 c 849
2014-07-14T11:40:44.74978 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 1706c355 fe01 
fe_ap_se02 x:x:x:d74::2 655 c 3245
2014-07-14T11:40:44.74978 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 7e3dd1e2 fe_ap_se02 
be05 x.x.x.229 655 c 859
2014-07-14T11:40:44.74979 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 4e8c51cb fe_ap_se02 
be03 x.x.x.50 655 c 857
2014-07-14T11:40:44.74979 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 730c5871 fe_ap_se02 
fe01 x:x:x:54a3::2 655 c 3245
2014-07-14T11:40:44.74979 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 123b6762 
fe_eu_west02 fe_ap_se01
2014-07-14T11:40:44.74980 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 76d19a1d 
fe_eu_west01 fe_ap_se01
2014-07-14T11:40:44.74980 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 354ef41b fe_ap_se02 
be08 x.x.x.182 655 c 882
2014-07-14T11:40:44.74980 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 6730c99a fe_ap_se02 
be06 x.x.x.177 655 c 882
2014-07-14T11:40:44.74980 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 7c5d290e fe_ap_se02 
be07 x.x.x.4 655 c 892
2014-07-14T11:40:44.74981 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 176b2ecc fe_ap_se02 
fe04 x.x.x.34 655 c 892
2014-07-14T11:40:44.74981 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 2a5deba5 fe_ap_se02 
su01 x.x.x.99 655 c 894
2014-07-14T11:40:44.74981 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 4b8c6f57 fe_ap_se02 
fe01 x.x.x.164 655 c 894
2014-07-14T11:40:44.74982 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 3b230564 
fe_eu_west01 fe_ap_se01 x.x.x.118 655 c 484
2014-07-14T11:40:44.74982 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 7fb94445 
fe_eu_west02 fe_ap_se01 x.x.x.118 655 c 486
2014-07-14T11:40:44.74982 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 3fe0e8ba fe_ap_se02 
su02 x.x.x.98 655 c 898
2014-07-14T11:40:44.74982 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 6d3dad8a fe_ap_se02 
be01 x.x.x.83 655 c 894
2014-07-14T11:40:44.74983 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 1e42f6e4 fe_ap_se02 
be02 x.x.x.181 655 c 896
2014-07-14T11:40:44.74983 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 1a3a80fc be06 fe_ap_se01
2014-07-14T11:40:44.74983 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 3903b0cb be05 fe_ap_se01
2014-07-14T11:40:44.74984 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 4d5ad531 fe04 fe_ap_se01
2014-07-14T11:40:44.74984 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 3d43707b su02 fe_ap_se01
2014-07-14T11:40:44.74984 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 213b50d1 be02 fe_ap_se01
2014-07-14T11:40:44.74985 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 4da56662 fe_ap_se02 
fe_ap_se01 x.x.x.118 655 c 19
2014-07-14T11:40:44.74985 Got ADD_EDGE from be01 (x:x:x:4424::1 port 40768): 12 4b37aee7 fe_ap_se01 
fe_ap_se02 x.x.x.82 655 c 19
2014-07-14T11:40:44.74985 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 1084d927 be07 fe_ap_se01
2014-07-14T11:40:44.74985 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 30a264aa fe_ap_se01 be06
2014-07-14T11:40:44.74986 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 11407212 fe_ap_se01 be05
2014-07-14T11:40:44.74986 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 46ab6a05 fe_ap_se01 be08
2014-07-14T11:40:44.74986 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 55d1f17b fe_ap_se01 be02
2014-07-14T11:40:44.74987 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 3cba43b0 fe_ap_se01 
fe_eu_west01
2014-07-14T11:40:44.74987 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 5dd7f23b fe_ap_se01 
fe_eu_west02
2014-07-14T11:40:44.74987 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 474ec98b fe03 fe_ap_se01
2014-07-14T11:40:44.74987 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 3e3e1577 be01 fe_ap_se01
2014-07-14T11:40:44.74988 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 378ecf0c fe_ap_se01 
fe_us_da01
2014-07-14T11:40:44.74988 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 79c70f58 fe_ap_se01 
fe_us_ea01
2014-07-14T11:40:44.74988 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 633497ac fe_ap_se01 
fe_us_ea02
2014-07-14T11:40:44.74988 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 27ab13d9 fe_ap_se01 
fe_us_ea03
2014-07-14T11:40:44.74989 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 2d9fd736 fe_ap_se01 fe04
2014-07-14T11:40:44.74989 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 7de0ad83 fe_ap_se01 su02
2014-07-14T11:40:44.74989 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 34b3c326 fe_ap_se01 be01
2014-07-14T11:40:44.74990 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 539499c4 fe_ap_se01 su03
2014-07-14T11:40:44.74990 Got DEL_EDGE from be01 (x:x:x:4424::1 port 40768): 13 2e33b5b2 fe_ap_se01 be03
2014-07-14T11:40:44.74990 Metadata socket read error for fe_ap_se01 (x:x:x:d73::2 port 655): 
Connection reset by peer *Segfault on fe_ap_se01*
2014-07-14T11:40:44.74990 Closing connection with fe_ap_se01 (x:x:x:d73::2 port 655)
*Segfault (fe02)*

So, possibly it happens when the edge is deleted while receiving metadata from the node.
Hope this helps.

Best regards,
-Tuomas

On 14/07/14 12:20, Tuomas Silen wrote:
> Hi all,
>
> Just wanted to update my status on this. It has recently happened a few times (with around 20 nodes)
> that a network issue with one or two nodes has resulted in the entire vpn network going down as the
> nodes keep reconnecting fast and then segfault (and restart) and end up in a state where all of the
> tincds keep segfaulting and restarting all the time and the vpn never comes up again. A segfault on
> one node causes a segfault on another node basically. At that point the only way to get it back up
> again has been to prevent enough tincds from restarting after the segfault and slowly take them back
> up again after the rest of the vpn has settled.
>
> When that happens, the logs are full of "Flushing x bytes to y would block", unfortunately the
> logging level has been 2, so there's not much additional information available.
>
> We also have both Ipv4 and IPv6 addresses for the nodes if it's relevant.
>
> Note that this is still on tinc 1.0.23, not sure if there's any relevant changes in 1.0.24, but I'll
> try to upgrade to it anyway.
>
> This is what's in the logs:
> 2014-07-14T04:56:54.79824 Flushing 585820 bytes to be06 (10.9.10.60 port 655) would block
> 2014-07-14T04:56:54.84673 Flushing 598125 bytes to be06 (10.9.10.60 port 655) would block
> 2014-07-14T04:56:54.91778 Flushing 636047 bytes to be06 (10.9.10.60 port 655) would block
> 2014-07-14T04:56:55.02065 be06 (10.9.10.60 port 655) could not flush for 3 seconds (732141 bytes
> remaining)
> 2014-07-14T04:56:55.02066 Closing connection with be06 (10.9.10.60 port 655)
> 2014-07-14T04:56:55.02094 Trying to connect to be06 (fc00:9:10::60 port 655)
> 2014-07-14T04:56:55.03630 Connected to be06 (fc00:9:10::60 port 655)
> ...
> 2014-07-14T04:56:56.14377 Trying to connect to fe03 (fc00:9:11::2 port 655)
> 2014-07-14T04:56:56.16306 Flushing 138968 bytes to fe01 (10.9.11.146 port 52052) would block
> 2014-07-14T04:56:56.16708 Flushing 177911 bytes to be03 (fc00:9:10::30 port 655) would block
> 2014-07-14T04:56:56.18829 Flushing 476742 bytes to fe06 (10.9.12.203 port 44588) would block
> 2014-07-14T04:56:56.20198 Flushing 188423 bytes to be03 (fc00:9:10::30 port 655) would block
> 2014-07-14T04:56:56.20940 Flushing 153377 bytes to fe01 (10.9.11.146 port 52052) would block
> 2014-07-14T04:56:56.22510 Metadata socket read error for be06 (fc00:9:10::60 port 655): Connection
> reset by peer <- *Segfault on be06*
> 2014-07-14T04:56:56.22512 Closing connection with be06 (fc00:9:10::60 port 655)
> *Segfault on this node*
>
> The gdb for that one looks like this:
>
> Program terminated with signal 11, Segmentation fault.
> #0  edge_del (e=0xa2ef90) at edge.c:93
> 93                      e->reverse->reverse = NULL;
> (gdb) bt
> #0  edge_del (e=0xa2ef90) at edge.c:93
> #1  0x0000000000408a65 in terminate_connection (report=false, c=0x921030) at net.c:188
> #2  terminate_connection (c=0x921030, report=false) at net.c:168
> #3  0x00000000004090d3 in check_network_activity (writeset=0x7fff443e1f00, readset=0x7fff443e1e80)
> at net.c:342
> #4  main_loop () at net.c:434
> #5  0x000000000040478a in main (argc=<optimized out>, argv=<optimized out>) at tincd.c:656
>
> *e = {from = 0xa72f80, to = 0x53c36317, address = {...}, in = {...}, in6 = {...}, options =
> 1646294117, weight = 2036688745, connection = 0x36372e3434312065, reverse = 0x3636312e3635312e}
> e.connection = (struct connection_t *) 0x36372e3434312065
> (gdb) p *e.connection
> Cannot access memory at address 0x36372e3434312065
>
> e.reverse = (struct edge_t *) 0x3636312e3635312e
> (gdb) p *e.reverse
> Cannot access memory at address 0x3636312e3635316e
>
> c.name = "be06"
> c.status = {pinged = 0, active = 0, connecting = 0, unused_termreq = 0, remove = 1, timeout = 0,
> encryptout = 0, decryptin = 0, mst = 0, unused = 0}
> c.last_ping_time = 1405313815
> c.last_flushed_time = 1405313815
> c.node.name = "be06"
>
> (gdb) p *c.edge.to
> Cannot access memory at address 0x53c36317
>
> c.edge.from = {name = 0x3665363836203231 <Address 0x3665363836203231 out of bounds>, options =
> 543373622, sock = 1700750694, address = {...}, hostname = 0x20323a3a33613435 <Address
> 0x20323a3a33613435 out of bounds>, status = {unused_active = 0, validkey = 1, unused_waitingforkey =
> 1, visited = 0, reachable = 1, indirect = 1, unused = 8443092},  nexthop = 0x3939206320353536,
> prevedge = 0x37, via = 0xe8aa4b5274392b04, subnet_tree = 0x21, edge_tree = 0xc51670, connection =
> 0x53c36317, sent_seqno = 9796560, received_seqno = 0, farfuture = 33, mtu = 25367, minmtu = 21443,
> maxmtu = 0, mtuprobes = 9796560, mtuevent = 0x41 }
>
> (gdb) p *c.edge.from.nexthop
> Cannot access memory at address 0x3939206320353536
>
> (gdb) p *c.edge.from.connection
> Cannot access memory at address 0x53c36317
>
>
> By the way, in tinc.conf the ping interval and timeout are pretty low:
> PingInterval = 3
> PingTimeout = 2
>
> Best regards,
>
> -Tuomas
>
>
> On 27/06/14 08:32, zorun wrote:
>> On Thu, Jun 26, 2014 at 08:33:01PM +0200, Guus Sliepen wrote:
>>> On Wed, Jun 25, 2014 at 08:40:08AM +0200, zorun wrote:
>>>
>>>> I have been able to trigger this segfault reliably, with Tinc 1.0.24:
>>>>
>>>> 0/ setup your local Tinc node to connect to a remote dual-stacked Tinc
>>>>     node (that is, the remote node has both a A and a AAAA record)
>>>>
>>>> 1/ run Tinc in debug mode, "tincd -n NETNAME -D -d 3"
>>>>
>>>> 2/ wait until Tinc establishes a connection with the remote server
>>>>     (lots of "Got PACKET from REMOTE (XX.XX.XX.XX port 656)" messages)
>>>>
>>>> 3/ apply an insane amount of delay on your physical interface: "tc
>>>>     qdisc del dev eth0 root netem delay 10s"
>>>>
>>>> 4/ wait for Tinc to timeout on a ping.  It will then try to reconnect,
>>>>     and crash:
>>>
>>> I tried to reproduce it, but I do not get a crash. Running it in
>>> valgrind doesn't produce any errors either. One difference I observe
>>> with your log:
>>>
>>>> REMOTE (XX.XX.XX.XX port 656) didn't respond to PING in 5 seconds
>>>> Closing connection with REMOTE (XX.XX.XX.XX port 656)
>>>> Sending DEL_EDGE to everyone (BROADCAST)
>>>> UDP address of REMOTE cleared
>>>> UDP address of OTHER_SERVER1 cleared
>>>> UDP address of OTHER_SERVER2 cleared
>>>> UDP address of OTHER_SERVER3 cleared
>>>> UDP address of OTHER_SERVER4 cleared
>>>> Sending DEL_EDGE to everyone (BROADCAST)
>>>> Trying to connect to REMOTE (2001:db8::1 port 656)
>>>> Connected to REMOTE (2001:db8::1 port 656)
>>>> Sending ID to REMOTE (2001:db8::1 port 656)
>>>> Timeout from REMOTE (2001:db8::1 port 656) during authentication
>>>
>>> I wonder how you get to this point with the tc qdisc command, because if
>>> I do that then it never gets to the point where it makes a connection to
>>> the other side, because the SYN packets are also delayed 10 seconds. So
>>> I get:
>>>
>>> Trying to connect to REMOTE (2606:2800:220:6d:26bf:1447:1097:aa7 port 655)
>>> Connected to REMOTE (2606:2800:220:6d:26bf:1447:1097:aa7 port 655)
>>> Sending ID to REMOTE (2606:2800:220:6d:26bf:1447:1097:aa7 port 655)
>>> Timeout from REMOTE (2606:2800:220:6d:26bf:1447:1097:aa7 port 655) during authentication
>>
>> Isn't that the same output?  Didn't you mean that you *don't* see the
>> "Connected to REMOTE" part?
>>
>> But you are right, I forgot to mention I was using a SOCKS5 proxy on
>> localhost, sorry (I have even removed that from the log, it seems :/).
>> It makes sense: the TCP connection is established immediately with the
>> proxy on localhost, since it is not affected by the delaying qdisc.
>>
>> You can try this:
>>
>> - ssh -D 4242 -N remote-host
>> - "Proxy = socks5 localhost 4242" in tinc.conf
>>
>> Or even a simple SSH tunnel, the result is the same.  Assuming your
>> ssh client will listen on ::1 and your OS doesn't use IPV6_V6ONLY by
>> default:
>>
>> - "ssh -N -L 6565:localhost:655 remote-host"
>> - hardcode the remote address to both "::1 6565 and "127.0.0.1 6565"
>>    in the host file
>>
>>> Another difference is that you have 4 other nodes connected to the VPN,
>>> what happens if those are not running when you do this test?
>>
>> I can test that, yes.
>> _______________________________________________
>> tinc-devel mailing list
>> tinc-devel at tinc-vpn.org
>> http://www.tinc-vpn.org/cgi-bin/mailman/listinfo/tinc-devel
>>
> _______________________________________________
> tinc-devel mailing list
> tinc-devel at tinc-vpn.org
> http://www.tinc-vpn.org/cgi-bin/mailman/listinfo/tinc-devel


More information about the tinc-devel mailing list