[PATCH] Eternal flush, memory leaks

Scott Lamb slamb at slamb.org
Tue Feb 13 10:02:55 CET 2007


Using tincd 1.0.7, if I send a SIGALRM to tincd when a host is  
unresolvable, it gets stuck in a nasty loop:

Feb 12 19:33:02 rosalyn tinc.slamb.org[2925]: Got ALRM signal
Feb 12 19:33:02 rosalyn tinc.slamb.org[2925]: Trying to connect to  
calvin (216.136.66.56 port 655)
Feb 12 19:33:02 rosalyn tinc.slamb.org[2925]: Error looking up slamb- 
linux.dyn.slamb.org port 4500: Name or service not known
Feb 12 19:33:02 rosalyn tinc.slamb.org[2925]: Could not set up a meta  
connection to slamb_linux
Feb 12 19:33:02 rosalyn tinc.slamb.org[2925]: Trying to re-establish  
outgoing connection in 15 seconds
Feb 12 19:33:03 rosalyn tinc.slamb.org[2925]: Flushing event queue
Feb 12 19:33:03 rosalyn tinc.slamb.org[2925]: Error looking up slamb- 
linux.dyn.slamb.org port 4500: Name or service not known
Feb 12 19:33:03 rosalyn tinc.slamb.org[2925]: Could not set up a meta  
connection to slamb_linux
Feb 12 19:33:03 rosalyn tinc.slamb.org[2925]: Trying to re-establish  
outgoing connection in 20 seconds
Feb 12 19:33:03 rosalyn tinc.slamb.org[2925]: Error looking up slamb- 
linux.dyn.slamb.org port 4500: Name or service not known
Feb 12 19:33:03 rosalyn tinc.slamb.org[2925]: Could not set up a meta  
connection to slamb_linux
Feb 12 19:33:03 rosalyn tinc.slamb.org[2925]: Trying to re-establish  
outgoing connection in 25 seconds
Feb 12 19:33:03 rosalyn tinc.slamb.org[2925]: Error looking up slamb- 
linux.dyn.slamb.org port 4500: Name or service not known
Feb 12 19:33:03 rosalyn tinc.slamb.org[2925]: Could not set up a meta  
connection to slamb_linux
Feb 12 19:33:03 rosalyn tinc.slamb.org[2925]: Trying to re-establish  
outgoing connection in 30 seconds
Feb 12 19:33:03 rosalyn tinc.slamb.org[2925]: Error looking up slamb- 
linux.dyn.slamb.org port 4500: Name or service not known
Feb 12 19:33:03 rosalyn tinc.slamb.org[2925]: Could not set up a meta  
connection to slamb_linux
...

During this process, it keeps consuming memory until the kernel's out- 
of-memory killer gets rid of it. If I take the unresolvable address  
out of the configuration, it works fine.

Looks like the problem is this:

             logger(LOG_INFO, _("Flushing event queue"));

             while(event_tree->head) {
                 event = event_tree->head->data;
                 event->handler(event->data);
                 event_del(event);
             }

There's initially a setup_outgoing_connection() event there. It calls  
do_outgoing_connection(), which on resolve failure calls  
retry_outgoing(), which adds another setup_outgoing_connection()  
event. Events are added as fast as they are taken away, and the flush  
never terminates. And apparently connections are only removed in  
build_fdset() and terminate_connection(), which aren't getting called  
in this tight loop.

I've attached a patch (tinc-1.0.7-flushfix.patch) that only flushes  
events that already exist. I've also attached a patch (flush-1.0.7- 
leaks.patch) that fixes a couple minor memory leaks I spotted in  
"valgrind --tool=memcheck" output while looking for this problem.

Cheers,
Scott

-- 
Scott Lamb <http://www.slamb.org/>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: tinc-1.0.7-flushfix.patch
Type: application/octet-stream
Size: 1782 bytes
Desc: not available
Url : http://brouwer.uvt.nl/pipermail/tinc/attachments/20070213/3a5dbac0/tinc-1.0.7-flushfix.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tinc-1.0.7-leaks.patch
Type: application/octet-stream
Size: 1460 bytes
Desc: not available
Url : http://brouwer.uvt.nl/pipermail/tinc/attachments/20070213/3a5dbac0/tinc-1.0.7-leaks.obj


More information about the tinc mailing list