Large sites

Mon Feb 25 14:03:02 CET 2013

On Fri, Feb 22, 2013 at 4:27 PM, Guus Sliepen <guus at tinc-vpn.org> wrote:
> On Fri, Feb 22, 2013 at 02:58:09PM +0000, Mike C wrote:
> Tinc requires roughly 250 bytes of memory for each node in the VPN. So if you
> have 2000 sites, then it will use 500 kilobytes. So this not really an issue
> unless you run it on devices with very little memory.  Tinc makes connections
> on demand, and uses connectionless UDP for most of them, so there is not much
> overhead there, except for the hub nodes.
>
> The largest overhead is that tinc daemons need to exchange information with
> each other. This happens when a tinc daemon just starts and connects to the
> hub, from which it has to learn information about the other nodes. Also, each
> time a node joins or leaves the VPN this will be broadcast to all other nodes.
> There is roughly 100 bytes of information that needs to be exchanged per node;
> so in a 2000 node network a new node that connects to the hub will receive 200
> kilobyte from the hub, and 100 bytes are broadcast to all other nodes to inform
> them of the new node that just connected. You can calculate the load based on
> how many nodes you expect to join/leave the VPN every second.

Thanks, this is useful information. I can see nodes coming and going
frequently due to the unreliable nature of their links (DSL, Cable,
3G), so the 100 bytes broadcast might be something I'll need to keep
an eye on. Looking through the docs, would setting TunnelServer=yes
stop the need for this broadcast, as I presume it forces all data to
go through known, pre-defined tinc daemons so no need to make everyone
aware of everyone else?

>> I have read elsewhere on this list that tincd isn't multi-threaded and
>> to get the most out of a multi-core server you should split the VPN
>> into smaller VPNs. Is this still the case and if so, are there any
>> reasons that would prevent it being made multi-threaded?
>
> The reason it is not multi-threaded is because that makes the code more
> complex, especially because it would require locking for many data structures.
> Also, the CPU (assuming it is a decent one) becomes the bottleneck only if you
> have a network faster than 100 Mbit/s. But indeed, if you have that, and you
> have a hub-and-spoke model anyway, then you can run multiple daemons on the
> hub.

Point taken.

There is less configuration complexity required with a single
instance, but I guess the only real pain I'm likely to see is if I
increase the number of CPUs on the hubs and need to redistribute the
nodes evenly.

>> In my case, the majority of the traffic will ultimately reach 1
>> location/datacentre. So hub-and-spoke model. There's no need for
>> meshing between sites, except for maybe between the hub(s) themselves.
>> IPSec doesn't work so well, given problems with NAT (even with NAT-T)
>> - which is where tinc comes in. The hub itself is unfortunately NAT'd,
>> and so are most of the remote sites, so I am trying to think of
>> alternative approaches. Thinking tinc could be used as an intermediary
>> between the dc and the remote sites. E.g.
>>
>> Datacentre <--> intermediary tinc server on non-NAT public IP <--> remote sites.
>>
>> In fact I was thinking of running multiple intermediary tinc servers,
>> to provide some form of redundancy if one failed (using the Subnet
>> #weight setting).
>
> Using multiple intermediary servers is a good idea. However, you don't have to
> assign Subnets to them at all, they can just be there to help the datacentre
> and remote sites punch holes through their NATs.

That's good to know, appreciate the feedback.

Cheers,

Mike