how to pick cipher for AES-NI enabled AMD GX-412TC SOC tincd at 100% CPU

Jelle de Jong jelledejong at powercraft.nl
Sat May 9 12:25:07 CEST 2020


Hello everybody,

I would also love to know how I can optimize my tinc setup so it goes 
faster without using 100% CPU load for 10MB/s...

Kind regards,

Jelle de Jong

On 2020-04-04 21:33, Jelle de Jong wrote:
> Hello everybody,
> 
> Thank you Fufu Fang for your quick reply:
> 
> With tinc version 1.0.35 and the bellow options at 100% CPu load i get 
> about 10 MB/s...
> 
> PMTU = 1400
> PMTUDiscovery = yes
> #Cipher = none
> Cipher = chacha20-poly1305
> Digest = blake2b512
> 
> Tried Cipher = none as well and also got 10MB/s with 100% CPU on one 
> thread the other three available threads are idle.
> 
> With inc_1.1~pre17-1.1_amd64.deb and libssl1.1:amd64 1.1.1d-0+deb10u2 I 
> get the following error:
> 
> Apr 04 19:03:19 officelink01 tincd[522]: Error while decrypting: 
> error:060A7094:digital envelope routines:EVP_EncryptUpdate:invalid 
> operation
> 
> installation steps:
> wget 
> http://ftp.nl.debian.org/debian/pool/main/t/tinc/tinc_1.1~pre17-1.1_amd64.deb 
> 
> dpkg -i tinc_1.1~pre17-1.1_amd64.deb
> apt-get -f install
> 
> Any speed improvement ideas?
> 
> Kind regards,
> 
> Jelle
> 
> On 2020-04-04 20:02, Jelle de Jong wrote:
>> Hello everybody,
>>
>> First a big thanks for tinc-vpn I am still using it next to wireguard 
>> and openvpn.
>>
>> I am having a setup where the tinc debian appliance is at 100% cpu 
>> load doing about 7.5MB/s.
>>
>> Compression = 9
>> PMTU = 1400
>> PMTUDiscovery = yes
>> Cipher = aes-128-cbc
>>
>> How can I pick a cipher that is the fasted for my CPU and don't create 
>> a CPU bottleneck at 100%.
>>
>> Kind regards,
>>
>> Jelle de Jong
>>
>> root at officelink01:~# lscpu
>> Architecture:        x86_64
>> CPU op-mode(s):      32-bit, 64-bit
>> Byte Order:          Little Endian
>> Address sizes:       40 bits physical, 48 bits virtual
>> CPU(s):              4
>> On-line CPU(s) list: 0-3
>> Thread(s) per core:  1
>> Core(s) per socket:  4
>> Socket(s):           1
>> NUMA node(s):        1
>> Vendor ID:           AuthenticAMD
>> CPU family:          22
>> Model:               48
>> Model name:          AMD GX-412TC SOC
>> Stepping:            1
>> CPU MHz:             775.729
>> CPU max MHz:         1000.0000
>> CPU min MHz:         600.0000
>> BogoMIPS:            1996.08
>> Virtualization:      AMD-V
>> L1d cache:           32K
>> L1i cache:           32K
>> L2 cache:            2048K
>> NUMA node0 CPU(s):   0-3
>> Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr 
>> pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext 
>> fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good acc_power nopl 
>> nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 
>> cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c lahf_lm cmp_legacy 
>> svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs 
>> skinit wdt topoext perfctr_nb bpext ptsc perfctr_llc cpb hw_pstate 
>> ssbd vmmcall bmi1 xsaveopt arat npt lbrv svm_lock nrip_save tsc_scale 
>> flushbyasid decodeassists pausefilter pfthreshold overflow_recov
>>
>> root at officelink01:~# openssl help
>> Standard commands
>> asn1parse         ca                ciphers           cms
>> crl               crl2pkcs7         dgst              dhparam
>> dsa               dsaparam          ec                ecparam
>> enc               engine            errstr            gendsa
>> genpkey           genrsa            help              list
>> nseq              ocsp              passwd            pkcs12
>> pkcs7             pkcs8             pkey              pkeyparam
>> pkeyutl           prime             rand              rehash
>> req               rsa               rsautl            s_client
>> s_server          s_time            sess_id           smime
>> speed             spkac             srp               storeutl
>> ts                verify            version           x509
>>
>> Message Digest commands (see the `dgst' command for more details)
>> blake2b512        blake2s256        gost              md4
>> md5               rmd160            sha1              sha224
>> sha256            sha3-224          sha3-256          sha3-384
>> sha3-512          sha384            sha512            sha512-224
>> sha512-256        shake128          shake256          sm3
>>
>> Cipher commands (see the `enc' command for more details)
>> aes-128-cbc       aes-128-ecb       aes-192-cbc       aes-192-ecb
>> aes-256-cbc       aes-256-ecb       aria-128-cbc      aria-128-cfb
>> aria-128-cfb1     aria-128-cfb8     aria-128-ctr      aria-128-ecb
>> aria-128-ofb      aria-192-cbc      aria-192-cfb      aria-192-cfb1
>> aria-192-cfb8     aria-192-ctr      aria-192-ecb      aria-192-ofb
>> aria-256-cbc      aria-256-cfb      aria-256-cfb1     aria-256-cfb8
>> aria-256-ctr      aria-256-ecb      aria-256-ofb      base64
>> bf                bf-cbc            bf-cfb            bf-ecb
>> bf-ofb            camellia-128-cbc  camellia-128-ecb  camellia-192-cbc
>> camellia-192-ecb  camellia-256-cbc  camellia-256-ecb  cast
>> cast-cbc          cast5-cbc         cast5-cfb         cast5-ecb
>> cast5-ofb         des               des-cbc           des-cfb
>> des-ecb           des-ede           des-ede-cbc       des-ede-cfb
>> des-ede-ofb       des-ede3          des-ede3-cbc      des-ede3-cfb
>> des-ede3-ofb      des-ofb           des3              desx
>> rc2               rc2-40-cbc        rc2-64-cbc        rc2-cbc
>> rc2-cfb           rc2-ecb           rc2-ofb           rc4
>> rc4-40            seed              seed-cbc          seed-cfb
>> seed-ecb          seed-ofb          sm4-cbc           sm4-cfb
>> sm4-ctr           sm4-ecb           sm4-ofb
>>
>> root at officelink01:~# openssl speed -elapsed -evp aes-128-cbc
>> You have chosen to measure elapsed time instead of user CPU time.
>> Doing aes-128-cbc for 3s on 16 size blocks: 13905799 aes-128-cbc's in 
>> 3.00s
>> Doing aes-128-cbc for 3s on 64 size blocks: 6572120 aes-128-cbc's in 
>> 3.00s
>> Doing aes-128-cbc for 3s on 256 size blocks: 2254183 aes-128-cbc's in 
>> 3.00s
>> Doing aes-128-cbc for 3s on 1024 size blocks: 623111 aes-128-cbc's in 
>> 3.00s
>> Doing aes-128-cbc for 3s on 8192 size blocks: 80058 aes-128-cbc's in 
>> 3.00s
>> Doing aes-128-cbc for 3s on 16384 size blocks: 40180 aes-128-cbc's in 
>> 3.00s
>> OpenSSL 1.1.1d  10 Sep 2019
>> built on: Sat Oct 12 19:56:43 2019 UTC
>> options:bn(64,64) rc4(8x,int) des(int) aes(partial) blowfish(ptr)
>> compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall 
>> -Wa,--noexecstack -g -O2 
>> -fdebug-prefix-map=/build/openssl-YwazYa/openssl-1.1.1d=. 
>> -fstack-protector-strong -Wformat -Werror=format-security 
>> -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ 
>> -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 
>> -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM 
>> -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM 
>> -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG 
>> -Wdate-time -D_FORTIFY_SOURCE=2
>> The 'numbers' are in 1000s of bytes per second processed.
>> type             16 bytes     64 bytes    256 bytes   1024 bytes   
>> 8192 bytes  16384 bytes
>> aes-128-cbc      74164.26k   140205.23k   192356.95k   212688.55k 
>> 218611.71k   219436.37k
>> root at officelink01:~# openssl speed -elapsed -evp aes-256-cbc
>> You have chosen to measure elapsed time instead of user CPU time.
>> Doing aes-256-cbc for 3s on 16 size blocks: 12322268 aes-256-cbc's in 
>> 3.00s
>> Doing aes-256-cbc for 3s on 64 size blocks: 5283431 aes-256-cbc's in 
>> 3.00s
>> Doing aes-256-cbc for 3s on 256 size blocks: 1686231 aes-256-cbc's in 
>> 3.00s
>> Doing aes-256-cbc for 3s on 1024 size blocks: 454425 aes-256-cbc's in 
>> 3.00s
>> Doing aes-256-cbc for 3s on 8192 size blocks: 58092 aes-256-cbc's in 
>> 3.00s
>> Doing aes-256-cbc for 3s on 16384 size blocks: 29035 aes-256-cbc's in 
>> 3.00s
>> OpenSSL 1.1.1d  10 Sep 2019
>> built on: Sat Oct 12 19:56:43 2019 UTC
>> options:bn(64,64) rc4(8x,int) des(int) aes(partial) blowfish(ptr)
>> compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall 
>> -Wa,--noexecstack -g -O2 
>> -fdebug-prefix-map=/build/openssl-YwazYa/openssl-1.1.1d=. 
>> -fstack-protector-strong -Wformat -Werror=format-security 
>> -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ 
>> -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 
>> -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM 
>> -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM 
>> -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG 
>> -Wdate-time -D_FORTIFY_SOURCE=2
>> The 'numbers' are in 1000s of bytes per second processed.
>> type             16 bytes     64 bytes    256 bytes   1024 bytes   
>> 8192 bytes  16384 bytes
>> aes-256-cbc      65718.76k   112713.19k   143891.71k   155110.40k 
>> 158629.89k   158569.81k
>> root at officelink01:~#
>> _______________________________________________
>> tinc mailing list
>> tinc at tinc-vpn.org
>> https://www.tinc-vpn.org/cgi-bin/mailman/listinfo/tinc
> _______________________________________________
> tinc mailing list
> tinc at tinc-vpn.org
> https://www.tinc-vpn.org/cgi-bin/mailman/listinfo/tinc


More information about the tinc mailing list