-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
low performance in network not in localhost #57
Comments
Hi, When you say "tested locally", you mean through a Unix socket or through a network TCP socket on localhost? If your test is on the localhost network interface, the performance loss is indeed weird ... You can use for instance a Regards, |
Hello @mortezaataiy , you have not provided enough information for us to try to understand where the performance issues you are observing are coming from. In essence, if you provide with more details we might be able to help. |
TCP socket on localhost: TCP socket on network: Caml-crush uses default configs. Instead of: DSS technologies is DotNetCore and use Net.Pkcs11Interop.X509Store and System.Security.Cryptography Do proxy call parallel or serial? I will try to check it with Thank you both @calderonth and @rb-anssi 🌹 |
######## |
Without caml-crush, how does DSS handle parallel requests with the original PKCS#11 library? caml-crush is based on Netplex which should spawn a new process for each new PKCS#11 session, so for multiple PKCS#11 sessions parallelism should not be an issue. It might be helpful to provide more information on what kind of PKCS#11 requests the REST API / DSS generates. Finally, to check whether network and/or parallelism are to be blamed, an insightful test would be to see the time taken by 1000 sequential requests in both scenarios (localhost and through network). Regards, |
Hi @rb-anssi DotNet Core by default uses multiple threads and I didn't change it. Netplex have some configs. I tried to change them but nothing changes. I improve my network and its ping is 1ms now but caml-crush spend 7 seconds for 1000 signs.
Log for one sign:
So |
Hi, Thanks for the figures and the feedback. If I get them right, processing with caml-crush goes from 4s to 7s in parallel between local and remote, while it goes from 1.5s to 4s without caml-crush (the 4s is taken from your first post). It seems to me that we have roughly the same scaling multiplicative factor between these scenarios, i.e. a factor around 2 between caml-crush and native implementation, which seems in line with the workload the proxy handles. Can you please confirm that all the PKCS#11 sessions are opened and handled in different processes / TCP connections by Regards, |
The number of our active DSS threads is 4 by default (equal to the number of CPU cores). Each request handled in a separate thread. I increase the number of active threads but the performance decreases. I tried to call many sign functions in parallel without DSS. I call
How did you test the performance of Caml-crush? It is mentioned in this PDF: https://eprint.iacr.org/2015/063.pdf Is this LIMITATIONS related to this discussion? Also about ping time: DSSs can't be near to HSM so ping can be more than 100ms. Thanks |
When using pkcs11-tool in a loop, the library would get loaded for each
loop cycle, creating new connections to the server (costly).
However, observing 1s for a single signature operation using pkcs11-tool, I
suspect you have network issues/latency impacting the overall performance.
You should collect network measurements of latency, jitter and packet loss
between your client/server hosts to ensure you have a good quality link.
When done, you could share those results with us.
Equally, you could test caml-crush in a VM/container setup to simulate a
network setup that has no network loss to validate the usual performance
with a software token (such as SoftHSM).
Hope this helps.
…On Sun, 24 Apr 2022, 03:01 morteza ataiy, ***@***.***> wrote:
The number of our active DSS threads is 4 by default (equal to the number
of CPU cores). Each request handled in a separate thread. I increase the
number of active threads but the performance decreases.
So yeah. All sessions are opened and handled in different processes.
I tried to call many sign functions in parallel without DSS. I call pkcs11-tool
--sign command but it was so slow (about 1 sign per seconds)
It also has some errors:
Error RPC with C_SetupArch
Unsupported architecture error EXITING
caml-crush: C_SetupArch: failed detecting architecture
error: PKCS11 function C_Initialize failed: rv = CKR_DEVICE_ERROR (0x30)
How did you test the performance of Caml-crush? It is mentioned in this
PDF: https://eprint.iacr.org/2015/063.pdf
Is this LIMITATIONS related to this discussion?
https://github.com/caml-pkcs11/caml-crush/blob/master/ISSUES.md#handling-synchronization-ocaml-client-library-
https://github.com/caml-pkcs11/caml-crush/blob/master/ISSUES.md#handling-synchronization-c-client-library
Also about ping time: DSSs can't be near to HSM so ping can be more than
100ms.
I'm coming to the conclusion that I should stop using caml-crush :(
Thanks
—
Reply to this email directly, view it on GitHub
<#57 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAF3DVVACMEE3V5YJVQ4ELLVGT5VFANCNFSM5SOTYLUQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
So why does it work well with DSS? :) I'm grateful |
I'm very excited to know how you checked Caml-crush performance? |
Hello @mortezaataiy , I now believe what you're observing is totally expected. If you have a high latency network, this will have a significant impact when using Caml-Crush client/server (as any other client/server that handles some sort of synchronous RPCs). When looking at the other scenario you describe as performing better, I believe it is not an appropriate comparion. I have performed a performance comparison between a client/server on the same host (TCP) and a client/server on different Docker containers (same LAN). I can confirm that the performance difference between a TCP/localhost and TCP/lan is negligible. It does sound like if you can't improve your RTT/latency and/or tweak the TCP settings to better cope with it then CamlCrush might not be a good fit here. |
Hello
I am testing caml-crush performance to support a very high requests rate (for example 1000 requests per second)
Using this proxy locally is good for me and 1000 requests are completed in 4 seconds. But not good on the network.
I tested on my network with the REST API and the 1000 request ends in 9 seconds. Increases to 18 seconds when using caml-crush for proxy.
What changes when I use caml-crush in localhost vs the network?
I tried disabling all filter features and also changing the "netplex.service.workload_manager" settings, but none of them change performance.
Can anyone help me, please?
The text was updated successfully, but these errors were encountered: