You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running a warp test of get to 1 node with noobaa with 32 forks and monitoring the number of file descriptors (FD) of the main process, we can see that in some runs, it can increase to high numbers (for example, during a run I saw 171,910). Anyway, after the run is finished (or if we kill the process), we get to the initial number of 56. In the past, we limited this number to 65,536, and we increased it (see PR NC | NSFS | Panic Printings Added + Try-Catch memory_monitor + Change Default Event Logs + Increase LimitNOFILE #8518). But there are cases where this number is unchanged and stays 56 for the whole run.
Expected behavior
We might want to investigate and understand the source of that behavior.
Change the config.json in path vi /ibm/fs1/cessharedroot/ces/s3-config/config.json and add the values: "UV_THREADPOOL_SIZE": 16 and "ENDPOINT_FORKS":32.
Note: without this change running it would result in an error of timeout during the preparing step of warp
mms3 config change DEBUGLEVEL="all" for the restart.
Run this script run_counter_fd.sh by ./run_counter_fd.sh <main PID> (the <main PID> is the main process ID from system status noobaa).
#!/bin/bash
echo $0 $1
# monitor open file descriptor count every 10 seconds
while true; do
ls -al /proc/$1/fd | wc -l
date
sleep 10
done
Note: you can also use 'lsof -c noobaaand try to analyze it (we saw many cases of TCP sockets in stateWAIT_CLOSE`).
From a client node:
5. Create an account: noobaa-cli account add --name warp-shira --new_buckets_path /ibm/fs1/teams/ --uid 1001 gid 1001 --fs_backend GPFS
6. Create the alias for the account (based on the existing account): alias s3-u1='AWS_ACCESS_KEY_ID=<> AWS_SECRET_ACCESS_KEY=<> aws --no-verify-ssl --endpoint <ip-address-node> --no-verify-ssl'
Check the connection (by trying to list the buckets of the account): s3-u1 s3 ls; echo $?
7. Run the warp command: cd warp; ./warp get --host=<ip-address-node> --access-key="<>" --secret-key="<>" --obj.size=1k --concurrent=1000 --duration=30m --bucket=bsw-01 --insecure --tls (I run it with 1 host)
More information - Screenshots / Logs / Other output
attached 2 partials runs (it is partial due to a corrupted connection on my side - not related to the issue):
Hi,
In case you want to add something to the discussion - @dannyzaken@guymguym if you think it is interesting to keep the investigation and have suggestions for things we can test, and improve our knowledge in internal NodeJS behavior, etc.
@romayalon@nadavMiz if you think I missed any detail (I can either edit the description or add a comment)
Environment info
Actual behavior
memory_monitor
+ Change Default Event Logs + IncreaseLimitNOFILE
#8518). But there are cases where this number is unchanged and stays 56 for the whole run.Expected behavior
Steps to reproduce
Copied from issue #8471 (see this comment):
From a node in the cluster that runs noobaa:
config.json
in pathvi /ibm/fs1/cessharedroot/ces/s3-config/config.json
and add the values:"UV_THREADPOOL_SIZE": 16
and"ENDPOINT_FORKS":32
.Note: without this change running it would result in an error of timeout during the preparing step of warp
mms3 config change DEBUGLEVEL="all"
for the restart.run_counter_fd.sh
by./run_counter_fd.sh <main PID>
(the<main PID>
is the main process ID fromsystem status noobaa
).Note: you can also use 'lsof -c noobaa
and try to analyze it (we saw many cases of TCP sockets in state
WAIT_CLOSE`).From a client node:
5. Create an account:
noobaa-cli account add --name warp-shira --new_buckets_path /ibm/fs1/teams/ --uid 1001 gid 1001 --fs_backend GPFS
6. Create the alias for the account (based on the existing account):
alias s3-u1='AWS_ACCESS_KEY_ID=<> AWS_SECRET_ACCESS_KEY=<> aws --no-verify-ssl --endpoint <ip-address-node> --no-verify-ssl'
Check the connection (by trying to list the buckets of the account):
s3-u1 s3 ls; echo $?
7. Run the warp command:
cd warp;
./warp get --host=<ip-address-node> --access-key="<>" --secret-key="<>" --obj.size=1k --concurrent=1000 --duration=30m --bucket=bsw-01 --insecure --tls
(I run it with 1 host)More information - Screenshots / Logs / Other output
attached 2 partials runs (it is partial due to a corrupted connection on my side - not related to the issue):
open_fds_02.txt
open_fds_01.txt
The text was updated successfully, but these errors were encountered: