Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NC | NSFS | Number of FDs of NooBaa Main Process #8542

Open
shirady opened this issue Nov 20, 2024 · 1 comment
Open

NC | NSFS | Number of FDs of NooBaa Main Process #8542

shirady opened this issue Nov 20, 2024 · 1 comment
Labels
Non Containerized Non containerized

Comments

@shirady
Copy link
Contributor

shirady commented Nov 20, 2024

Environment info

  • NooBaa Version: current master, 5.18.0
  • Platform: NC, noobaa deployment on CES

Actual behavior

  1. When running a warp test of get to 1 node with noobaa with 32 forks and monitoring the number of file descriptors (FD) of the main process, we can see that in some runs, it can increase to high numbers (for example, during a run I saw 171,910). Anyway, after the run is finished (or if we kill the process), we get to the initial number of 56. In the past, we limited this number to 65,536, and we increased it (see PR NC | NSFS | Panic Printings Added + Try-Catch memory_monitor + Change Default Event Logs + Increase LimitNOFILE #8518). But there are cases where this number is unchanged and stays 56 for the whole run.

Expected behavior

  1. We might want to investigate and understand the source of that behavior.

Steps to reproduce

Copied from issue #8471 (see this comment):

From a node in the cluster that runs noobaa:

  1. Change the config.json in path vi /ibm/fs1/cessharedroot/ces/s3-config/config.json and add the values: "UV_THREADPOOL_SIZE": 16 and "ENDPOINT_FORKS":32.
    Note: without this change running it would result in an error of timeout during the preparing step of warp
  2. mms3 config change DEBUGLEVEL="all" for the restart.
  3. Run this script run_counter_fd.sh by ./run_counter_fd.sh <main PID> (the <main PID> is the main process ID from system status noobaa).
#!/bin/bash

echo $0 $1
# monitor open file descriptor count every 10 seconds
while true; do
  ls -al /proc/$1/fd | wc -l
  date
  sleep 10
done

Note: you can also use 'lsof -c noobaaand try to analyze it (we saw many cases of TCP sockets in stateWAIT_CLOSE`).

From a client node:
5. Create an account: noobaa-cli account add --name warp-shira --new_buckets_path /ibm/fs1/teams/ --uid 1001 gid 1001 --fs_backend GPFS
6. Create the alias for the account (based on the existing account):
alias s3-u1='AWS_ACCESS_KEY_ID=<> AWS_SECRET_ACCESS_KEY=<> aws --no-verify-ssl --endpoint <ip-address-node> --no-verify-ssl'
Check the connection (by trying to list the buckets of the account): s3-u1 s3 ls; echo $?
7. Run the warp command: cd warp;
./warp get --host=<ip-address-node> --access-key="<>" --secret-key="<>" --obj.size=1k --concurrent=1000 --duration=30m --bucket=bsw-01 --insecure --tls (I run it with 1 host)

More information - Screenshots / Logs / Other output

attached 2 partials runs (it is partial due to a corrupted connection on my side - not related to the issue):

  1. Without any change to the FD number.
  2. Increasing number of FDs.
    open_fds_02.txt
    open_fds_01.txt
@shirady shirady added the NS-FS label Nov 20, 2024
@shirady
Copy link
Contributor Author

shirady commented Nov 20, 2024

Hi,
In case you want to add something to the discussion -
@dannyzaken @guymguym if you think it is interesting to keep the investigation and have suggestions for things we can test, and improve our knowledge in internal NodeJS behavior, etc.

@romayalon @nadavMiz if you think I missed any detail (I can either edit the description or add a comment)

@shirady shirady changed the title NC | NSFS | Number of FDs of The main Process NC | NSFS | Number of FDs of NooBaa main Process Nov 20, 2024
@shirady shirady changed the title NC | NSFS | Number of FDs of NooBaa main Process NC | NSFS | Number of FDs of NooBaa Main Process Nov 20, 2024
@nimrod-becker nimrod-becker added Non Containerized Non containerized and removed NS-FS labels Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Non Containerized Non containerized
Projects
None yet
Development

No branches or pull requests

2 participants