Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Settings timeout error with grpc-js client #13175

Closed
Tanavya opened this issue Aug 8, 2024 · 5 comments
Closed

Settings timeout error with grpc-js client #13175

Tanavya opened this issue Aug 8, 2024 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@Tanavya
Copy link

Tanavya commented Aug 8, 2024

What version of Bun is running?

1.1.22-canary.96+df33f2b2a

What platform is your computer?

Linux 6.6.12-linuxkit aarch64 unknown

What steps can reproduce the bug?

This is somewhat hard to reproduce since it is intermittent and the implementation is part of a bigger system, but basically I the setup is as follows. I have a gRPC client implemented using Bun+TS in a server running with Express, and a gRPC server implemented in Python. I make a curl request to the Express server, that then triggers a gRPC client request to the gRPC server. Its a somewhat long running operation, like at least 5 seconds, but every 1 in 15 times the gRPC call fails with an erroor with status code 14 or UNAVAILABLE (https://grpc.github.io/grpc/core/md_doc_statuscodes.html, and message "settings timeout".
These are the logs when running with GRPC_VERBOSITY=DEBUG GRPC_TRACE=all. The grpc client being used is grpc-js.

2024-08-08 15:09:09 D 2024-08-08T19:09:09.632Z | v1.10.10 260 | resolving_call | [60] Created child [62]
2024-08-08 15:09:09 D 2024-08-08T19:09:09.632Z | v1.10.10 260 | retrying_call | [62] start called
2024-08-08 15:09:09 D 2024-08-08T19:09:09.632Z | v1.10.10 260 | channel | (1) dns:abi-op-service:5005 createLoadBalancingCall [63] method="/optimizer.v1.OptimizerService/GetFinancialPlan"
2024-08-08 15:09:09 D 2024-08-08T19:09:09.632Z | v1.10.10 260 | retrying_call | [62] Created child call [63] for attempt 1
2024-08-08 15:09:09 D 2024-08-08T19:09:09.632Z | v1.10.10 260 | load_balancing_call | [63] start called
2024-08-08 15:09:09 D 2024-08-08T19:09:09.633Z | v1.10.10 260 | load_balancing_call | [63] Pick called
2024-08-08 15:09:09 D 2024-08-08T19:09:09.633Z | v1.10.10 260 | load_balancing_call | [63] Pick result: COMPLETE subchannel: (1) 192.168.160.4:5005 status: undefined undefined
2024-08-08 15:09:09 D 2024-08-08T19:09:09.633Z | v1.10.10 260 | retrying_call | [62] startRead called
2024-08-08 15:09:09 D 2024-08-08T19:09:09.633Z | v1.10.10 260 | load_balancing_call | [63] startRead called
2024-08-08 15:09:09 D 2024-08-08T19:09:09.633Z | v1.10.10 260 | channel | (1) dns:abi-op-service:5005 createRetryingCall [64] method="/optimizer.v1.OptimizerService/GetNaiveFinancialPlan"
2024-08-08 15:09:09 D 2024-08-08T19:09:09.633Z | v1.10.10 260 | resolving_call | [61] Created child [64]
2024-08-08 15:09:09 D 2024-08-08T19:09:09.633Z | v1.10.10 260 | retrying_call | [64] start called
2024-08-08 15:09:09 D 2024-08-08T19:09:09.633Z | v1.10.10 260 | channel | (1) dns:abi-op-service:5005 createLoadBalancingCall [65] method="/optimizer.v1.OptimizerService/GetNaiveFinancialPlan"
2024-08-08 15:09:09 D 2024-08-08T19:09:09.633Z | v1.10.10 260 | retrying_call | [64] Created child call [65] for attempt 1
2024-08-08 15:09:09 D 2024-08-08T19:09:09.633Z | v1.10.10 260 | load_balancing_call | [65] start called
2024-08-08 15:09:09 D 2024-08-08T19:09:09.633Z | v1.10.10 260 | load_balancing_call | [65] Pick called
2024-08-08 15:09:09 D 2024-08-08T19:09:09.633Z | v1.10.10 260 | load_balancing_call | [65] Pick result: COMPLETE subchannel: (1) 192.168.160.4:5005 status: undefined undefined
2024-08-08 15:09:09 D 2024-08-08T19:09:09.633Z | v1.10.10 260 | retrying_call | [64] startRead called
2024-08-08 15:09:09 D 2024-08-08T19:09:09.633Z | v1.10.10 260 | load_balancing_call | [65] startRead called
2024-08-08 15:09:09 D 2024-08-08T19:09:09.634Z | v1.10.10 260 | transport_flowctrl | (7) 192.168.160.4:5005 local window size: 65535 remote window size: 65535
2024-08-08 15:09:09 D 2024-08-08T19:09:09.634Z | v1.10.10 260 | transport_internals | (7) 192.168.160.4:5005 session.closed=false session.destroyed=false session.socket.destroyed=false
2024-08-08 15:09:09 D 2024-08-08T19:09:09.634Z | v1.10.10 260 | load_balancing_call | [63] Created child call [66]
2024-08-08 15:09:09 D 2024-08-08T19:09:09.634Z | v1.10.10 260 | transport_flowctrl | (7) 192.168.160.4:5005 local window size: 65535 remote window size: 65535
2024-08-08 15:09:09 D 2024-08-08T19:09:09.634Z | v1.10.10 260 | transport_internals | (7) 192.168.160.4:5005 session.closed=false session.destroyed=false session.socket.destroyed=false
2024-08-08 15:09:09 D 2024-08-08T19:09:09.634Z | v1.10.10 260 | load_balancing_call | [65] Created child call [67]
2024-08-08 15:09:09 D 2024-08-08T19:09:09.634Z | v1.10.10 260 | retrying_call | [62] write() called with message of length 2623
2024-08-08 15:09:09 D 2024-08-08T19:09:09.634Z | v1.10.10 260 | load_balancing_call | [63] write() called with message of length 2623
2024-08-08 15:09:09 D 2024-08-08T19:09:09.634Z | v1.10.10 260 | subchannel_call | [66] write() called with message of length 2623
2024-08-08 15:09:09 D 2024-08-08T19:09:09.634Z | v1.10.10 260 | subchannel_call | [66] sending data chunk of length 2623
2024-08-08 15:09:09 D 2024-08-08T19:09:09.634Z | v1.10.10 260 | retrying_call | [62] halfClose called
2024-08-08 15:09:09 D 2024-08-08T19:09:09.634Z | v1.10.10 260 | retrying_call | [64] write() called with message of length 2623
2024-08-08 15:09:09 D 2024-08-08T19:09:09.634Z | v1.10.10 260 | load_balancing_call | [65] write() called with message of length 2623
2024-08-08 15:09:09 D 2024-08-08T19:09:09.634Z | v1.10.10 260 | subchannel_call | [67] write() called with message of length 2623
2024-08-08 15:09:09 D 2024-08-08T19:09:09.634Z | v1.10.10 260 | subchannel_call | [67] sending data chunk of length 2623
2024-08-08 15:09:09 D 2024-08-08T19:09:09.634Z | v1.10.10 260 | retrying_call | [64] halfClose called
2024-08-08 15:09:09 D 2024-08-08T19:09:09.635Z | v1.10.10 260 | load_balancing_call | [63] halfClose called
2024-08-08 15:09:09 D 2024-08-08T19:09:09.635Z | v1.10.10 260 | subchannel_call | [66] end() called
2024-08-08 15:09:09 D 2024-08-08T19:09:09.635Z | v1.10.10 260 | subchannel_call | [66] calling end() on HTTP/2 stream
2024-08-08 15:09:09 D 2024-08-08T19:09:09.635Z | v1.10.10 260 | load_balancing_call | [65] halfClose called
2024-08-08 15:09:09 D 2024-08-08T19:09:09.635Z | v1.10.10 260 | subchannel_call | [67] end() called
2024-08-08 15:09:09 D 2024-08-08T19:09:09.635Z | v1.10.10 260 | subchannel_call | [67] calling end() on HTTP/2 stream
2024-08-08 15:09:12 D 2024-08-08T19:09:12.069Z | v1.10.10 260 | transport | (7) 192.168.160.4:5005 connection closed by GOAWAY with code 4 and data settings_timeout
2024-08-08 15:09:12 D 2024-08-08T19:09:12.069Z | v1.10.10 260 | subchannel | (1) 192.168.160.4:5005 READY -> IDLE
2024-08-08 15:09:12 D 2024-08-08T19:09:12.069Z | v1.10.10 260 | subchannel_refcount | (1) 192.168.160.4:5005 refcount 2 -> 1
2024-08-08 15:09:12 D 2024-08-08T19:09:12.069Z | v1.10.10 260 | pick_first | READY -> IDLE
2024-08-08 15:09:12 D 2024-08-08T19:09:12.069Z | v1.10.10 260 | resolving_load_balancer | dns:abi-op-service:5005 READY -> IDLE
2024-08-08 15:09:12 D 2024-08-08T19:09:12.069Z | v1.10.10 260 | connectivity_state | (1) dns:abi-op-service:5005 READY -> IDLE
2024-08-08 15:09:12 D 2024-08-08T19:09:12.069Z | v1.10.10 260 | dns_resolver | Looking up DNS hostname abi-op-service
2024-08-08 15:09:12 D 2024-08-08T19:09:12.069Z | v1.10.10 260 | resolving_load_balancer | dns:abi-op-service:5005 IDLE -> CONNECTING
2024-08-08 15:09:12 D 2024-08-08T19:09:12.069Z | v1.10.10 260 | connectivity_state | (1) dns:abi-op-service:5005 IDLE -> CONNECTING
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [66] Received server headers:
2024-08-08 15:09:12             :status: 200
2024-08-08 15:09:12             content-type: application/grpc
2024-08-08 15:09:12             grpc-status: 14
2024-08-08 15:09:12             grpc-message: settings timeout
2024-08-08 15:09:12 
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [66] Received server trailers:
2024-08-08 15:09:12             :status: 200
2024-08-08 15:09:12             content-type: application/grpc
2024-08-08 15:09:12             grpc-status: 14
2024-08-08 15:09:12             grpc-message: settings timeout
2024-08-08 15:09:12 
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [66] received status code 14 from server
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [66] received status details string "settings timeout" from server
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [66] ended with status: code=14 details="settings timeout"
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [66] Node error event: message=write after end code=ERR_STREAM_WRITE_AFTER_END errno=Unknown system error undefined syscall=undefined
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [67] Received server headers:
2024-08-08 15:09:12             :status: 200
2024-08-08 15:09:12             content-type: application/grpc
2024-08-08 15:09:12             grpc-status: 14
2024-08-08 15:09:12             grpc-message: settings timeout
2024-08-08 15:09:12 
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [67] Received server trailers:
2024-08-08 15:09:12             :status: 200
2024-08-08 15:09:12             content-type: application/grpc
2024-08-08 15:09:12             grpc-status: 14
2024-08-08 15:09:12             grpc-message: settings timeout
2024-08-08 15:09:12 
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [67] received status code 14 from server
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [67] received status details string "settings timeout" from server
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [67] ended with status: code=14 details="settings timeout"
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [67] Node error event: message=write after end code=ERR_STREAM_WRITE_AFTER_END errno=Unknown system error undefined syscall=undefined
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | transport | (7) 192.168.160.4:5005 session closed
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [66] Node error event: message=Session closed with error code 4 code=ERR_HTTP2_SESSION_ERROR errno=EINTR syscall=undefined
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [67] Node error event: message=Session closed with error code 4 code=ERR_HTTP2_SESSION_ERROR errno=EINTR syscall=undefined
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | load_balancing_call | [63] Received status
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | load_balancing_call | [63] ended with status: code=14 details="settings timeout" start time=2024-08-08T19:09:09.632Z
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | retrying_call | [62] Received status from child [63]
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | retrying_call | [62] state=TRANSPARENT_ONLY handling status with progress PROCESSED from child [63] in state ACTIVE
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | retrying_call | [62] ended with status: code=14 details="settings timeout" start time=2024-08-08T19:09:09.632Z
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [66] HTTP/2 stream closed with code 0
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | load_balancing_call | [65] Received status
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | load_balancing_call | [65] ended with status: code=14 details="settings timeout" start time=2024-08-08T19:09:09.633Z
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | retrying_call | [64] Received status from child [65]
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | retrying_call | [64] state=TRANSPARENT_ONLY handling status with progress PROCESSED from child [65] in state ACTIVE
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | retrying_call | [64] ended with status: code=14 details="settings timeout" start time=2024-08-08T19:09:09.633Z
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [67] HTTP/2 stream closed with code 0
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [66] HTTP/2 stream closed with code 0
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | subchannel_call | [67] HTTP/2 stream closed with code 0
2024-08-08 15:09:12 D 2024-08-08T19:09:12.070Z | v1.10.10 260 | resolving_call | [60] Received status
2024-08-08 15:09:12 D 2024-08-08T19:09:12.071Z | v1.10.10 260 | resolving_call | [60] ended with status: code=14 details="settings timeout"
2024-08-08 15:09:12 D 2024-08-08T19:09:12.071Z | v1.10.10 260 | resolving_call | [61] Received status
2024-08-08 15:09:12 D 2024-08-08T19:09:12.071Z | v1.10.10 260 | resolving_call | [61] ended with status: code=14 details="settings timeout"

If I replace bun run with ts-node then I never get this "settings timeout" error.

What is the expected behavior?

No response

What do you see instead?

You can see the logs shared. I think the "ERR_HTTP2_SESSION_ERROR" is concerning. Also, this is the actual error caught in the application:

message: "Settings timeout"
error: {
2024-08-08 15:33:14   "name": "RpcError",
2024-08-08 15:33:14   "code": "UNAVAILABLE",
2024-08-08 15:33:14   "meta": {
2024-08-08 15:33:14     "content-type": "application/grpc"
2024-08-08 15:33:14   },
2024-08-08 15:33:14   "methodName": "GetFinancialPlan",
2024-08-08 15:33:14   "serviceName": "optimizer.v1.OptimizerService"
2024-08-08 15:33:14 }

Additional information

No response

@Tanavya Tanavya added bug Something isn't working needs triage labels Aug 8, 2024
@Tanavya
Copy link
Author

Tanavya commented Aug 8, 2024

Having to move away from Bun and to Node because of this issue :(

@cirospaciari
Copy link
Collaborator

This should be resolved with #14286 can you add a repro or confirm if its resolved?

@cirospaciari cirospaciari added the needs repro Needs an example to reproduce label Oct 16, 2024
Copy link
Contributor

Hello @Tanavya. Please provide a minimal reproduction using a GitHub repository, Replit, CodeSandbox, or provide a bulleted list of commands to run that reproduce this issue. Issues marked with needs repro will be closed if they have no activity within 3 days.

@Tanavya
Copy link
Author

Tanavya commented Oct 16, 2024

Hi, it appears the issue has been resolved in the latest canary build. Earlier any grpc server function taking more than 60 seconds would guarantee to fail with settings timeout when using bun but work with tsx, now it always works. Also if the function took less than 60 seconds then it would fail intermittently.

If its still important, here is the minimal reproduction.
https://codesandbox.io/p/github/Tanavya/grpc-bun-issue/master?import=true
To reproduce:
buf generate protos
docker compose up --build=True -d
cd grpc-client
bun install
bun run index.ts OR bunx tsx index.ts

You can close the ticket or let me know if you need to confirm anything.

@cirospaciari
Copy link
Collaborator

Hi, it appears the issue has been resolved in the latest canary build. Earlier any grpc server function taking more than 60 seconds would guarantee to fail with settings timeout when using bun but work with tsx, now it always works. Also if the function took less than 60 seconds then it would fail intermittently.

If its still important, here is the minimal reproduction. https://codesandbox.io/p/github/Tanavya/grpc-bun-issue/master?import=true To reproduce: buf generate protos docker compose up --build=True -d cd grpc-client bun install bun run index.ts OR bunx tsx index.ts

You can close the ticket or let me know if you need to confirm anything.

Thank you for the minimum reproduction and for the feedback!

@cirospaciari cirospaciari removed needs repro Needs an example to reproduce needs triage labels Oct 16, 2024
@cirospaciari cirospaciari self-assigned this Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants