Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Disconnected from call" websocket connection failures #1844

Open
dbkr opened this issue Oct 30, 2023 · 4 comments
Open

"Disconnected from call" websocket connection failures #1844

dbkr opened this issue Oct 30, 2023 · 4 comments
Labels
T-Defect Something isn't working: bugs, crashes, hangs, vulnerabilities, or other reported problems

Comments

@dbkr
Copy link
Member

dbkr commented Oct 30, 2023

Steps to reproduce

Unknown!

We are seeing both a lot of rageshakes of people getting the 'disconnected from call" screen and occasionally seeing this in our usage where we'll suddenly be kicked to the disconnected for no reason.

This happened to me yesterday and the failure mode was: enter call, other person was in the call but was a grey screen. My mute button was showing I was unmuted but the mute indicator on my local video showed muted. Clicking the mute button did nothing, then after being confused for ~10 seconds, it changed to the disconnected from call screen.

On the livekit server, this manifested as, INFO livekit service/rtcservice.go:374 exit ws read loop for closed connection {"room": "***", "roomID": "RM_VMPzm5H6xPjb", "participant": "@blush-brilliant-octopus:call.ems.host:XEQOOZWPJC", "pID": "PA_FPtvGSXW8SWM", "remote": false, "connID": "CO_Hn2rKNwoLB5R", "wsError": "websocket: close 1005 (no status)"}

...which suggests a TCP level disconnection of the websocket, I think, if no status was provided.

The client logs are not very helpful, with the client just unpublishing tracks and changing the connection status to disconnected after a while logging nothing.

Why is the websocket connection dying, and why isn't livekit trying to re-establish it?

Outcome

What did you expect?

What happened instead?

Operating system

No response

Browser information

No response

URL for webapp

No response

Will you send logs?

Yes

@dbkr dbkr added the T-Defect Something isn't working: bugs, crashes, hangs, vulnerabilities, or other reported problems label Oct 30, 2023
@dbkr
Copy link
Member Author

dbkr commented Nov 1, 2023

Logs from the client when this happens:

2023-10-31T16:00:43.628Z 1 secondary PC state changed failed 
2023-10-31T16:00:43.628Z 3 secondary peerconnection disconnected 
2023-10-31T16:00:43.628Z 1 reconnecting in 0ms 
2023-10-31T16:00:43.628Z 2 resuming signal connection, attempt 0 
2023-10-31T16:00:43.628Z 1 room event connectionStateChanged {"event":"connectionStateChanged","args":["reconnecting"]}
2023-10-31T16:00:43.629Z 1 room event reconnecting {"event":"reconnecting","args":[]}
2023-10-31T16:00:43.629Z 1 clearing ping interval 
2023-10-31T16:00:43.629Z 1 connecting to wss://sfu.call.element.io/rtc?access_token=xxxxx&reconnect=1&sid=PA_kNYPRD8aLqMk&auto_subscribe=1&sdk=js&version=1.14.0&protocol=9&adaptive_stream=1&reconnect_reason=2 
2023-10-31T16:00:43.664Z 1 clearing ping interval 
2023-10-31T16:00:43.785Z 1 clearing ping interval 
2023-10-31T16:00:43.785Z 1 start ping interval 
2023-10-31T16:00:43.785Z 3 reconnect disconnected 
2023-10-31T16:00:43.785Z 1 reconnecting in 300ms 
2023-10-31T16:00:43.786Z 1 received server offer {"RTCSdpType":"offer","signalingState":"stable"}
2023-10-31T16:00:43.787Z 1 setting munged local description 
2023-10-31T16:00:43.788Z 1 sending answer 
2023-10-31T16:00:44.086Z 2 resuming signal connection, attempt 1 
2023-10-31T16:00:44.086Z 1 clearing ping interval 
2023-10-31T16:00:44.086Z 1 connecting to wss://sfu.call.element.io/rtc?access_token=xxxxx&reconnect=1&sid=PA_kNYPRD8aLqMk&auto_subscribe=1&sdk=js&version=1.14.0&protocol=9&adaptive_stream=1 
2023-10-31T16:00:44.118Z 1 clearing ping interval 
2023-10-31T16:00:44.249Z 1 clearing ping interval 
2023-10-31T16:00:44.249Z 1 start ping interval 
2023-10-31T16:00:44.249Z 3 reconnect disconnected 
2023-10-31T16:00:44.249Z 1 reconnecting in 1851.1063972577504ms 
2023-10-31T16:00:44.249Z 1 received server offer {"RTCSdpType":"offer","signalingState":"stable"}
2023-10-31T16:00:44.251Z 1 setting munged local description 
2023-10-31T16:00:44.252Z 1 sending answer 
2023-10-31T16:00:46.101Z 2 resuming signal connection, attempt 2 
2023-10-31T16:00:46.101Z 1 clearing ping interval 
2023-10-31T16:00:46.101Z 1 connecting to wss://sfu.call.element.io/rtc?access_token=xxxxx&reconnect=1&sid=PA_kNYPRD8aLqMk&auto_subscribe=1&sdk=js&version=1.14.0&protocol=9&adaptive_stream=1 
2023-10-31T16:00:46.143Z 1 clearing ping interval 
2023-10-31T16:00:46.276Z 1 clearing ping interval 
2023-10-31T16:00:46.277Z 1 start ping interval 
2023-10-31T16:00:46.277Z 3 reconnect disconnected 
2023-10-31T16:00:46.277Z 1 reconnecting in 2939.2955978060813ms 
2023-10-31T16:00:46.277Z 1 received server offer {"RTCSdpType":"offer","signalingState":"stable"}
2023-10-31T16:00:46.279Z 1 setting munged local description 
2023-10-31T16:00:46.280Z 1 sending answer 
2023-10-31T16:00:48.790Z 3 websocket closed {"ev":{"isTrusted":true}}
2023-10-31T16:00:48.790Z 1 clearing ping interval 
2023-10-31T16:00:48.790Z 1 websocket connection closed:  
2023-10-31T16:00:48.790Z 3 signal disconnected 
2023-10-31T16:00:48.790Z 1 reconnecting in 3371.406768287034ms 
2023-10-31T16:00:52.162Z 2 resuming signal connection, attempt 3 
2023-10-31T16:00:52.162Z 1 clearing ping interval 
2023-10-31T16:00:52.162Z 1 connecting to wss://sfu.call.element.io/rtc?access_token=xxxxx&reconnect=1&sid=PA_kNYPRD8aLqMk&auto_subscribe=1&sdk=js&version=1.14.0&protocol=9&adaptive_stream=1&reconnect_reason=1 
2023-10-31T16:00:52.280Z 1 clearing ping interval 
2023-10-31T16:00:52.280Z 1 start ping interval 
2023-10-31T16:00:52.281Z 3 reconnect disconnected 
2023-10-31T16:00:52.281Z 1 reconnecting in 5394.441162569518ms 
2023-10-31T16:00:52.281Z 1 received server offer {"RTCSdpType":"offer","signalingState":"stable"}
2023-10-31T16:00:52.282Z 1 setting munged local description 
2023-10-31T16:00:52.283Z 1 primary PC state changed disconnected 
2023-10-31T16:00:52.283Z 1 sending answer 
2023-10-31T16:00:52.562Z 1 primary PC state changed connected 
2023-10-31T16:00:56.217Z 1 FetchHttpApi: <-- GET https://call.ems.host/_matrix/client/v3/sync?filter=xxx&timeout=xxx&since=xxx [30102ms 200]
2023-10-31T16:00:56.218Z 1 FetchHttpApi: --> GET https://call.ems.host/_matrix/client/v3/sync?filter=xxx&timeout=xxx&since=xxx
2023-10-31T16:00:57.676Z 2 resuming signal connection, attempt 4 
2023-10-31T16:00:57.676Z 1 clearing ping interval 
2023-10-31T16:00:57.676Z 1 connecting to wss://sfu.call.element.io/rtc?access_token=xxxxx&reconnect=1&sid=PA_kNYPRD8aLqMk&auto_subscribe=1&sdk=js&version=1.14.0&protocol=9&adaptive_stream=1 
2023-10-31T16:00:57.710Z 1 clearing ping interval 
2023-10-31T16:00:57.830Z 1 clearing ping interval 
2023-10-31T16:00:57.830Z 1 start ping interval 
2023-10-31T16:00:57.830Z 3 reconnect disconnected 
2023-10-31T16:00:57.830Z 1 reconnecting in 7050.309513807013ms 
2023-10-31T16:00:57.831Z 1 received server offer {"RTCSdpType":"offer","signalingState":"stable"}
2023-10-31T16:00:57.832Z 1 setting munged local description 
2023-10-31T16:00:57.832Z 1 primary PC state changed disconnected 
2023-10-31T16:00:57.832Z 1 sending answer 
2023-10-31T16:00:58.082Z 1 primary PC state changed connected 
2023-10-31T16:01:04.881Z 2 resuming signal connection, attempt 5 
2023-10-31T16:01:04.881Z 1 clearing ping interval 
2023-10-31T16:01:04.881Z 1 connecting to wss://sfu.call.element.io/rtc?access_token=xxxxx&reconnect=1&sid=PA_kNYPRD8aLqMk&auto_subscribe=1&sdk=js&version=1.14.0&protocol=9&adaptive_stream=1 
2023-10-31T16:01:04.917Z 1 clearing ping interval 
2023-10-31T16:01:05.049Z 1 clearing ping interval 
2023-10-31T16:01:05.049Z 1 start ping interval 
2023-10-31T16:01:05.050Z 3 reconnect disconnected 
2023-10-31T16:01:05.050Z 1 reconnecting in 7159.723250281866ms 
2023-10-31T16:01:05.050Z 1 received server offer {"RTCSdpType":"offer","signalingState":"stable"}
2023-10-31T16:01:05.052Z 1 setting munged local description 
2023-10-31T16:01:05.052Z 1 primary PC state changed disconnected 
2023-10-31T16:01:05.052Z 1 sending answer 
2023-10-31T16:01:05.247Z 1 primary PC state changed connected 
2023-10-31T16:01:09.566Z 3 websocket closed {"ev":{"isTrusted":true}}
2023-10-31T16:01:09.566Z 1 clearing ping interval 
2023-10-31T16:01:09.566Z 1 websocket connection closed:  
2023-10-31T16:01:09.566Z 3 signal disconnected 
2023-10-31T16:01:09.566Z 1 reconnecting in 7541.0662509999165ms 
2023-10-31T16:01:17.116Z 2 resuming signal connection, attempt 6 
2023-10-31T16:01:17.116Z 1 clearing ping interval 
2023-10-31T16:01:17.116Z 1 connecting to wss://sfu.call.element.io/rtc?access_token=xxxxx&reconnect=1&sid=PA_kNYPRD8aLqMk&auto_subscribe=1&sdk=js&version=1.14.0&protocol=9&adaptive_stream=1&reconnect_reason=1 
2023-10-31T16:01:17.240Z 1 clearing ping interval 
2023-10-31T16:01:17.240Z 1 start ping interval 
2023-10-31T16:01:17.240Z 3 reconnect disconnected 
2023-10-31T16:01:17.240Z 1 reconnecting in 7655.799510661192ms 
2023-10-31T16:01:17.240Z 1 received server offer {"RTCSdpType":"offer","signalingState":"stable"}
2023-10-31T16:01:17.242Z 1 setting munged local description 
2023-10-31T16:01:17.242Z 1 primary PC state changed disconnected 
2023-10-31T16:01:17.242Z 1 sending answer 
2023-10-31T16:01:17.469Z 1 primary PC state changed connected 
2023-10-31T16:01:24.902Z 2 resuming signal connection, attempt 7 
2023-10-31T16:01:24.903Z 1 clearing ping interval 
2023-10-31T16:01:24.903Z 1 connecting to wss://sfu.call.element.io/rtc?access_token=xxxxx&reconnect=1&sid=PA_kNYPRD8aLqMk&auto_subscribe=1&sdk=js&version=1.14.0&protocol=9&adaptive_stream=1 
2023-10-31T16:01:24.938Z 1 clearing ping interval 
2023-10-31T16:01:25.069Z 1 clearing ping interval 
2023-10-31T16:01:25.069Z 1 start ping interval 
2023-10-31T16:01:25.069Z 3 reconnect disconnected 
2023-10-31T16:01:25.069Z 1 reconnecting in 7769.884933000982ms 
2023-10-31T16:01:25.069Z 1 received server offer {"RTCSdpType":"offer","signalingState":"stable"}
2023-10-31T16:01:25.071Z 1 setting munged local description 
2023-10-31T16:01:25.071Z 1 primary PC state changed disconnected 
2023-10-31T16:01:25.071Z 1 sending answer 
2023-10-31T16:01:25.316Z 1 primary PC state changed connected 
2023-10-31T16:01:26.320Z 1 FetchHttpApi: <-- GET https://call.ems.host/_matrix/client/v3/sync?filter=xxx&timeout=xxx&since=xxx [30102ms 200]
2023-10-31T16:01:26.320Z 1 FetchHttpApi: --> GET https://call.ems.host/_matrix/client/v3/sync?filter=xxx&timeout=xxx&since=xxx
2023-10-31T16:01:26.875Z 1 room event connectionQualityChanged {"event":"connectionQualityChanged","args":["poor",{"_events":{"localTrackPublished":"<$ object-pruned $>","trackMuted":"<$ object-pruned $>","trackUnmuted":"<$ object-pruned $>","participantPermissionsChanged":"<$ object-pruned $>","trackPublished":"<$ object-pruned $>","trackUnpublished":"<$ object-pruned $>","localTrackUnpublished":"<$ object-pruned $>","mediaDevicesError":"<$ object-pruned $>","trackSubscriptionStatusChanged":"<$ object-pruned $>"},"_eventsCount":12,"_maxListeners":100,"audioLevel":0,"isSpeaking":false,"_connectionQuality":"poor","sid":"PA_kNYPRD8aLqMk","identity":"@salmon-necessary-smelt:call.ems.host:LZQCUPJFZM","name":"","metadata":"","audioTracks":{},"videoTracks":{},"tracks":{},"pendingPublishing":{},"pendingPublishPromises":{},"participantTrackPermissions":[],"allParticipantsAllowedToSubscribe":true,"encryptionType":1,"engine":{"_events":"<$ object-pruned $>","_eventsCount":19,"options":"<$ object-pruned $>","rtcConfig":"<$ object-pruned $>","peerConnectionTimeout":15000,"fullReconnectOnNext":false,"subscriberPrimary":true,"pcState":1,"_isClosed":false,"pendingTrackResolvers":"<$ object-pruned $>","hasPublished":true,"reconnectAttempts":8,"reconnectStart":1698768043628,"attemptingReconnect":false,"joinAttempts":1,"maxJoinAttempts":1,"shouldFailNext":false,"client":"<$ object-pruned $>","reconnectPolicy":"<$ object-pruned $>","closingLock":"<$ object-pruned $>","dataProcessLock":"<$ object-pruned $>","dcBufferStatus":"<$ object-pruned $>","url":"wss://sfu.call.element.io","token":"xxxx","signalOpts":"<$ object-pruned $>","latestJoinResponse":"<$ object-pruned $>","participantSid":"PA_kNYPRD8aLqMk","publisher":"<$ object-pruned $>","subscriber":"<$ object-pruned $>","primaryPC":"<$ object-pruned $>","lossyDC":"<$ object-pruned $>","reliableDC":"<$ object-pruned $>","reliableDCSub":"<$ object-pruned $>","lossyDCSub":"<$ object-pruned $>","reconnectTimeout":141},"roomOptions":"<$ cycle-trimmed $>","activeDeviceMap":{},"audioContext":{},"permissions":{"canSubscribe":true,"canPublish":true,"canPublishData":true,"canPublishSources":"<$ object-pruned $>","hidden":false,"recorder":false,"canUpdateMetadata":false},"participantInfo":{"sid":"PA_kNYPRD8aLqMk","identity":"@salmon-necessary-smelt:call.ems.host:LZQCUPJFZM","state":"ACTIVE","tracks":"<$ object-pruned $>","metadata":"","joinedAt":"1698768026","name":"","version":5,"permission":"<$ object-pruned $>","region":"","isPublisher":true},"reconnectFuture":{"promise":"<$ object-pruned $>"}}]}
2023-10-31T16:01:32.849Z 2 resuming signal connection, attempt 8 
2023-10-31T16:01:32.849Z 1 clearing ping interval 
2023-10-31T16:01:32.849Z 1 connecting to wss://sfu.call.element.io/rtc?access_token=xxxxx&reconnect=1&sid=PA_kNYPRD8aLqMk&auto_subscribe=1&sdk=js&version=1.14.0&protocol=9&adaptive_stream=1 
2023-10-31T16:01:32.890Z 1 clearing ping interval 
2023-10-31T16:01:33.067Z 1 clearing ping interval 
2023-10-31T16:01:33.067Z 1 start ping interval 
2023-10-31T16:01:33.068Z 3 reconnect disconnected 
2023-10-31T16:01:33.068Z 1 reconnecting in 7396.546486505859ms 
2023-10-31T16:01:33.068Z 1 received server offer {"RTCSdpType":"offer","signalingState":"stable"}
2023-10-31T16:01:33.071Z 1 setting munged local description 
2023-10-31T16:01:33.072Z 1 primary PC state changed disconnected 
2023-10-31T16:01:33.072Z 1 sending answer 
2023-10-31T16:01:33.358Z 1 primary PC state changed connected 
2023-10-31T16:01:40.471Z 2 resuming signal connection, attempt 9 
2023-10-31T16:01:40.471Z 1 clearing ping interval 
2023-10-31T16:01:40.471Z 1 connecting to wss://sfu.call.element.io/rtc?access_token=xxxxx&reconnect=1&sid=PA_kNYPRD8aLqMk&auto_subscribe=1&sdk=js&version=1.14.0&protocol=9&adaptive_stream=1 
2023-10-31T16:01:40.508Z 1 clearing ping interval 
2023-10-31T16:01:40.638Z 1 clearing ping interval 
2023-10-31T16:01:40.638Z 1 start ping interval 
2023-10-31T16:01:40.639Z 3 reconnect disconnected 
2023-10-31T16:01:40.639Z 3 could not recover connection after 10 attempts, 57011ms. giving up 

So the websocket connection is failing within ~200ms: almost like the connection is being refused, except the server is saying:

2023-10-31T16:01:24.769Z	INFO	livekit	service/rtcservice.go:374	exit ws read loop for closed connection	{"room": "!WUugyKdzBpQdhXAxkk:call.ems.host", "roomID": "RM_KpxFCCrXMzSg", "participant": "@salmon-necessary-smelt:call.ems.host:LZQCUPJFZM", "pID": "PA_kNYPRD8aLqMk", "remote": false, "connID": "CO_752cHFUQUz35", "wsError": "websocket: close 1005 (no status)"}

...which implies the client is closing the connection? Or at least that it's certainly establishing a connection that's then getting closed again each time for some reason.

@dbkr
Copy link
Member Author

dbkr commented Nov 2, 2023

livekit/client-sdk-js#921 should fix this!

@youdontneedtoknow22
Copy link

I just had the same problem (Windows 11, Edge) and a friend (iPhone, Safari). I don't know if it's after the fix has been merged to the version online (should've been merged into main 7 hours ago)

@Half-Shot
Copy link
Member

Not sure what the next steps are here, potentially better errors ( @robintown ?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-Defect Something isn't working: bugs, crashes, hangs, vulnerabilities, or other reported problems
Projects
None yet
Development

No branches or pull requests

3 participants