Skip to content

Commit

Permalink
Handle connection resets in the status server more gracefully
Browse files Browse the repository at this point in the history
Connection resets due to network instability can lead to the status
server not catching a test status, an asyncio error like

full.log│2024-03-08 03:08:44,053 asyncio base_events      L1744 ERROR| Task exception was never retrieved                                                                                                                                  │
full.log│future: <Task finished name='Task-2038' coro=<StatusServer.cb() done, defined at /usr/lib/python3.10/site-packages/avocado/core/status/server.py:51> exception=ConnectionResetError(104, 'Connection reset by peer')>             ├
full.log│Traceback (most recent call last):                                                                                                                                                                                                │
full.log│  File "/usr/lib/python3.10/site-packages/avocado/core/status/server.py", line 53, in cb                                                                                                                                          │
full.log│    raw_message = await reader.readline()                                                                                                                                                                                         │
full.log│  File "/usr/lib64/python3.10/asyncio/streams.py", line 525, in readline                                                                                                                                                          │
full.log│    line = await self.readuntil(sep)                                                                                                                                                                                              │
full.log│  File "/usr/lib64/python3.10/asyncio/streams.py", line 617, in readuntil                                                                                                                                                         │
full.log│    await self._wait_for_data('readuntil')                                                                                                                                                                                        │
full.log│  File "/usr/lib64/python3.10/asyncio/streams.py", line 502, in _wait_for_data                                                                                                                                                    │
full.log│    await self._waiter                                                                                                                                                                                                            │
full.log│  File "/usr/lib64/python3.10/asyncio/selector_events.py", line 854, in _read_ready__data_received                                                                                                                                │
full.log│    data = self._sock.recv(self.max_size)                                                                                                                                                                                         │
full.log│ConnectionResetError: [Errno 104] Connection reset by peer

and worst yet to test tasks hanging indefinitely without the job
ever completing properly. This was mostly observed in cases of
LXC and remote spawner isolation where the isolated task process
completes but the task on the side of the task machine remains
unfinished.

Signed-off-by: Plamen Dimitrov <[email protected]>
  • Loading branch information
pevogam committed Mar 14, 2024
1 parent 18bebb9 commit 673324f
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion avocado/core/status/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,10 @@ def close(self):

async def cb(self, reader, _):
while True:
raw_message = await reader.readline()
try:
raw_message = await reader.readline()
except ConnectionResetError:
continue
if not raw_message:
return
self._repo.process_raw_message(raw_message)

0 comments on commit 673324f

Please sign in to comment.