You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm running medperf tutorials in WSL and face a strange behaviour when client and server start to fail randomly. As firstly I was thinking that's an internal medperf issue, I'm going to document details here.
When passing tutorials https://docs.medperf.org/getting_started/benchmark_owner_demo/ (this and other ones), I use a local medperf server. While running some (random) commands (usually heavy ones, that require a lot of i/o operations), I got the following error:
Client side:
Traceback (most recent call last):
File "/home/vukw/anaconda3/envs/env39_medperf/bin/mlcube", line 5, in <module>
from mlcube.__main__ import cli
File "/home/vukw/anaconda3/envs/env39_medperf/lib/python3.9/site-packages/mlcube/__main__.py", line 66, in <module>
default=os.getcwd(),
FileNotFoundError: [Errno 2] No such file or directory
Interesting thing is that it touch not only client side, but a server side also (that's running in an independent bash terminal):
Traceback (most recent call last):
File "/home/vukw/anaconda3/envs/env39_medperf/lib/python3.9/site-packages/django/db/backends/base/base.py", line 219, in ensure_connection
self.connect()
File "/home/vukw/anaconda3/envs/env39_medperf/lib/python3.9/site-packages/django/utils/asyncio.py", line 33, in inner
return func(*args, **kwargs)
File "/home/vukw/anaconda3/envs/env39_medperf/lib/python3.9/site-packages/django/db/backends/base/base.py", line 200, in connect
self.connection = self.get_new_connection(conn_params)
File "/home/vukw/anaconda3/envs/env39_medperf/lib/python3.9/site-packages/django/utils/asyncio.py", line 33, in inner
return func(*args, **kwargs)
File "/home/vukw/anaconda3/envs/env39_medperf/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 209, in get_new_connection
conn = Database.connect(**conn_params)
sqlite3.OperationalError: unable to open database file
Still, rerun server doesn't help:
$ sh setup-dev-server.sh
realpath: cert.crt: No such file or directory
realpath: cert.key: No such file or directory
1
1
0
CERT FILE must not be empty
Moreover, not just medperf is broken, but pip also:
$ pip list
The folder you are executing pip from can no longer be found.
Workarounds and solutions.
Workarounds
First of all, rerunning server and client in a new bash terminal helps to fix issue - for a while. Still after a few commands error is raised again.
cd . also helps like a magic. Looks like it resets working directory path - but again only for a while.
Solution debugging
Together with @hasan7n we've found that sometimes such a behavior can be noticed on external encrypted storages: stackoverflow discussion. In my case I checked out repo in Windows env - so all the files are located somewhere on /mnt/c/Users/vykuk/repos/mlc/medperf, that's actually an external and encrypted drive. Moreover, we've found a WSL issue with a similar behavior and workaround, but without notes about drive encryption. So, looks like WSL mounting drive (in my case) is a particular kind of main problem - that sometimes external drives can be locked & unlocked, and it causes working directory issues for all the scripts running on that storages.
Solution
Thus, a reasonable solution (that helped in my case also) is to move a whole medperf repository from windows host mounted drive /mnt/c/.... to the internal WSL filesystem. Moving the whole repo folder to /home/medperf removes the issue.
Future explorations
I still don't know why exactly mounted storage is locked, which conditions lead to it and who is responsible (Windows host or Ubuntu itself). Also, I didn't met such an issue with other projects located on mounted drive - medperf is the first one who reproduces that behavior. Finally, the nature of the issue makes it extremely hard to find a way to reproduce it with 100% guarantee. Same commands can sometimes pass successfully, and next time fail with error.
We can expect same issue may arise in other systems & combinations - when medperf repo is located on external storages.
Issue description
I'm running medperf tutorials in WSL and face a strange behaviour when client and server start to fail randomly. As firstly I was thinking that's an internal medperf issue, I'm going to document details here.
When passing tutorials https://docs.medperf.org/getting_started/benchmark_owner_demo/ (this and other ones), I use a local medperf server. While running some (random) commands (usually heavy ones, that require a lot of i/o operations), I got the following error:
Client side:
Interesting thing is that it touch not only client side, but a server side also (that's running in an independent bash terminal):
Still, rerun server doesn't help:
Moreover, not just medperf is broken, but pip also:
Workarounds and solutions.
Workarounds
cd .
also helps like a magic. Looks like it resets working directory path - but again only for a while.Solution debugging
Together with @hasan7n we've found that sometimes such a behavior can be noticed on external encrypted storages: stackoverflow discussion. In my case I checked out repo in Windows env - so all the files are located somewhere on
/mnt/c/Users/vykuk/repos/mlc/medperf
, that's actually an external and encrypted drive. Moreover, we've found a WSL issue with a similar behavior and workaround, but without notes about drive encryption. So, looks like WSL mounting drive (in my case) is a particular kind of main problem - that sometimes external drives can be locked & unlocked, and it causes working directory issues for all the scripts running on that storages.Solution
Thus, a reasonable solution (that helped in my case also) is to move a whole
medperf
repository from windows host mounted drive/mnt/c/....
to the internal WSL filesystem. Moving the whole repo folder to/home/medperf
removes the issue.Future explorations
I still don't know why exactly mounted storage is locked, which conditions lead to it and who is responsible (Windows host or Ubuntu itself). Also, I didn't met such an issue with other projects located on mounted drive - medperf is the first one who reproduces that behavior. Finally, the nature of the issue makes it extremely hard to find a way to reproduce it with 100% guarantee. Same commands can sometimes pass successfully, and next time fail with error.
We can expect same issue may arise in other systems & combinations - when medperf repo is located on external storages.
Environment
$ uname -r
): 5.15.90.1-microsoft-standard-WSL2lsb_release -a
): Ubuntu 22.04.1 LTSThe text was updated successfully, but these errors were encountered: