Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work around for rsync error 23 (part of #1587) + new "experimental" snapshot log filter #1621

Merged
merged 28 commits into from
Jan 29, 2024

Conversation

aryoda
Copy link
Contributor

@aryoda aryoda commented Jan 26, 2024

  • Work around: Relax rsync exit code 23: Ignore instead of error now (part of Master issue for "error 23" (rsync returns with exit code 23: Partial transfer due to errors) #1587)

  • Feature: Exclude 'SingletonLock' and 'SingletonCookie' (Discord) and 'lock' (Mozilla Firefox) files by default (part of Rsync error 23 #1555)

  • Feature (experimental): Add new snapshot log filter rsync transfer failures (experimental) to find them easier (they are normally not shown as "error").

    This feature is experimental because it is based on hard-coded error message strings in the rsync source code and may possibly not find all rsync messages or show false positives.

    The filter is meant only to support humans, not to automatically recognize errors when taking a snapshot!

    image

    The filter recognizes snapshot log entries that would otherwise require intensive human search in long logs
    because they are hidden in [I]nfo log entries, eg.:

    [I] Take snapshot (rsync: symlink has no referent: "/home/user/Documents/dead-link
    [I] Schnappschuss erstellen (rsync: IO error encountered -- skipping file deletion)
    [I] Schnappschuss erstellen (rsync: rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3])
    [I] Take snapshot (rsync: rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3])
    

Changelog is updated too...

* rsync exit code 23 is now in the ignore list
* Add experimental "rsync transfer failures" filter to snapshot log view
* Exclude 'SingletonLock' and 'SingletonCookie' (Discord) and 'lock' (Mozilla Firefox) files by default (part of bit-team#1555)
@buhtz
Copy link
Member

buhtz commented Jan 26, 2024

The TravisCI job for ppce64le on Python 3.8 is failing. This should not happen. I can't see in the logs why this happens.

This was also the case with PR #1614. But we did not recognized because we are used to the fact that currently TravisCI do not work. There is another job "ppc64le Python 3.12" (the latest Python!) which do not work because TravisCI still has no Python interpreter with that version for this architecture installed. They do work on it.

First of all within this PR the travis.yaml file should modified that way that the known problematic job ""ppc64le Python 3.12" is inactive.

Why "ppc64le Python 3.8" is failing needs further investigation but I assume this is somehow because of PR #1614.

@aryoda
Copy link
Contributor Author

aryoda commented Jan 26, 2024

First of all within this PR the travis.yaml file should modified that way that the known problematic job ""ppc64le Python 3.12" is inactive.

Why "ppc64le Python 3.8" is failing needs further investigation but I assume this is somehow because of PR #1614.

I will look into this and fix it

Edit: Ah, I just saw you are already fixing this, so I will wait (meanwhile trying to understand the impact of #1614 on TravisCI for ppc64le)

@buhtz
Copy link
Member

buhtz commented Jan 26, 2024

Mhm... I still have no idea and can not see the error message in the TravisCI output.
grafik

I removed pytest on my own machine to make sure that "unittest" package is used only. Then configure && make unitest-v was not able to reproduce the problem on my Debian 12.
Then I compared the generated Makefiles between TravisCI and my Debian 12.

On Debian:

/usr/bin/python3 -m unittest -v -b test/test_applicationinstance.py

On Travis

coverage run -p -m unittest -v -b test/test_applicationinstance.py

This makes sense and does not seems wrong.

But I do think that because of #1614 now "/usr/bin/python3" is used instead of "python3" in the previous versions.
But I don't see how this could make problems.

I am stopping here for this night and going to bed. I have no further ideas yet.

@aryoda
Copy link
Contributor Author

aryoda commented Jan 27, 2024

But I do think that because of #1614 now "/usr/bin/python3" is used instead of "python3" in the previous versions.
But I don't see how this could make problems.

I have injected a relative path again in common and qt via ./configure --python=python3 and it works again.

I have not yet tested it with an absolute path again (configure was not called for qt - perhaps this was the root problem).

It looked like pytest did work in qt but after_success did not work (coverage combine + coveralls).

Really strange...

I can do more tests Sa. night...

@buhtz
Copy link
Member

buhtz commented Jan 27, 2024

I tried to run the "coverage ..." commands on my local machine. There is no error.
On major difference to the TravisCI Ubuntu could be that Travis do not use Ubuntu's own Python but installs a specific one from a non-ubuntu source.

OK, I asked the community about how to make "make" more verbose: r/pythoncoding and debianforum.de.

I also asked the "coverage.py" maintainer and community about how coverage decide which Python interpreter it uses: Python discuss.

@buhtz
Copy link
Member

buhtz commented Jan 27, 2024

The make -d give some information.

overage run -p -m unittest -v -b test/test_applicationinstance.py

Putting child 0x8e31851a7f0 (unittest-v) PID 4469 on the chain.
Live child 0x8e31851a7f0 (unittest-v) PID 4469 
WARNING: 'import keyring' failed with: ModuleNotFoundError("No module named 'keyring'")
test_autoExit_other_running_process (test.test_applicationinstance.TestApplicationInstance) ... ok

Keyring is missing. I assume that "coverage.py" does use an Python interpreter different from the one used in this environment. Working on it ...

@buhtz
Copy link
Member

buhtz commented Jan 27, 2024

💥
Our "bit-team" account on TravisCI don't have credits anymore. I contacted the support.

I have an idea about the problem but I am not sure.

There are ModuleNotFound errors about "keyring" and "packaging". Not sure how this could be triggered by our last binary-path-fix. It could be that these packages are not installed inside the virtual environment that is used.
Via "coverage debug sys" I was able to verify that "coverage" do use the Python binary that is installed inside the virtual environment.

Maybe it will help when installing the missing packages explicit via "pip".

I also modified the "coverage" call into "{$PYTHON} -m coverage" as an experiment. But to my knowledge it doesn't matter. Coverage is used only on Travis and it always use the virtual environments own python interpreter by default. So this modification shouldn't change something and could be undone.

@buhtz
Copy link
Member

buhtz commented Jan 27, 2024

Credits are back. Seems that people also working in the weekend at Travis.

@aryoda
Copy link
Contributor Author

aryoda commented Jan 27, 2024

I have an idea about the problem but I am not sure.

Really weird..

I'd say we should use the working ./configure --python=python3 for now since it is not the goal the test the infrastructure (which may have a problem) but BiT.

@buhtz
Copy link
Member

buhtz commented Jan 27, 2024

I'd say we should use the working ./configure --python=python3 for now since it is not the goal the test the infrastructure (which may have a problem) but BiT.

Yes, good point. It works and doesn't matter that I don't know why it works. 😆

@aryoda
Copy link
Contributor Author

aryoda commented Jan 27, 2024

I'd say we should use the working ./configure --python=python3 for now since it is not the goal the test the infrastructure (which may have a problem) but BiT.

Yes, good point. It works and doesn't matter that I don't know why it works. 😆

I also really want to understand what's going wrong 😄 Since it only happens on one platform (ppc64le) I suspect a packaging issue on this specific Travis CI platform.

But my point is:

  • I could merge this PR now since it works and shall be contained in the upcoming release (unless you want to do more testing on Travis CI with this PR - which is totally OK for me - I think the release can wait a few more days)

@aryoda aryoda merged commit dd46462 into bit-team:dev Jan 29, 2024
1 check passed
@aryoda aryoda deleted the issue/1587_rsync_error_23 branch January 29, 2024 21:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants