-
-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating Gravity after restoring a v5 Teleport into a v6 instance crashes FTL #2035
Comments
Sorry to hear you are experiencing an issue and many thanks for the very good ticket on this. Unfortunately, I am unable to reproduce following your particular description so it seems to depend on the content of the particular file you are trying to restore. As you can reproduce this reliably, can I ask you to try to reproduce this again with the debugger being attached to As you said, you're running in an services:
pihole:
...
cap_add:
...
- SYS_PTRACE
....
security_opt:
- seccomp:unconfined and then follow the description on the v6 documentation draft: The final step (output of the |
Thanks DL6ER for your prompt reply! I prepared the debugging information. Here is it:
The log above was generated with GDB running directly from the host while the process was running inside the container. Please let me know if this is helpful, I noticed the following warning
But I could not find a way to install gdb inside the container (neither apt nor yum are available :/ ). If there is any other information I can provide to help, don't hesitate in asking! |
Somehow, this doesn't really fit together. It can bei either a bad thing (double-free) or actually a lot worse (mangling memory on a low level), both rather hard to debug. We'd need many step-by-step ping-pong messages until we can get this resolved. I'm also afraid we may need to utilize memcheck64 to monitor memory access violations to exclude this as possible cause. So it seems to be the better alternative if you could send me your particular file which I should then be able to trigger the exact same crash here. If there is stuff in there you aren't comfortable sharing with us, you can surely modify it but please test again if the modified version still crashes FTL before sending. The best way is to send it via e-mail to my username here .at. pi-hole.net |
Hi @DL6ER, I just sent you the Teleport pack by email. I made some more tests and found some more information that may be useful for you: 1 - Restarting from scratch, as I though it was related to the lists, in the Teleport restore dialog I tried importing just the lists and trigger gravity. Instead of failing with FTL crashing like before, it failed too but more gracefully, apparently related to DB. Dashboard log:
Container log:
FTL kept running fine so I was able to keep browsing the dashboard 2 - Restarting from scratch, if in the Teleport restore dialog I restore something non related to gravity, for example, if I restore just "Clients" and nothing else, I still get the same FOREIGN KEY failure described in (1) above 3 - Restarting from scratch, if I don't restore anything and simply try to update gravity on a brand new setup, I don't get any failure in the dashboard log, but looking at the container log I saw the following line
Not sure if expected, I guess not so I decided to report here 4 - Restarting from scratch, after some trial and error I could trace the FTL crashing down to the "Configuration" restore in Teleport. If I restore just that and nothing else, updating gravity will cause the FTL crash as the original problem reported in this issue |
I have not been able to confirm/reproduce the crash and need some more details like your particular |
Even though I am not able to reproduce it, I might have found what is causing this looking again at your first backtrace. Actually, I have fixed this a few days ago but now got more confident that this might be the real fix we need here. Could you please build a local docker image with my proposed fix so you can verify it works for you? Following the instructions git clone https://github.com/pi-hole/docker-pi-hole
cd docker-pi-hole
git checkout development-v6 # NOTE: This step is only needed until V6 is released
./build.sh -f fix/config_crash2 and then using the generated image should be enough. |
Hi DL6ER, there is no |
I just tried running a test on a Based on latest |
Please follow these instructions to attach the debugger to the container Thanks for your continued support to get this fixed - once we exactly know where it happens, it should also learn that the required boundary conditions are and it should become obvious why I failed to reproduce the bug myself. |
I cannot follow the debugger instructions because there is no package manager available inside the docker image:
could you please advise how to install gdb in this setup? Thank you! |
|
Here is the gdb log. First, gdb signal breakpoints stopped a few times while importing teleport. I had to
Second, following is the log while updating gravity
I hope this helps! Let me know if you need more info and thanks for the support! |
Thank you for the backtrace. This confirms it is a double-free corruption issue - I thought I had already fixed it in #2043 but there seems to be another one. More to come later. |
I just had some time to look at your output and just realized that there isn't actually any backtrace. When you see a real crash indicated by signal
the next step would have to be Could you maybe try this once again for us? Either way, I also prepared a new FTL branch https://github.com/pi-hole/docker-pi-hole/tree/development?tab=readme-ov-file#usage In this case, it'd be
which will create a new docker image with tag |
I built the docker image from I used the same fix/free branch to get the backtrace you requested:
First, a backtrace from the SIG33 received while importing teleport:
And finally the backtrace for update gravity SIGSEGV:
I hope this helps. I also have easy access to test under Thanks! |
Okay ... two things: Please built a new container on the latest version of If it does not work, please try again with FTL from branch |
I built
After finishing import, I tried updating gravity. I got kicked to the login screen just like before when FTL was crashing, however, according to the logs, there was no crash this time (still, the update process was interrupted right at the beginning, exactly at the same time a crash would happen). I could confirm that FTL did not crash because I could login back right away (as opposed to login becoming defunct when FTL crashed). I assume this was your graceful handling in action? Once I logged back in, I triggered another gravity update. This second run did get to finish (naturally taking longer, since I have 36 lists), but I don't believe we can consider it successful. All lists were processed successfully, except one, here are some relevant browser logs:
After this, the dashboard started to report the number of blocked domains as "-2". Apparently, this also broke the whole gravity db. Trying to update gravity again would always fail with the following browser log:
There was no recovery beyond that. I deleted the container and restarted from scratch and I was able to reproduce the errors. I was going to try to get you some backtrace using
Once you fix it, I can get you the detailed logs you want. Thanks! |
That's all there was? If so, it looks fine.
Yeah, importing the password from the Teleporter archive is expected to cause a termination of all web sessions as the password might have changed.
No, I think this means the issue was actually fixed. Whatever is causing the following issue seems to be something new - or rather - something we have not been able to see before.
Could you share the involved logs?
I don't think we will need this. @PromoFaux Seems we need some error handling in the custom-docker-image sh script. On non-existing branch, it stored the error 404 page as the FTL binary into the container which then obviously cannot start. |
Most of it looks fine, I was just confused because apparently FTL is finishing with exit code 22 instead of 0. Is it expected? If you want I can share the full log but I don't see anything suspicious other than this
This would be fine if it happened during Teleport import, however, this is happening during gravity update
Sure. Here are the long until I get kicked back to login screen:
And here are the logs after I log back in and try updating gravity again:
And for
Would you like me to create a new issue? |
Yes, code 22 means an internally triggered restart. I made this clearer in 1f264eb
Would be interesting to see the log in between from
Yeah, it probably better as this issue ticket here is already complex enough to risk loosing one or the other bits. Thanks for your continues support helping making Pi-hole better for us all! |
Versions
Core
Version is 9564a6e (Latest: null)
Branch is development-v6
Hash is 9564a6e9 (Latest: 4972cc6f)
Web
Version is a6807d1 (Latest: null)
Branch is development-v6
Hash is a6807d1a (Latest: 046b5629)
FTL
Version is vDev-0c36f47 (Latest: null)
Branch is development-v6
Hash is 0c36f47 (Latest: 0c36f47)
Platform
Expected behavior
Actual behavior / bug
Steps to reproduce
Steps to reproduce the behavior:
Debug Token
Additional context
The text was updated successfully, but these errors were encountered: