Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Can't upgrade because of "Segmentation fault" (and server is not running) #2257

Open
5 of 8 tasks
TechupBusiness opened this issue Jul 11, 2024 · 5 comments
Open
5 of 8 tasks
Labels
0. Needs triage bug needs info Additional info needed to triage

Comments

@TechupBusiness
Copy link

TechupBusiness commented Jul 11, 2024

⚠️ This issue respects the following points: ⚠️

Bug description

I upgraded to the latest minor version and then to the latest next major version, but it fails with "Segmentation fault". And the nginx proxy can don't find any fpm result to serve (and shows 502 Bad Gateway)

I have no idea whats wrong. The acpu config is correct. Redis seems to be also reachable.

./occ upgrade -vvv
Nextcloud or one of the apps require upgrade - only a limited number of commands are available
You may use your browser or the occ upgrade command to do the upgrade
2024-07-11T20:42:49+00:00 Setting log level to debug
2024-07-11T20:42:49+00:00 Repair step: Repair MySQL collation
2024-07-11T20:42:49+00:00 Repair info: All tables already have the correct collation -> nothing to do
2024-07-11T20:42:49+00:00 Repair step: Repair SQLite autoincrement
2024-07-11T20:42:49+00:00 Repair step: Copy data from accounts table when migrating from ownCloud
2024-07-11T20:42:50+00:00 Repair step: Drop account terms table when migrating from ownCloud
2024-07-11T20:42:50+00:00 Updating database schema
2024-07-11T20:42:50+00:00 Updated database
2024-07-11T20:42:50+00:00 Updating <lookup_server_connector> ...
Segmentation fault

Steps to reproduce

Expected behavior

No error (Segmentation fault) and served content via fpm.

Installation method

Community Docker image

Nextcloud Server version

27

Operating system

Other

PHP engine version

PHP 8.2

Web server

Nginx

Database engine version

MariaDB

Is this bug present after an update or on a fresh install?

Upgraded to a MAJOR version (ex. 28 to 29)

Are you using the Nextcloud Server Encryption module?

None

What user-backends are you using?

  • Default user-backend (database)
  • LDAP/ Active Directory
  • SSO - SAML
  • Other

Configuration report

{
    "system": {
        "memcache.local": "\\OC\\Memcache\\APCu",
        "apps_paths": [
            {
                "path": "\/var\/www\/html\/apps",
                "url": "\/apps",
                "writable": false
            },
            {
                "path": "\/var\/www\/html\/custom_apps",
                "url": "\/custom_apps",
                "writable": true
            }
        ],
        "instanceid": "***REMOVED SENSITIVE VALUE***",
        "passwordsalt": "***REMOVED SENSITIVE VALUE***",
        "secret": "***REMOVED SENSITIVE VALUE***",
        "trusted_domains": [
            "domain.tld"
        ],
        "datadirectory": "***REMOVED SENSITIVE VALUE***",
        "dbtype": "mysql",
        "version": "26.0.13.1",
        "overwrite.cli.url": "https:\/\/domain.tld",
        "dbname": "***REMOVED SENSITIVE VALUE***",
        "dbhost": "***REMOVED SENSITIVE VALUE***",
        "dbport": "",
        "dbtableprefix": "oc_",
        "mysql.utf8mb4": true,
        "dbuser": "***REMOVED SENSITIVE VALUE***",
        "dbpassword": "***REMOVED SENSITIVE VALUE***",
        "installed": true,
        "maintenance": true,
        "mail_smtpmode": "smtp",
        "mail_smtphost": "***REMOVED SENSITIVE VALUE***",
        "mail_sendmailmode": "smtp",
        "mail_smtpport": "587",
        "mail_from_address": "***REMOVED SENSITIVE VALUE***",
        "mail_domain": "***REMOVED SENSITIVE VALUE***",
        "trusted_proxies": "***REMOVED SENSITIVE VALUE***",
        "loglevel": 0,
        "memcache.distributed": "\\OC\\Memcache\\Redis",
        "memcache.locking": "\\OC\\Memcache\\Redis",
        "redis": {
            "host": "***REMOVED SENSITIVE VALUE***",
            "password": "***REMOVED SENSITIVE VALUE***",
            "port": 6379
        }
    }
}

List of activated Apps

Enabled:
  - cloud_federation_api: 1.9.0
  - dav: 1.25.0
  - federatedfilesharing: 1.16.0
  - files: 1.21.1
  - lookup_server_connector: 1.14.0
  - oauth2: 1.14.2
  - provisioning_api: 1.16.0
  - settings: 1.8.0
  - theming: 2.1.1
  - twofactor_backupcodes: 1.15.0
  - viewer: 1.10.0
  - workflowengine: 2.8.0

Nextcloud Signing status

Not possible, server is not starting and shows "502 Bad Gateway"

Nextcloud Logs

Doesnt look revelant to error: 

{"reqId":"PKSrZtXcZ5c12GAKyncb","level":1,"time":"2024-07-11T20:37:05+00:00","remoteAddr":"","user":"--","app":"updater","method":"","url":"--","message":"\\OC\\Updater::setDebugLogLevel: Set log level to debug","userAgent":"--","version":"26.0.13.1","data":{"app":"updater"}}
{"reqId":"PKSrZtXcZ5c12GAKyncb","level":0,"time":"2024-07-11T20:37:05+00:00","remoteAddr":"","user":"--","app":"core","method":"","url":"--","message":"starting upgrade from 26.0.13.1 to 27.1.11.3","userAgent":"--","version":"26.0.13.1","data":{"app":"core"}}
{"reqId":"PKSrZtXcZ5c12GAKyncb","level":1,"time":"2024-07-11T20:37:05+00:00","remoteAddr":"","user":"--","app":"updater","method":"","url":"--","message":"OC\\Repair\\Events\\RepairStepEvent: Repair step: Repair MySQL collation","userAgent":"--","version":"26.0.13.1","data":{"app":"updater"}}
{"reqId":"PKSrZtXcZ5c12GAKyncb","level":1,"time":"2024-07-11T20:37:05+00:00","remoteAddr":"","user":"--","app":"updater","method":"","url":"--","message":"OC\\Repair\\Events\\RepairInfoEvent: Repair info: All tables already have the correct collation -> nothing to do","userAgent":"--","version":"26.0.13.1","data":{"app":"updater"}}
{"reqId":"PKSrZtXcZ5c12GAKyncb","level":1,"time":"2024-07-11T20:37:05+00:00","remoteAddr":"","user":"--","app":"updater","method":"","url":"--","message":"OC\\Repair\\Events\\RepairStepEvent: Repair step: Repair SQLite autoincrement","userAgent":"--","version":"26.0.13.1","data":{"app":"updater"}}
{"reqId":"PKSrZtXcZ5c12GAKyncb","level":1,"time":"2024-07-11T20:37:05+00:00","remoteAddr":"","user":"--","app":"updater","method":"","url":"--","message":"OC\\Repair\\Events\\RepairStepEvent: Repair step: Copy data from accounts table when migrating from ownCloud","userAgent":"--","version":"26.0.13.1","data":{"app":"updater"}}
{"reqId":"PKSrZtXcZ5c12GAKyncb","level":1,"time":"2024-07-11T20:37:06+00:00","remoteAddr":"","user":"--","app":"updater","method":"","url":"--","message":"OC\\Repair\\Events\\RepairStepEvent: Repair step: Drop account terms table when migrating from ownCloud","userAgent":"--","version":"26.0.13.1","data":{"app":"updater"}}
{"reqId":"PKSrZtXcZ5c12GAKyncb","level":1,"time":"2024-07-11T20:37:06+00:00","remoteAddr":"","user":"--","app":"updater","method":"","url":"--","message":"\\OC\\Updater::dbUpgradeBefore: Updating database schema","userAgent":"--","version":"26.0.13.1","data":{"app":"updater"}}
{"reqId":"PKSrZtXcZ5c12GAKyncb","level":1,"time":"2024-07-11T20:37:06+00:00","remoteAddr":"","user":"--","app":"updater","method":"","url":"--","message":"\\OC\\Updater::dbUpgrade: Updated database","userAgent":"--","version":"26.0.13.1","data":{"app":"updater"}}
{"reqId":"PKSrZtXcZ5c12GAKyncb","level":1,"time":"2024-07-11T20:37:06+00:00","remoteAddr":"","user":"--","app":"updater","method":"","url":"--","message":"\\OC\\Updater::appUpgradeStarted: Updating <lookup_server_connector> ...","userAgent":"--","version":"26.0.13.1","data":{"app":"updater"}}
{"reqId":"3MRtPxQFz6w5KeqO4wGV","level":0,"time":"2024-07-11T20:40:00+00:00","remoteAddr":"","user":"--","app":"cron","method":"","url":"--","message":"Update required, skipping cron","userAgent":"--","version":"26.0.13.1","data":{"app":"cron"}}
{"reqId":"6JcJFhHrVdptY2egjHfZ","level":0,"time":"2024-07-11T20:40:00+00:00","remoteAddr":"","user":"--","app":"cron","method":"","url":"--","message":"Update required, skipping cron","userAgent":"--","version":"26.0.13.1","data":{"app":"cron"}}

Additional info

./occ status
Nextcloud or one of the apps require upgrade - only a limited number of commands are available
You may use your browser or the occ upgrade command to do the upgrade

  • installed: true
  • version: 27.1.11.3
  • versionstring: 27.1.11
  • edition:
  • maintenance: true
  • needsDbUpgrade: true
  • productname: Nextcloud
  • extendedSupport: false
@joshtrichards
Copy link
Member

joshtrichards commented Jul 11, 2024

"version": "26.0.13.1",

Which image, precisely, are using?

Please share your Docker Compose file and precisely how you upgraded from whatever version you started this latest upgrade attempt with.

@joshtrichards joshtrichards added the needs info Additional info needed to triage label Jul 11, 2024
@joshtrichards joshtrichards transferred this issue from nextcloud/server Jul 11, 2024
@TechupBusiness
Copy link
Author

"version": "26.0.13.1",

Which image, precisely, are using?

Please share your Docker Compose file and precisely how you upgraded from whatever version you started this latest upgrade attempt with.

You can find the files here:

Image (via configuration) from 26-fpm-alpine to 27-fpm-alpine.

Upgrade is described here: https://github.com/TechupBusiness/simple-docker-multi-project/blob/master/applications/system-services/main/nextcloud/Readme.md

Would be helpful if there would be a logfile thats shows the real reason for this error.

@joshtrichards
Copy link
Member

You're using a highly customized setup/image. This does not happen in a standard install or using a standard image. I can't even begin to decipher what is going on there without digging into whatever abstractions and customization you've done (and that's not realistic through this channel).

The most useful output would been the Docker container output from the very first start-up of your app container when attempting to upgrade. The upgrade is handled (in the unmodified image at least) within the entrypoint. There is no need to run occ upgrade independently (and if you need to, it's a sign something has already gone wrong).

Would be helpful if there would be a logfile thats shows the real reason for this error.

Well, it's a low-level error that is unlikely to be coming from Nextcloud itself. You'll likely have to troubleshoot what is causing it in your local environment. A million things can cause segfaults.

I will say that the custom GID stuff you're doing may not be compatible with assumptions made within the stock image's entrypoint.sh. The ownership of newer versions of Server that get deployed there (via rsync) may not match your customized Dockerfile's assumptions.

If I was in your position I'd probably eliminate all the variables I possibly can by going to a stock image then working from there incrementally until the problem comes up again. And if it comes up in the stock image, then I'd look at what's notable about the host (OS, Docker platform) to try to find a way to make it possible for someone else to reproduce the same behavior.

@TechupBusiness
Copy link
Author

TechupBusiness commented Jul 13, 2024

You're using a highly customized setup/image. This does not happen in a standard install or using a standard image. I can't even begin to decipher what is going on there without digging into whatever abstractions and customization you've done (and that's not realistic through this channel).

Erm you are aware that the only difference to the default image is, that I set explicit the user/group? Whats highly customized here? I dont have custom entrypoints and also nothing else "custom" with this image!? So its pretty standard image.

The most useful output would been the Docker container output from the very first start-up of your app container when attempting to upgrade. The upgrade is handled (in the unmodified image at least) within the entrypoint. There is no need to run occ upgrade independently (and if you need to, it's a sign something has already gone wrong).

Yes I know but it didnt work, same error. Thats why I tried it manually.

The log looked like:

app_1             | Configuring Redis as session handler
app_1             | Initializing nextcloud 26.0.13.1 ...
app_1             | Upgrading nextcloud from 25.0.13.2 ...
app_1             | => Searching for scripts (*.sh) to run, located in the folder: /docker-entrypoint-hooks.d/pre-upgrade
app_1             | Nextcloud or one of the apps require upgrade - only a limited number of commands are available
app_1             | You may use your browser or the occ upgrade command to do the upgrade
app_1             | Setting log level to debug
app_1             | Turned on maintenance mode
app_1             | Updating database schema
app_1             | Updated database
app_1             | Updating <lookup_server_connector> ...
app_1             | Segmentation fault
app_1             | Configuring Redis as session handler
app_1             | => Searching for scripts (*.sh) to run, located in the folder: /docker-entrypoint-hooks.d/before-starting
app_1             | [11-Jul-2024 20:28:23] NOTICE: fpm is running, pid 1
app_1             | [11-Jul-2024 20:28:23] NOTICE: ready to handle connections

Would be helpful if there would be a logfile thats shows the real reason for this error.

Well, it's a low-level error that is unlikely to be coming from Nextcloud itself. You'll likely have to troubleshoot what is causing it in your local environment. A million things can cause segfaults.

Yeah but something must trigger this error... it seems quite high-level to me. Other users reports it was related to outdated extensions etc.

I will say that the custom GID stuff you're doing may not be compatible with assumptions made within the stock image's entrypoint.sh. The ownership of newer versions of Server that get deployed there (via rsync) may not match your customized Dockerfile's assumptions.

Do you know whats the proper ownership? Because I didnt change anything else.

If I was in your position I'd probably eliminate all the variables I possibly can by going to a stock image then working from there incrementally until the problem comes up again. And if it comes up in the stock image, then I'd look at what's notable about the host (OS, Docker platform) to try to find a way to make it possible for someone else to reproduce the same behavior.

I'm using the stock image except of setting ownership (which is needed because I mount existing files too). Linux filesystem stuff. I'm considering now a fresh install because I dont use nextcloud for anything else than just accessing files via app or sharing via web (I dont need existing active shares). But this error is maybe not even related to such things, therefore... I guess I need to know about proper file ownership :)

@tzerber
Copy link
Contributor

tzerber commented Jul 15, 2024

I remember having similar weird issue due testing some of the images. The safest way of re-starting this is doing docker compose down followed by a docker compose up -d I have no idea where that version 25 came from, but it worked. If it does not help, run all containers that you want to keep, and do a docker system prune. This also helps, but keep in mind that by doing so, all stopped containers / anything not in use at the moment(edit: volumes included) of execution is going to be deleted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0. Needs triage bug needs info Additional info needed to triage
Projects
None yet
Development

No branches or pull requests

3 participants