Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus Plugin Update prevents Jenkins from startup #647

Closed
Alphayeeeet opened this issue Mar 26, 2024 · 19 comments · Fixed by #650
Closed

Prometheus Plugin Update prevents Jenkins from startup #647

Alphayeeeet opened this issue Mar 26, 2024 · 19 comments · Fixed by #650

Comments

@Alphayeeeet
Copy link

Jenkins and plugins versions report

Jenkins: 2.440.2
OS: Linux - 4.18.0-513.18.1.el8_9.x86_64
Java: 11.0.22 - Red Hat, Inc. (OpenJDK 64-Bit Server VM)

analysis-model-api:12.1.0
ansicolor:1.0.4
ant:497.v94e7d9fffa_b_9
antisamy-markup-formatter:162.v0e6ec0fcfcf6
apache-httpcomponents-client-4-api:4.5.14-208.v438351942757
asm-api:9.6-3.v2e1fa_b_338cd7
authentication-tokens:1.53.v1c90fd9191a_b_
badge:1.9.1
basic-branch-build-strategies:81.v05e333931c7d
bootstrap5-api:5.3.3-1
bouncycastle-api:2.30.1.77-225.v26ea_c9455fd9
branch-api:2.1152.v6f101e97dd77
build-timeout:1.32
build-user-vars-plugin:1.9
caffeine-api:3.1.8-133.v17b_1ff2e0599
checks-api:2.0.2
cloudbees-bitbucket-branch-source:880.vcf4056c5a_71f
cloudbees-folder:6.901.vb_4c7a_da_75da_3
cobertura:1.17
code-coverage-api:4.99.0
command-launcher:107.v773860566e2e
commons-lang3-api:3.13.0-62.v7d18e55f51e2
commons-text-api:1.11.0-95.v22a_d30ee5d36
conditional-buildstep:1.4.3
config-file-provider:968.ve1ca_eb_913f8c
content-replace:1.8.2
copyartifact:722.v0662a_9b_e22a_c
coverage:1.13.0
credentials:1337.v60b_d7b_c7b_c9f
credentials-binding:657.v2b_19db_7d6e6d
custom-tools-plugin:0.8
data-tables-api:2.0.2-1
dependency-track:4.3.1
display-url-api:2.200.vb_9327d658781
docker-commons:439.va_3cb_0a_6a_fb_29
docker-workflow:572.v950f58993843
durable-task:550.v0930093c4b_a_6
echarts-api:5.5.0-1
email-ext:2.105
exclusive-execution:0.8
extended-choice-parameter:381.v360a_25ea_017c
extended-read-permission:53.v6499940139e5
extra-tool-installers:139.v723fee51b_7f2
favorite:2.208.v91d65b_7792a_c
file-operations:214.v2e7dc7f25757
file-parameters:316.va_83a_1221db_a_7
font-awesome-api:6.5.1-3
forensics-api:2.4.0
generic-webhook-trigger:2.1.0
git:5.2.1
git-client:4.7.0
git-parameter:0.9.19
git-server:114.v068a_c7cc2574
gitlab-plugin:1.8.0
gradle:2.10
groovy:457.v99900cb_85593
groovy-postbuild:228.vcdb_cf7265066
gson-api:2.10.1-15.v0d99f670e0a_7
handy-uri-templates-2-api:2.1.8-30.v7e777411b_148
hidden-parameter:237.v4b_df26c7a_f0e
htmlpublisher:1.33
http_request:1.18
instance-identity:185.v303dc7c645f9
ionicons-api:56.v1b_1c8c49374e
jackson2-api:2.17.0-379.v02de8ec9f64c
jakarta-activation-api:2.1.3-1
jakarta-mail-api:2.1.3-1
javadoc:243.vb_b_503b_b_45537
javax-activation-api:1.2.0-6
javax-mail-api:1.6.2-9
jaxb:2.3.9-1
jdk-tool:73.vddf737284550
jersey2-api:2.42-147.va_28a_44603b_d5
jjwt-api:0.11.5-77.v646c772fddb_0
job-dsl:1.87
joda-time-api:2.12.7-29.v5a_b_e3a_82269a_
jquery:1.12.4-1
jquery3-api:3.7.1-2
jsch:0.2.16-86.v42e010d9484b_
json-api:20240303-41.v94e11e6de726
json-path-api:2.9.0-58.v62e3e85b_a_655
junit:1259.v65ffcef24a_88
ldap:719.vcb_d039b_77d0d
mailer:472.vf7c289a_4b_420
mapdb-api:1.0.9-28.vf251ce40855d
matrix-auth:3.2.2
matrix-project:822.824.v14451b_c0fd42
maven-plugin:3.23
metrics:4.2.21-449.v6960d7c54c69
mina-sshd-api-common:2.12.0-90.v9f7fb_9fa_3d3b_
mina-sshd-api-core:2.12.0-90.v9f7fb_9fa_3d3b_
msbuild:1.30
nodejs:1.6.1
okhttp-api:4.11.0-172.vda_da_1feeb_c6e
openshift-client:1.1.0.424.v829cb_ccf8798
pam-auth:1.10
pipeline-build-step:540.vb_e8849e1a_b_d8
pipeline-graph-analysis:216.vfd8b_ece330ca_
pipeline-groovy-lib:704.vc58b_8890a_384
pipeline-input-step:491.vb_07d21da_1a_fb_
pipeline-milestone-step:111.v449306f708b_7
pipeline-model-api:2.2184.v0b_358b_953e69
pipeline-model-definition:2.2184.v0b_358b_953e69
pipeline-model-extensions:2.2184.v0b_358b_953e69
pipeline-rest-api:2.34
pipeline-stage-step:305.ve96d0205c1c6
pipeline-stage-tags-metadata:2.2184.v0b_358b_953e69
pipeline-stage-view:2.34
pipeline-utility-steps:2.16.2
plain-credentials:179.vc5cb_98f6db_38
plugin-util-api:4.1.0
powershell:2.1
prism-api:1.29.0-13
prometheus:2.5.1 --> Upgrade to 2.5.2 not possible needed to rollback to backup
publish-over:0.22
publish-over-ssh:1.25
pubsub-light:1.18
purge-build-queue-plugin:88.v23b_97b_f2c7a_d
purge-job-history:1.6
rebuild:330.v645b_7df10e2a_
resource-disposer:0.23
role-strategy:713.vb_3837801b_8cc
run-condition:1.7
scm-api:689.v237b_6d3a_ef7f
script-security:1326.vdb_c154de8669
slave-setup:1.16
snakeyaml-api:2.2-111.vc6598e30cc65
sonar:2.17.2
sse-gateway:1.26
ssh-agent:346.vda_a_c4f2c8e50
ssh-credentials:326.v7fcb_a_ef6194b_
ssh-slaves:2.948.vb_8050d697fec
sshd:3.322.v159e91f6a_550
strict-crumb-issuer:2.1.1
structs:337.v1b_04ea_4df7c8
throttle-concurrents:2.14
timestamper:1.26
token-macro:400.v35420b_922dcb_
trilead-api:2.142.v748523a_76693
uno-choice:2.8.1
variant:60.v7290fc0eb_b_cd
warnings-ng:11.2.2
workflow-aggregator:596.v8c21c963d92d
workflow-api:1291.v51fd2a_625da_7
workflow-basic-steps:1049.v257a_e6b_30fb_d
workflow-cps:3889.v937e0b_3412d3
workflow-durable-task-step:1331.vc8c2fed35334
workflow-job:1400.v7fd111b_ec82f
workflow-multibranch:773.vc4fe1378f1d5
workflow-scm-step:427.v4ca_6512e7df1
workflow-step-api:657.v03b_e8115821b_
workflow-support:881.v7663695646cf
ws-cleanup:0.45

What Operating System are you using (both controller, and any agents involved in the problem)?

Controller Node is Running RHEL 8

Reproduction steps

Updating Prometheus Plugin to 2.5.2
Jenkins version is 2.440.2

Expected Results

Jenkins should startup normally.

Actual Results

Jenkins was stuck in startup process and throws errors in log.

Anything else?

Log Messages attached:

2024-03-26 14:08:26.920+0000 [id=43] WARNING o.j.p.prometheus.JobCollector#lambda$collect$0: Caught error when processing job [JOB] error: java.lang.IllegalStateException: Jenkins.instance is missing. Read the documentation of Jenkins.getInstanceOrNull to see what you are doing wrong. at jenkins.model.Jenkins.get(Jenkins.java:819) at org.jenkinsci.plugins.prometheus.config.PrometheusConfiguration.get(PrometheusConfiguration.java:79) at org.jenkinsci.plugins.prometheus.JobCollector.appendJobMetrics(JobCollector.java:215) at org.jenkinsci.plugins.prometheus.JobCollector.lambda$collect$0(JobCollector.java:162) at org.jenkinsci.plugins.prometheus.util.Jobs.forEachJob(Jobs.java:19) at org.jenkinsci.plugins.prometheus.JobCollector.collect(JobCollector.java:155) at io.prometheus.client.CollectorRegistry.collectorNames(CollectorRegistry.java:113) at io.prometheus.client.CollectorRegistry.register(CollectorRegistry.java:50) at org.jenkinsci.plugins.prometheus.service.DefaultPrometheusMetrics.registerCollectors(DefaultPrometheusMetrics.java:46) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:109) at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:185) at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:305) at jenkins.model.Jenkins$5.runTask(Jenkins.java:1170) at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:221) at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:120) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) 2024-03-26 14:08:26.910+0000 [id=23] SEVERE h.i.i.InstallUncaughtExceptionHandler$DefaultUncaughtExceptionHandler#uncaughtException: A thread (Jenkins initialization thread/23) died unexpectedly due to an uncaught exception. This may leave your server corrupted and usually indicates a software bug. java.lang.ClassNotFoundException: hudson.util.HudsonFailedToLoad at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:476) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:594) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:527) at org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:511) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:527) Caused: java.lang.NoClassDefFoundError: hudson/util/HudsonFailedToLoad at hudson.WebAppMain$3.run(WebAppMain.java:276)

Are you interested in contributing a fix?

No response

@Waschndolos
Copy link

Sounds stange. I didn't have any issues in my systems (multiple platforms). I'll check if I can provide a quick fix

@Waschndolos
Copy link

I've just updated 2 of our production Jenkins instances to 2.5.2 in the company without any issues. Are you sure your Jenkins instance is healthy?

@jonesbusy
Copy link

jonesbusy commented Mar 27, 2024

Hi,

I also have issue with new version.

Caused by: java.lang.IllegalArgumentException: Failed to register Collector of type JenkinsStatusCollector: default_jenkins_version_info is already in use by another Collector of type JenkinsStatusCollector

Full stacktrace is

2024-03-27T05:54:02.816Z [ERROR] [Thread=pool-199-thread-3] - jenkins.InitReactorRunner # Failed DefaultPrometheusMetrics.registerCollectors 
java.lang.Error: java.lang.reflect.InvocationTargetException
        at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:115)
        at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:185)
        at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:305)
        at jenkins.model.Jenkins$5.runTask(Jenkins.java:1170)
        at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:221)
        at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:120)
        at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.lang.reflect.InvocationTargetException: null
        at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:118)
        at java.base/java.lang.reflect.Method.invoke(Method.java:580)
        at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:109)
        ... 9 common frames omitted
Caused by: java.lang.IllegalArgumentException: Failed to register Collector of type JenkinsStatusCollector: default_jenkins_version_info is already in use by another Collector of type JenkinsStatusCollector
        at io.prometheus.client.CollectorRegistry.register(CollectorRegistry.java:57)
        at org.jenkinsci.plugins.prometheus.service.DefaultPrometheusMetrics.registerCollectors(DefaultPrometheusMetrics.java:47)
        at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
        ... 11 common frames omitted

Not sure if it's same issue or not.

But started to appear with 2.5.2.

Instance is healthy and latest LTS 2.440.2. All plugins are up-to-date

@Waschndolos
Copy link

But have you been able to update and does the 2.5.2 version produce results?

That's kind of strange. I have multiple Jenkins instances. Windows, Docker... also colleagues tried that and nobody had any issues yet. Are we using the @Initializer(after = InitMilestone.JOB_LOADED) incorrect?

@Waschndolos
Copy link

Since I cannot reproduce the issue I've implemented a "shot in the dark" in PR #648. @jonesbusy @Alphayeeeet Could anybody of you test the .hpi file which comes out of https://ci.jenkins.io/job/Plugins/job/prometheus-plugin/job/PR-648/ ?

@jonesbusy
Copy link

Sure thanks, I will test the incrementals tomorrow.

I would be curious if the jenkins bom would be able to discover such issue

@nilbek
Copy link

nilbek commented Mar 28, 2024

Yesterday it happened to me too. I manually remove recently updated plugins one by one and after deleting prometheus, it could start normally.

Jenkins 2.440.2
Instance is healthy and latest LTS 2.440.2. All plugins are up-to-date.
Platform: Linux rpm
Controller Node is Running RHEL 8

More details: Actually I was not upgrading "Prometheus Plugin" to "2.5.2". I done that earlier. When "Prometheus Plugin, 2.5.2" already installed, I was trying to upgrade other plugins. After upgrading them and restart, it hangs and timeouts start of the jenkins.

Maybe helpful logs:

...
Mar 27 10:20:39 jenkins[1557315]: 2024-03-27 07:20:39.442+0000 [id=427311]        INFO        hudson.util.Retrier#start: Attempt #1 to do the action check updates server
...
Mar 27 10:20:58 jenkins[1557315]: 2024-03-27 07:20:58.263+0000 [id=427311]        INFO        hudson.util.Retrier#start: Performed the action check updates server successfully at the attempt #1
...
Mar 27 10:23:41 jenkins[1557315]: 2024-03-27 07:23:41.732+0000 [id=427311]        INFO        hudson.PluginManager#install: Starting installation of a batch of 3 plugins plus their dependencies
Mar 27 10:23:41 jenkins[1557315]: 2024-03-27 07:23:41.734+0000 [id=439073]        INFO        h.model.UpdateCenter$DownloadJob#run: Starting the installation of github-branch-source on behalf of USER
Mar 27 10:23:41 jenkins[1557315]: 2024-03-27 07:23:41.734+0000 [id=439073]        INFO        h.m.UpdateCenter$UpdateCenterConfiguration#download: Downloading github-branch-source
Mar 27 10:23:43 jenkins[1557315]: 2024-03-27 07:23:43.071+0000 [id=439073]        INFO        h.model.UpdateCenter$DownloadJob#run: Starting the installation of junit on behalf of USER
Mar 27 10:23:43 jenkins[1557315]: 2024-03-27 07:23:43.071+0000 [id=439073]        INFO        h.m.UpdateCenter$UpdateCenterConfiguration#download: Downloading junit
Mar 27 10:23:44 jenkins[1557315]: 2024-03-27 07:23:44.648+0000 [id=439073]        INFO        h.model.UpdateCenter$DownloadJob#run: Starting the installation of mina-sshd-api-core on behalf of USER
Mar 27 10:23:44 jenkins[1557315]: 2024-03-27 07:23:44.648+0000 [id=439073]        INFO        h.m.UpdateCenter$UpdateCenterConfiguration#download: Downloading mina-sshd-api-core
Mar 27 10:23:54 jenkins[1557315]: 2024-03-27 07:23:54.463+0000 [id=417202]        INFO        hudson.model.UpdateCenter#doSafeRestart: Scheduling Jenkins reboot
Mar 27 10:23:54 jenkins[1557315]: 2024-03-27 07:23:54.482+0000 [id=439102]        INFO        hudson.lifecycle.Lifecycle#onStatusUpdate: Restart in 10 seconds
Mar 27 10:24:04 jenkins[1557315]: 2024-03-27 07:24:04.486+0000 [id=439102]        INFO        hudson.lifecycle.Lifecycle#onStop: Stopping Jenkins as requested by USER
Mar 27 10:24:04 jenkins[1557315]: 2024-03-27 07:24:04.488+0000 [id=439102]        INFO        hudson.lifecycle.Lifecycle#onStatusUpdate: Stopping Jenkins
Mar 27 10:24:04 jenkins[1557315]: 2024-03-27 07:24:04.527+0000 [id=439102]        INFO        jenkins.model.Jenkins$16#onAttained: Started termination
Mar 27 10:24:04 jenkins[1557315]: 2024-03-27 07:24:04.589+0000 [id=439102]        INFO        c.zaxxer.hikari.HikariDataSource#close: HikariPool-1 - Shutdown initiated...
Mar 27 10:24:04 jenkins[1557315]: 2024-03-27 07:24:04.591+0000 [id=439102]        INFO        c.zaxxer.hikari.HikariDataSource#close: HikariPool-1 - Shutdown completed.
Mar 27 10:24:04 jenkins[1557315]: 2024-03-27 07:24:04.607+0000 [id=439102]        INFO        h.p.b.global.Lifecycle#shutdown: Shutdown complete - Global TimeOut ScheduledExecutorService had 0 tasks pending
Mar 27 10:24:04 jenkins[1557315]: 2024-03-27 07:24:04.611+0000 [id=439102]        INFO        jenkins.model.Jenkins$16#onAttained: Completed termination
Mar 27 10:24:04 jenkins[1557315]: 2024-03-27 07:24:04.611+0000 [id=439102]        INFO        jenkins.model.Jenkins#_cleanUpDisconnectComputers: Starting node disconnection
...
Mar 27 10:24:04 jenkins[1557315]: 2024-03-27 07:24:04.715+0000 [id=439102]        INFO        jenkins.model.Jenkins#_cleanUpShutdownPluginManager: Stopping plugin manager
...
Mar 27 10:24:05 jenkins[1557315]: 2024-03-27 07:24:05.020+0000 [id=439102]        INFO        jenkins.model.Jenkins#_cleanUpPersistQueue: Persisting build queue
Mar 27 10:24:05 jenkins[1557315]: 2024-03-27 07:24:05.024+0000 [id=439102]        INFO        jenkins.model.Jenkins#_cleanUpAwaitDisconnects: Waiting for node disconnection completion
...
Mar 27 10:24:05 jenkins[1557315]: 2024-03-27 07:24:05.033+0000 [id=439102]        INFO        hudson.lifecycle.Lifecycle#onStatusUpdate: Jenkins stopped
Mar 27 10:24:05 jenkins[1557315]: 2024-03-27 07:24:05.036+0000 [id=44]        INFO        winstone.Logger#logInternal: JVM is terminating. Shutting down Jetty
Mar 27 10:24:05 jenkins[1557315]: 2024-03-27 07:24:05.038+0000 [id=44]        INFO        org.eclipse.jetty.server.Server#doStop: Stopped Server@61009542{STOPPING}[10.0.20,sto=0]
Mar 27 10:24:05 jenkins[1557315]: 2024-03-27 07:24:05.043+0000 [id=439049]        WARNING        hudson.model.RunMap#retrieve: could not load /var/lib/jenkins/jobs/dev/jobs/JOB/builds/356
Mar 27 10:24:05 jenkins[1557315]: java.lang.IllegalStateException: Expected 1 instance of jenkins.model.TransientActionFactory$Cache but got 0
Mar 27 10:24:05 jenkins[1557315]:         at hudson.ExtensionList.lookupSingleton(ExtensionList.java:457)
Mar 27 10:24:05 jenkins[1557315]:         at jenkins.model.TransientActionFactory.factoriesFor(TransientActionFactory.java:127)
Mar 27 10:24:05 jenkins[1557315]:         at hudson.model.Actionable.getAllActions(Actionable.java:100)
Mar 27 10:24:05 jenkins[1557315]:         at hudson.model.Run.onLoad(Run.java:386)
Mar 27 10:24:05 jenkins[1557315]:         at org.jenkinsci.plugins.workflow.job.WorkflowRun.onLoad(WorkflowRun.java:566)
Mar 27 10:24:05 jenkins[1557315]:         at hudson.model.RunMap.retrieve(RunMap.java:233)
Mar 27 10:24:05 jenkins[1557315]:         at hudson.model.RunMap.retrieve(RunMap.java:61)
Mar 27 10:24:05 jenkins[1557315]:         at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:663)
Mar 27 10:24:05 jenkins[1557315]:         at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:645)
Mar 27 10:24:05 jenkins[1557315]:         at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:543)
Mar 27 10:24:05 jenkins[1557315]:         at jenkins.model.lazy.AbstractLazyLoadRunMap.search(AbstractLazyLoadRunMap.java:502)
Mar 27 10:24:05 jenkins[1557315]:         at jenkins.model.lazy.LazyBuildMixIn$RunMixIn.getPreviousBuild(LazyBuildMixIn.java:406)
Mar 27 10:24:05 jenkins[1557315]:         at org.jenkinsci.plugins.workflow.job.WorkflowRun.getPreviousBuild(WorkflowRun.java:272)
Mar 27 10:24:05 jenkins[1557315]:         at org.jenkinsci.plugins.workflow.job.WorkflowRun.getPreviousBuild(WorkflowRun.java:136)
Mar 27 10:24:05 jenkins[1557315]:         at org.jenkinsci.plugins.prometheus.JobCollector.appendJobMetrics(JobCollector.java:270)
Mar 27 10:24:05 jenkins[1557315]:         at org.jenkinsci.plugins.prometheus.JobCollector.lambda$collect$0(JobCollector.java:184)
Mar 27 10:24:05 jenkins[1557315]:         at org.jenkinsci.plugins.prometheus.util.Jobs.forEachJob(Jobs.java:19)
Mar 27 10:24:05 jenkins[1557315]:         at org.jenkinsci.plugins.prometheus.JobCollector.collect(JobCollector.java:177)
Mar 27 10:24:05 jenkins[1557315]:         at io.prometheus.client.Collector.collect(Collector.java:45)
Mar 27 10:24:05 jenkins[1557315]:         at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.findNextElement(CollectorRegistry.java:204)
Mar 27 10:24:05 jenkins[1557315]:         at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:219)
Mar 27 10:24:05 jenkins[1557315]:         at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:152)
Mar 27 10:24:05 jenkins[1557315]:         at java.base/java.util.Enumeration$1.next(Enumeration.java:123)
Mar 27 10:24:05 jenkins[1557315]:         at org.jenkinsci.plugins.prometheus.config.disabledmetrics.FilteredMetricEnumeration.filterList(FilteredMetricEnumeration.java:18)
Mar 27 10:24:05 jenkins[1557315]:         at org.jenkinsci.plugins.prometheus.config.disabledmetrics.FilteredMetricEnumeration.<init>(FilteredMetricEnumeration.java:12)
Mar 27 10:24:05 jenkins[1557315]:         at org.jenkinsci.plugins.prometheus.service.DefaultPrometheusMetrics.collectMetrics(DefaultPrometheusMetrics.java:53)
Mar 27 10:24:05 jenkins[1557315]:         at org.jenkinsci.plugins.prometheus.service.PrometheusAsyncWorker.execute(PrometheusAsyncWorker.java:41)
Mar 27 10:24:05 jenkins[1557315]:         at hudson.model.AsyncPeriodicWork.lambda$doRun$0(AsyncPeriodicWork.java:102)
Mar 27 10:24:05 jenkins[1557315]:         at java.base/java.lang.Thread.run(Thread.java:829)
...
(same log for different jobs)
...
Mar 27 10:24:05 jenkins[1557315]: 2024-03-27 07:24:05.147+0000 [id=44]        INFO        o.e.j.s.handler.ContextHandler#doStop: Stopped w.@1e886a5b{Jenkins v2.440.2,/,null,STOPPED}{/var/cache/jenkins/war}
Mar 27 10:24:05 jenkins[1557315]: 2024-03-27 07:24:05.156+0000 [id=44]        INFO        winstone.Logger#logInternal: Jetty shutdown successfully
Mar 27 10:24:10 systemd[1]: jenkins.service: Main process exited, code=exited, status=5/NOTINSTALLED
Mar 27 10:24:10 systemd[1]: jenkins.service: Failed with result 'exit-code'.
Mar 27 10:24:11 systemd[1]: jenkins.service: Service RestartSec=100ms expired, scheduling restart.
Mar 27 10:24:11 systemd[1]: jenkins.service: Scheduled restart job, restart counter is at 3.
Mar 27 10:24:11 systemd[1]: Stopped Jenkins Continuous Integration Server.
Mar 27 10:24:11 systemd[1]: Starting Jenkins Continuous Integration Server...
...
(Only high severity logs hereafter)
...
Mar 27 10:25:41 systemd[1]: jenkins.service: start operation timed out. Terminating.
Mar 27 10:25:46 systemd[1]: jenkins.service: Failed with result 'timeout'.
Mar 27 10:25:46 systemd[1]: Failed to start Jenkins Continuous Integration Server.
Mar 27 10:27:17 systemd[1]: jenkins.service: start operation timed out. Terminating.
Mar 27 10:27:23 systemd[1]: jenkins.service: Failed with result 'timeout'.
Mar 27 10:27:23 systemd[1]: Failed to start Jenkins Continuous Integration Server.
Mar 27 10:28:54 systemd[1]: jenkins.service: start operation timed out. Terminating.
Mar 27 10:28:59 systemd[1]: jenkins.service: Failed with result 'timeout'.
Mar 27 10:28:59 systemd[1]: Failed to start Jenkins Continuous Integration Server.
...

@nilbek
Copy link

nilbek commented Mar 28, 2024

After investigating logs, i remembered more details and edited my comment. I hope it will help.

@jonesbusy
Copy link

Yes my feeling is also similar that is related with restart or reload. When starting Jenkins inside a container (lets say from a fresh start) it works. But when restarting a controller or upgrading plugin and then restart it doesn't work. I think it has do to with init phase, plugin reload or even class path ?

It could explain why test are not discovering such issue.

Perhaps it can be reproduced using RealJenkinsRule 🤔

@Waschndolos
Copy link

Waschndolos commented Mar 28, 2024

I think the root cause was the introduction of #637 that @Initializer annotation. Maybe I just revert that - although it does make sense to me to use the @Initializer annotation but it seems to produce a weired behavior

@jonesbusy
Copy link

I never used Initializer on any of my plugin so I'm so sure about the behavior. Perhaps worth to ask on the dev list.

Thanks in any case!

@Waschndolos
Copy link

Guess I'll revert things. Takes some time - currently low on time. Hopefully done at the weekend

@Waschndolos
Copy link

@jonesbusy I've created a new PR at #650. I've tested it explicitly on a Windows hosted machine with update and let Jenkins restart over the plugins page. Didn't find any issues. Could you maybe check if it also works on your instances?

@Waschndolos
Copy link

@Alphayeeeet @jonesbusy @nilbek . I've reverted the commit which might have caused the issue and published 2.5.3. Hopefully this release works for you all. Sorry for the trouble!

@jonesbusy
Copy link

Hi,

Thanks, no problem it can happen. I will test the 2.5.3 next week after holidays.

Best regards,

@dominik-horb-umg
Copy link

We were seeing the same issue described here, updating to 2.5.3 fixed the problem.

@jonesbusy
Copy link

I also just tested the 2.5.3 this morning. This is also fixed on my side. Thanks again for the fix

@jonesbusy
Copy link

This issue is back after upgrading from 773.v3b_62d8178eec to 778.ve1c932a_ff24f

[05:40:14.113+02:00] - Caused by: java.lang.reflect.InvocationTargetException
[05:40:14.113+02:00] - 	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
[05:40:14.113+02:00] - 	at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:109)
[05:40:14.113+02:00] - 	... 9 more
[05:40:14.113+02:00] - Caused by: java.lang.IllegalArgumentException: Failed to register Collector of type JenkinsStatusCollector: default_jenkins_version_info is already in use by another Collector of type JenkinsStatusCollector
[05:40:14.113+02:00] - 	at io.prometheus.client.CollectorRegistry.register(CollectorRegistry.java:57)
[05:40:14.113+02:00] - 	at org.jenkinsci.plugins.prometheus.service.DefaultPrometheusMetrics.registerCollector(DefaultPrometheusMetrics.java:54)
[05:40:14.113+02:00] - 	at org.jenkinsci.plugins.prometheus.service.DefaultPrometheusMetrics.registerCollectors(DefaultPrometheusMetrics.java:62)
[05:40:14.113+02:00] - 	... 11 more
[05:40:14.113+02:00] - 

I will open a new issue to track it

@jonesbusy
Copy link

New issue #683

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment