You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In some cases, zinniad gets stuck, and it takes several minutes until it responds to the SIGTERM signal sent by Station Core after Core detects that Spark is stuck.
In the case for which the logs are shown below:
At 2024-12-10T13:31:39Z, Spark enters a 60 second sleep
After ~5 minutes, Station Core detects inactivity and kills Spark
At 2024-12-10T13:47:45Z, Spark sends an HTTP request to check the current round. The request fails with "connection reset" error
At 2024-12-10T13:47:45Z, Spark enters another 60 second sleep
At that time, Zinnia main loop ends, the process exits and Station Core detects the exit (via signal)
At 2024-12-10T13:47:45Z, Station Core starts Spark/Zinnia again
Logs:
[2024-12-10T13:31:39Z INFO module:spark/main] Measurement submitted (id: [redacted])
{"type":"jobs-completed","total":[redacted],"rewardsScheduledForAddress":"[redacted]"}
[2024-12-10T13:31:39Z INFO module:spark/main] Sleeping for 60 seconds before starting the next task...
{"type":"activity:error","module":"Zinnia","message":"Spark has been inactive for 5 minutes, restarting..."}
{"type":"activity:error","module":"spark/main","message":"SPARK failed reporting retrieval"}
[2024-12-10T13:47:45Z INFO module:spark/main]
[2024-12-10T13:47:45Z INFO module:spark/main] Checking the current SPARK round...
[2024-12-10T13:47:45Z ERROR module:spark/main] Error: error sending request for url (https://api.filspark.com/rounds/current): connection error: connection reset
at async mainFetch (ext:deno_fetch/26_fetch.js:277:12)
at async fetch (ext:deno_fetch/26_fetch.js:504:7)
at async Tasker.#updateCurrentRound (file:///Users/redacted/Library/Caches/app.filstation.desktop/sources/spark/lib/tasker.js:50:15)
at async Tasker.next (file:///Users/redacted/Library/Caches/app.filstation.desktop/sources/spark/lib/tasker.js:44:5)
at async Spark.getRetrieval (file:///Users/redacted/Library/Caches/app.filstation.desktop/sources/spark/lib/spark.js:40:23)
at async Spark.nextRetrieval (file:///Users/redacted/Library/Caches/app.filstation.desktop/sources/spark/lib/spark.js:189:23)
at async Spark.run (file:///Users/redacted/Library/Caches/app.filstation.desktop/sources/spark/lib/spark.js:208:9)
at async file:///Users/redacted/Library/Caches/app.filstation.desktop/sources/spark/main.js:4:1
[2024-12-10T13:47:45Z INFO module:spark/main] Sleeping for 60 seconds before starting the next task...
{"type":"activity:error","module":"Zinnia","message":"Spark crashed via signal SIGTERM"}
Zinnia main loop ended
[2024-12-10T13:47:45Z INFO zinniad] Starting zinniad with config CliArgs { wallet_address: "[redacted]", station_id:
"[redacted]", state_root: "/Users/redacted/Library/Application Support/app.filstation.desktop/modules/zinnia", cache_root: "/Users/redacted/Library/Caches/app.filstation.desktop/modules/zinnia", files: ["spark/main.js"] }
[2024-12-10T13:47:45Z INFO lassie] Starting Lassie Daemon
[2024-12-10T13:47:45Z INFO lassie] Lassie Daemon is listening on port 54326
{"type":"activity:info","module":"spark","message":"Spark started"}
The text was updated successfully, but these errors were encountered:
Add more logs to understand where exactly Spark spent those 5 minutes
Add timestamp to log lines printing activities.
Change Station Core to use SIGKILL instead of SIGTERM and kill Zinnia immediately the hard way. (Maybe use SIGKILL only when we detect the process got stuck.) This is fine because Zinnia has not yet implemented a graceful shutdown.
In some cases,
zinniad
gets stuck, and it takes several minutes until it responds to the SIGTERM signal sent by Station Core after Core detects that Spark is stuck.In the case for which the logs are shown below:
Logs:
The text was updated successfully, but these errors were encountered: