Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker, core, editoast: add mode single-worker for all infra #9166

Open
wants to merge 6 commits into
base: dev
Choose a base branch
from

Conversation

bougue-pe
Copy link
Contributor

@bougue-pe bougue-pe commented Oct 3, 2024

Also:

  • editoast: remove http core client
  • cleanup unavailable envvar OSRD_BACKEND_URL

🔍 please review by commit + one more developper should test it (integration + e2e + regular use)
This work may probably be improved (doc, script, etc.) any suggestion is welcome.

Fix: #8599

@bougue-pe bougue-pe requested review from a team as code owners October 3, 2024 08:09
@github-actions github-actions bot added area:core Work on Core Service area:editoast Work on Editoast Service labels Oct 3, 2024
@bougue-pe bougue-pe requested a review from Erashin October 3, 2024 08:10
@codecov-commenter
Copy link

codecov-commenter commented Oct 3, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 3.84615% with 25 lines in your changes missing coverage. Please review.

Project coverage is 39.14%. Comparing base (e4eef48) to head (d60f674).
Report is 9 commits behind head on dev.

Files with missing lines Patch % Lines
...re/src/main/java/fr/sncf/osrd/cli/WorkerCommand.kt 0.00% 12 Missing ⚠️
editoast/src/core/mq_client.rs 0.00% 10 Missing ⚠️
editoast/src/client/mod.rs 0.00% 2 Missing ⚠️
editoast/src/core/mod.rs 0.00% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff              @@
##                dev    #9166      +/-   ##
============================================
+ Coverage     39.02%   39.14%   +0.11%     
  Complexity     2245     2245              
============================================
  Files          1289     1289              
  Lines         97319    97218     -101     
  Branches       3280     3280              
============================================
+ Hits          37981    38054      +73     
+ Misses        57399    57225     -174     
  Partials       1939     1939              
Flag Coverage Δ
core 74.92% <0.00%> (-0.02%) ⬇️
editoast 72.46% <7.14%> (+0.02%) ⬆️
front 10.33% <ø> (+0.20%) ⬆️
gateway 2.50% <ø> (+0.30%) ⬆️
osrdyne 3.52% <ø> (ø)
railjson_generator 87.49% <ø> (ø)
tests 86.71% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@flomonster
Copy link
Contributor

I did not reviewed it yet but I have a small question.

This features seems useful for debuging purpose (see #8599). I believe we're talking of core? If so I don't see why we should handle this feature using docker compose.

Copy link
Contributor

@eckter eckter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for core with one minor comment.

Not tested yet, I'll probably come back to it later.

Comment on lines 51 to 52
WORKER_ID = System.getenv("WORKER_ID") ?: (if (ALL_INFRA) "all_infra_worker" else null)
WORKER_KEY = System.getenv("WORKER_KEY") ?: (if (ALL_INFRA) "all" else null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sets the priority as "env variable > all infra > null". I'd prefer if we used "all infra > env variable > null". It's quite minor but that way we could change a single flag to enable the "all infra" mode, instead of having to also unset the worker key/id.

Suggested change
WORKER_ID = System.getenv("WORKER_ID") ?: (if (ALL_INFRA) "all_infra_worker" else null)
WORKER_KEY = System.getenv("WORKER_KEY") ?: (if (ALL_INFRA) "all" else null)
WORKER_ID = if (ALL_INFRA) "all_infra_worker" else System.getenv("WORKER_ID")
WORKER_KEY = if (ALL_INFRA) "all" else System.getenv("WORKER_KEY")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went for the safety on regular use, but it's fine to change.
Any reaction to that is welcome to make a "definitive" choice: current choice is to go for @eckter's suggestion.

Copy link
Contributor Author

@bougue-pe bougue-pe Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I remember, I also wanted to let a possible choice of the worker_key name, which is not consistent with the "all infra" naming here.

The namings using "all" are about the targeted usage more than what it actually does/permit: no pre-load + work attached to just a grouping key, no matter the infra(s) being used.
Not sure it is useful for now or in the future to change/rename so I went for your suggestion in 302e4ef after upvotes.

@bougue-pe
Copy link
Contributor Author

I did not reviewed it yet but I have a small question.

This features seems useful for debuging purpose (see #8599). I believe we're talking of core? If so I don't see why we should handle this feature using docker compose.

Compose file here is more of a helper/documentation to plug all together, as editoast also needs some specific param.
And I tried to put the params/envvar at the best place (IMO 😅) to enable mutualization when reusing core in another compose.

Copy link
Contributor

@younesschrifi younesschrifi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ! Didn't tested it yet. I'm waiting for your rebase to do so and then I'll approve the PR👍🏽

Copy link
Contributor

@woshilapin woshilapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good although, as discussed, it will make us not really use the full complexity of osrdyne on local, might not help to detect possible bugs in it. That said, all for making the developer experience smoother.

⚠️ not tested since I still didn't find the time to handle osrdyne that depends on a Docker daemon that I don't have (using podman, daemon less).

docker/docker-compose.host.yml Show resolved Hide resolved
Comment on lines 87 to 54
pub async fn new_mq(
uri: String,
worker_pool_identifier: String,
timeout: u64,
single_worker: bool,
) -> Result<Self> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense that this function takes directly a mq_client::Options as an input argument? I'm just triggered by the increasing number of arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not convinced by the benefits: on the only caller place, we would just embed the mq_client::Options creation, revealing more of the internals (and adding little more code/import).
Not strictly opposed to it, but not so fond of the idea here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be helpful to settle the debate: I'm fine with both :p

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is revealing the internal really a problem? It will make calling the function more explicit. Compare these, for example:

new_mq("foobar", "yolo", 42, false);

As a reader of the code, I will therefore need to go consult the function to understand what each of these parameters mean (which in Github might not be trivial, so I may need to pull the code, to benefit from rust-analyzer).

Compared to this:

new_mq(mq_client::Options {
  uri: "foobar",
  worker_pool_identifier: "yolo",
  timeout: 42,
  single_worker: false
});

Sure it's more verbose, but more verbose and more explicit it's not necessarily bad if you think that code is read more often than it's written. In some languages, you can name the parameter when calling but sadly, Rust doesn't allow that (there has been discussion about it).

⚠️ This comment is long, because I wanted to be explicit about why I tend to be averse to a lot of arguments in a function. For functions with only few arguments, usually, the name of the function is enough to understand what the 1 or 2 arguments are, without naming them. That said, 4 is not yet a lot of arguments, just enough to trigger me. So I'm fine to keep it as it is, feel free to resolve whatever you do.

Copy link
Contributor Author

@bougue-pe bougue-pe Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@leovalais leovalais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, not tested

@ElysaSrc
Copy link
Member

ElysaSrc commented Oct 7, 2024

I do not have anything to say about the PR per se, but I'm curious on why we would want to make this change the default behavior ?

What's the rationale behind this ?

@bougue-pe
Copy link
Contributor Author

bougue-pe commented Oct 7, 2024

@ElysaSrc there is a bit more info in the related issue #8599 but you may already have had a look and I'm not sure about your question.

The main purpose is to ease debug of core (mainly without having to spawn a new core and track the id of the infra which can be painful, like for integration tests).
The solution provided is OK for the business logic and hijacks part of the ops/architecture part, so it is limited in that regard (but a bit less than switching back to the web api).

Then the rationale of changing the noopdyne compose file is to consider that using dyne as noop is (almost?) only for that use case, so we might as well add the "changes" to editoast and core conf on the way to ease/document that case.
So the use case (which I didn't test completly, as I realized few minutes ago) would be:

  1. compose up the whole stack
  2. down the component(s) that I want to debug/replace locally
  3. spawn locally those components
  4. <do actual test/use>

Hope that it answers your question, and maybe it's worth adding a bit of doc and a bit of description (will try to after more test, but any suggestion is welcome).

@ElysaSrc
Copy link
Member

ElysaSrc commented Oct 7, 2024

@ElysaSrc there is a bit more info in the related issue #8599 but you may already have had a look and I'm not sure about your question.

The main purpose is to ease debug of core (mainly without having to spawn a new core and track the id of the infra which can be painful, like for integration tests).
The solution provided is OK for the business logic and hijacks part of the ops/architecture part, so it is limited in that regard (but a bit less than switching back to the web api).

Then the rationale of changing the noopdyne compose file is to consider that using dyne as noop is (almost?) only for that use case, so we might as well add the "changes" to editoast and core conf on the way to ease/document that case.
So the use case (which I didn't test completly, as I realized few minutes ago) would be:

  1. compose up the whole stack
  2. down the component(s) that I want to debug/replace locally
  3. spawn locally those components
  4. <do actual test/use>

Hope that it answers your question, and maybe it's worth adding a bit of doc and a bit of description (will try to after more test, but any suggestion is welcome).

Thanks for the detailed response !

Copy link
Contributor

@Erashin Erashin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested for core debug purposes. Works as intended. Good job :)

Copy link
Contributor

@Khoyo Khoyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

Should we also rename noopdyne as single-worker? It wasn't a great name, and I mostly named it that way because it wasn't a single-worker mode ^^'

WORKER_ID = System.getenv("WORKER_ID")
WORKER_KEY = System.getenv("WORKER_KEY")
ALL_INFRA = !System.getenv("ALL_INFRA").isNullOrEmpty()
WORKER_ID = if (ALL_INFRA) "all_infra_worker" else System.getenv("WORKER_ID")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty sure this will conflict, could you rebase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It did, ended up refactoring a bit through a getBooleanEnvvar fun 52ffad3#diff-7f5b2e605ea8cd19fd586f8d299a533f45c5419298dc3b16aca328645dee196aR75

Comment on lines 69 to 78
environment:
# Should match the reference, see. ./docker/osrdyne.yml
CORE_EDITOAST_URL: "http://osrd-editoast"
JAVA_TOOL_OPTIONS: "-javaagent:/app/opentelemetry-javaagent.jar"
CORE_MONITOR_TYPE: "opentelemetry"
OTEL_EXPORTER_OTLP_TRACES_PROTOCOL: "grpc"
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT: "http://jaeger:4317"
OTEL_METRICS_EXPORTER: "none"
OTEL_LOGS_EXPORTER: "none"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think directly providing the reference in ./docker/osrdyne.yml is a better idea... Or at least make clear that those are just for documentation, and that ./docker/osrdyne.yml is not the reference, but the actually used values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also cleanup unavailable envvar OSRD_BACKEND_URL

Signed-off-by: Pierre-Etienne Bougué <[email protected]>
Signed-off-by: Pierre-Etienne Bougué <[email protected]>
@bougue-pe bougue-pe requested a review from a team as a code owner October 11, 2024 19:32
@github-actions github-actions bot added the area:front Work on Standard OSRD Interface modules label Oct 11, 2024
@bougue-pe
Copy link
Contributor Author

bougue-pe commented Oct 11, 2024

Renamed in 2329db4 & d60f674:

  • osrd-compose.sh to host-compose.sh
  • docker-compose.noopdyne.yml to docker-compose.single-worker.yml

Also added a dedicated single-worker-compose.sh script to wrap this up with a bit more doc.

@bougue-pe bougue-pe requested a review from Khoyo October 11, 2024 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core Work on Core Service area:editoast Work on Editoast Service area:front Work on Standard OSRD Interface modules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Core debug: allow one core to handle multiple infra
10 participants