Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor agent followup #700

Merged
merged 5 commits into from
Sep 4, 2024

Conversation

diconico07
Copy link
Contributor

@diconico07 diconico07 commented Jul 31, 2024

What this PR does / why we need it:
This PR adds some documentation for the agent's discovery handler as asked in #684 .

It also does a lot of cleanup related to the new slot reclaiming system: as the new slot reclaiming process doesn't rely on crictl nor the containerd socket, we can remove those from the agent's definition, it also removes the need for specifying the kubernetes distribution during installation.

Special notes for your reviewer:
Documentation PR: project-akri/akri-docs#113

If applicable:

  • this PR has an associated PR with documentation in akri-docs
  • this PR contains unit tests
  • added code adheres to standard Rust formatting (cargo fmt)
  • code builds properly (cargo build)
  • code is free of common mistakes (cargo clippy)
  • all Akri tests succeed (cargo test)
  • inline documentation builds (cargo doc)
  • all commits pass the DCO bot check by being signed off -- see the failing DCO check for instructions on how to retroactively sign commits

Copy link
Contributor

@kate-goldenring kate-goldenring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very exciting to see the crictl bits removed 🎉 ! We will need to update our docs and close some security issues after this.

I had some suggestions and questions on the diagrams

agent/Cargo.toml Outdated Show resolved Hide resolved
break on Discovery Handler connection error
Registration endpoint -x DiscoveryHandlerRegistry: close endpoint
DiscoveryHandlerRegistry ->> DiscoveryHandlerRegistry: Remove endpoint from registered handlers list
note over DiscoveryHandlerRequest,Discovery Handler: The DiscoveryHandlerRequest request will handle termination by itself
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it means that we do not propagate the fact the discovery handler is not reachable as every active request will notice that by themselves and handle this.

sequenceDiagram
Configuration Controller -) DiscoveryHandlerRegistry: new_request()
alt a Handler exists for the Request
DiscoveryHandlerRegistry ->> DiscoveryHandlerRequest: Creates with filtered list of endpoints
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"endpoints" is a little overloaded here. Is this DH endpoints to device filtering that is passed to the DH as discovery properties?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the DHRequestImpl structure which defines endpoints as discovered devices but i dont think that is what this diagram is referring to by "endpoints"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, there is a bit of type passing here, the DHRequestImpl defines an endpoint as a receiver for discovered devices, that is the output stream from a discovery handler's endpoint connection.
So endpoint for DHRequestImpl is an open connection to a discovery handler, and in here to create that we pass the list of discovery handlers endpoints that have the right name (hence filtered list)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reword this, then, to clarify what filtered means. For example "Creates a discover connection with each DH for a Configuration"

DiscoveryHandlerRegistry ->> DiscoveryHandlerRequest: Creates with filtered list of endpoints
DiscoveryHandlerRegistry ->> DiscoveryHandlerRegistry: Add request to tracked request list
loop over DiscoveryHandlerEndpoints with this name
DiscoveryHandlerRequest ->>+ Kubernetes API: Solve discovery properties
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same point as my previous comment about reqording this step

DiscoveryHandlerRequest -) Configuration Controller: Requests reconciliation of Configuration linked to Request
note over Configuration Controller: The following is Configuration Controller behavior
activate Configuration Controller
Configuration Controller ->> Configuration Controller: Reconcile Configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does Reconcile Configuration mean? Is this where we are making sure that the appropriate set of Akri Instances exists?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just what the configuration controller do, so ensuring there is a request for the configuration and that the appropriate set of Instances exists.
At first I didn't want to put it in that diagram as it is outside of the discovery handler manager, but did it in the end to highlight the get_request() and get_instances() use.

@@ -67,12 +67,6 @@ spec:
- name: DEBUG_ECHO_INSTANCES_SHARED
value: {{ .Values.debugEcho.configuration.shared | quote }}
{{- end }}
- name: HOST_CRICTL_PATH
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woohooo 🎉

@diconico07 diconico07 force-pushed the refactor-agent-followup branch from 1aff5b6 to 51855d1 Compare September 4, 2024 12:32
diconico07 and others added 4 commits September 4, 2024 14:34
New slot reconciliation system doesn't need crictl anymore,
removing everything related to that

Signed-off-by: Nicolas Belouin <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Nicolas Belouin <[email protected]>

Co-authored-by: Kate Goldenring <[email protected]>
Signed-off-by: Nicolas Belouin <[email protected]>
@diconico07 diconico07 force-pushed the refactor-agent-followup branch from 51855d1 to d8e50ad Compare September 4, 2024 12:35
@diconico07 diconico07 merged commit 7c8242a into project-akri:main Sep 4, 2024
33 checks passed
@diconico07 diconico07 deleted the refactor-agent-followup branch September 4, 2024 13:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants