Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add deployment of nvidia nim #803

Draft
wants to merge 3 commits into
base: mlp-base-dev
Choose a base branch
from

Conversation

laurentgrangeau
Copy link

Adding deployment of NVIDIA NIMs on the ML platform

@laurentgrangeau laurentgrangeau marked this pull request as draft September 9, 2024 16:30
@laurentgrangeau laurentgrangeau marked this pull request as ready for review September 9, 2024 17:02
@laurentgrangeau laurentgrangeau marked this pull request as draft September 9, 2024 17:03
@arueth
Copy link
Collaborator

arueth commented Sep 9, 2024

For the repo, we try and follow the Terraform on Google Cloud: Best practices for general style and structure, could you take a look at it and update the PR. Mainly around the Adopt a naming convention and using underscore to delimit multiple words.

Although not required, it is good practice to break out standard attributes and block attributes as well as alphabetize attributes in resources and data sources to make merges and updates from multiple people cleaner.

Also please ensure that you run terraform fmt on the code as this will be enforce when merging to main.

Copy link
Collaborator

@arueth arueth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also see my prior comment.

@@ -0,0 +1,3 @@
output "inference-url" {
value = ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this isn't going to be populated with a value, can you remove it?


resource "kubernetes_storage_class" "name" {
metadata {
name = "hyperdisk-ml"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is HDML required or could another StorageClass also be used?

@arueth
Copy link
Collaborator

arueth commented Sep 10, 2024

Could you also include a versions.tf that includes the pinned provider versions.

@arueth
Copy link
Collaborator

arueth commented Sep 10, 2024

You will also need to add the license header to your .tf files.

@roberthbailey
Copy link
Collaborator

How different is this from the nim guide that we merged into https://github.com/GoogleCloudPlatform/ai-on-gke/tree/main/tutorials-and-examples/nvidia-nim?

@arueth
Copy link
Collaborator

arueth commented Sep 11, 2024

Looks to be similar, except done with Terraform. We are consuming this in the ml-platform, so it will be referenced internally primarily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants