Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat[frontend]: implement artifact-repositories configmap support #11354

Merged
merged 2 commits into from
Nov 20, 2024

Conversation

droctothorpe
Copy link
Contributor

@droctothorpe droctothorpe commented Nov 5, 2024

Description of your changes:

Argo Workflows supports the ability to specify an artifact repository for archived logs uniquely for each namespace rather than just globally as documented here.

This PR adds support for this functionality to the KFP UI. If a workflow runs and...

  1. The corresponding pods are deleted
  2. The corresponding workflow is deleted
  3. There is an artifact-repositories configmap in the target namespace that provides a unique, namespaced keyFormat

...this PR ensures that logs are still accessible in the UI. This functionality is disabled by default to avoid unnecessary network calls (for end users who don't leverage namespace-specific artifact-repositories). It can be toggled on through configs.ts or an environment variable.

Co-authored with @quinnovator 🌟 .

Fixes: #11339

Checklist:

Signed-off-by: droctothorpe <[email protected]>
Co-authored-by: quinnovator <[email protected]>
@github-actions github-actions bot added ci-passed All CI tests on a pull request have passed and removed ci-passed All CI tests on a pull request have passed labels Nov 5, 2024
@github-actions github-actions bot added the ci-passed All CI tests on a pull request have passed label Nov 5, 2024
@HumairAK HumairAK self-assigned this Nov 6, 2024
@droctothorpe
Copy link
Contributor Author

droctothorpe commented Nov 6, 2024

@HumairAK to recreate the edge case:

  1. Deploy KFP to a local K8s cluster.
  2. Set ARGO_ARCHIVE_LOGS=true in configs.ts.
  3. cd into the frontend directory.
  4. Run NAMESPACE=kubeflow npm run start:proxy-and-server to start the UI on your host machine.
  5. Create the following configmap in the kubeflow namespace:
metadata:
  annotations:
    workflows.argoproj.io/default-artifact-repository: artifact-repositories
  name: artifact-repositories
  namespace: kubeflow
data:
  artifact-repositories: |-
    archiveLogs: true
    s3:
      accessKeySecret:
        key: accesskey
        name: mlpipeline-minio-artifact
      bucket: mlpipeline
      endpoint: minio-service.kubeflow:9000
      insecure: true
      keyFormat: foo
      secretKeySecret:
        key: secretkey
        name: mlpipeline-minio-artifact
kind: ConfigMap
  1. Run a simple pipeline.
  2. Once it completes, delete the AWF workflow custom resource associated with the run.
  3. Try to access the logs of the run in the UI running on your host machine. On master, this operation will fail. On this branch, it will succeed.
  4. The next few steps are optional. Port-forward the minio pod.
  5. Navigate to the minio UI.
  6. Login (username: minio, password: minio123).
  7. Note how the logs are stored in mlpipeline/foo rather than the default keyFormat specified in configs.ts. Argo grants precedence to the keyFormat specified in the artifact-repositories configmap over the keyFormat specified in workflow-controller-configmap.

@droctothorpe
Copy link
Contributor Author

droctothorpe commented Nov 8, 2024

I updated the instructions to reproduce to edge case to make manual validation a lot easier by running the UI on your host machine.

@HumairAK
Copy link
Collaborator

HumairAK commented Nov 13, 2024

Hey @droctothorpe /@quinnovator, thanks for this. I'm trying this out now and encountering some hiccups. I suspect it might be my environment/setup, but I want to confirm with you first.

  1. When buildling the image and trying this out with a pod deployment of the UI, I'm not seeing this working, what I see:

image

the UI yaml is as posted below:

deployment.yaml
kind: Deployment
apiVersion: apps/v1
metadata:
  annotations:
    deployment.kubernetes.io/revision: '3'
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"app":"ml-pipeline-ui","application-crd-id":"kubeflow-pipelines"},"name":"ml-pipeline-ui","namespace":"kubeflow"},"spec":{"selector":{"matchLabels":{"app":"ml-pipeline-ui","application-crd-id":"kubeflow-pipelines"}},"template":{"metadata":{"annotations":{"cluster-autoscaler.kubernetes.io/safe-to-evict":"true"},"labels":{"app":"ml-pipeline-ui","application-crd-id":"kubeflow-pipelines"}},"spec":{"containers":[{"env":[{"name":"DISABLE_GKE_METADATA","value":"true"},{"name":"VIEWER_TENSORBOARD_POD_TEMPLATE_SPEC_PATH","value":"/etc/config/viewer-pod-template.json"},{"name":"MINIO_NAMESPACE","valueFrom":{"fieldRef":{"fieldPath":"metadata.namespace"}}},{"name":"MINIO_ACCESS_KEY","valueFrom":{"secretKeyRef":{"key":"accesskey","name":"mlpipeline-minio-artifact"}}},{"name":"MINIO_SECRET_KEY","valueFrom":{"secretKeyRef":{"key":"secretkey","name":"mlpipeline-minio-artifact"}}},{"name":"ALLOW_CUSTOM_VISUALIZATIONS","value":"true"},{"name":"FRONTEND_SERVER_NAMESPACE","valueFrom":{"fieldRef":{"fieldPath":"metadata.namespace"}}}],"image":"gcr.io/ml-pipeline/frontend:2.3.0","imagePullPolicy":"IfNotPresent","livenessProbe":{"exec":{"command":["wget","-q","-S","-O","-","http://localhost:3000/apis/v1beta1/healthz"]},"initialDelaySeconds":3,"periodSeconds":5,"timeoutSeconds":2},"name":"ml-pipeline-ui","ports":[{"containerPort":3000}],"readinessProbe":{"exec":{"command":["wget","-q","-S","-O","-","http://localhost:3000/apis/v1beta1/healthz"]},"initialDelaySeconds":3,"periodSeconds":5,"timeoutSeconds":2},"resources":{"requests":{"cpu":"10m","memory":"70Mi"}},"volumeMounts":[{"mountPath":"/etc/config","name":"config-volume","readOnly":true}]}],"serviceAccountName":"ml-pipeline-ui","volumes":[{"configMap":{"name":"ml-pipeline-ui-configmap"},"name":"config-volume"}]}}}}
  resourceVersion: '627852'
  name: ml-pipeline-ui
  uid: 4d454d73-05f6-4b01-b744-8a4c9aec05ec
  creationTimestamp: '2024-11-12T22:37:37Z'
  generation: 8
  namespace: kubeflow
  labels:
    app: ml-pipeline-ui
    application-crd-id: kubeflow-pipelines
spec:
  replicas: 0
  selector:
    matchLabels:
      app: ml-pipeline-ui
      application-crd-id: kubeflow-pipelines
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: ml-pipeline-ui
        application-crd-id: kubeflow-pipelines
      annotations:
        cluster-autoscaler.kubernetes.io/safe-to-evict: 'true'
    spec:
      restartPolicy: Always
      serviceAccountName: ml-pipeline-ui
      schedulerName: default-scheduler
      terminationGracePeriodSeconds: 30
      securityContext: {}
      containers:
        - resources:
            requests:
              cpu: 10m
              memory: 70Mi
          readinessProbe:
            exec:
              command:
                - wget
                - '-q'
                - '-S'
                - '-O'
                - '-'
                - 'http://localhost:3000/apis/v1beta1/healthz'
            initialDelaySeconds: 3
            timeoutSeconds: 2
            periodSeconds: 5
            successThreshold: 1
            failureThreshold: 3
          terminationMessagePath: /dev/termination-log
          name: ml-pipeline-ui
          livenessProbe:
            exec:
              command:
                - wget
                - '-q'
                - '-S'
                - '-O'
                - '-'
                - 'http://localhost:3000/apis/v1beta1/healthz'
            initialDelaySeconds: 3
            timeoutSeconds: 2
            periodSeconds: 5
            successThreshold: 1
            failureThreshold: 3
          env:
            - name: DISABLE_GKE_METADATA
              value: 'true'
            - name: VIEWER_TENSORBOARD_POD_TEMPLATE_SPEC_PATH
              value: /etc/config/viewer-pod-template.json
            - name: MINIO_NAMESPACE
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
            - name: MINIO_ACCESS_KEY
              valueFrom:
                secretKeyRef:
                  name: mlpipeline-minio-artifact
                  key: accesskey
            - name: MINIO_SECRET_KEY
              valueFrom:
                secretKeyRef:
                  name: mlpipeline-minio-artifact
                  key: secretkey
            - name: ALLOW_CUSTOM_VISUALIZATIONS
              value: 'true'
            - name: FRONTEND_SERVER_NAMESPACE
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
            - name: ARGO_ARTIFACT_REPOSITORIES_LOOKUP
              value: 'true'
            - name: ARGO_ARCHIVE_LOGS
              value: 'true'
          ports:
            - containerPort: 3000
              protocol: TCP
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - name: config-volume
              readOnly: true
              mountPath: /etc/config
          terminationMessagePolicy: File
          image: 'quay.io/hukhan/ds-pipelines-frontend:11354'
      serviceAccount: ml-pipeline-ui
      volumes:
        - name: config-volume
          configMap:
            name: ml-pipeline-ui-configmap
            defaultMode: 420
      dnsPolicy: ClusterFirst
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 25%
      maxSurge: 25%
  revisionHistoryLimit: 10
  progressDeadlineSeconds: 600
status:
  observedGeneration: 8
  conditions:
    - type: Progressing
      status: 'True'
      lastUpdateTime: '2024-11-13T04:36:07Z'
      lastTransitionTime: '2024-11-12T22:37:37Z'
      reason: NewReplicaSetAvailable
      message: ReplicaSet "ml-pipeline-ui-5c5479dc67" has successfully progressed.
    - type: Available
      status: 'True'
      lastUpdateTime: '2024-11-13T22:54:29Z'
      lastTransitionTime: '2024-11-13T22:54:29Z'
      reason: MinimumReplicasAvailable
      message: Deployment has minimum availability.

And the artifact repository configmap:

kind: ConfigMap
apiVersion: v1
metadata:
  name: artifact-repositories
  namespace: kubeflow
  # edit: forgot to add this
  annotations:
    workflows.argoproj.io/default-artifact-repository: artifact-repositories
data:
  artifact-repositories: |-
    archiveLogs: true
    s3:
      accessKeySecret:
        key: accesskey
        name: mlpipeline-minio-artifact
      bucket: mlpipeline
      endpoint: minio-service.kubeflow:9000
      insecure: true
      keyFormat: custom_location/{{workflow.name}}/{{workflow.creationTimestamp.Y}}/{{workflow.creationTimestamp.m}}/{{workflow.creationTimestamp.d}}/{{pod.name}}
      secretKeySecret:
        key: secretkey
        name: mlpipeline-minio-artifact

As you can see I have set ARGO_ARTIFACT_REPOSITORIES_LOOKUP and ARGO_ARCHIVE_LOGS to true. I can also confirm it's got the PR changes by viewing the UI pod container logs, the UI config has:

{
argo: {
archiveArtifactory: 'minio',
archiveBucketName: 'mlpipeline',
archiveLogs: true,
keyFormat: 'artifacts/{{workflow.name}}/{{workflow.creationTimestamp.Y}}/{{workflow.creationTimestamp.m}}/{{workflow.creationTimestamp.d}}/{{pod.name}}',
artifactRepositoriesLookup: true      

^^^^^ PR change
  1. When trying this locally, I see some strange behavior where one of the task nodes is fetching the logs from object store, but not the other task. In some scenarios both fail to fetch it. My suspicion is there is some weird timeout behavior that's faulty with connecting to object store on my side, but wondering if you encounter this behavior at all:

logs-not-pulled

The pipeline used:

pipeline.yaml
# PIPELINE DEFINITION
# Name: tutorial-data-passing-2
# Inputs:
#    message: str [Default: 'message']
components:
  comp-preprocess:
    executorLabel: exec-preprocess
    inputDefinitions:
      parameters:
        message:
          parameterType: STRING
    outputDefinitions:
      artifacts:
        output_dataset_one:
          artifactType:
            schemaTitle: system.Dataset
            schemaVersion: 0.0.1
        output_dataset_two_path:
          artifactType:
            schemaTitle: system.Dataset
            schemaVersion: 0.0.1
      parameters:
        output_bool_parameter_path:
          parameterType: BOOLEAN
        output_dict_parameter_path:
          parameterType: STRUCT
        output_list_parameter_path:
          parameterType: LIST
        output_parameter_path:
          parameterType: STRING
  comp-train:
    executorLabel: exec-train
    inputDefinitions:
      artifacts:
        dataset_one_path:
          artifactType:
            schemaTitle: system.Dataset
            schemaVersion: 0.0.1
        dataset_two:
          artifactType:
            schemaTitle: system.Dataset
            schemaVersion: 0.0.1
      parameters:
        input_bool:
          parameterType: BOOLEAN
        input_dict:
          parameterType: STRUCT
        input_list:
          parameterType: LIST
        message:
          parameterType: STRING
        num_steps:
          defaultValue: 100.0
          isOptional: true
          parameterType: NUMBER_INTEGER
    outputDefinitions:
      artifacts:
        model:
          artifactType:
            schemaTitle: system.Model
            schemaVersion: 0.0.1
deploymentSpec:
  executors:
    exec-preprocess:
      container:
        args:
        - --executor_input
        - '{{$}}'
        - --function_to_execute
        - preprocess
        command:
        - sh
        - -c
        - "\nif ! [ -x \"$(command -v pip)\" ]; then\n    python3 -m ensurepip ||\
          \ python3 -m ensurepip --user || apt-get install python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1\
          \ python3 -m pip install --quiet --no-warn-script-location 'kfp==2.3.0'\
          \ '--no-deps' 'typing-extensions>=3.7.4,<5; python_version<\"3.9\"' && \"\
          $0\" \"$@\"\n"
        - sh
        - -ec
        - 'program_path=$(mktemp -d)


          printf "%s" "$0" > "$program_path/ephemeral_component.py"

          _KFP_RUNTIME=true python3 -m kfp.dsl.executor_main                         --component_module_path                         "$program_path/ephemeral_component.py"                         "$@"

          '
        - "\nimport kfp\nfrom kfp import dsl\nfrom kfp.dsl import *\nfrom typing import\
          \ *\n\ndef preprocess(\n    # An input parameter of type string.\n    message:\
          \ str,\n    # Use Output[T] to get a metadata-rich handle to the output\
          \ artifact\n    # of type `Dataset`.\n    output_dataset_one: Output[Dataset],\n\
          \    # A locally accessible filepath for another output artifact of type\n\
          \    # `Dataset`.\n    output_dataset_two_path: OutputPath('Dataset'),\n\
          \    # A locally accessible filepath for an output parameter of type string.\n\
          \    output_parameter_path: OutputPath(str),\n    # A locally accessible\
          \ filepath for an output parameter of type bool.\n    output_bool_parameter_path:\
          \ OutputPath(bool),\n    # A locally accessible filepath for an output parameter\
          \ of type dict.\n    output_dict_parameter_path: OutputPath(Dict[str, int]),\n\
          \    # A locally accessible filepath for an output parameter of type list.\n\
          \    output_list_parameter_path: OutputPath(List[str]),\n):\n    \"\"\"\
          Dummy preprocessing step.\"\"\"\n\n    # Use Dataset.path to access a local\
          \ file path for writing.\n    # One can also use Dataset.uri to access the\
          \ actual URI file path.\n    with open(output_dataset_one.path, 'w') as\
          \ f:\n        f.write(message)\n\n    # OutputPath is used to just pass\
          \ the local file path of the output artifact\n    # to the function.\n \
          \   with open(output_dataset_two_path, 'w') as f:\n        f.write(message)\n\
          \n    with open(output_parameter_path, 'w') as f:\n        f.write(message)\n\
          \n    with open(output_bool_parameter_path, 'w') as f:\n        f.write(\n\
          \            str(True))  # use either `str()` or `json.dumps()` for bool\
          \ values.\n\n    import json\n    with open(output_dict_parameter_path,\
          \ 'w') as f:\n        f.write(json.dumps({'A': 1, 'B': 2}))\n\n    with\
          \ open(output_list_parameter_path, 'w') as f:\n        f.write(json.dumps(['a',\
          \ 'b', 'c']))\n\n"
        image: quay.io/opendatahub/ds-pipelines-ci-executor-image:v1.0
    exec-train:
      container:
        args:
        - --executor_input
        - '{{$}}'
        - --function_to_execute
        - train
        command:
        - sh
        - -c
        - "\nif ! [ -x \"$(command -v pip)\" ]; then\n    python3 -m ensurepip ||\
          \ python3 -m ensurepip --user || apt-get install python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1\
          \ python3 -m pip install --quiet --no-warn-script-location 'kfp==2.3.0'\
          \ '--no-deps' 'typing-extensions>=3.7.4,<5; python_version<\"3.9\"' && \"\
          $0\" \"$@\"\n"
        - sh
        - -ec
        - 'program_path=$(mktemp -d)


          printf "%s" "$0" > "$program_path/ephemeral_component.py"

          _KFP_RUNTIME=true python3 -m kfp.dsl.executor_main                         --component_module_path                         "$program_path/ephemeral_component.py"                         "$@"

          '
        - "\nimport kfp\nfrom kfp import dsl\nfrom kfp.dsl import *\nfrom typing import\
          \ *\n\ndef train(\n    # Use InputPath to get a locally accessible path\
          \ for the input artifact\n    # of type `Dataset`.\n    dataset_one_path:\
          \ InputPath('Dataset'),\n    # Use Input[T] to get a metadata-rich handle\
          \ to the input artifact\n    # of type `Dataset`.\n    dataset_two: Input[Dataset],\n\
          \    # An input parameter of type string.\n    message: str,\n    # Use\
          \ Output[T] to get a metadata-rich handle to the output artifact\n    #\
          \ of type `Model`.\n    model: Output[Model],\n    # An input parameter\
          \ of type bool.\n    input_bool: bool,\n    # An input parameter of type\
          \ dict.\n    input_dict: Dict[str, int],\n    # An input parameter of type\
          \ List[str].\n    input_list: List[str],\n    # An input parameter of type\
          \ int with a default value.\n    num_steps: int = 100,\n):\n    \"\"\"Dummy\
          \ Training step.\"\"\"\n    with open(dataset_one_path, 'r') as input_file:\n\
          \        dataset_one_contents = input_file.read()\n\n    with open(dataset_two.path,\
          \ 'r') as input_file:\n        dataset_two_contents = input_file.read()\n\
          \    print(\"test3\")\n    line = (f'dataset_one_contents: {dataset_one_contents}\
          \ || '\n            f'dataset_two_contents: {dataset_two_contents} || '\n\
          \            f'message: {message} || '\n            f'input_bool: {input_bool},\
          \ type {type(input_bool)} || '\n            f'input_dict: {input_dict},\
          \ type {type(input_dict)} || '\n            f'input_list: {input_list},\
          \ type {type(input_list)} \\n')\n\n    with open(model.path, 'w') as output_file:\n\
          \        for i in range(num_steps):\n            output_file.write('Step\
          \ {}\\n{}\\n=====\\n'.format(i, line))\n\n    # model is an instance of\
          \ Model artifact, which has a .metadata dictionary\n    # to store arbitrary\
          \ metadata for the output artifact.\n    model.metadata['accuracy'] = 0.9\n\
          \n"
        image: quay.io/opendatahub/ds-pipelines-ci-executor-image:v1.0
pipelineInfo:
  name: tutorial-data-passing-2
root:
  dag:
    tasks:
      preprocess:
        cachingOptions:
          enableCache: true
        componentRef:
          name: comp-preprocess
        inputs:
          parameters:
            message:
              componentInputParameter: message
        taskInfo:
          name: preprocess
      train:
        cachingOptions:
          enableCache: true
        componentRef:
          name: comp-train
        dependentTasks:
        - preprocess
        inputs:
          artifacts:
            dataset_one_path:
              taskOutputArtifact:
                outputArtifactKey: output_dataset_one
                producerTask: preprocess
            dataset_two:
              taskOutputArtifact:
                outputArtifactKey: output_dataset_two_path
                producerTask: preprocess
          parameters:
            input_bool:
              taskOutputParameter:
                outputParameterKey: output_bool_parameter_path
                producerTask: preprocess
            input_dict:
              taskOutputParameter:
                outputParameterKey: output_dict_parameter_path
                producerTask: preprocess
            input_list:
              taskOutputParameter:
                outputParameterKey: output_list_parameter_path
                producerTask: preprocess
            message:
              taskOutputParameter:
                outputParameterKey: output_parameter_path
                producerTask: preprocess
        taskInfo:
          name: train
  inputDefinitions:
    parameters:
      message:
        defaultValue: message
        isOptional: true
        parameterType: STRING
schemaVersion: 2.1.0
sdkVersion: kfp-2.3.0

So question for you:

  1. Did you have this working with a built image and running in a pod? If so, I suspect I must have something configured incorrectly then.
  2. Does the above pipeline work fine for you when you juggle between either task's pod logs (after deleting the underlying WF) ? (or any other pipeline with more than one task node with logs)

@droctothorpe
Copy link
Contributor Author

droctothorpe commented Nov 14, 2024

Thank you so much for taking the time to validate this, @HumairAK, and for being so thorough with documenting the issue that you encountered.

I'm going to try a couple of things to see if I can repro your results:

  1. Run the UI on host (using NAMESPACE=kubeflow npm run start:proxy-and-server), validate that I can toggle between multiple tasks.
  2. Run the UI on host, use your pipeline.
  3. Same as above but also use your configmap.
  4. Same as above but build and deploy the UI image instead of running it on the host.

Results:

  1. Run the UI on host, validate that I can toggle between multiple tasks.
    Worked.
image image
  1. Run the UI on host, use your pipeline.
    Worked.
image image
  1. Same as above but also use your configmap.

I think I see the problem. I did a comparison of the two configmaps:

image

Your configmap is missing the workflows.argoproj.io/default-artifact-repository: artifact-repositories annotation. From the AWF documentation:
image

Can you run the same test but make sure to include that annotation in your configmap?

Thank you! 🙏

@HumairAK
Copy link
Collaborator

Ah yeah I actually did include the annotation, when trying to prune the configmap of runtime information for the github post, I accidentally left it out.

I see that in your configmap your path is set to keyFormat: foo which makes it seem like it will just overwrite the main.log for each task, but in your (1) and (2), the logs are different. So I'm wondering if you cleaned up the underlying Workflows for those tests? With the given configmap I would have expected the logs view to show the same logs.

@droctothorpe
Copy link
Contributor Author

Confirming that the workflows were deleted.

image

I see that in your configmap your path is set to keyFormat: foo which makes it seem like it will just overwrite the main.log for each task, but in your (1) and (2), the logs are different. With the given configmap I would have expected the logs view to show the same logs.

I actually had the keyFormat field in my configmap set to foo/artifacts/{{workflow.name}}/{{workflow.creationTimestamp.Y}}/{{workflow.creationTimestamp.m}}/{{workflow.creationTimestamp.d}}/{{pod.name}} from an earlier test where I wanted to confirm that the string replacement was happening properly. That's why the logs were different.

I changed keyFormat to a static string ("foo") to see what happens. Confirming that the logs are in fact identical for differing components in that scenario.

I'm still unable to repro the failure scenario. I'll try to build and deploy the image instead of running it on the host next.

@HumairAK
Copy link
Collaborator

HumairAK commented Nov 16, 2024

AHA found the reason:

k8serr:Could not get configMap artifact-repositories in namespace kubeflow: configmaps "artifact-repositories" is forbidden: User "system:serviceaccount:kubeflow:ml-pipeline-ui" cannot get resource "configmaps" in API group "" in the namespace "kubeflow", additional info: [object Object]

we'll need to update the mlpipeline ui role for this

and we should probably surface this error so it's easier to find next time

Comment on lines 174 to 178
const [configMap] = await getConfigMap('artifact-repositories', namespace);
if (configMap === undefined) {
// If there is no artifact-repositories configmap, return undefined. The
// caller will just use keyFormat as specified in configs.ts.
return undefined;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's surface the k8serr from getConfigmap, and report that to console so it can help with troubleshooting if this part fails for whatever reasons (e.g. rbac is insufficient)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, @HumairAK! I pushed up a fix. Here's some manual validation evidence as well:

Workflow deleted.
image

Logs still available:
image

Configmap deleted:
image

Logs not available. Error surfacing in UI pod:
image

Is surfacing the error in the pod sufficient? We want to continue execution in case the default path specified by ARGO_KEYFORMAT in configs.ts works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failing CI check seems unrelated to the code.

Signed-off-by: droctothorpe <[email protected]>
@HumairAK
Copy link
Collaborator

/lgtm
/approve

let me retry running the ci

@google-oss-prow google-oss-prow bot added the lgtm label Nov 20, 2024
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: HumairAK

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 467f30c into kubeflow:master Nov 20, 2024
15 of 16 checks passed
@HumairAK
Copy link
Collaborator

@droctothorpe are you guys willing to provide some supplementing docs for this functionality? I'm thinking https://www.kubeflow.org/docs/components/pipelines/operator-guides/ here makes the most sense

@droctothorpe
Copy link
Contributor Author

Thanks for the review and merge, @HumairAK! Happy to add some supplemental docs. Would you prefer for it to be in a standalone page or appended to https://www.kubeflow.org/docs/components/pipelines/operator-guides/configure-object-store/?

@HumairAK
Copy link
Collaborator

I think we can create a standalone page on logging. Especially since we're planning future work on that front, there will probably be more to add later. WDYT?

@droctothorpe droctothorpe deleted the configmap branch November 21, 2024 16:20
@droctothorpe
Copy link
Contributor Author

I didn't forget about this, @HumairAK! We just really slow down in Dec, heh. My only reservation wrt tackling it now is if we turn logs into first class artifacts, which I plan to start on very soon, the logging documentation will no longer be necessary / relevant.

@HumairAK
Copy link
Collaborator

HumairAK commented Jan 16, 2025

hrmm good point
I don't imagine we'd remove this capability once we switch log artifact, as that would be api breaking, correct? In which case, maybe providing docs for the user to pick/choose what they prefer maybe beneficial, wdyt? or would that add to confusion? (I don't have a strong opinion here)

@droctothorpe
Copy link
Contributor Author

Unless someone is calling the backend for the frontend directly, it's hard to imagine a scenario in which an end user would be impacted by this. Maybe it's just lack of imagination on my part though, heh. Users do the darndest things. Once logs are proper artifacts, a user would have to be out of their mind to not use that pattern. This sequential guess work on the UIs part is terrible, inconsistent, and brittle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved ci-passed All CI tests on a pull request have passed lgtm size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[frontend] The UI does not account for the artifact-repositories configmap
2 participants