Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core feature] Pyflyte 'task with different structure' should show delta #4762

Closed
2 tasks done
katrogan opened this issue Jan 23, 2024 · 9 comments
Closed
2 tasks done
Assignees
Labels
enhancement New feature or request flytekit FlyteKit Python related issue good first issue Good for newcomers

Comments

@katrogan
Copy link
Contributor

katrogan commented Jan 23, 2024

Motivation: Why do you think this is important?

It's difficult as a user when I try to register Flyte inventory and get back

details: task with different structure already exists with id resource_type:TASK project:"system" domain:"production" name:"task-name" version:"1234" 

but I can't tell what's changed in my local version of the task.

Since we already store the registered task template and have the current serialized version of the task template we're trying to register, it would be super slick if flytekit (pyflyte) could compute what parameters in the template changed across registration and highlight that to the end user. (same for workflows and launch plans)

Goal: What should the final outcome look like, ideally?

Should be straightforward to identify what has changed across registration versions when the back-end blocks a re-registration attempt

Describe alternatives you've considered

Current error response code user experience

Propose: Link/Inline OR Additional context

Much like we use json marshalling to compute the hash, we could just json serialization to compute the delta here.

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@katrogan katrogan added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Jan 23, 2024
@katrogan
Copy link
Contributor Author

cc @mhotan

Copy link

dosubot bot commented Jan 23, 2024

🤖

Thank you for your feature request. I understand the need for a more detailed error message when there's a conflict in task structure during registration. This would indeed improve the user experience by making it easier to identify what has changed across registration versions.

Currently, when there is a conflict in task structure during registration, the Flyte codebase generates an error message using the CompileError struct and its associated methods in the errors package. The error message could be something like MismatchingInterfaces and is generated by calling a specific function like NewMismatchingInterfacesErr("", ""). You can find more details about this in the compiler_error_test.go file.

The task registration process involves creating a TaskTemplate message as defined in the tasks.proto file. This template includes various fields like id, type, metadata, interface, custom, target, task_type_version, security_context, extended_resources, and config that define the task.

The comparison of task structures during the registration process is not explicitly handled in the Flyte codebase based on the provided context. However, the serialization of task templates for the purpose of comparison during the registration process is handled using the Protocol Buffers (protobuf) library. The protobuf library provides methods like XXX_Marshal, XXX_Unmarshal, XXX_Merge, XXX_Size, and XXX_DiscardUnknown for serializing and deserializing the data. You can find more details about this in the tasks.pb.go file.

To implement your feature request, we could enhance the error handling to include a comparison of the current and previous versions of the task template. This could be done by serializing both versions using the protobuf library and then comparing the serialized data. If there are any differences, these could be highlighted in the error message. This would require changes to the CompileError struct and its associated methods to include the comparison results in the error message.

I hope this provides some insight into how your feature request could be implemented. If you have any further questions or need more clarification, please let me know.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@eapolinario eapolinario added good first issue Good for newcomers flytekit FlyteKit Python related issue and removed untriaged This issues has not yet been looked at by the Maintainers labels Jan 25, 2024
@hampusrosvall
Copy link
Contributor

hampusrosvall commented Feb 16, 2024

You can assign this to me, pointers on how to get started is very much appreciated!

@austin362667
Copy link
Contributor

austin362667 commented Feb 20, 2024

Hi, @katrogan

Wanna making sure of few things before starting a PR. Thank you~

1. diff() on what?

compute what parameters in the template changed across registration

By parameters, did you mean like the task / workflow / launch plan source code itself (different code structure but with the same id) OR in container.args dict from registration request bad rpc response e.g.,

Input Request:

{ 
  "id": { 
    "resourceType": "TASK", 
    "project": "flytesnacks",
    "domain": "development",
    "name": "workflow.say_hello",
    "version": "1234"
  },
  "spec": {
    "template": {
      "id": {
        "resourceType": "TASK",
        "project": "flytesnacks",
        "domain": "development",
        "name": "workflow.say_hello",
        "version": "1234"
      },
      "type": "python-task",
      "metadata": {
        "runtime": {
          "type": "FLYTE_SDK",
          "version": "1.10.7",
          "flavor": "python"
        },
        "retries": {}
      },
      "interface": {
        "inputs": {},
        "outputs": {
          "variables": {
            "o0": {
              "type": {
                "simple": "STRING"
              },
              "description": "o0"
            }
          }
        }
      },
      "container": {
        "image": "basics:v1",
        "args": [
          "pyflyte-fast-execute",
          "--additional-distribution",
          "s3://my-s3-bucket/flytesnacks/development/M4GSLZXCTPK46HIL57X6NW6SBM======/fast229fe274ee6e0cb90581d25bcf670186.tar.gz",
          "--dest-dir",
          "/root",
          "--",
          "pyflyte-execute",
          "--inputs",
          "{{.input}}",
          "--output-prefix",
          "{{.outputPrefix}}",
          "--raw-output-data-prefix",
          "{{.rawOutputDataPrefix}}",
          "--checkpoint-path",
          "{{.checkpointOutputPrefix}}",
          "--prev-checkpoint",
          "{{.prevCheckpointPrefix}}",
          "--resolver",
          "flytekit.core.python_auto_container.default_task_resolver",
          "--",
          "task-module",
          "workflow",
          "task-name",
          "say_hello"
        ],
        "resources": {}
      }
    },
    "description": {
      "longDescription": {
        "format": "DESCRIPTION_FORMAT_RST"
      }
    }
  }
}

2. Where to diff()?

Should we handle delta computation in backend flyteadmin (language-agnostic) or in flytekit (python)?

Some tradeoff to consider:

  • proto file breaking change?
  • sdk generalizability?

cc @Future-Outlier

@katrogan
Copy link
Contributor Author

I think it'd be hard to diff the source code although that would be spectacular, but the diff check performed on the server doesn't even evaluate the source code so in this case it's not super relevant

Ideally we should diff across the task spec (like you dumped!) since that's used to compute the digest that ultimately determines whether the structure has changes across (re-)registration

  1. Great question, I like the idea of having a language agnostic implementation! We can use something like
    WorkflowErrorExistsDifferentStructure exists_different_structure = 1;
    and extend the error message to include structured data about what in the spec has changed

For example, this where we initialize the error message in flyteadmin code:

func NewWorkflowExistsDifferentStructureError(ctx context.Context, request *admin.WorkflowCreateRequest) FlyteAdminError {
errorMsg := "workflow with different structure already exists"
statusErr, transformationErr := NewFlyteAdminError(codes.InvalidArgument, errorMsg).WithDetails(&admin.CreateWorkflowFailureReason{
Reason: &admin.CreateWorkflowFailureReason_ExistsDifferentStructure{
ExistsDifferentStructure: &admin.WorkflowErrorExistsDifferentStructure{
Id: request.Id,
},
},
})
if transformationErr != nil {
logger.Panicf(ctx, "Failed to wrap grpc status in type 'Error': %v", transformationErr)
return NewFlyteAdminErrorf(codes.InvalidArgument, errorMsg)
}
return statusErr
}

@austin362667
Copy link
Contributor

austin362667 commented Feb 21, 2024

2. extend the error message to include structured data about what in the spec has changed

What if we just show diff as string, in order to prevent modify too much task proto buff?

@katrogan
Copy link
Contributor Author

@austin362667 can we close this out?

@austin362667
Copy link
Contributor

Certainly! Thank you, Katrina.

@katrogan
Copy link
Contributor Author

thank you @austin362667 for all your work in improving this!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request flytekit FlyteKit Python related issue good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

5 participants