Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stepfunctions-tasks: CallAwsServiceCrossRegion doesn't work with WAIT_FOR_TASK_TOKEN #32746

Open
1 task
shatgupt opened this issue Jan 6, 2025 · 5 comments · May be fixed by #32807
Open
1 task

stepfunctions-tasks: CallAwsServiceCrossRegion doesn't work with WAIT_FOR_TASK_TOKEN #32746

shatgupt opened this issue Jan 6, 2025 · 5 comments · May be fixed by #32807
Labels
@aws-cdk/aws-stepfunctions-tasks bug This issue is a bug. effort/small Small work item – less than a day of effort p2

Comments

@shatgupt
Copy link

shatgupt commented Jan 6, 2025

Describe the bug

CallAwsServiceCrossRegion task doesn't generate the correct Resource required for WAIT_FOR_TASK_TOKEN to work. Because of this, SFN doesn't recognize the task as WAIT_FOR_TASK_TOKEN and thus $$.Task.Token is not available in the task parameter to be passed to the call. See Reproduction Steps below.

Regression Issue

  • Select this option if this issue appears to be a regression.

Last Known Working CDK Version

No response

Expected Behavior

CallAwsServiceCrossRegion should work with IntegrationPattern.WAIT_FOR_TASK_TOKEN so that the StateMachine can pass task token to another service in different account and region and wait on task completion after making a cross region call.

The ASL generated from this task should be similar to the one generated by CallAwsService.

"InvokeWrapper": {
      "Retry": [
        {
          "ErrorEquals": [
            "Lambda.ClientExecutionTimeoutException",
            "Lambda.ServiceException",
            "Lambda.AWSLambdaException",
            "Lambda.SdkClientException"
          ],
          "IntervalSeconds": 2,
          "MaxAttempts": 6,
          "BackoffRate": 2
        }
      ],
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke.waitForTaskToken",
      "Parameters": {
        "FunctionName":"arn:aws:lambda:us-east-1:xxxx:function:xxxx-CrossRegionAwsSdkxxxx-qxxxw",
        "Payload":{
          "region.$": "$.wrapperStateMachineRegion",
          "action": "startExecution",
          "service": "stepfunctions",
          "parameters": {
            "StateMachineArn.$": "$.wrapperStateMachineArn",
            "Input": {
              "type": 1,
              "value": {
                "taskToken.$": "$$.Task.Token",
                "payload.$": "$.payload"
              }
            }
          }
        }
      }
    }

I understand that SendTaskSuccess can only be called with the token using the current account principal.

Current Behavior

This task

new CallAwsServiceCrossRegion(this, 'InvokeWrapper', {
      integrationPattern: IntegrationPattern.WAIT_FOR_TASK_TOKEN,
      service: 'stepfunctions',
      action: 'startExecution',
      region: JsonPath.stringAt('$.wrapperStateMachineRegion'),
      parameters: {
        StateMachineArn: JsonPath.stringAt('$.wrapperStateMachineArn'),
        Input: TaskInput.fromObject({
          taskToken: JsonPath.taskToken,
          payload: JsonPath.stringAt('$.payload'),
        }),
      },
      iamResources: ['*'],
    });

generates following ASL

"InvokeWrapper": {
      "End": true,
      "Retry": [
        {
          "ErrorEquals": [
            "Lambda.ClientExecutionTimeoutException",
            "Lambda.ServiceException",
            "Lambda.AWSLambdaException",
            "Lambda.SdkClientException"
          ],
          "IntervalSeconds": 2,
          "MaxAttempts": 6,
          "BackoffRate": 2
        }
      ],
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:xxxx:function:xxxx-CrossRegionAwsSdkxxxx-qxxxw",
      "Parameters": {
        "region.$": "$.wrapperStateMachineRegion",
        "action": "startExecution",
        "service": "stepfunctions",
        "parameters": {
          "StateMachineArn.$": "$.wrapperStateMachineArn",
          "Input": {
            "type": 1,
            "value": {
              "taskToken.$": "$$.Task.Token",
              "payload.$": "$.payload"
            }
          }
        }
      }
    }

which throws following error during execution

An error occurred while executing the state 'InvokeWrapper' (entered at the event id #2). The JSONPath '$$.Task.Token' specified for the field 'taskToken.$' could not be found in the input

Reproduction Steps

See code snippets in the Current Behavior.

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.173.4

Framework Version

No response

Node.js Version

18

OS

Mac

Language

TypeScript

Language Version

No response

Other information

No response

@shatgupt shatgupt added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jan 6, 2025
@ashishdhingra ashishdhingra self-assigned this Jan 6, 2025
@ashishdhingra ashishdhingra added p2 needs-reproduction This issue needs reproduction. and removed needs-triage This issue or PR still needs to be triaged. labels Jan 6, 2025
@ashishdhingra
Copy link
Contributor

@shatgupt Good afternoon. Thanks for opening the issue. Although unrelated to issue reproduction from CDK code perspective, at Wait for a Callback with Task Token, it's mentioned as a note that You must pass task tokens from principals within the same AWS account. The tokens won't work if you send them from principals in a different AWS account.. So, I'm unsure if Task token would work in cross-account scenario. However, refer Cross-Account waitForTaskToken in AWS Step Functions for possible workaround.

Thanks,
Ashish

@ashishdhingra ashishdhingra added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Jan 6, 2025
@pahud
Copy link
Contributor

pahud commented Jan 7, 2025

Looking at the _renderTask() of the CallAwsServiceCrossRegion class:

The fix should be in how we construct the base resource ARN:

export class CallAwsServiceCrossRegion extends TaskStateBase {
  // ... existing code ...

  protected _renderTask(): any {
    // Construct the proper service integration ARN
    const serviceIntegrationArn = `arn:aws:states:::${this.service}:${this.action}`;
    
    // If using WAIT_FOR_TASK_TOKEN, the service integration ARN will automatically 
    // get .waitForTaskToken appended by the Step Functions service
    
    return {
      Resource: serviceIntegrationArn,
      Parameters: {
        region: this.region,
        parameters: this.parameters,
      },
    };
  }
}

Looking at

const resourceArnSuffix: Record<IntegrationPattern, string> = {
[IntegrationPattern.REQUEST_RESPONSE]: '',
[IntegrationPattern.RUN_JOB]: '.sync',
[IntegrationPattern.WAIT_FOR_TASK_TOKEN]: '.waitForTaskToken',
};

This shows how the resource ARN suffixes are defined for different integration patterns:

  • REQUEST_RESPONSE: ''
  • RUN_JOB: '.sync'
  • WAIT_FOR_TASK_TOKEN: '.waitForTaskToken'

And the integrationResourceArn function that uses these suffixes.

The issue is that CallAwsServiceCrossRegion is likely not using the integrationResourceArn utility function to construct its resource ARN, which would automatically append the correct suffix based on the integration pattern.

The key difference is that instead of using the Lambda function ARN, we should be using the proper service integration ARN format. The Step Functions service will handle appending .waitForTaskToken based on the integration pattern.

This aligns better with how other service integration tasks work in the CDK.
I am requesting the team for a second look.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Jan 7, 2025
@shatgupt
Copy link
Author

shatgupt commented Jan 7, 2025

@ashishdhingra Thank you for mentioning that. I was aware of it and was expecting it to be pointed out that is why already mentioned it my report.

I understand that SendTaskSuccess can only be called with the token using the current account principal.

I have solved it using a similar workaround as your reference but with an SNS topic.

@shatgupt
Copy link
Author

shatgupt commented Jan 7, 2025

This probably could be its own issue but I don't think CallAwsServiceCrossRegion task would work correctly with Credentials as I don't see any assume role being done in the Lambda code for it. For context, assume role would be required to start execution of a cross-account, cross-region StateMachine since SFN doesn't have resource policies.

I am currently solving both the task token and assume role issues by creating a Lambda myself.

@ashishdhingra ashishdhingra added effort/small Small work item – less than a day of effort and removed needs-reproduction This issue needs reproduction. labels Jan 7, 2025
@ashishdhingra ashishdhingra removed their assignment Jan 7, 2025
@tmokmss
Copy link
Contributor

tmokmss commented Jan 8, 2025

We should have followed this pattern 🤦

protected _renderTask(): any {
if (this.props.payloadResponseOnly) {
return {
Resource: this.props.lambdaFunction.functionArn,
...this.props.payload && { Parameters: sfn.FieldUtils.renderObject(this.props.payload.value) },
};
} else {
return {
Resource: integrationResourceArn('lambda', 'invoke', this.integrationPattern),
Parameters: sfn.FieldUtils.renderObject({
FunctionName: this.props.lambdaFunction.functionArn,
Payload: this.props.payload ? this.props.payload.value : sfn.TaskInput.fromJsonPathAt('$').value,
InvocationType: this.props.invocationType,
ClientContext: this.props.clientContext,
Qualifier: this.props.qualifier,
}),
};
}
}
}

To fix this without breaking changes, I guess we can introduce payloadResponseOnly property (true by default), and then fix the _renderTask method as above. Additionally we throw an error if payloadResponseOnly == true && integrationPattern != REQUEST_RESPONSE to prevent misconfiguration. Users who needs other integration patterns can set payloadResponseOnly: false.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-stepfunctions-tasks bug This issue is a bug. effort/small Small work item – less than a day of effort p2
Projects
None yet
4 participants