Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signed header using on host:port instead of host only #2965

Closed
aimran-adroll opened this issue Jun 7, 2023 · 8 comments
Closed

Signed header using on host:port instead of host only #2965

aimran-adroll opened this issue Jun 7, 2023 · 8 comments

Comments

@aimran-adroll
Copy link

Describe the bug

At our company. we use 0-trust solution that puts most services (including https) behind non standard local port (e.g 7443 instead of 443)

Consequently, this bit breaks the signing

https://github.com/boto/botocore/blame/5ba1dc100324723b56ff6953350d8218a85b63bf/botocore/auth.py#L83-L85

Because if I am not mistaken the SDK should only use the host bit (e.g 127.0.0.1). But as you can see on line 85, host is being modified if my port does not match your default_ports scheme.

To wit: Canonical header should not have host:port but rather just host regardless if I am not sorely mistaken. And sdk should not confuse endpoint (for making requests) with what goes into awsv4 headers.

To make matter worse. looks like Golang sdk copy pasted the same bits of code

Expected Behavior

canonical header should only contain host

endpoint = '127.0.0.1:7443' # endpoint is not the same thing as host for signing purposes
host = '127.0.0.1'
canonical_headers = "\n".join([f'{k}:{v}' for k,v in {'host':host, 
                                                      'x-amz-content-sha256': empty_string_hash,
                                                      'x-amz-date': amzdate, 
                                                      'x-amz-security-token': credentials.token,
                                                     }.items()])

Current Behavior

Getting

"message":"The request signature we calculated does not match the signature you provided. ...

Reproduction Steps

Repeat the aws example https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-query-string-auth.html

with host and host:port to see how this is getting messed up

Possible Solution

No response

Additional Information/Context

No response

SDK version used

botocore==1.29.110, boto3==1.26.97

Environment details (OS name and version, etc.)

py3.9, mac

@aimran-adroll aimran-adroll added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Jun 7, 2023
@nateprewitt
Copy link
Contributor

Hi @aimran-adroll, thanks for reaching out about this. We do intentionally include the port if it's provided and not implicit with the scheme (http -> 80 and https -> 443). The HTTP RFC for a Host header is defined as (emphasis mine):

The "Host" header field in a request provides the host and port information from the target URI, enabling the origin server to distinguish among resources while servicing requests for multiple host names.

- RFC 9110 § 7.2

You won't typically encounter this scenario with a production AWS service though. Could you provide some more details about your use case and where you found information on the exclusion of the port in signing? Thanks!

@nateprewitt nateprewitt added investigating This issue is being investigated and/or work is in progress to resolve the issue. response-requested Waiting on additional info and feedback. and removed bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Jun 7, 2023
@aimran-adroll
Copy link
Author

We use Banyan for managing our corporate network. Tldr: Something sits in the middle to mediate access between user (aka my laptop) and internal resources (aws services). Consequently, instead of directly hitting https://awsendpoint.vpc.aws.com, we typically have an app that maps awsendpoint to localhost:PORT

In any case, here is an example where requests_aws4auth library does the right thing but aws sdks (using underlying botocore) fails. I am using opensearch as an example but pretty sure that its going to be the same issue for any other using botocore with similar setup.

import boto3
from opensearchpy import OpenSearch, AWSV4SignerAuth, RequestsHttpConnection
from requests_aws4auth import AWS4Auth


credentials = boto3.Session().get_credentials()

## THIS WORKS (does _not_ uses botocore)
auth = AWS4Auth(credentials.access_key, credentials.secret_key,
                        "us-west-2", "es", session_token=credentials.token)
        

## !!!!! THIS DOES NOT 
## (dont be thrown off by AWSV4SignerAuth. it uses merely uses botocre
auth =  AWSV4SignerAuth(credentials, "us-west-2")


### everything else remaining same
c = OpenSearch(
            hosts=[{"host": "localhost", "port": 7443}],
            http_auth=auth,      # <---------------------- switch 
            connection_class=RequestsHttpConnection,
            use_ssl=True,
            verify_certs=False,
            ssl_show_warn=False,
        )

After spending absurdly crazy amount of time, i am fairly convinced its the use of host:port in the canonical signature thats messing things up. I verified it by crafting the signature by hand and issuing raw requests

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. label Jun 7, 2023
@nateprewitt
Copy link
Contributor

nateprewitt commented Jun 7, 2023

Thanks for the info, @aimran-adroll. To quickly clarify, the Host header you're sending in your raw request to OpenSearch is localhost, or is this being transformed somewhere else? Can you provide the exact values being passed into Boto3 and the Host header you're expecting?

Taking a look at requests_aws4auth it looks like they just added tests specifically to ensure the port is preserved on the Host header. It seems there's some confusion around the expected behavior and this may be an issue specific to OpenSearch. I'm not sure we can modify this behavior without risking breakages to other use cases.

There is also a similar discussion around OpenSearch in the originating issue for that PR. Their specific case is due to setting the endpoint_url to a different domain than what the request is actually hitting. I'm not sure we can easily handle these implicit network mappings in the SDKs.

If you can provide the info requested above, it'll be helpful while we investigate internally. Thanks!

@nateprewitt nateprewitt added the response-requested Waiting on additional info and feedback. label Jun 7, 2023
@aimran-adroll
Copy link
Author

Thanks @nateprewitt for sticking on this. Its a touch difficult to explain but here it goes. I repeated the above (the two ways of generating auth). I then used the debugger to drop down to the following places


in both cases, I am passing host="localhost:7443"

case 1: request_aws4auth

https://github.com/tedder/requests-aws4auth/blob/9e9e7bf25ad1962cd8dd77064216ee4cab8ca520/requests_aws4auth/aws4auth.py#L415

This is right after it has generated the canonical headers. I inspected the values and host is indeed localhost (sans port)

image

case 2: using opensearchpy.AWSV4SignerAuth

I dropped down to the following place in botocore

return header_map

image

You can see that host is localhost:7443


Since case 2 fails to sign the request, its reasonable to assume that host in botocore in this particular scenario is not set correctly at the signing phase.

@aimran-adroll
Copy link
Author

And just to be annoying 😅 , i commented out the following lines in botocore, and 💥 . Case 2 works too

if url_parts.port is not None:
if url_parts.port != default_ports.get(url_parts.scheme):
host = '%s:%d' % (host, url_parts.port)

@nateprewitt
Copy link
Contributor

Ok interesting, so this may work, but is 100% not intended functionality. The Host header is the destination server, not where the request is originating. This is to distinguish which service you're addressing for co-habitated applications on the same server.

It looks like OpenSearch, and potentially other services, are interpreting "localhost" as the service host. I'm actually surprised they are accepting this to begin with, but it may be an unintentional omission.

@aimran-adroll
Copy link
Author

This is to distinguish which service you're addressing for co-habitated applications on the same server.

That certainly makes sense. I am not sure if there is anything to be done here. I guess I will keep using requests_aws4auth since its unintentionally using uri-host for the host header (atleast until they "fix" it)

Thanks again

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. label Jun 7, 2023
@tim-finnigan tim-finnigan removed the investigating This issue is being investigated and/or work is in progress to resolve the issue. label Jun 24, 2024
@tim-finnigan tim-finnigan closed this as not planned Won't fix, can't repro, duplicate, stale Jun 24, 2024
Copy link

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants