Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GET /bundles/all can be tricked into returning tombstoned bundles #2641

Open
hannes-ucsc opened this issue Dec 1, 2019 · 0 comments
Open

Comments

@hannes-ucsc
Copy link
Contributor

hannes-ucsc commented Dec 1, 2019

Request the first page of a prefix that contains tombstoned bundles

$ http 'https://dss.data.humancellatlas.org/v1/bundles/all?replica=aws&prefix=3f&per_page=500'
HTTP/1.1 206 Partial Content
Access-Control-Allow-Headers: Authorization,Content-Type,X-Amz-Date,X-Amz-Security-Token,X-Api-Key
Access-Control-Allow-Origin: *
Connection: keep-alive
Content-Length: 43776
Content-Type: application/json
Date: Sun, 01 Dec 2019 01:33:31 GMT
Link: <https://dss.data.humancellatlas.org/v1/bundles/all?per_page=500&prefix=3f&replica=aws&search_after=bundles%2F3fcf3988-4e63-4f2f-8e3d-b213b7fde5de.2019-10-09T170735.487612Z&token=1nD54ZWaI21uceAAYvxlxkkUtuTU1tsqOCyqY7vBZuFmmeZlcrCEvXSNcEPpbdiOncyrG1t0Gt%2B5QZziyXEuLnn3%2FnrHoqcDFMNmMFCEmIcutoHau1HwNj0hQc0ZxHW45rSWrTr9uLe8%3D>; rel='next'
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
X-AWS-REQUEST-ID: 81926d5e-b649-48ee-aca6-205add9e9d50
X-Amzn-Trace-Id: Root=1-5de31869-41d61718aed0d4e829a975f8;Sampled=0
X-OpenAPI-Paginated-Content-Key: bundles
X-OpenAPI-Pagination: true
x-amz-apigw-id: EADAcHP7IAMFjaQ=
x-amzn-RequestId: 9b58f5c3-f6ee-4380-a466-ae71e1b23b3e

{
    "bundles": [
        {
            "uuid": "3f001679-4b89-4f2c-b762-9db0f7cba89e",
            "version": "2019-05-17T160538.781000Z"
        },
        ... 498 mor bundles ...

        {
            "uuid": "3fcf3988-4e63-4f2f-8e3d-b213b7fde5de",
            "version": "2019-10-09T170735.487612Z"
        }
    ],
    "dss_api": "https://dss.data.humancellatlas.org",
    "event_timestamp": "2019-12-01T013329.207505Z",
    "has_more": true,
    "link": "https://dss.data.humancellatlas.org/v1/bundles/all?per_page=500&prefix=3f&replica=aws&search_after=bundles%2F3fcf3988-4e63-4f2f-8e3d-b213b7fde5de.2019-10-09T170735.487612Z&token=1nD54ZWaI21uceAAYvxlxkkUtuTU1tsqOCyqY7vBZuFmmeZlcrCEvXSNcEPpbdiOncyrG1t0Gt%2B5QZziyXEuLnn3%2FnrHoqcDFMNmMFCEmIcutoHau1HwNj0hQc0ZxHW45rSWrTr9uLe8%3D",
    "object": "list",
    "page_count": 500,
    "per_page": 500,
    "search_after": "bundles/3fcf3988-4e63-4f2f-8e3d-b213b7fde5de.2019-10-09T170735.487612Z",
    "search_prefix": "bundles/3f",
    "token": "1nD54ZWaI21uceAAYvxlxkkUtuTU1tsqOCyqY7vBZuFmmeZlcrCEvXSNcEPpbdiOncyrG1t0Gt+5QZziyXEuLnn3/nrHoqcDFMNmMFCEmIcutoHau1HwNj0hQc0ZxHW45rSWrTr9uLe8="
}

Take the token returned in the response and append to URL. Note that we are intentionally ignoring the search_after or next response elements and instead crafting the next URL manually, against the documented recommendation.

$ http 'https://dss.data.humancellatlas.org/v1/bundles/all?replica=aws&prefix=3f&per_page=500&token=1nD54ZWaI21uceAAYvxlxkkUtuTU1tsqOCyqY7vBZuFmmeZlcrCEvXSNcEPpbdiOncyrG1t0Gt%2B5QZziyXEuLnn3%2FnrHoqcDFMNmMFCEmIcutoHau1HwNj0hQc0ZxHW45rSWrTr9uLe8%3D'
HTTP/1.1 200 OK
Access-Control-Allow-Headers: Authorization,Content-Type,X-Amz-Date,X-Amz-Security-Token,X-Api-Key
Access-Control-Allow-Origin: *
Connection: keep-alive
Content-Length: 13876
Content-Type: application/json
Date: Sun, 01 Dec 2019 01:34:20 GMT
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
X-AWS-REQUEST-ID: b0c2317f-9167-4483-930f-e35c60ea5554
X-Amzn-Trace-Id: Root=1-5de3189c-297253779dc76b4143c9eb70;Sampled=0
X-OpenAPI-Paginated-Content-Key: bundles
X-OpenAPI-Pagination: false
x-amz-apigw-id: EADIbH_RIAMFw4A=
x-amzn-RequestId: 41f29cb1-961c-4919-9196-8794bc26a727

{
    "bundles": [
        {
            "uuid": "3fb8b36e-0d8d-444b-922b-1dc877237c9a",
            "version": "2018-12-06T040402.751201Z"
        },
        ... more bundles ...
        {
            "uuid": "3fffa7af-634f-457b-bc5a-2c39ae7ee774",
            "version": "2019-08-01T200146.992368Z"
        }
    ],
    "dss_api": "https://dss.data.humancellatlas.org",
    "event_timestamp": "2019-12-01T013420.282467Z",
    "has_more": false,
    "object": "list",
    "page_count": 159,
    "per_page": 500,
    "search_prefix": "bundles/3f"
}

Note the first bundle returned in the response: 3fb8b36e-0d8d-444b-922b-1dc877237c9a, version 2018-12-06T040402.751201Z. Requesting that bundle version from the DSS yields a 404.

(.venv.xps) hannes@hannes-xps:~/workspace/hca/azul.stable$ http 'https://dss.data.humancellatlas.org/v1/bundles/3fb8b36e-0d8d-444b-922b-1dc877237c9a?version=2018-12-06T040402.751201Z&replica=aws'
HTTP/1.1 404 Not Found
Access-Control-Allow-Headers: Authorization,Content-Type,X-Amz-Date,X-Amz-Security-Token,X-Api-Key
Access-Control-Allow-Origin: *
Connection: keep-alive
Content-Length: 405
Content-Type: application/problem+json
Date: Sun, 01 Dec 2019 01:37:22 GMT
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
X-AWS-REQUEST-ID: 3e13202c-3c5b-4c47-8ad4-5b895b273f20
X-Amzn-Trace-Id: Root=1-5de31951-e3cc28b8afece138c0003328;Sampled=0
x-amz-apigw-id: EADk9F0CIAMFdRA=
x-amzn-RequestId: 4abb633f-69f5-4916-83ea-b198eaf3d2ca

{
    "code": "not_found",
    "stacktrace": "Traceback (most recent call last):\n  File \"/var/task/chalicelib/dss/error.py\", line 72, in wrapper\n    return func(*args, **kwargs)\n  File \"/var/task/chalicelib/dss/api/bundles/__init__.py\", line 56, in get\n    raise DSSException(404, \"not_found\", \"Cannot find bundle!\")\ndss.error.DSSException\n",
    "status": 404,
    "title": "Cannot find bundle!"
}

And indeed, that bundle version is tombstoned:

$ aws s3 ls s3://org-hca-dss-prod/bundles/3fb8b36e-0d8d-444b-922b-1dc877237c9a.2018-12-06T040402.751201Z
2018-12-05 20:04:04       7298 3fb8b36e-0d8d-444b-922b-1dc877237c9a.2018-12-06T040402.751201Z
2019-06-25 12:59:46        204 3fb8b36e-0d8d-444b-922b-1dc877237c9a.2018-12-06T040402.751201Z.dead

We stumbled over this by accident when researching the cause for DataBiosphere/azul#1488.

Granted, this is a rather uncooperative usage pattern of the DSS REST API but no matter how uncooperative the client is, or even malicious, the DSS should prevent being tricked into revealing tombstoned bundles.

I think an easy fix would be to require that the search_after and token parameters are both present, or better, combine them into a single request parameter. I see no reason why search_after couldn't be stuffed into the token. Consider the case of an malicious client that passes both but stuffs manipulated values into them. I have the feeling that I could manipulate the DSS into all sorts of interesting things by manipulating the search_after and token values. Also note that base64 encoding the token does not protect it from being reverse engineered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants