Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch blob requests fail when multipart boundary contains '=' #2413

Open
teamnimbus opened this issue Jun 18, 2024 · 4 comments
Open

Batch blob requests fail when multipart boundary contains '=' #2413

teamnimbus opened this issue Jun 18, 2024 · 4 comments
Assignees
Labels
blob-storage bug Something isn't working
Milestone

Comments

@teamnimbus
Copy link

Which service(blob, file, queue, table) does this issue concern?

Blob

Which version of the Azurite was used?

3.30.0

Where do you get Azurite? (npm, DockerHub, NuGet, Visual Studio Code Extension)

DockerHub

What's the Node.js version?

v20.10.0

What problem was encountered?

When parsing the Content-Type header of a multipart request, the boundary parameter is not correctly identified when it includes the = character.
Since the Python SDK uses the = character in boundaries, this breaks all operations for which the SDK generates a batch request -- for example, blob deletion.
Note that using the = character as part of the boundary is allowed by RFC-1341.

Steps to reproduce the issue?

Reproducing the issue is a little convoluted since we need to use production-style urls to avoid hitting issue #1809.

Start an Azurite container:

docker run -d --rm -p 10000:10000 --name azurite -e AZURITE_ACCOUNTS=account1:AAAA mcr.microsoft.com/azure-storage/azurite:3.30.0 azurite --blobHost 0.0.0.0 --debug /tmp/azurite-debug.log

Reproducer:

docker run -it --rm --add-host=account1.blob.localhost=127.0.0.1 --net=host python:3.9-bullseye python3 -m asyncio

# The following happens in the Python shell of the container
import sys, subprocess
subprocess.run([sys.executable, '-m', 'pip', 'install', 'azure-storage-blob==12.12.0', 'aiohttp'])

from azure.storage.blob.aio import BlobServiceClient
connection_string = "DefaultEndpointsProtocol=http;AccountName=account1;AccountKey=AAAA;BlobEndpoint=http://account1.blob.localhost:10000;"
blob_service_client = BlobServiceClient.from_connection_string(connection_string)
container_client = await blob_service_client.create_container('testcontainer')
await container_client.upload_blob(name='foo', data='')
await container_client.delete_blobs('foo')

The delete_blobs call fails with:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "<console>", line 1, in <module>
  File "/usr/local/lib/python3.9/site-packages/azure/core/tracing/decorator_async.py", line 94, in wrapper_use_tracer
    return await func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/azure/storage/blob/aio/_container_client_async.py", line 1107, in delete_blobs
    return await self._batch_send(*reqs, **options)
  File "/usr/local/lib/python3.9/site-packages/azure/storage/blob/_shared/base_client_async.py", line 167, in _batch_send
    process_storage_error(error)
  File "/usr/local/lib/python3.9/site-packages/azure/storage/blob/_shared/response_handlers.py", line 181, in process_storage_error
    exec("raise error from None")   # pylint: disable=exec-used # nosec
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.9/site-packages/azure/storage/blob/_shared/base_client_async.py", line 163, in _batch_send
    raise error
azure.storage.blob._shared.response_handlers.PartialBatchErrorException: There is a partial failure in the batch operation.
ErrorCode:None
Content: --
Content-Type: application/http

HTTP/1.1 400 One of the request inputs is not valid.
x-ms-error-code: InvalidInput
x-ms-request-id: 215d36a6-96e0-47f1-a57e-9af371a0d7b8
content-type: application/xml

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Error>
  <Code>InvalidInput</Code>
  <Message>One of the request inputs is not valid.
RequestId:215d36a6-96e0-47f1-a57e-9af371a0d7b8
Time:2024-06-18T08:53:06.200Z</Message>
</Error>

while the debug log contains:

...
2024-06-18T08:53:06.196Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: BlobStorageContextMiddleware: RequestMethod=POST RequestURL=http://account1.blob.localhost/testcontainer?restype=container&comp=batch RequestHeaders:{"host":"account1.blob.localhost:10000","x-ms-version":"2021-06-08","content-length":"391","content-type":"multipart/mixed; boundary================5306085128869334238==","x-ms-date":"Tue, 18 Jun 2024 08:53:06 GMT","x-ms-client-request-id":"2dde134a-2d50-11ef-9f0f-00155dd86730","user-agent":"azsdk-python-storage-blob/12.12.0 Python/3.9.19 (Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.31)","authorization":"SharedKey account1:4mORpeBuY9gNV0YyuAzmmK+xK/2Ryo6NkdYX3BXF7dg=","accept":"*/*","accept-encoding":"gzip, deflate"} ClientIP=10.10.0.1 Protocol=http HTTPVersion=1.1
...
2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 verbose: DeserializerMiddleware: Start deserializing...
2024-06-18T08:53:06.198Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: HandlerMiddleware: DeserializedParameters={"options":{"requestId":"2dde134a-2d50-11ef-9f0f-00155dd86730"},"restype":"container","comp":"batch","contentLength":391,"multipartContentType":"multipart/mixed; boundary================5306085128869334238==","version":"2021-06-08","body":"ReadableStream"}
2024-06-18T08:53:06.200Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 error: BlobBatchHandler: One of the request inputs is not valid.
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Received a MiddlewareError, fill error information to HTTP response
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: ErrorName=StorageError ErrorMessage=One of the request inputs is not valid.  ErrorHTTPStatusCode=400 ErrorHTTPStatusMessage=One of the request inputs is not valid. ErrorHTTPHeaders={"x-ms-error-code":"InvalidInput","x-ms-request-id":"215d36a6-96e0-47f1-a57e-9af371a0d7b8"} ErrorHTTPBody="<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<Error>\n  <Code>InvalidInput</Code>\n  <Message>One of the request inputs is not valid.\nRequestId:215d36a6-96e0-47f1-a57e-9af371a0d7b8\nTime:2024-06-18T08:53:06.200Z</Message>\n</Error>" ErrorStack="StorageError: One of the request inputs is not valid.\n    at BlobBatchHandler.submitBatch (/opt/azurite/dist/src/blob/handlers/BlobBatchHandler.js:282:25)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async ContainerHandler.submitBatch (/opt/azurite/dist/src/blob/handlers/ContainerHandler.js:232:36)"
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set HTTP code: 400
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set HTTP status message: One of the request inputs is not valid.
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set HTTP Header: x-ms-error-code=InvalidInput
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set HTTP Header: x-ms-request-id=215d36a6-96e0-47f1-a57e-9af371a0d7b8
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set content type: application/xml
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set HTTP body: "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<Error>\n  <Code>InvalidInput</Code>\n  <Message>One of the request inputs is not valid.\nRequestId:215d36a6-96e0-47f1-a57e-9af371a0d7b8\nTime:2024-06-18T08:53:06.200Z</Message>\n</Error>"
2024-06-18T08:53:06.202Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 verbose: SerializerMiddleware: Start serializing...
2024-06-18T08:53:06.202Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: Serializer: Start returning stream body.
2024-06-18T08:53:06.204Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: EndMiddleware: End response. TotalTimeInMS=8 StatusCode=202 StatusMessage=Accepted Headers={"server":"Azurite-Blob/3.30.0","content-type":"multipart/mixed; boundary=","x-ms-request-id":"215d36a6-96e0-47f1-a57e-9af371a0d7b8","x-ms-version":"2024-05-04"}
...
Full debug log
2024-06-18T08:51:59.691Z         info: Azurite Blob service is starting on 0.0.0.0:10000
2024-06-18T08:51:59.693Z         info: AccountDataStore:init() Refresh accounts from environment variable AZURITE_ACCOUNTS with value *****
2024-06-18T08:51:59.705Z         info: BlobGCManager:start() Starting BlobGCManager. Set status to Initializing.
2024-06-18T08:51:59.706Z         info: BlobGCManager:start() Trigger mark and sweep loop. Set status to Running.
2024-06-18T08:51:59.706Z         info: BlobGCManager:markSweepLoop() Start next mark and sweep.
2024-06-18T08:51:59.706Z         info: BlobGCManager:markSweep() Get all extents.
2024-06-18T08:51:59.707Z         info: BlobGCManager:start() BlobGCManager successfully started.
2024-06-18T08:51:59.709Z         info: BlobGCManager:markSweep() Got 0 extents.
2024-06-18T08:51:59.710Z         info: BlobGCManager:markSweep() Get referred extents.
2024-06-18T08:51:59.710Z         info: BlobGCManager:markSweep() Got referred extents, unreferenced extents count is 0.
2024-06-18T08:51:59.710Z         info: BlobGCManager:markSweepLoop() Mark and sweep finished, taken 4ms.
2024-06-18T08:51:59.711Z         info: BlobGCManager:markSweepLoop() Sleep for 600000ms.
2024-06-18T08:51:59.712Z         info: Azurite Blob service successfully listens on http://0.0.0.0:10000
2024-06-18T08:51:59.713Z         info: Azurite Queue service is starting on 127.0.0.1:10001
2024-06-18T08:51:59.713Z         info: AccountDataStore:init() Refresh accounts from environment variable AZURITE_ACCOUNTS with value *****
2024-06-18T08:51:59.718Z         info: QueueGCManager:start() Starting QueueGCManager, set status to Initializing
2024-06-18T08:51:59.719Z         info: QueueGCManager:start() Trigger mark and sweep loop, set status to Running.
2024-06-18T08:51:59.719Z         info: QueueGCManager:markSweepLoop() Start new mark and sweep.
2024-06-18T08:51:59.719Z         info: QueueGCManger:markSweep() Get all extents.
2024-06-18T08:51:59.719Z         info: QueueGCManager:start() QueueGCManager successfully started.
2024-06-18T08:51:59.719Z         info: QueueGCManager:marksweep() Get 0 extents.
2024-06-18T08:51:59.720Z         info: QueueGCManager:markSweep() Get referred extents, then remove from allExtents.
2024-06-18T08:51:59.720Z         info: QueueGCManager:markSweep() Got referred extents, unreferenced extents count is 0.
2024-06-18T08:51:59.720Z         info: QueueGCManager:markSweepLoop() Mark and sweep finished, take 1ms.
2024-06-18T08:51:59.720Z         info: QueueGCManager:markSweepLoop() Sleep for 60000
2024-06-18T08:51:59.720Z         info: Azurite Queue service successfully listens on http://127.0.0.1:10001
2024-06-18T08:51:59.720Z         info: Azurite Table service is starting on 127.0.0.1:10002
2024-06-18T08:51:59.721Z         info: AccountDataStore:init() Refresh accounts from environment variable AZURITE_ACCOUNTS with value *****
2024-06-18T08:51:59.723Z         info: Azurite Table service successfully listens on http://127.0.0.1:10002
2024-06-18T08:52:25.866Z 8abe45a0-5e85-4641-9c40-19576e62007f info: BlobStorageContextMiddleware: RequestMethod=PUT RequestURL=http://account1.blob.localhost/testcontainer?restype=container RequestHeaders:{"host":"account1.blob.localhost:10000","x-ms-version":"2021-06-08","accept":"application/xml","x-ms-date":"Tue, 18 Jun 2024 08:52:25 GMT","x-ms-client-request-id":"15d100a0-2d50-11ef-9f0f-00155dd86730","user-agent":"azsdk-python-storage-blob/12.12.0 Python/3.9.19 (Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.31)","authorization":"SharedKey account1:pEw9gywuJlaZ4mVNbsk3kwoHNiMMLBdHOIkGSLKuDmM=","accept-encoding":"gzip, deflate","content-length":"0"} ClientIP=10.10.0.1 Protocol=http HTTPVersion=1.1
2024-06-18T08:52:25.866Z 8abe45a0-5e85-4641-9c40-19576e62007f info: BlobStorageContextMiddleware: Account=account1 Container=testcontainer Blob=
2024-06-18T08:52:25.867Z 8abe45a0-5e85-4641-9c40-19576e62007f verbose: DispatchMiddleware: Dispatching request...
2024-06-18T08:52:25.869Z 8abe45a0-5e85-4641-9c40-19576e62007f info: DispatchMiddleware: Operation=Container_Create
2024-06-18T08:52:25.870Z 8abe45a0-5e85-4641-9c40-19576e62007f verbose: AuthenticationMiddlewareFactory:createAuthenticationMiddleware() Validating authentications.
2024-06-18T08:52:25.871Z 8abe45a0-5e85-4641-9c40-19576e62007f info: PublicAccessAuthenticator:validate() Start validation against public access.
2024-06-18T08:52:25.871Z 8abe45a0-5e85-4641-9c40-19576e62007f debug: PublicAccessAuthenticator:validate() Getting account properties...
2024-06-18T08:52:25.871Z 8abe45a0-5e85-4641-9c40-19576e62007f debug: PublicAccessAuthenticator:validate() Retrieved account name from context: account1, container: testcontainer, blob:
2024-06-18T08:52:25.875Z 8abe45a0-5e85-4641-9c40-19576e62007f debug: PublicAccessAuthenticator:validate() Skip public access authentication. Cannot get public access type for container testcontainer
2024-06-18T08:52:25.876Z 8abe45a0-5e85-4641-9c40-19576e62007f info: BlobSharedKeyAuthenticator:validate() Start validation against account shared key authentication.
2024-06-18T08:52:25.889Z 8abe45a0-5e85-4641-9c40-19576e62007f info: BlobSharedKeyAuthenticator:validate() [STRING TO SIGN]:"PUT\n\n\n\n\n\n\n\n\n\n\n\nx-ms-client-request-id:15d100a0-2d50-11ef-9f0f-00155dd86730\nx-ms-date:Tue, 18 Jun 2024 08:52:25 GMT\nx-ms-version:2021-06-08\n/account1/testcontainer\nrestype:container"
2024-06-18T08:52:25.890Z 8abe45a0-5e85-4641-9c40-19576e62007f info: BlobSharedKeyAuthenticator:validate() Calculated authentication header based on key1: SharedKey account1:pEw9gywuJlaZ4mVNbsk3kwoHNiMMLBdHOIkGSLKuDmM=
2024-06-18T08:52:25.890Z 8abe45a0-5e85-4641-9c40-19576e62007f info: BlobSharedKeyAuthenticator:validate() Signature 1 matched.
2024-06-18T08:52:25.891Z 8abe45a0-5e85-4641-9c40-19576e62007f verbose: DeserializerMiddleware: Start deserializing...
2024-06-18T08:52:25.893Z 8abe45a0-5e85-4641-9c40-19576e62007f info: HandlerMiddleware: DeserializedParameters={"options":{"metadata":{},"requestId":"15d100a0-2d50-11ef-9f0f-00155dd86730","containerCpkScopeInfo":{}},"restype":"container","version":"2021-06-08"}
2024-06-18T08:52:25.894Z 8abe45a0-5e85-4641-9c40-19576e62007f verbose: SerializerMiddleware: Start serializing...
2024-06-18T08:52:25.895Z 8abe45a0-5e85-4641-9c40-19576e62007f info: EndMiddleware: End response. TotalTimeInMS=29 StatusCode=201 StatusMessage=undefined Headers={"server":"Azurite-Blob/3.30.0","etag":"\"0x245C74B74FA7E40\"","last-modified":"Tue, 18 Jun 2024 08:52:25 GMT","x-ms-client-request-id":"15d100a0-2d50-11ef-9f0f-00155dd86730","x-ms-request-id":"8abe45a0-5e85-4641-9c40-19576e62007f","x-ms-version":"2024-05-04"}
2024-06-18T08:52:51.047Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: BlobStorageContextMiddleware: RequestMethod=PUT RequestURL=http://account1.blob.localhost/testcontainer/foo RequestHeaders:{"host":"account1.blob.localhost:10000","x-ms-blob-type":"BlockBlob","content-length":"0","if-none-match":"*","x-ms-version":"2021-06-08","content-type":"application/octet-stream","accept":"application/xml","x-ms-date":"Tue, 18 Jun 2024 08:52:51 GMT","x-ms-client-request-id":"24d6288c-2d50-11ef-9f0f-00155dd86730","user-agent":"azsdk-python-storage-blob/12.12.0 Python/3.9.19 (Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.31)","authorization":"SharedKey account1:+jYccn4bBMDo4SgeoAbukzSHJ53VvzigJjiPpiEv7O8=","accept-encoding":"gzip, deflate"} ClientIP=10.10.0.1 Protocol=http HTTPVersion=1.1
2024-06-18T08:52:51.047Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: BlobStorageContextMiddleware: Account=account1 Container=testcontainer Blob=foo
2024-06-18T08:52:51.047Z 2fa88f93-99ce-422e-92d5-3f61edd76765 verbose: DispatchMiddleware: Dispatching request...
2024-06-18T08:52:51.049Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: DispatchMiddleware: Operation=BlockBlob_Upload
2024-06-18T08:52:51.050Z 2fa88f93-99ce-422e-92d5-3f61edd76765 verbose: AuthenticationMiddlewareFactory:createAuthenticationMiddleware() Validating authentications.
2024-06-18T08:52:51.050Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: PublicAccessAuthenticator:validate() Start validation against public access.
2024-06-18T08:52:51.050Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: PublicAccessAuthenticator:validate() Getting account properties...
2024-06-18T08:52:51.050Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: PublicAccessAuthenticator:validate() Retrieved account name from context: account1, container: testcontainer, blob: foo
2024-06-18T08:52:51.053Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: PublicAccessAuthenticator:validate() Skip public access authentication. Cannot get public access type for container testcontainer
2024-06-18T08:52:51.053Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: BlobSharedKeyAuthenticator:validate() Start validation against account shared key authentication.
2024-06-18T08:52:51.054Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: BlobSharedKeyAuthenticator:validate() [STRING TO SIGN]:"PUT\n\n\n\n\napplication/octet-stream\n\n\n\n*\n\n\nx-ms-blob-type:BlockBlob\nx-ms-client-request-id:24d6288c-2d50-11ef-9f0f-00155dd86730\nx-ms-date:Tue, 18 Jun 2024 08:52:51 GMT\nx-ms-version:2021-06-08\n/account1/testcontainer/foo"
2024-06-18T08:52:51.054Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: BlobSharedKeyAuthenticator:validate() Calculated authentication header based on key1: SharedKey account1:+jYccn4bBMDo4SgeoAbukzSHJ53VvzigJjiPpiEv7O8=
2024-06-18T08:52:51.054Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: BlobSharedKeyAuthenticator:validate() Signature 1 matched.
2024-06-18T08:52:51.054Z 2fa88f93-99ce-422e-92d5-3f61edd76765 verbose: DeserializerMiddleware: Start deserializing...
2024-06-18T08:52:51.054Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: HandlerMiddleware: DeserializedParameters={"options":{"metadata":{},"requestId":"24d6288c-2d50-11ef-9f0f-00155dd86730","blobHTTPHeaders":{},"leaseAccessConditions":{},"cpkInfo":{},"cpkScopeInfo":{},"modifiedAccessConditions":{"ifNoneMatch":"*"}},"contentLength":0,"version":"2021-06-08","blobType":"BlockBlob","body":"ReadableStream"}
2024-06-18T08:52:51.056Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: OperationQueue.operate() Schedule incoming job 881910da-c7f7-4835-877f-3b299d8e0ace
2024-06-18T08:52:51.056Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: OperationQueue:execute() Current runningConcurrency:0 maxConcurrency:50 operations.length:1
2024-06-18T08:52:51.056Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: FSExtentStore:appendExtent() Select extent from idle location for extent append operation. LocationId:1 extentId:84267ce0-9fb1-418f-b311-7311dececc33 offset:0 MAX_EXTENT_SIZE:67108864
2024-06-18T08:52:51.057Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:appendExtent() Get fd:undefined for extent:84267ce0-9fb1-418f-b311-7311dececc33 from cache.
2024-06-18T08:52:51.057Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:appendExtent() Open file:/opt/azurite/__blobstorage__/84267ce0-9fb1-418f-b311-7311dececc33 for extent:84267ce0-9fb1-418f-b311-7311dececc33, get new fd:24
2024-06-18T08:52:51.059Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:appendExtent() Created write stream for fd:24
2024-06-18T08:52:51.059Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:appendExtent() Start writing to extent 84267ce0-9fb1-418f-b311-7311dececc33
2024-06-18T08:52:51.060Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:streamPipe() Start piping data to write stream
2024-06-18T08:52:51.060Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:streamPipe() Readable stream triggers close event, 0 bytes piped
2024-06-18T08:52:51.060Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:streamPipe() Invoke write stream end()
2024-06-18T08:52:51.061Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:streamPipe() Writable stream triggers finish event, after 0 bytes piped. Flush data to fd:24.
2024-06-18T08:52:51.061Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:streamPipe() Readable stream triggers close event, 0 bytes piped
2024-06-18T08:52:51.064Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:streamPipe() Flush data to fd:24 successfully. Resolve streamPipe().
2024-06-18T08:52:51.064Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:appendExtent() Write finish, start updating extent metadata. extent:{"id":"84267ce0-9fb1-418f-b311-7311dececc33","locationId":"Default","path":"84267ce0-9fb1-418f-b311-7311dececc33","size":0,"lastModifiedInMS":1718700771064}
2024-06-18T08:52:51.065Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:appendExtent() Update extent metadata done. Resolve()
2024-06-18T08:52:51.065Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: OperationQueue.operate() Job 881910da-c7f7-4835-877f-3b299d8e0ace completes callback, resolve.
2024-06-18T08:52:51.066Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: OperationQueue:execute() Current runningConcurrency:0 maxConcurrency:50 operations.length:0
2024-06-18T08:52:51.066Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: OperationQueue:execute() return. Operation.length === 0
2024-06-18T08:52:51.068Z 2fa88f93-99ce-422e-92d5-3f61edd76765 verbose: SerializerMiddleware: Start serializing...
2024-06-18T08:52:51.069Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: EndMiddleware: End response. TotalTimeInMS=22 StatusCode=201 StatusMessage=undefined Headers={"server":"Azurite-Blob/3.30.0","etag":"\"0x1E85E3D6248BBC0\"","last-modified":"Tue, 18 Jun 2024 08:52:51 GMT","content-md5":"1B2M2Y8AsgTpgAmY7PhCfg==","x-ms-client-request-id":"24d6288c-2d50-11ef-9f0f-00155dd86730","x-ms-request-id":"2fa88f93-99ce-422e-92d5-3f61edd76765","x-ms-version":"2024-05-04","date":"Tue, 18 Jun 2024 08:52:51 GMT","x-ms-request-server-encrypted":"true"}
2024-06-18T08:52:59.698Z         info: AccountDataStore:init() Refresh accounts from environment variable AZURITE_ACCOUNTS with value *****
2024-06-18T08:52:59.713Z         info: AccountDataStore:init() Refresh accounts from environment variable AZURITE_ACCOUNTS with value *****
2024-06-18T08:52:59.721Z         info: QueueGCManager:markSweepLoop() Start new mark and sweep.
2024-06-18T08:52:59.721Z         info: QueueGCManger:markSweep() Get all extents.
2024-06-18T08:52:59.722Z         info: QueueGCManager:marksweep() Get 0 extents.
2024-06-18T08:52:59.722Z         info: QueueGCManager:markSweep() Get referred extents, then remove from allExtents.
2024-06-18T08:52:59.722Z         info: QueueGCManager:markSweep() Got referred extents, unreferenced extents count is 0.
2024-06-18T08:52:59.722Z         info: QueueGCManager:markSweepLoop() Mark and sweep finished, take 1ms.
2024-06-18T08:52:59.722Z         info: QueueGCManager:markSweepLoop() Sleep for 60000
2024-06-18T08:52:59.723Z         info: AccountDataStore:init() Refresh accounts from environment variable AZURITE_ACCOUNTS with value *****
2024-06-18T08:53:06.196Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: BlobStorageContextMiddleware: RequestMethod=POST RequestURL=http://account1.blob.localhost/testcontainer?restype=container&comp=batch RequestHeaders:{"host":"account1.blob.localhost:10000","x-ms-version":"2021-06-08","content-length":"391","content-type":"multipart/mixed; boundary================5306085128869334238==","x-ms-date":"Tue, 18 Jun 2024 08:53:06 GMT","x-ms-client-request-id":"2dde134a-2d50-11ef-9f0f-00155dd86730","user-agent":"azsdk-python-storage-blob/12.12.0 Python/3.9.19 (Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.31)","authorization":"SharedKey account1:4mORpeBuY9gNV0YyuAzmmK+xK/2Ryo6NkdYX3BXF7dg=","accept":"*/*","accept-encoding":"gzip, deflate"} ClientIP=10.10.0.1 Protocol=http HTTPVersion=1.1
2024-06-18T08:53:06.196Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: BlobStorageContextMiddleware: Account=account1 Container=testcontainer Blob=
2024-06-18T08:53:06.196Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 verbose: DispatchMiddleware: Dispatching request...
2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: DispatchMiddleware: Operation=Container_SubmitBatch
2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 verbose: AuthenticationMiddlewareFactory:createAuthenticationMiddleware() Validating authentications.
2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: PublicAccessAuthenticator:validate() Start validation against public access.
2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 debug: PublicAccessAuthenticator:validate() Getting account properties...
2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 debug: PublicAccessAuthenticator:validate() Retrieved account name from context: account1, container: testcontainer, blob:
2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 debug: PublicAccessAuthenticator:validate() Skip public access authentication. Cannot get public access type for container testcontainer
2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: BlobSharedKeyAuthenticator:validate() Start validation against account shared key authentication.
2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: BlobSharedKeyAuthenticator:validate() [STRING TO SIGN]:"POST\n\n\n391\n\nmultipart/mixed; boundary================5306085128869334238==\n\n\n\n\n\n\nx-ms-client-request-id:2dde134a-2d50-11ef-9f0f-00155dd86730\nx-ms-date:Tue, 18 Jun 2024 08:53:06 GMT\nx-ms-version:2021-06-08\n/account1/testcontainer\ncomp:batch\nrestype:container"
2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: BlobSharedKeyAuthenticator:validate() Calculated authentication header based on key1: SharedKey account1:4mORpeBuY9gNV0YyuAzmmK+xK/2Ryo6NkdYX3BXF7dg=
2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: BlobSharedKeyAuthenticator:validate() Signature 1 matched.
2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 verbose: DeserializerMiddleware: Start deserializing...
2024-06-18T08:53:06.198Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: HandlerMiddleware: DeserializedParameters={"options":{"requestId":"2dde134a-2d50-11ef-9f0f-00155dd86730"},"restype":"container","comp":"batch","contentLength":391,"multipartContentType":"multipart/mixed; boundary================5306085128869334238==","version":"2021-06-08","body":"ReadableStream"}
2024-06-18T08:53:06.200Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 error: BlobBatchHandler: One of the request inputs is not valid.
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Received a MiddlewareError, fill error information to HTTP response
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: ErrorName=StorageError ErrorMessage=One of the request inputs is not valid.  ErrorHTTPStatusCode=400 ErrorHTTPStatusMessage=One of the request inputs is not valid. ErrorHTTPHeaders={"x-ms-error-code":"InvalidInput","x-ms-request-id":"215d36a6-96e0-47f1-a57e-9af371a0d7b8"} ErrorHTTPBody="<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<Error>\n  <Code>InvalidInput</Code>\n  <Message>One of the request inputs is not valid.\nRequestId:215d36a6-96e0-47f1-a57e-9af371a0d7b8\nTime:2024-06-18T08:53:06.200Z</Message>\n</Error>" ErrorStack="StorageError: One of the request inputs is not valid.\n    at BlobBatchHandler.submitBatch (/opt/azurite/dist/src/blob/handlers/BlobBatchHandler.js:282:25)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async ContainerHandler.submitBatch (/opt/azurite/dist/src/blob/handlers/ContainerHandler.js:232:36)"
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set HTTP code: 400
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set HTTP status message: One of the request inputs is not valid.
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set HTTP Header: x-ms-error-code=InvalidInput
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set HTTP Header: x-ms-request-id=215d36a6-96e0-47f1-a57e-9af371a0d7b8
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set content type: application/xml
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set HTTP body: "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<Error>\n  <Code>InvalidInput</Code>\n  <Message>One of the request inputs is not valid.\nRequestId:215d36a6-96e0-47f1-a57e-9af371a0d7b8\nTime:2024-06-18T08:53:06.200Z</Message>\n</Error>"
2024-06-18T08:53:06.202Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 verbose: SerializerMiddleware: Start serializing...
2024-06-18T08:53:06.202Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: Serializer: Start returning stream body.
2024-06-18T08:53:06.204Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: EndMiddleware: End response. TotalTimeInMS=8 StatusCode=202 StatusMessage=Accepted Headers={"server":"Azurite-Blob/3.30.0","content-type":"multipart/mixed; boundary=","x-ms-request-id":"215d36a6-96e0-47f1-a57e-9af371a0d7b8","x-ms-version":"2024-05-04"}

Note: we used azure-storage-blob 12.12.0, which is somewhat old, but the same behavior is observed with the latest version (12.20.0).

Have you found a mitigation/solution?

These commands replace the azurite container with a patched one:

docker stop azurite
cat <<EOF | docker build --tag azurite-patched -
FROM mcr.microsoft.com/azure-storage/azurite:3.30.0
RUN sed -i 's/getHeader("content-type").split("=")\[1\]/getHeader("content-type").match(\/boundary=(.*)\/)[1].trim()/g' /opt/azurite/dist/src/blob/handlers/ContainerHandler.js
RUN sed -i 's/getHeader("content-type").split("=")\[1\]/getHeader("content-type").match(\/boundary=(.*)\/)[1].trim()/g' /opt/azurite/dist/src/blob/handlers/ServiceHandler.js
EOF
docker run -d --rm -p 10000:10000 --name azurite -e AZURITE_ACCOUNTS=account1:AAAA azurite-patched azurite --blobHost 0.0.0.0 --debug /tmp/azurite-debug.log

This container makes the "Reproducer" above work, the blob gets deleted without errors.

The patch replaces getHeader("content-type").split("=")[1] with getHeader("content-type").match(/boundary=(.*)/)[1].trim() in the submitBatch functions of both ContainerHandler.js and ServiceHandler.js.

In the source code the faulty lines are:

@blueww blueww added question Further information is requested blob-storage labels Jun 19, 2024
@blueww
Copy link
Member

blueww commented Jun 19, 2024

@EmmaZhu

Would you please help to look at the blob batch related issue?

@teamnimbus
Copy link
Author

@EmmaZhu, @blueww: we see the issue is labeled "question" but we weren't asked anything. If you need further info feel free to ask.

@blueww
Copy link
Member

blueww commented Jun 27, 2024

@teamnimbus
We will change it to "Bug" label after confirm this is an Azurite bug.

@EmmaZhu EmmaZhu added bug Something isn't working and removed question Further information is requested labels Jul 2, 2024
@EmmaZhu
Copy link
Member

EmmaZhu commented Jul 3, 2024

#2421

@EmmaZhu EmmaZhu added this to the Nov 12 2024 milestone Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blob-storage bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants