Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@cap-js/hana corrupts the large zip file ( zip files on read gives Corrupted zip or invalid signature) #788

Closed
nishnp opened this issue Aug 27, 2024 · 3 comments · Fixed by #846
Assignees
Labels
bug Something isn't working hana

Comments

@nishnp
Copy link

nishnp commented Aug 27, 2024

I am working on a multitenant CAP application the dependencies are as below

pr https://github.com/<your/repo>
@cap-js/asyncapi 1.0.2
@cap-js/audit-logging 0.8.0
@cap-js/cds-types 0.2.0
@cap-js/hana 1.1.1
@cap-js/openapi 1.0.5
@cap-js/telemetry 0.2.3
@sap/cds 8.1.1
@sap/cds-common-conten 2.1.0
@sap/cds-compiler 5.1.2
@sap/cds-dk 8.1.2
@sap/cds-dk (global) 7.9.4
@sap/cds-fiori 1.2.4
@sap/cds-foss 5.0.1
@sap/cds-indexer 1.0.14
@sap/cds-mtxs 2.0.6
@sap/cds-odata-v2-adap 1.9.21
@sap/cds-shim 0.6.6
@sap/eslint-plugin-cds 3.0.4
Node.js v20.15.0
home /Users/payann/Documents/ace/ace/node_modules/@sap/cds

in the file @cap-js/hana/lib/drivers/hana-client.js at function streamBlob, because the binary buffer is assigned to 65536 ( 1 << 16).
if the file is large, then the first 65536 bytes are read and then the next read overwrites the first few bytes which corrupts the file signature.

for now i was able to fix with a patch something like this
-async function* streamBlob(rs, rowIndex = -1, columnIndex, binaryBuffer = Buffer.allocUnsafe(1 << 16)) { +async function* streamBlob(rs, rowIndex = -1, columnIndex, binaryBuffer = Buffer.allocUnsafe(1 << 25)) {

@nishnp nishnp added the bug Something isn't working label Aug 27, 2024
@mathewzile
Copy link

mathewzile commented Sep 6, 2024

i am also facing this issue using @sap/hana-client. document stream retrieved from SELECT and converted to Blob ends up with errors when passed to Langchain's document loaders.
this does not happen with hdb driver (there's an unrelated issue with that driver regarding streams and draft activation)

EDIT: this is resolved in my scenario, so it might be an unrelated issue from the OP's. it was due to the buffer for each data chunk being mutated, resulting in the same data chunk in my final array:
image

i fixed it by creating a copy of the buffer on each read
image

original code works for hdb driver:
image

@patricebender
Copy link
Member

@BobdenOs could you have a look?

@BobdenOs
Copy link
Contributor

@mathewzile Thanks for your additional insights into the issue. I just wanted to point out the official nodejs implementation of this function here used like: require('stream/consumer').blob.

I am certain that the official implementation would have had the same issue you where facing. Not sure whether you can put the mimetype onto the Blob afterwards, but I would generally recommend to use the std implementation where possible. Or if it is not possible open a PR to node they might accept it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working hana
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants