-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: optional checksum algorithm for upload #13849
feat: optional checksum algorithm for upload #13849
Conversation
packages/storage/src/providers/s3/apis/uploadData/putObjectJob.ts
Outdated
Show resolved
Hide resolved
const optionsHash = ( | ||
await calculateContentCRC32(JSON.stringify(uploadDataOptions)) | ||
).checksum; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for making this change 👍 @wuuxigh it does fix a sharp edge with current implementation.
On the other hand, we are going to add an option to disable this upload caching feature completely. Partially because of this sharp edge, and there are more sharp edges. We plan to have a more thorough refactor to how we cache the uploads.
For now I think this change is OK to be merged in, even though it will not be used by StorageBrowser.
cc @jimblanc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, when that other change goes in we'll probably need to go through and make optionsHash
optional on interfaces.
const finalCrc32 = | ||
checksumAlgorithm === 'crc-32' | ||
? await getCombinedCrc32(data, size) | ||
: undefined; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sanity check: do we have special error message for client-side hashing errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No new error messages were added in this PR.
/** | ||
* Indicates the algorithm used to create the checksum for the object. | ||
* This checksum can be used as a data integrity check to verify that the data received is the same data that was originally sent. | ||
* @default undefined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the decription is a bit verbose since integrity check means to verify that the data received is the same data that was originally sent
.
/** | |
* Indicates the algorithm used to create the checksum for the object. | |
* This checksum can be used as a data integrity check to verify that the data received is the same data that was originally sent. | |
* @default undefined | |
/** | |
* Indicates the algorithm used to create the checksum for the object. | |
* The checksum created is for data integrity check by service-side. | |
* By default this checksum is not created. | |
* @default undefined |
new Uint8Array((hexString.match(/\w{2}/g)! ?? []).map(h => parseInt(h, 16))) | ||
.buffer; | ||
|
||
const hexToBase64 = (hexString: string) => | ||
export const hexToBase64 = (hexString: string) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto, move this to own file.
@@ -126,7 +128,7 @@ export const getUploadsCacheKey = ({ | |||
levelStr = accessLevel === 'guest' ? 'public' : accessLevel; | |||
} | |||
|
|||
const baseId = `${size}_${resolvedContentType}_${bucket}_${levelStr}_${key}`; | |||
const baseId = `${optionsHash}_${size}_${resolvedContentType}_${bucket}_${levelStr}_${key}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sanity check: What will happen here for any ongoing uploads when this change is released? Is the change backwards compatible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will not be backwards compatible and in progress uploads will need to start from scratch.
const optionsHash = ( | ||
await calculateContentCRC32(JSON.stringify(uploadDataOptions)) | ||
).checksum; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, when that other change goes in we'll probably need to go through and make optionsHash
optional on interfaces.
/** | ||
* Indicates the algorithm used to create the checksum for the object. | ||
* This checksum can be used as a data integrity check to verify that the data received is the same data that was originally sent. | ||
* @default undefined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To build on what Allan suggested, I'd make it even less verbose:
/** | |
* Indicates the algorithm used to create the checksum for the object. | |
* This checksum can be used as a data integrity check to verify that the data received is the same data that was originally sent. | |
* @default undefined | |
/** | |
* The algorithm used to compute a checksum for the object. Used to verify that the data received by S3 | |
* matches what was originally sent. Disabled by default. | |
* | |
* @default undefined |
offset += crc32Hash.length; | ||
} | ||
|
||
return `${(await calculateContentCRC32(combinedArray.buffer)).checksum}-${crc32List.length}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a stupid question, but why is the length appended to the hash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Length is needed to indicate that this is a combined checksum expected by s3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Thank you for adding this.
02cb08a
into
aws-amplify:storage-browser/integrity
…)" This reverts commit 02cb08a.
Description of changes
checksumAlgorithm
option foruploadData
API to enable CRC-32 checksum calculation.UploadDataOptions
CRC-32 hash to avoid collision.Issue #, if available
Description of how you validated changes
checksumAlgorithm
option for CRC-32.Checklist
yarn test
passesChecklist for repo maintainers
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.