Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add configurable download method to object_store for enhanced usability #6837

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

midnattsol
Copy link

@midnattsol midnattsol commented Dec 4, 2024

Which issue does this PR close?

Closes #5277

Rationale for this change

The current get method in ObjectStore requires users to handle multipart downloads and large file transfers manually, which can be complex and error-prone. This PR proposes a new download method to streamline this process by:

  • Automatically managing concurrency for large file downloads.
  • Supporting chunk buffering to optimize memory usage.
  • Allowing configurable retry mechanisms to handle transient errors.

This change simplifies the API for users and aligns with the feature request outlined in issue #5277.

What changes are included in this PR?

  • Added a new download method in the object_store::local module.
  • Introduced the TransferOptions struct for user-configurable transfer options, such as concurrency, buffer size, and retry limits.
  • Implemented helper functions:
    • download_chunk: Downloads individual chunks of data.
    • write_multi_chunks: Writes downloaded chunks to a local file.
  • Added cancellation support for graceful handling of aborted downloads.
  • Updated documentation to reflect the new API.

Are there any user-facing changes?

Yes:

  • Users can now utilize the download method to simplify file downloads from the object store.
  • Transfer behavior can be configured using the new TransferOptions struct.
  • Documentation has been updated to include the usage and configuration details for the download method.

midnattsol and others added 4 commits December 3, 2024 12:56
- Implemented  and  to handle concurrent downloads and efficient file writing.
- Added the main  function to coordinate multi-chunk downloads and writes.
- Introduced support for cancellation using , enabling early termination of both downloads and writes.
- Added  to allow customization of concurrency, buffer size, and retry limits.

These functions provide an efficient way to handle downloads from an  while ensuring robust error handling and cancellation support.
…patibility with uploads. This change enhances clarity by using a more generic and versatile name for transfer-related operations.

Changes made:
- Updated the name in the source code.
- Adjusted related documentation and comments.

No functional changes were introduced beyond the refactor.
@github-actions github-actions bot added the object-store Object Store Interface label Dec 4, 2024
- Renamed struct TransferConfig to TransferOptions to improve code clarity and consistency.
- Renamed fields in `TransferConfig`: `max_concurrent_chunks` to `concurrent_tasks`, and `chunk_queue_size` to `buffer_capacity` for clarity.
- Updated references to old variables throughout the codebase.

Rationale:
- Improved readability and maintainability.
- `<old_variable_name>` was ambiguous or inconsistent with naming conventions.

No functional changes were made; this is purely a refactor for naming consistency.
…SendToChannel` formerly `SendDataError`

- Renamed several error variants for better clarity and consistency.
- Updated the `source` of one error to properly propagate the original cause.
- Removed redundant documentation as it's already implicit in the code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
object-store Object Store Interface
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Object_store: put_file and get_file methods
1 participant