Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for DALI, pytorch, and tensorflow reader #80

Open
hariharan-devarajan opened this issue Jul 29, 2023 · 0 comments · May be fixed by #81
Open

Support for DALI, pytorch, and tensorflow reader #80

hariharan-devarajan opened this issue Jul 29, 2023 · 0 comments · May be fixed by #81
Assignees
Labels
enhancement New feature or request

Comments

@hariharan-devarajan
Copy link
Collaborator

hariharan-devarajan commented Jul 29, 2023

All data loaders support internal reading functions. I will use this issue to describe some Data loader and possible integration into dlio_benchmark.

Dali data loader

Examples: npz tfrecord

Suggestion: define input pipeline where we do the following: a) read files, b) extract samples, and c) resize. DaliReader will have a init, read, and finalize API.

TFRecord

Examples csv (experimental) and tfrecord

Suggestion Return tf.data.dataset which includes reading, extracting samples and resize. TensorflowReader will have a init, read, and finalize API.

Pytorch

The recommended way to use PyTorch is to define custom data loaders. But it has some custom image loading.

Suggested Changes

  • I will create separate enums for TensorflowReaders and DaliReaders we support. They will have numbers similar to our ReaderType for compatibility.
  • Rename our data loaders to DLIO_PYTORCH, DLIO_TENSORFLOW, and DLIO_DALI as this is our implementations.
  • Similarly, rename our data reads as DLIO_CSV and so on.
  • The new data loaders would be called NATIVE_TENSORFLOW and NATIVE_PYTORCH.
  • For validation our current loaders work with our DLIOReaderType. If user selects The NATIVE_TENSORFLOW then it will be validated against TensorflowReaderType and similarly for DALI.
  • The base classes for these reader would be different as well. We will have three baseclasses DaliBaseReader, DLIOBaseReader, PyTorchBaseReader, and TensorflowBaseReader.
@hariharan-devarajan hariharan-devarajan self-assigned this Jul 29, 2023
@hariharan-devarajan hariharan-devarajan added the enhancement New feature or request label Jul 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant