Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support setup for torch DataPipes #16603

Closed
awaelchli opened this issue Feb 1, 2023 · 4 comments
Closed

Support setup for torch DataPipes #16603

awaelchli opened this issue Feb 1, 2023 · 4 comments
Labels
data handling Generic data-related topic fabric lightning.fabric.Fabric feature Is an improvement or enhancement

Comments

@awaelchli
Copy link
Contributor

awaelchli commented Feb 1, 2023

Description & Motivation

When working with torchdata's DataPipes in distributed settings, the user can use DataLoader2 to run the pipe with a specific reading service that applies sharding of the data pipe correctly. For example:

# multiprocessing in a single node
from torchdata.dataloader2 import DataLoader2
from torchdata.dataloader2 import PrototypeMultiProcessingReadingService

service = PrototypeMultiProcessingReadingService(num_workers=5)
dataloader = DataLoader2(datapipe, reading_service=service)

# distributed/multi-gpu/multi-node
from torchdata.dataloader2 import DistributedReadingService

service = DistributedReadingService()
dataloader = DataLoader2(datapipe, reading_service=service)

However, this does not mix well with Lightning as the user would have to change the reading service when switching from one strategy to another.

Pitch

Similar to what we do with the injection of the DistributedSampler for the regular DataLoader, add the reading service for the datapipe automatically for the user.

fabric = Fabric(...)

dataloader = DataLoader2(datapipe)

# sets up the reading service for distributed training
dataloader = fabric.setup_dataloader(dataloader)

Alternatives

No response

Additional context

If you are interested in learning more about torchdata, here is a good YouTube video by PyTorch that introduces the main concepts an values.

cc @Borda @justusschock @awaelchli @carmocca

@awaelchli awaelchli added needs triage Waiting to be triaged by maintainers data handling Generic data-related topic fabric lightning.fabric.Fabric feature Is an improvement or enhancement and removed needs triage Waiting to be triaged by maintainers labels Feb 1, 2023
@awaelchli awaelchli added this to the 2.0 milestone Feb 1, 2023
@carmocca carmocca modified the milestones: 2.0, future Feb 23, 2023
@NivekT
Copy link

NivekT commented Mar 27, 2023

I am not super familiar with Lightning, but what happens if users do:

mp_rs = MultiProcessingReadingService(num_workers=2)
dist_rs = DistributedReadingService()
rs = SequentialReadingService(dist_rs, mp_rs)  # Execute both distributed and multiprocessing
dataloader = DataLoader2(datapipe, reading_service=rs)

This would normally work with standalone torchdata (the bottom example here).

Would it still work if dataloader (as defined above) is passed into Lightning?

@awaelchli
Copy link
Contributor Author

awaelchli commented Mar 28, 2023

In Lightning Trainer yes, because it supports arbitrary iterables. But we don't have any correctness tests for DataLoader2 specifically, so no guarantees. In Fabric, the setup_dataloaders() method does not support DataLoader2 atm.

@tensorcopy
Copy link

Is there any update or plan to support Dataloader2? Thanks!

@awaelchli
Copy link
Contributor Author

awaelchli commented Sep 10, 2023

The future of torchdata is very unclear. The development has paused, here is the official statement: pytorch/data#1196

I suggest we wait until we know about future plans. For now, users can have datapipes/dataloader2 with Lightning by configuring them manually.

@awaelchli awaelchli closed this as not planned Won't fix, can't repro, duplicate, stale Sep 10, 2023
@carmocca carmocca removed this from the future milestone Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data handling Generic data-related topic fabric lightning.fabric.Fabric feature Is an improvement or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants