From 254fba7862555ccd34f654f3fc72763586692365 Mon Sep 17 00:00:00 2001 From: Milo Cress Date: Mon, 4 Nov 2024 13:39:23 -0500 Subject: [PATCH 1/4] Update faqs_and_tips.md --- docs/source/getting_started/faqs_and_tips.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/source/getting_started/faqs_and_tips.md b/docs/source/getting_started/faqs_and_tips.md index bea912cf8..fe07a4340 100644 --- a/docs/source/getting_started/faqs_and_tips.md +++ b/docs/source/getting_started/faqs_and_tips.md @@ -58,6 +58,8 @@ The `epoch_size` attribute of StreamingDataset is the number of samples per epoc ### What's the difference between `StreamingDataset` vs. datasets vs. streams? `StreamingDataset` is the dataset class. It can take in multiple streams, which are just data sources. It combines these streams into a single dataset. `StreamingDataset` does not *stream* data, as continuous bytes; instead, it downloads shard files to enable a continuous flow of samples into the training job. `StreamingDataset` is an `IterableDataset` as opposed to a map-style dataset -- samples are retrieved as needed. +### I wrapped my streaming dataloader with HuggingFace's `accelerate` dataloader wrapper and my run is hanging, what should I do? +When using HF `accelerate` with `streaming` for training, do not wrap the dataloader as this will cause the run to fail. ## 🤓 Helpful Tips From d5949126442f01c26328e31779cbd5f831ff3206 Mon Sep 17 00:00:00 2001 From: Saaketh Narayan Date: Mon, 4 Nov 2024 13:54:14 -0500 Subject: [PATCH 2/4] Update docs/source/getting_started/faqs_and_tips.md --- docs/source/getting_started/faqs_and_tips.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/getting_started/faqs_and_tips.md b/docs/source/getting_started/faqs_and_tips.md index fe07a4340..ded395964 100644 --- a/docs/source/getting_started/faqs_and_tips.md +++ b/docs/source/getting_started/faqs_and_tips.md @@ -59,7 +59,7 @@ The `epoch_size` attribute of StreamingDataset is the number of samples per epoc `StreamingDataset` is the dataset class. It can take in multiple streams, which are just data sources. It combines these streams into a single dataset. `StreamingDataset` does not *stream* data, as continuous bytes; instead, it downloads shard files to enable a continuous flow of samples into the training job. `StreamingDataset` is an `IterableDataset` as opposed to a map-style dataset -- samples are retrieved as needed. ### I wrapped my streaming dataloader with HuggingFace's `accelerate` dataloader wrapper and my run is hanging, what should I do? -When using HF `accelerate` with `streaming` for training, do not wrap the dataloader as this will cause the run to fail. +When using HF Accelerate with Streaming for training, do not wrap the DataLoader as this can may cause hangs during training. StreamingDataset ready for distributed training out of the box and does not need the wrapping that HF Accelerate provides. ## 🤓 Helpful Tips From 4e6d32d2c616668ee263e569a21d6b02edc643f5 Mon Sep 17 00:00:00 2001 From: Saaketh Narayan Date: Mon, 4 Nov 2024 13:54:19 -0500 Subject: [PATCH 3/4] Update docs/source/getting_started/faqs_and_tips.md --- docs/source/getting_started/faqs_and_tips.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/getting_started/faqs_and_tips.md b/docs/source/getting_started/faqs_and_tips.md index ded395964..6b5747e08 100644 --- a/docs/source/getting_started/faqs_and_tips.md +++ b/docs/source/getting_started/faqs_and_tips.md @@ -58,7 +58,7 @@ The `epoch_size` attribute of StreamingDataset is the number of samples per epoc ### What's the difference between `StreamingDataset` vs. datasets vs. streams? `StreamingDataset` is the dataset class. It can take in multiple streams, which are just data sources. It combines these streams into a single dataset. `StreamingDataset` does not *stream* data, as continuous bytes; instead, it downloads shard files to enable a continuous flow of samples into the training job. `StreamingDataset` is an `IterableDataset` as opposed to a map-style dataset -- samples are retrieved as needed. -### I wrapped my streaming dataloader with HuggingFace's `accelerate` dataloader wrapper and my run is hanging, what should I do? +### Should I wrap the Streaming DataLoader with HuggingFace Accelerate's DataLoader wrapper when training? When using HF Accelerate with Streaming for training, do not wrap the DataLoader as this can may cause hangs during training. StreamingDataset ready for distributed training out of the box and does not need the wrapping that HF Accelerate provides. ## 🤓 Helpful Tips From 27104847fd1d761b0dea53b369c00dea705ce55e Mon Sep 17 00:00:00 2001 From: Milo Cress Date: Mon, 4 Nov 2024 13:58:48 -0500 Subject: [PATCH 4/4] typo --- docs/source/getting_started/faqs_and_tips.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/getting_started/faqs_and_tips.md b/docs/source/getting_started/faqs_and_tips.md index 6b5747e08..b9ae058fb 100644 --- a/docs/source/getting_started/faqs_and_tips.md +++ b/docs/source/getting_started/faqs_and_tips.md @@ -59,7 +59,7 @@ The `epoch_size` attribute of StreamingDataset is the number of samples per epoc `StreamingDataset` is the dataset class. It can take in multiple streams, which are just data sources. It combines these streams into a single dataset. `StreamingDataset` does not *stream* data, as continuous bytes; instead, it downloads shard files to enable a continuous flow of samples into the training job. `StreamingDataset` is an `IterableDataset` as opposed to a map-style dataset -- samples are retrieved as needed. ### Should I wrap the Streaming DataLoader with HuggingFace Accelerate's DataLoader wrapper when training? -When using HF Accelerate with Streaming for training, do not wrap the DataLoader as this can may cause hangs during training. StreamingDataset ready for distributed training out of the box and does not need the wrapping that HF Accelerate provides. +When using HF Accelerate with Streaming for training, do not wrap the DataLoader as this can may cause hangs during training. StreamingDataset is ready for distributed training out of the box and does not need the wrapping that HF Accelerate provides. ## 🤓 Helpful Tips