Skip to content

Commit

Permalink
Clarify how the dataloader works with HF dataset (#3910)
Browse files Browse the repository at this point in the history
  • Loading branch information
adam-narozniak authored Jul 25, 2024
1 parent 7f4d2df commit fb744a2
Showing 1 changed file with 30 additions and 0 deletions.
30 changes: 30 additions & 0 deletions datasets/doc/source/tutorial-quickstart.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -439,6 +439,36 @@
"dataloader = DataLoader(partition_torch, batch_size=64)"
]
},
{
"cell_type": "markdown",
"id": "b93678a5",
"metadata": {},
"source": "The `Dataloader` created this way does not return a `Tuple` when iterating over it but a `Dict` with the names of the columns as keys and features as values. Look below for an example."
},
{
"cell_type": "code",
"execution_count": null,
"id": "5edd3ce2",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Return type when iterating over dataloader: <class 'dict'>\n",
"torch.Size([64, 3, 32, 32])\n",
"torch.Size([64])\n"
]
}
],
"source": [
"for batch in dataloader:\n",
" print(f\"Return type when iterating over a dataloader: {type(batch)}\")\n",
" print(batch[\"img\"].shape)\n",
" print(batch[\"label\"].shape)\n",
" break"
]
},
{
"cell_type": "markdown",
"id": "71531613",
Expand Down

0 comments on commit fb744a2

Please sign in to comment.