Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make torchdata default imports lazy #1243

Closed
wants to merge 10 commits into from
Closed

make torchdata default imports lazy #1243

wants to merge 10 commits into from

Conversation

andrewkho
Copy link
Contributor

@andrewkho andrewkho commented Apr 23, 2024

Summary: Push all imports in torchdata/init.py to be lazy by default so they don't interfere with new development (eg stateful_dataloader)

Test Plan:
New unit test, CI for the rest

Please read through our contribution guide prior to
creating your pull request.

  • Note that there is a section on requirements related to adding a new DataPipe.

Fixes #{issue number}

Changes

Makes imports lazy, should be no-op for downstream users

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 23, 2024
@andrewkho
Copy link
Contributor Author

Not sure why CI is skipped

@facebook-github-bot
Copy link
Contributor

@andrewkho has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@andrewkho andrewkho added the enhancement New feature or request label Apr 23, 2024
@andrewkho
Copy link
Contributor Author

CI failing for MacOS during test-documentation phase for S3 connector with some CPP Stacktrace:

################################################################################
Stack trace:
################################################################################
1   libarrow.1600.dylib                 0x0000000124089856 aws_fatal_assert + 70
2   libarrow.1600.dylib                 0x0000000124088[507](https://github.com/pytorch/data/actions/runs/8808984396/job/24179039858?pr=1243#step:11:508) aws_mem_acquire + 55
3   libarrow.1600.dylib                 0x000000012409c612 aws_string_new_from_cursor + 50
4   libarrow.1600.dylib                 0x0000000124095ec8 aws_json_value_get_from_object + 40
5   libarrow.1600.dylib                 0x0000000124080e10 aws_endpoints_ruleset_new_from_string + 96
6   libarrow.1600.dylib                 0x00000001240147c0 _ZN3Aws3Crt9Endpoints10RuleEngineC2ERK15aws_byte_cursorS5_P13aws_allocator + 48
7   libarrow.1600.dylib                 0x0000000123daefcc _ZN3Aws8Endpoint23DefaultEndpointProviderINS_2S321S3ClientConfigurationENS2_8Endpoint19S3BuiltInParametersENS4_25S3ClientContextParametersEEC2EPKcm + 124
8   _torchdata.so                       0x00000001272081ca _ZN3Aws2S38S3ClientC2ERKNS_6Client19ClientConfigurationENS2_15AWSAuthV4Signer20PayloadSigningPolicyEbNS0_34US_EAST_1_REGIONAL_ENDPOINT_OPTIONE + 618
9   _torchdata.so                       0x00000001271e491f _ZN9torchdata9S3Handler18InitializeS3ClientEv + 63
10  _torchdata.so                       0x00000001271e42af _ZN9torchdata9S3HandlerC2ElNSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEE + 1631
11  _torchdata.so                       0x00000001271e21e2 _ZNO8pybind116detail15argument_loaderIJRNS0_16value_and_holderElRKNSt3__112basic_stringIcNS4_11char_traitsIcEENS4_9allocatorIcEEEEEE9call_implIvRZNS0_8initimpl11constructorIJKlSC_EE7executeINS_6class_IN9torchdata9S3HandlerEJEEEJELi0EEEvRT_DpRKT0_EUlS3_lSC_E_JLm0ELm1ELm2EENS0_9void_typeEEESO_OT0_NS4_16integer_sequenceImJXspT1_EEEEOT2_ + 66
12  _torchdata.so                       0x00000001271e203f _ZZN8pybind1112cpp_function10initializeIZNS_6detail8initimpl11constructorIJKlRKNSt3__112basic_stringIcNS6_11char_traitsIcEENS6_9allocatorIcEEEEEE7executeINS_6class_IN9torchdata9S3HandlerEJEEEJELi0EEEvRT_DpRKT0_EUlRNS2_16value_and_holderElSE_E_vJSS_lSE_EJNS_4nameENS_9is_methodENS_7siblingENS2_24is_new_style_constructorEEEEvOSL_PFT0_DpT1_EDpRKT2_ENKUlRNS2_13function_callEE_clES19_ + 127
13  _torchdata.so                       0x00000001271d6e33 _ZN8pybind1112cpp_function10dispatcherEP7_objectS2_S2_ + 4723
14  libpython3.8.dylib                  0x000000010977e4f5 cfunction_call_varargs + 69
15  libpython3.8.dylib                  0x000000010977dafc _PyObject_MakeTpCall + 188

@andrewkho
Copy link
Contributor Author

The S3FileSaver seems to pass on MacOS when reverting the change, I'll try it locally on my laptop to see if I can repro

@andrewkho
Copy link
Contributor Author

I see, like needed to fix this import _extension which will make the s3 binary available

@facebook-github-bot
Copy link
Contributor

@andrewkho has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@andrewkho has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@andrewkho
Copy link
Contributor Author

Test is currently failing on macos for the multiprocess iterdatapipe test, when ctx == "fork". A similar test with non-datapipes is also skipped for macOS so going to follow that here too

@@ -1638,6 +1638,7 @@ def test_multiprocessing_contexts(self):
list(self._get_data_loader(ds_cls(counting_ds_n), multiprocessing_context=ctx, **dl_common_args)),
)

@unittest.skipIf(IS_MACOS, "Not working on macos")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See test_multiprocessing_contexts() above on line 1614, similar non-datapipe test is skipped for macOS so going to follow that here too

Copy link
Contributor Author

@andrewkho andrewkho Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative for this test to pass is to skip ctx == "fork" or locally import torchdata.datapipes, even though there is absolutely no dependency on it, guessing it's something to do with https://www.wefearchange.org/2018/11/forkmacos.rst.html

@facebook-github-bot
Copy link
Contributor

@andrewkho has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@andrewkho merged this pull request in db1eae3.

@andrewkho andrewkho deleted the lazy-imports branch April 25, 2024 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants