-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/simple python3 migration #53
base: master
Are you sure you want to change the base?
Conversation
* allowing for a range of dependency versions, locked by api level
…ckend * install a viable tensorflow backend by running the command `pip install -e .[tensorflow_backend]`
was opening writing as bytes when content to be written was `str`
* encountering pickle load error: `_pickle.UnpicklingError: the STRING opcode argument must be quoted`
with python3 the values being given to `check_ascii` are now `str` instead of `bytes` using `word.encode('ascii')` to validate the given word can be ASCII encoded
c2f6829
to
fc0911b
Compare
fixing docstring
obtaining a `_pickle.UnpicklingError: the STRING opcode argument must be quoted` on loading this file.
was opening writing as `bytes` when content to be written was `str` in python3
* having analyze result tests `test_smoke_analyze_results` and `test_smoke_analyze_all_results` execute after the `test_smoke_finetune_dataset`
Do all the tests run successfully on your system? I get some tests failing that are not related to TF/Keras at all and so should have same behavior across all systems. test_tokenizer
|
I see in your above traceback your path starts with |
I've added you as a collaborator to my fork. Let me know if you need anything else to push to my repo. Though, it might be best if you forked my branch and add the changes there, then later PR them back into this branch. |
I changed the line separator from CRLF to LF for |
Does nose have any documentation on this behavior for Mac? Do you get any tracebacks from executing these failing Nosetests? I currently, do not have a good setup for testing python on Mac. Are you using a python3 version lower than 3.5? |
I'm running with Python 3.7, but perhaps nosetests is not picking up my Supporting only 3.5+ seems good. |
It was an issue with my conda environment and nosetests. If others have a similar issue, we should perhaps migrate to pytest, but that'll be in a separate PR. |
If I try |
I can't fork your branch as I already have a repo named DeepMoji :) I've pushed commits that updates the repo to use |
I've seen a few comments about the Tensorflow keras being slower. Can you check that the slow tests run in a reasonable time on your system? |
Hi, can I ask what is the status of this? Is there anything I can help with? I was in the way of migrating the codebase to py3/tf2 on my own but then I saw this PR. Let me know if you need help! |
Thanks @anbasile. This PR should be ready, I just haven't had time to benchmark it properly across Windows/Mac/Linux before merging it. It would be amazing if you could run the entire test suite using this PR on a CPU and/or GPU setup and respond here with how long running the test suite took! |
I have the tests running. I'll report here as soon as they are done. |
Here I am again: I've ran the tests overnight on the following config:
I got a few OOMs. |
Furthermore, I've found that you need to change one operation in the attention layer (
|
Thanks, that's great to know! Okay, I'll change that. Can you describe what the 4 errors were? If you could try to fix them, then that would be amazing! |
@anbasile I've added a test for checking save/loading of the model using the SavedModel format that you're interested in. However, I can't get the loading to work. It would be amazing if you could take a look at it! Steps I've taken to make the test work in case it's helpful:
|
After a lot of debugging, I've managed to load the model (using TF nightly) by disabling the mask in the embedding layer, so I guess the issue is in how the mask is handled either in the attention layer or it's a compatibility issue with the LSTM layers. I'll look it into it and come back, unless you have already an idea for how to solve this. UPDATE: the issue is due to VarianceScaling. Setting
|
Awesome, thanks! The initializer shouldn't have a big impact on this model so we should be able to change it w/o issues. |
@anbasile I've added you as a contributor to my fork if needed. Glad to see the testing and bug-fixing continuing with this PR. Hopefully, in May I should be able to start helping again with this this PR. |
Hey @nklapste, any updates on this? What in particular doesn't work? |
Hello, I sadly cannot continue my work with this PR within the foreseeable future. I would recommend having this be passed on to another. Most of the issues found are noted in the above comments. One other issue I found was in loading the dataset stored at |
I might be able to work on this to finish the PR. |
@anbasile if its any assistance I have another branch nklapste:feature/Python3Window10 were I was able to use it in a Ubuntu 20.04 python 3.8 environment. However I was only using the pre-trained model/weights and using the model as-is (aka: I was not attempting/testing any training). That branch might be a better start. |
Cherry picks of #52 to have a reduced changeset and be easier to review.
Features:
tensorflow==2.0.0
compatibilityexamples/
andscripts/
scriptsscripts/download_weights.py
now uses the system agnosticurllib.request
and uses a sha256sum for downloaded content validationKnown Issues:
some changes regarding updates to keras and tensorflow might have changed behavior with using the prebuilt
deepmoji_weights.hdf5
modeltest
test_finetuning.test_encode_texts
is raising aAssertionError
as itsavg_across_sentances
is[-0.024 0.027 -0.046 -0.002 -0. ]
instead of test's expected[-0.023, 0.021, -0.037, -0.001, -0.005]
examples/create_twitter_vocab.py
fails to execute due to it missing the file../../twitterdata/tweets.2016-09-01
data/PsychExp/raw.pickle
fails to be unpickled inscripts/convert_all_datasets.py
and inscripts/calculate_coverages.py