Feature/simple python3 migration #53

nklapste · 2020-01-20T20:16:11Z

Cherry picks of #52 to have a reduced changeset and be easier to review.

Features:

Python3 compatibility
Windows 10 compatibility
tensorflow==2.0.0 compatibility
Smoke tests for examples/ and scripts/ scripts
scripts/download_weights.py now uses the system agnostic urllib.request and uses a sha256sum for downloaded content validation

Known Issues:

some changes regarding updates to keras and tensorflow might have changed behavior with using the prebuilt deepmoji_weights.hdf5 model

test test_finetuning.test_encode_texts is raising a AssertionError as its avg_across_sentances is [-0.024 0.027 -0.046 -0.002 -0. ] instead of test's expected [-0.023, 0.021, -0.037, -0.001, -0.005]
examples/create_twitter_vocab.py fails to execute due to it missing the file ../../twitterdata/tweets.2016-09-01
data/PsychExp/raw.pickle fails to be unpickled in scripts/convert_all_datasets.py and in scripts/calculate_coverages.py

* allowing for a range of dependency versions, locked by api level

…ckend * install a viable tensorflow backend by running the command `pip install -e .[tensorflow_backend]`

was opening writing as bytes when content to be written was `str`

…thon 3

* encountering pickle load error: `_pickle.UnpicklingError: the STRING opcode argument must be quoted`

with python3 the values being given to `check_ascii` are now `str` instead of `bytes` using `word.encode('ascii')` to validate the given word can be ASCII encoded

fixing docstring

obtaining a `_pickle.UnpicklingError: the STRING opcode argument must be quoted` on loading this file.

was opening writing as `bytes` when content to be written was `str` in python3

* having analyze result tests `test_smoke_analyze_results` and `test_smoke_analyze_all_results` execute after the `test_smoke_finetune_dataset`

bfelbo · 2020-02-02T21:56:28Z

Do all the tests run successfully on your system?

I get some tests failing that are not related to TF/Keras at all and so should have same behavior across all systems.

test_tokenizer

======================================================================
FAIL: Tokenizing emojis/emoticons/decorations.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/Users/bfelbo/code/deepmoji_python3_pr/DeepMoji/tests/test_tokenizer.py", line 112, in test_emojis
    test_base(TESTS_EMOJIS)
  File "/Users/bfelbo/code/deepmoji_python3_pr/DeepMoji/tests/test_tokenizer.py", line 165, in test_base
    .format(test, expected, actual)
AssertionError: Tokenization of 'i \U0001f496 you to the moon and back' failed, expected: ['i', '\\U0001f496', 'you', 'to', 'the', 'moon', 'and', 'back'], actual: ['i', '\\', 'U', '0001', 'f', '496', 'you', 'to', 'the', 'moon', 'and', 'back']

======================================================================
FAIL: Tokenizing currencies.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/Users/bfelbo/code/deepmoji_python3_pr/DeepMoji/tests/test_tokenizer.py", line 142, in test_currencies
    test_base(TESTS_CURRENCIES)
  File "/Users/bfelbo/code/deepmoji_python3_pr/DeepMoji/tests/test_tokenizer.py", line 165, in test_base
    .format(test, expected, actual)
AssertionError: Tokenization of '10,00\u20ac' failed, expected: ['10', ',', '00', '\\u20ac'], actual: ['10', ',', '00', '\\', 'u', '20', 'ac']

nklapste · 2020-02-05T19:34:25Z

I see in your above traceback your path starts with /usr/local/lib/python2.7/ are you running these tests with python2.7? I would assume this would cause errors do to migrating to python3 and the syntax for escaping various strings/Unicode has changed. Could you try running the tests with a python3.5+ environment?

nklapste · 2020-02-05T19:35:55Z

I had a lot of issues getting the updated code to work, but after importing tensorflow.python.keras, turning off TF eager mode and fixing various issues due to breaking changes with Keras, it now works 🎉

test_finetuning.test_encode_texts now returns the same encoded representations. I'll clean up the changes and push them to this PR. Can you give me push permission to your fork?

I've added you as a collaborator to my fork. Let me know if you need anything else to push to my repo. Though, it might be best if you forked my branch and add the changes there, then later PR them back into this branch.

nklapste · 2020-02-05T19:40:30Z

How come data/Olympic/raw.pickle was changed? Did you repickle all the benchmark datasets using Python3?

I changed the line separator from CRLF to LF for data/Olympic/raw.pickle, otherwise it failed to depickle in a Windows environment. Have you observed any changes/failures within your environment from this change?

nklapste · 2020-02-07T16:46:48Z

The smoke tests are great, but at least on Mac, you cannot use subprocess.run() with nosetests. Nosetests simply remove that function to prevent messing with the testing setup so we'll have to do the smoke tests some other way. Can you look into that?

Does nose have any documentation on this behavior for Mac?

Do you get any tracebacks from executing these failing Nosetests? I currently, do not have a good setup for testing python on Mac.

Are you using a python3 version lower than 3.5? subprocess.run was added in python3.5. If we want older python3 compatibility we could possibly use subprocess.call.

bfelbo · 2020-02-07T17:58:19Z

I'm running with Python 3.7, but perhaps nosetests is not picking up my conda activate deepmoji. I'll take a look on the weekend :)

Supporting only 3.5+ seems good.

bfelbo · 2020-02-10T03:24:17Z

It was an issue with my conda environment and nosetests. If others have a similar issue, we should perhaps migrate to pytest, but that'll be in a separate PR.

bfelbo · 2020-02-10T03:35:57Z

The extremely long smoke tests should be flagged with the @attr("slow").

However, within my environment running the recommended nosetests command from the README.md:
nosetests -v -a '!slow'
Resulted in no tests being run.

However, running:
nosetests -v -a "!slow"
Ran the tests, omitting the extremely slow tests! Would it be possible for you to validate this same command works in your environment as well?

If I try "!slow" I get bash: !slow: event not found. Can you try nosetests -v -a 'slow=False'? That works on Mac in my environment.

bfelbo · 2020-02-10T06:13:07Z

I had a lot of issues getting the updated code to work, but after importing tensorflow.python.keras, turning off TF eager mode and fixing various issues due to breaking changes with Keras, it now works 🎉
test_finetuning.test_encode_texts now returns the same encoded representations. I'll clean up the changes and push them to this PR. Can you give me push permission to your fork?

I've added you as a collaborator to my fork. Let me know if you need anything else to push to my repo. Though, it might be best if you forked my branch and add the changes there, then later PR them back into this branch.

I can't fork your branch as I already have a repo named DeepMoji :) I've pushed commits that updates the repo to use tensorflow.python.keras rather than plain keras. All the non-slow tests succeed for me (I'll run the slow ones overnight). Can you check that the tests succeed in your environment?

bfelbo · 2020-02-10T18:00:02Z

I've seen a few comments about the Tensorflow keras being slower. Can you check that the slow tests run in a reasonable time on your system?

anbasile · 2020-04-07T11:42:07Z

Hi, can I ask what is the status of this? Is there anything I can help with? I was in the way of migrating the codebase to py3/tf2 on my own but then I saw this PR. Let me know if you need help!

bfelbo · 2020-04-07T16:11:09Z

Thanks @anbasile. This PR should be ready, I just haven't had time to benchmark it properly across Windows/Mac/Linux before merging it.

It would be amazing if you could run the entire test suite using this PR on a CPU and/or GPU setup and respond here with how long running the test suite took!

anbasile · 2020-04-08T15:58:06Z

I have the tests running. I'll report here as soon as they are done.

anbasile · 2020-04-09T07:58:03Z

Here I am again: I've ran the tests overnight on the following config:

Ubuntu 18.04,
4x 1080 Ti,
AMD Threadripper
128G RAM

Ran 45 tests in 7360.620s

FAILED (errors=4)

I got a few OOMs.

anbasile · 2020-04-09T11:22:53Z

Furthermore, I've found that you need to change one operation in the attention layer (attlayer.py), otherwise the model can't be serialized using the SavedModel format:

    def call(self, x, mask=None):
        # computes a probability distribution over the timesteps
        # uses 'max trick' for numerical stability
        # reshape is done to avoid issue with Tensorflow
        # and 1-dimensional weights
        logits = K.dot(x, self.W)
        x_shape = K.shape(x)
        logits = K.reshape(logits, (x_shape[0], x_shape[1]))
        ai = K.exp(logits - K.max(logits, axis=-1, keepdims=True))

vs.
    def call(self, x, mask=None):
        [...]
        logits = tf.tensordot(x, self.W, axes=1)
        [...]

bfelbo · 2020-04-09T17:11:42Z

Thanks, that's great to know! Okay, I'll change that.

Can you describe what the 4 errors were? If you could try to fix them, then that would be amazing!

bfelbo · 2020-04-11T22:58:30Z

@anbasile I've added a test for checking save/loading of the model using the SavedModel format that you're interested in. However, I can't get the loading to work. It would be amazing if you could take a look at it!

Steps I've taken to make the test work in case it's helpful:

Fix the error you mentioned by using tensordot
Fix You tried to call count_params on z_input, but the layer isn't built. You can build it manually via: `z_input.build(batch_input_shape tensorflow/tensorflow#35353 by upgrading to TF 2.1
Fix ValueError: Could not find matching function to call loaded from the SavedModel. tensorflow/tensorflow#37339 by upgrading to TF nightly
Stuck on the following error, which has been mentioned TypeError: int() argument must be a string, a bytes-like object or a number, not 'Tensor' tensorflow/tensorflow#16951, but is unclear how to fix for this model

    fan_in, fan_out = _compute_fans(scale_shape)
  File "/usr/local/anaconda3/envs/deepmoji/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py", line 1425, in _compute_fans
    return int(fan_in), int(fan_out)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

anbasile · 2020-04-15T09:41:44Z

After a lot of debugging, I've managed to load the model (using TF nightly) by disabling the mask in the embedding layer, so I guess the issue is in how the mask is handled either in the attention layer or it's a compatibility issue with the LSTM layers. I'll look it into it and come back, unless you have already an idea for how to solve this.

UPDATE: the issue is due to VarianceScaling.

Setting he_uniform (or any other initializer which doesn't inherit from VarianceScaling) as kernel_initializer in both LSTM layers fixes the issue, but I have to check if this changes the predictions of the current saved model.

    [...]
    # ordering of the way the merge is done is important for consistency with the pretrained model
    lstm_0_output = Bidirectional(LSTM(512, return_sequences=True, kernel_initializer=tf.keras.initializers.get('he_uniform'), name="bi_lstm_0"))(x)
    lstm_1_output = Bidirectional(LSTM(512, return_sequences=True, kernel_initializer=tf.keras.initializers.get('he_uniform'), name="bi_lstm_1"))(lstm_0_output)
    x = concatenate([lstm_1_output, lstm_0_output, x])
    [...]

bfelbo · 2020-04-15T17:31:00Z

Awesome, thanks! The initializer shouldn't have a big impact on this model so we should be able to change it w/o issues.

nklapste · 2020-04-20T03:05:31Z

@anbasile I've added you as a contributor to my fork if needed. Glad to see the testing and bug-fixing continuing with this PR.

Hopefully, in May I should be able to start helping again with this this PR.

john-heyer · 2020-07-28T01:50:00Z

Hey @nklapste, any updates on this? What in particular doesn't work?

nklapste · 2021-03-22T19:58:13Z

Hey @nklapste, any updates on this? What in particular doesn't work?

Hello, I sadly cannot continue my work with this PR within the foreseeable future. I would recommend having this be passed on to another. Most of the issues found are noted in the above comments.

One other issue I found was in loading the dataset stored at '../data/PsychExp/raw.pickle' though I do not know the root reasons for this failure.

anbasile · 2021-03-23T18:42:35Z

Hey @nklapste, any updates on this? What in particular doesn't work?

Hello, I sadly cannot continue my work with this PR within the foreseeable future. I would recommend having this be passed on to another. Most of the issues found are noted in the above comments.

One other issue I found was in loading the dataset stored at '../data/PsychExp/raw.pickle' though I do not know the root reasons for this failure.

I might be able to work on this to finish the PR.

nklapste · 2021-04-09T02:15:19Z

@anbasile if its any assistance I have another branch nklapste:feature/Python3Window10 were I was able to use it in a Ubuntu 20.04 python 3.8 environment. However I was only using the pre-trained model/weights and using the model as-is (aka: I was not attempting/testing any training). That branch might be a better start.

nklapste added 12 commits January 20, 2020 12:03

updating for python3 and windows compatibility

6546fa6

locking down requirements

899587c

* allowing for a range of dependency versions, locked by api level

adding extras_requires option for installing a viable tensorflow ba…

96beec0

…ckend * install a viable tensorflow backend by running the command `pip install -e .[tensorflow_backend]`

adding notes on how to install a recommended tensorflow backend

e4cb4da

fixing csv write error

63d1cfc

was opening writing as bytes when content to be written was `str`

updating convert_all_datasets.py to opertate with windows 10 and py…

c8ceac2

…thon 3

disabling loading of PsychExp dataset

3e2feba

* encountering pickle load error: `_pickle.UnpicklingError: the STRING opcode argument must be quoted`

coverting to LF from CRLF

a628d8a

updating download_weights.py to work in windows10, python3

c108abc

fixing imports of example_helper to be used for local invocation

079d1f8

fixup on check_ascii

28306cf

with python3 the values being given to `check_ascii` are now `str` instead of `bytes` using `word.encode('ascii')` to validate the given word can be ASCII encoded

removing comment

abca333

nklapste mentioned this pull request Jan 20, 2020

Feature/Python3 & Windows10 compatibility #52

Closed

minimizing diffs

fc0911b

nklapste force-pushed the feature/SimplePython3Migration branch from c2f6829 to fc0911b Compare January 20, 2020 20:35

nklapste added 13 commits January 20, 2020 13:51

adding smoke tests for scripts in examples/

16bdfbd

adding smoke tests for scripts/

47646ca

fixing docstring

fixing loading of pickle files

177379c

disabling loading of data/PsychExp/raw.pickle

9047d5b

obtaining a `_pickle.UnpicklingError: the STRING opcode argument must be quoted` on loading this file.

fixing csv write error

dbf55f8

was opening writing as `bytes` when content to be written was `str` in python3

removing unneeded parenthesis

4b21a73

fixing ascii check for incoming words in shorten_word

6da04b9

fixing path for subprocess call

0dc13fc

adding TODO note on failure in test_smoke_create_twitter_vocab

5ff04f2

adding nose slow attribute to slow smoke tests

81c075c

reordering tests

9caaff3

* having analyze result tests `test_smoke_analyze_results` and `test_smoke_analyze_all_results` execute after the `test_smoke_finetune_dataset`

improving test for test_smoke_download_weights

c2ab3c1

fixing import of test_helper

8b8fab1

nklapste changed the title ~~WIP: Feature/simple python3 migration~~ Feature/simple python3 migration Jan 21, 2020

nklapste requested a review from bfelbo January 21, 2020 17:09

bfelbo added 5 commits February 9, 2020 22:06

Mark tests as slow as they require another slow test

22cfee7

Clarify that twitter dataset is not provided

357dd35

Conver to tf.keras w/o eager mode

dc9cbde

Update README on how to not run slow tests

bcdb035

Bump version number as breaks support with Theano

269cb43

Add test for saving/loading model

a327694

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/simple python3 migration #53

Feature/simple python3 migration #53

nklapste commented Jan 20, 2020 •

edited

Loading

bfelbo commented Feb 2, 2020

nklapste commented Feb 5, 2020

nklapste commented Feb 5, 2020 •

edited

Loading

nklapste commented Feb 5, 2020 •

edited

Loading

nklapste commented Feb 7, 2020 •

edited

Loading

bfelbo commented Feb 7, 2020

bfelbo commented Feb 10, 2020

bfelbo commented Feb 10, 2020

bfelbo commented Feb 10, 2020

bfelbo commented Feb 10, 2020

anbasile commented Apr 7, 2020

bfelbo commented Apr 7, 2020

anbasile commented Apr 8, 2020

anbasile commented Apr 9, 2020

anbasile commented Apr 9, 2020

bfelbo commented Apr 9, 2020

bfelbo commented Apr 11, 2020 •

edited

Loading

anbasile commented Apr 15, 2020 •

edited

Loading

bfelbo commented Apr 15, 2020

nklapste commented Apr 20, 2020

john-heyer commented Jul 28, 2020

nklapste commented Mar 22, 2021 •

edited

Loading

anbasile commented Mar 23, 2021

nklapste commented Apr 9, 2021

Feature/simple python3 migration #53

Are you sure you want to change the base?

Feature/simple python3 migration #53

Conversation

nklapste commented Jan 20, 2020 • edited Loading

bfelbo commented Feb 2, 2020

nklapste commented Feb 5, 2020

nklapste commented Feb 5, 2020 • edited Loading

nklapste commented Feb 5, 2020 • edited Loading

nklapste commented Feb 7, 2020 • edited Loading

bfelbo commented Feb 7, 2020

bfelbo commented Feb 10, 2020

bfelbo commented Feb 10, 2020

bfelbo commented Feb 10, 2020

bfelbo commented Feb 10, 2020

anbasile commented Apr 7, 2020

bfelbo commented Apr 7, 2020

anbasile commented Apr 8, 2020

anbasile commented Apr 9, 2020

anbasile commented Apr 9, 2020

bfelbo commented Apr 9, 2020

bfelbo commented Apr 11, 2020 • edited Loading

anbasile commented Apr 15, 2020 • edited Loading

bfelbo commented Apr 15, 2020

nklapste commented Apr 20, 2020

john-heyer commented Jul 28, 2020

nklapste commented Mar 22, 2021 • edited Loading

anbasile commented Mar 23, 2021

nklapste commented Apr 9, 2021

nklapste commented Jan 20, 2020 •

edited

Loading

nklapste commented Feb 5, 2020 •

edited

Loading

nklapste commented Feb 5, 2020 •

edited

Loading

nklapste commented Feb 7, 2020 •

edited

Loading

bfelbo commented Apr 11, 2020 •

edited

Loading

anbasile commented Apr 15, 2020 •

edited

Loading

nklapste commented Mar 22, 2021 •

edited

Loading