ValueError: Input 0 of layer sequential is incompatible with the layer #85

domeniconappo · 2020-08-20T11:57:28Z

Hi,
updating mordecai to 2.1.0 and dependencies:
tensorflow to 2.3.0
spacy to 2.3.2
keras to 2.4.3

Our geocoding processing now is much slower as we've started to observe lots of errors printing to console like the following:

ValueError: Input 0 of layer sequential is incompatible with the layer: expected axis -1 of input shape to have value 12 but received input with shape [None, 0]

It's not clear how this is influencing geocoding but for sure it's much slower as our queues are constantly building up and accumulating documents to be geoparsed.

Can you help? Is it a problem with deps versions?

Thank you in advance and for your great work!

The text was updated successfully, but these errors were encountered:

ahalterman · 2020-08-21T15:14:31Z

Huh, that's frustrating. I really didn't change that much beyond bumping the versions, so I'm not sure where the slowdown is coming from. Do you have a document that produces the ValueError that you can share?

marcusvrlopes · 2020-08-25T18:58:22Z

I just start with mordecai last week, but i got the same problem described by @domeniconappo . After a lot of tests changing versions, trying to use cuda etc... nothing changed. Then i gave a try on jupyter notebook. I don't know why, but analysis became a lot faster. The only lib version that differs from @domeniconappo and my own old script is tensorflow (1.14.0 installed by conda)

vupadhyaya19 · 2021-05-27T19:10:27Z

Hi @ahalterman, even I am getting the same issue while using the package. The issue is occurring due to the identification of some irrelevant terms as geo terms in my case. After the code lookup, I found out that in geoparse.py in line# 731 while we call this:
prediction = self.country_model.predict(i['matrix']).transpose()[0]
the matrix for the word generated is empty and of shape (1,0).
So let me know if we can filter out the below code based on the empty matrix(in line# 722 geoparse.py):
feat = self.make_country_matrix(loc).

Example of the geo-terms identified which are causing the issue:

{'labels': [], 'matrix': matrix([], shape=(1, 0), dtype=float64), 'word': 'organomercury'}

{'labels': [], 'matrix': matrix([], shape=(1, 0), dtype=float64), 'word': 'orangeiron'}

{'labels': [], 'matrix': matrix([], shape=(1, 0), dtype=float64), 'word': 'redoxygen'}

{'labels': [], 'matrix': matrix([], shape=(1, 0), dtype=float64), 'word': 'FeC10(HgCl)10'}

[{'text': 'organomercury', 'label': '', 'word': 'organomercury', 'spans': [{'start': 900, 'end': 913}], 'features': {'maj_vote': '', 'word_vec': '', 'first_back': '', 'most_alt': '', 'most_pop': '', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': '0', 'class_mention': '', 'code_mention': ''}}, {'text': 'Pbca', 'label': '', 'word': 'Pbca', 'spans': [{'start': 4644, 'end': 4648}], 'features': {'maj_vote': '', 'word_vec': '', 'first_back': 'POL', 'most_alt': 'CHN', 'most_pop': 'MEX', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': '0', 'class_mention': '', 'code_mention': ''}}, {'text': 'orangeiron', 'label': '', 'word': 'orangeiron', 'spans': [{'start': 6157, 'end': 6167}], 'features': {'maj_vote': '', 'word_vec': '', 'first_back': '', 'most_alt': '', 'most_pop': '', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': '0', 'class_mention': '', 'code_mention': ''}}, {'text': 'redoxygen', 'label': '', 'word': 'redoxygen', 'spans': [{'start': 6184, 'end': 6193}], 'features': {'maj_vote': '', 'word_vec': '', 'first_back': '', 'most_alt': '', 'most_pop': '', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': '0', 'class_mention': '', 'code_mention': ''}}, {'text': 'metallocene moiety', 'label': '', 'word': 'metallocene moiety', 'spans': [{'start': 6935, 'end': 6953}], 'features': {'maj_vote': '', 'word_vec': 'GNQ', 'first_back': '', 'most_alt': '', 'most_pop': '', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': 4.130288124084473, 'class_mention': '', 'code_mention': ''}}, {'text': '3.447(1)Å (Figure1C', 'label': '', 'word': '3.447(1)Å (Figure1C', 'spans': [{'start': 7585, 'end': 7604}], 'features': {'maj_vote': '', 'word_vec': 'TUR', 'first_back': '', 'most_alt': '', 'most_pop': '', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': 1.3494553565979004, 'class_mention': '', 'code_mention': ''}}, {'text': 'FeC10(HgCl)10', 'label': '', 'word': 'FeC10(HgCl)10', 'spans': [{'start': 12695, 'end': 12708}], 'features': {'maj_vote': '', 'word_vec': '', 'first_back': '', 'most_alt': '', 'most_pop': '', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': '0', 'class_mention': '', 'code_mention': ''}}, {'text': 'Deutsche Forschungsgemeinschaft', 'label': '', 'word': 'Deutsche Forschungsgemeinschaft', 'spans': [{'start': 13577, 'end': 13608}], 'features': {'maj_vote': '', 'word_vec': 'DEU', 'first_back': '', 'most_alt': '', 'most_pop': '', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': 10.370280265808105, 'class_mention': '', 'code_mention': ''}}, {'text': 'ZEDAT/FU Berlin', 'label': '', 'word': 'ZEDAT/FU Berlin', 'spans': [{'start': 13713, 'end': 13728}], 'features': {'maj_vote': '', 'word_vec': 'DEU', 'first_back': '', 'most_alt': '', 'most_pop': '', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': 11.895607948303223, 'class_mention': '', 'code_mention': ''}}]

vupadhyaya19 · 2021-06-01T12:24:11Z

Hi @ahalterman, I did the changes in geoparse.py and the issue is not occurring now. Let me know if the below code changes can be committed and pushed.
geoparse.txt

ahalterman · 2021-06-12T23:21:37Z

@vupadhyaya19: can you open a pull request with your changes?

I'm hoping to make v3 public in July and that should resolve the issue because it switches from TF to pytorch, but I'd like to leave this version in a usable form for people who might stick with it.

luizavladislavna · 2021-08-12T17:17:15Z

@vupadhyaya19: can you open a pull request with your changes?

I'm hoping to make v3 public in July and that should resolve the issue because it switches from TF to pytorch, but I'd like to leave this version in a usable form for people who might stick with it.

Hi, @ahalterman !
First of all, thank you for your job!

Looks like I have same issue described above, so:

Can you please update us with v3? Any chance that you will share it with community?

ahalterman added the v3 label Mar 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: Input 0 of layer sequential is incompatible with the layer #85

ValueError: Input 0 of layer sequential is incompatible with the layer #85

domeniconappo commented Aug 20, 2020

ahalterman commented Aug 21, 2020

marcusvrlopes commented Aug 25, 2020

vupadhyaya19 commented May 27, 2021 •

edited

Loading

vupadhyaya19 commented Jun 1, 2021

ahalterman commented Jun 12, 2021

luizavladislavna commented Aug 12, 2021

ValueError: Input 0 of layer sequential is incompatible with the layer #85

ValueError: Input 0 of layer sequential is incompatible with the layer #85

Comments

domeniconappo commented Aug 20, 2020

ahalterman commented Aug 21, 2020

marcusvrlopes commented Aug 25, 2020

vupadhyaya19 commented May 27, 2021 • edited Loading

vupadhyaya19 commented Jun 1, 2021

ahalterman commented Jun 12, 2021

luizavladislavna commented Aug 12, 2021

vupadhyaya19 commented May 27, 2021 •

edited

Loading