Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Input 0 of layer sequential is incompatible with the layer #85

Open
domeniconappo opened this issue Aug 20, 2020 · 6 comments
Labels

Comments

@domeniconappo
Copy link

Hi,
updating mordecai to 2.1.0 and dependencies:
tensorflow to 2.3.0
spacy to 2.3.2
keras to 2.4.3

Our geocoding processing now is much slower as we've started to observe lots of errors printing to console like the following:

ValueError: Input 0 of layer sequential is incompatible with the layer: expected axis -1 of input shape to have value 12 but received input with shape [None, 0]

It's not clear how this is influencing geocoding but for sure it's much slower as our queues are constantly building up and accumulating documents to be geoparsed.

Can you help? Is it a problem with deps versions?

Thank you in advance and for your great work!

@ahalterman
Copy link
Member

Huh, that's frustrating. I really didn't change that much beyond bumping the versions, so I'm not sure where the slowdown is coming from. Do you have a document that produces the ValueError that you can share?

@marcusvrlopes
Copy link

I just start with mordecai last week, but i got the same problem described by @domeniconappo . After a lot of tests changing versions, trying to use cuda etc... nothing changed. Then i gave a try on jupyter notebook. I don't know why, but analysis became a lot faster. The only lib version that differs from @domeniconappo and my own old script is tensorflow (1.14.0 installed by conda)

@ahalterman ahalterman added the v3 label Mar 8, 2021
@vupadhyaya19
Copy link

vupadhyaya19 commented May 27, 2021

Hi @ahalterman, even I am getting the same issue while using the package. The issue is occurring due to the identification of some irrelevant terms as geo terms in my case. After the code lookup, I found out that in geoparse.py in line# 731 while we call this:
prediction = self.country_model.predict(i['matrix']).transpose()[0]
the matrix for the word generated is empty and of shape (1,0).
So let me know if we can filter out the below code based on the empty matrix(in line# 722 geoparse.py):
feat = self.make_country_matrix(loc).

Example of the geo-terms identified which are causing the issue:

{'labels': [], 'matrix': matrix([], shape=(1, 0), dtype=float64), 'word': 'organomercury'}

{'labels': [], 'matrix': matrix([], shape=(1, 0), dtype=float64), 'word': 'orangeiron'}

{'labels': [], 'matrix': matrix([], shape=(1, 0), dtype=float64), 'word': 'redoxygen'}

{'labels': [], 'matrix': matrix([], shape=(1, 0), dtype=float64), 'word': 'FeC10(HgCl)10'}

[{'text': 'organomercury', 'label': '', 'word': 'organomercury', 'spans': [{'start': 900, 'end': 913}], 'features': {'maj_vote': '', 'word_vec': '', 'first_back': '', 'most_alt': '', 'most_pop': '', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': '0', 'class_mention': '', 'code_mention': ''}}, {'text': 'Pbca', 'label': '', 'word': 'Pbca', 'spans': [{'start': 4644, 'end': 4648}], 'features': {'maj_vote': '', 'word_vec': '', 'first_back': 'POL', 'most_alt': 'CHN', 'most_pop': 'MEX', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': '0', 'class_mention': '', 'code_mention': ''}}, {'text': 'orangeiron', 'label': '', 'word': 'orangeiron', 'spans': [{'start': 6157, 'end': 6167}], 'features': {'maj_vote': '', 'word_vec': '', 'first_back': '', 'most_alt': '', 'most_pop': '', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': '0', 'class_mention': '', 'code_mention': ''}}, {'text': 'redoxygen', 'label': '', 'word': 'redoxygen', 'spans': [{'start': 6184, 'end': 6193}], 'features': {'maj_vote': '', 'word_vec': '', 'first_back': '', 'most_alt': '', 'most_pop': '', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': '0', 'class_mention': '', 'code_mention': ''}}, {'text': 'metallocene moiety', 'label': '', 'word': 'metallocene moiety', 'spans': [{'start': 6935, 'end': 6953}], 'features': {'maj_vote': '', 'word_vec': 'GNQ', 'first_back': '', 'most_alt': '', 'most_pop': '', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': 4.130288124084473, 'class_mention': '', 'code_mention': ''}}, {'text': '3.447(1)Å (Figure1C', 'label': '', 'word': '3.447(1)Å (Figure1C', 'spans': [{'start': 7585, 'end': 7604}], 'features': {'maj_vote': '', 'word_vec': 'TUR', 'first_back': '', 'most_alt': '', 'most_pop': '', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': 1.3494553565979004, 'class_mention': '', 'code_mention': ''}}, {'text': 'FeC10(HgCl)10', 'label': '', 'word': 'FeC10(HgCl)10', 'spans': [{'start': 12695, 'end': 12708}], 'features': {'maj_vote': '', 'word_vec': '', 'first_back': '', 'most_alt': '', 'most_pop': '', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': '0', 'class_mention': '', 'code_mention': ''}}, {'text': 'Deutsche Forschungsgemeinschaft', 'label': '', 'word': 'Deutsche Forschungsgemeinschaft', 'spans': [{'start': 13577, 'end': 13608}], 'features': {'maj_vote': '', 'word_vec': 'DEU', 'first_back': '', 'most_alt': '', 'most_pop': '', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': 10.370280265808105, 'class_mention': '', 'code_mention': ''}}, {'text': 'ZEDAT/FU Berlin', 'label': '', 'word': 'ZEDAT/FU Berlin', 'spans': [{'start': 13713, 'end': 13728}], 'features': {'maj_vote': '', 'word_vec': 'DEU', 'first_back': '', 'most_alt': '', 'most_pop': '', 'ct_mention': '', 'ctm_count1': 0, 'ct_mention2': '', 'ctm_count2': 0, 'wv_confid': 11.895607948303223, 'class_mention': '', 'code_mention': ''}}]

@vupadhyaya19
Copy link

Hi @ahalterman, I did the changes in geoparse.py and the issue is not occurring now. Let me know if the below code changes can be committed and pushed.
geoparse.txt

@ahalterman
Copy link
Member

@vupadhyaya19: can you open a pull request with your changes?

I'm hoping to make v3 public in July and that should resolve the issue because it switches from TF to pytorch, but I'd like to leave this version in a usable form for people who might stick with it.

@luizavladislavna
Copy link

@vupadhyaya19: can you open a pull request with your changes?

I'm hoping to make v3 public in July and that should resolve the issue because it switches from TF to pytorch, but I'd like to leave this version in a usable form for people who might stick with it.

Hi, @ahalterman !
First of all, thank you for your job!

Looks like I have same issue described above, so:

  • Can you please update us with v3? Any chance that you will share it with community?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants