Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems parsing numbers: JSON forbids NaN and infinities #256

Closed
marcnause opened this issue Feb 8, 2016 · 6 comments
Closed

Problems parsing numbers: JSON forbids NaN and infinities #256

marcnause opened this issue Feb 8, 2016 · 6 comments

Comments

@marcnause
Copy link
Member

My JSON parser (GSON) throws exceptions like this every now and then:

JSON forbids NaN and infinities: Infinity at line 569 column 51 path $.statuses[11].classifier_language_probability

As far as I know the parser is right, Infinity or NaN are no legal numbers in JSON. I have only found Infinity so far, no NaN as far as I can tell.

Here is part of my JSON (I searched for "emoji"):

{ "timestamp" : "2016-02-08T13:50:11.887Z", "created_at" : "2016-02-08T13:46:40.000Z", "screen_name" : "_____haarupp", "text" : "ちょうどあと2週間でTOKYO<img class=\"Emoji Emoji--forText\" src=\"https://abs.twimg.com/emoji/v2/72x72/1f495.png\" draggable=\"false\" alt=\"💕\" title=\"Zwei Herzen\" aria-label=\"Emoji: Zwei Herzen\"><img class=\"Emoji Emoji--forText\" src=\"https://abs.twimg.com/emoji/v2/72x72/2708.png\" draggable=\"false\" alt=\"✈️\" title=\"Flugzeug\" aria-label=\"Emoji: Flugzeug\"><img class=\"Emoji Emoji--forText\" src=\"https://abs.twimg.com/emoji/v2/72x72/1f4ad.png\" draggable=\"false\" alt=\"💭\" title=\"Gedankenblase\" aria-label=\"Emoji: Gedankenblase\"><img class=\"Emoji Emoji--forText\" src=\"https://abs.twimg.com/emoji/v2/72x72/1f3b6.png\" draggable=\"false\" alt=\"🎶\" title=\"Mehrere Musiknoten\" aria-label=\"Emoji: Mehrere Musiknoten\">だだだいすちなみなさまと卒業旅行です<img class=\"Emoji Emoji--forText\" src=\"https://abs.twimg.com/emoji/v2/72x72/1f493.png\" draggable=\"false\" alt=\"💓\" title=\"Schlagendes Herz\" aria-label=\"Emoji: Schlagendes Herz\">神奈川も楽しみ〜無計画だけど<img class=\"Emoji Emoji--forText\" src=\"https://abs.twimg.com/emoji/v2/72x72/1f606.png\" draggable=\"false\" alt=\"😆\" title=\"Lächelndes Gesicht mit geöffnetem Mund und fest verschlossenen Augen\" aria-label=\"Emoji: Lächelndes Gesicht mit geöffnetem Mund und fest verschlossenen Augen\"><img class=\"Emoji Emoji--forText\" src=\"https://abs.twimg.com/emoji/v2/72x72/1f64c.png\" draggable=\"false\" alt=\"🙌\" title=\"Person mit im Jubel nach oben gehobenen Händen\" aria-label=\"Emoji: Person mit im Jubel nach oben gehobenen Händen\">", "link" : "https://twitter.com/_____haarupp/status/696691618858971137", "id_str" : "696691618858971137", "source_type" : "TWITTER", "provider_type" : "SCRAPED", "retweet_count" : 0, "favourites_count" : 0, "images" : [ ], "images_count" : 0, "audio" : [ ], "audio_count" : 0, "videos" : [ ], "videos_count" : 0, "place_name" : "Minato-ku, Tokyo", "place_id" : "594fa6c6bc5b5ba9", "place_context" : "FROM", "location_point" : [ 139.69171127290076, 35.68949890122258 ], "location_radius" : 0, "location_mark" : [ 139.77954268483185, 35.726840780611376 ], "location_source" : "PLACE", "hosts" : [ "abs.twimg.com" ], "hosts_count" : 1, "links" : [ "https://abs.twimg.com/emoji/v2/72x72/1f495.png\"", "https://abs.twimg.com/emoji/v2/72x72/2708.png\"", "https://abs.twimg.com/emoji/v2/72x72/1f4ad.png\"", "https://abs.twimg.com/emoji/v2/72x72/1f3b6.png\"", "https://abs.twimg.com/emoji/v2/72x72/1f493.png\"", "https://abs.twimg.com/emoji/v2/72x72/1f606.png\"", "https://abs.twimg.com/emoji/v2/72x72/1f64c.png\"" ], "links_count" : 7, "mentions" : [ ], "mentions_count" : 0, "hashtags" : [ ], "hashtags_count" : 0, "classifier_profanity" : "sex", "classifier_profanity_probability" : "Infinity", "classifier_emotion" : "joy", "classifier_emotion_probability" : 3.2575725E-5, "classifier_language" : "german", "classifier_language_probability" : 3480540.2, "without_l_len" : 1086, "without_lu_len" : 1086, "without_luh_len" : 1086, "user" : { "screen_name" : "_____haarupp", "user_id" : "2985012950", "name" : "とうま はるか", "profile_image_url_https" : "https://pbs.twimg.com/profile_images/692957698988642304/VcrE2cDi_bigger.jpg", "appearance_first" : "2016-02-08T13:47:23.099Z", "appearance_latest" : "2016-02-08T13:47:23.099Z" } }

@smokingwheels
Copy link
Contributor

Not Sure if related in anyway but my Loklak Server Crashes and logs the user interface off when hitting it constantly with searches.
Fresh install Debian 8.2 on SSD or HDD and on another with Java 8.
There is also a lot of disk IO when doing searches.

Does not happen so often when I load Loklak in to a RAM Drive.
There is a process in "iotop" java -Xmn-klokserver that seems to be swapping to disk all the time.
So there appears to be a stack problem see Here Maybe by increasing it it might help.

In the last bit of the log.
2016-02-17 20:53:08.113:INFO::Thread-4463: /api/search.json scraping with query: https://pbs.twimg.com/profile_images/687507264744247296/FZWFq302_bigger.jpg
java.io.IOException: client connection to http://loklak.org/api/search.json?q=from%3ARise_n_Shinee%2Bjat&timezoneOffset=-480&maximumRecords=100&source=cache&minified=true&timeout=2000 fail: 0: your (xxx.xxx.xxx.xxx) request frequency is too high
at org.loklak.http.ClientConnection.init(ClientConnection.java:203)
at org.loklak.http.ClientConnection.(ClientConnection.java:148)
at org.loklak.http.ClientConnection.download(ClientConnection.java:280)
at org.loklak.api.client.SearchClient.search(SearchClient.java:44)
at org.loklak.data.DAO.searchOnOtherPeers(DAO.java:1112)
at org.loklak.data.DAO.searchBackend(DAO.java:1096)
at org.loklak.api.server.SearchServlet$3.run(SearchServlet.java:145)
java.io.IOException: empty content from http://loklak.org
at org.loklak.api.client.SearchClient.search(SearchClient.java:45)
at org.loklak.data.DAO.searchOnOtherPeers(DAO.java:1112)
at org.loklak.data.DAO.searchBackend(DAO.java:1096)
at org.loklak.api.server.SearchServlet$3.run(SearchServlet.java:145)
2016-02-17 20:53:09.858:INFO::Thread-4462: searchOnOtherPeers: no IO to scraping target: empty content from http://loklak.org

@Orbiter
Copy link
Member

Orbiter commented Mar 7, 2016

Hi Marc can you fix the issue? I don't know where to look for this...

@marcnause
Copy link
Member Author

Can't promise anything, but I can take a look.

@marcnause
Copy link
Member Author

The problem seems to be in the Bayes classifier: ptnplanet/Java-Naive-Bayes-Classifier#3

@marcnause
Copy link
Member Author

Unfortunately I don't know too much about Bayes classifiers. The only thing I can offer is this:

loklak_server.patch.txt

Unfortunately this patch only addresses the symptoms, it does not fix the original problem.

Orbiter added a commit that referenced this issue Mar 8, 2016
#256 (comment)
which addresses a problem in the Bayesian Classifier source code as
discussed in
ptnplanet/Java-Naive-Bayes-Classifier#3
@Orbiter
Copy link
Member

Orbiter commented Mar 8, 2016

fixed in 65ede16

@Orbiter Orbiter closed this as completed Mar 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants