You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
so far so good (ok I had a little trouble figuring out that the easiest way to respect external dependencies is switching to the target directory and running from there)
then
cd target
wget https://dumps.wikimedia.org/barwiki/20151002/barwiki-20151002-pages-articles-multistream-index.txt.bz2
wget https://dumps.wikimedia.org/barwiki/20151002/barwiki-20151002-pages-articles-multistream.xml.bz2
when I now run
`java -jar wikiforia-1.2.1.jar -pages barwiki-20151002-pages-articles-multistream.xml.bz2 -output res.xml``
I receive the following output:
[2015-10-14 15:14:55.728 | main | INFO | se.lth.cs.nlp.wikiforia.App] Wikiforia v1.2.1 by Marcus Klang
Exception in thread "main" java.lang.NullPointerException
at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser.toString(MultistreamBzip2XmlDumpParser.java:480)
at se.lth.cs.nlp.wikiforia.Pipeline.run(Pipeline.java:73)
at se.lth.cs.nlp.wikiforia.App.convert(App.java:239)
at se.lth.cs.nlp.wikiforia.App.main(App.java:413)
I see that there must be some class fields not initialized but I didn't go into further debugging.
ls shows me that the file res.xml was created so I assume that passing arguments works and something else in the class field is not correctly set.
Did I do something wrong? Is the tool just not working with bavarian wikipedia? comparing git has I found this in git log
commit 04e80b46ecc1bb487419fb9f831258be78413f07
Author: Marcus Klang <[email protected]>
Date: Tue Mar 24 11:08:08 2015 +0100
* Added French, German and Spanish configurations
which made me wonder that my dump could be the reason. Thanks for help!
I am not particularly interested in the bavarian wikipedia but I wanted to test the tool with small data (:
best Rene
The text was updated successfully, but these errors were encountered:
hey Marcus I tried
so far so good (ok I had a little trouble figuring out that the easiest way to respect external dependencies is switching to the target directory and running from there)
then
when I now run
`java -jar wikiforia-1.2.1.jar -pages barwiki-20151002-pages-articles-multistream.xml.bz2 -output res.xml``
I receive the following output:
looking at
wikiforia/src/main/java/se/lth/cs/nlp/mediawiki/parser/MultistreamBzip2XmlDumpParser.java
Line 480 in 5672123
I see that there must be some class fields not initialized but I didn't go into further debugging.
ls shows me that the file res.xml was created so I assume that passing arguments works and something else in the class field is not correctly set.
Did I do something wrong? Is the tool just not working with bavarian wikipedia? comparing git has I found this in git log
which made me wonder that my dump could be the reason. Thanks for help!
I am not particularly interested in the bavarian wikipedia but I wanted to test the tool with small data (:
best Rene
The text was updated successfully, but these errors were encountered: