-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to load pretrained pipeline for ContextSpellCheckerModel; serialVersionUID error #7340
Comments
Hi @geowynn Could you please share how you start SparkSession in your Cloudera cluster which has Just double checking here, it could be totally the model itself that needs another copy for Spark 2.4, but before that, I want to be sure you are using the correct Spark NLP maven package. |
Hi @maziyarpanahi , Are you referring to this line in the Steps to Reproduce Section? I'm using Separately I've tried passing it directly but the issue persists.
Thanks for the prompt response! |
Yes, that is the line, thanks @geowynn I can confirm it's not your setup and it's actually the We will fix this model, re-upload it for Spark 2.4 and I'll keep you updated |
hello @geowynn , have you tried using Kryo serializer? |
Hi @albertoandreottiATgmail what do you mean by Kryo serializer? any examples? I'm not too familiar with it |
We have fixed the English version of the |
I am trying to use the ContextSpellCheckerMode in my NLP pipeline and I am facing py4JJava Exception. My runtime environment is Spark 3.2.1 and sparknlp 4.0.0. This notebook is running on a data bricks environment spellChecker = ContextSpellCheckerModel.load("dbfs:path/to/pretrained/model") .setInputCols("tokenized") .setOutputCol("checked") Any help is appreciated |
Could you please create a new issue, we need all the info in the issue template especially what exactly is that path to pretrained. (Link to the actual model on Models Hub) Thank you |
I'm new to using sparknlp and facing this error when adding the contextSpellCheckerModel pretrained to the pipeline.
Description
Also, working on my project in CDSW (Cloudera workbench) and I've referred to links here but I'm not entirely sure how to point into the correct jar on cloud.
References:
#2562
#5984
Expected Behavior
Pretrained model for ContextSpellCheckerModel should run without serialVersionUID InvalidClassException error.
Current Behavior
Code for pipeline:
spellChecker = ContextSpellCheckerModel.pretrained("spellcheck_dl") .setInputCols("tokenized") .setOutputCol("checked")
Summarised error:
spellcheck_dl download started this may take some time. Approximate size to download 112.2 MB [ | ]spellcheck_dl download started this may take some time. Approximate size to download 112.2 MB Download done! Loading the resource. 22/03/17 07:52:54 052 ERROR Executor: Exception in task 0.0 in stage 90.0 (TID 892) java.io.InvalidClassException: com.johnsnowlabs.nlp.annotators.spell.context.parser.MainVocab; local class incompatible: stream classdesc serialVersionUID = 2150722227907329010, local class serialVersionUID = 7050539942427507052
Full error message:
Possible Solution
Steps to Reproduce
Context
Unable to make use of contextual spell checker in pipeline
Your Environment
Spark NLP version
sparknlp.version()
: '3.4.2'Apache NLP version
spark.version
: '2.4.0-cdh6.3.4'Java version
java -version
: java version "1.8.0_181"Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
Setup and installation (Pypi, Conda, Maven, etc.): pip install spark-nlp
Operating System and version: Cloudera Data Science Workbench Python 3
Link to your project (if any):
The text was updated successfully, but these errors were encountered: