-
-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-submission inquiry for {kgrams}: Classical k-gram Language Models #452
Comments
Thank you for our first statistical package pre-submission, @vgherard! I believe this clearly falls in scope and look forward to a full submission once you have incorporated the |
Thanks @noamross, great :) I will begin looking to the |
Please ping me and @mpadge here with any questions, we know we are working out the kinks in the new system and are eager to help with the process to make it better! |
Thanks @noamross (@mpadge), I've filed an issue at ropensci-review-tools/autotest#49 |
Hello @vgherard! We're going back to some in-progress submissions that got stuck in an ambiguous state. Sorry that we haven't reached out in a while. I just wanted to see if ropensci peer review is something you were still interested in pursuing. |
Dear @noamross thanks for checking in and sorry for the long silence, I totally forgot about this process being open. Sadly, right now I'm too short of time for a relatively demanding submission like this... Apart from this, over time I became a bit unsatisfied with certain aspects of this package, which I'd at least try to improve before submitting. I'll close this, with the hope to come back to it in a not too far future :-) Thanks! |
@vgherard Any updates on the status of your package? We'd still be very interested in receiving a full submission 👍 |
Dears, thanks for keeping in touch. I had a look at the requirements I would need to cover in order to submit The output of These are in general quick things, but with a package of the dimension of It's understood that when I say "too much" I refer only to my individual case - I think the work you're doing by putting up this review process is awesome. For next package ideas I will definitely consider implementing ropensci standard from the onset! |
Thanks @vgherard, I definitely understand. It's a shame, but you are probably right that it wouldn't be a trivial amount of work to prepare it. Thanks for considering, and for the kind words, and we look forward to future submissions at any time. |
Submitting Author: Valerio Gherardi (@vgherard)
Repository: https://github.com/vgherard/kgrams
Submission type: Pre-submission
Scope
Please indicate which category or categories from our package fit policies or statistical package categories this package falls under. (Please check an appropriate box below):
Data Lifecycle Packages
Statistical Packages
Explain how and why the package falls under these categories (briefly, 1-2 sentences). Please note any areas you are unsure of:
This package implements classical k-gram language model algorithms, including utilities for training, evaluation and text prediction. Language models are an angular stone of Natural Language Processing applications, and the conceptual simplicity of k-gram models makes them a good model baseline, also of pedagogical value.
k-gram models are a simple form of Machine-Learning applied to text data; as such, machine-learning is definitely the most appropriate category within the above ones. I would be inclined to define this as an "Unsupervised" learning problem, since the target function being learned (the language's probability distribution over sentences) is clearly not explicit in the training data - but have never seen this particular qualification in the literature.
Not yet (NB: this is a presubmission inquiry).
The package can be useful for students and/or researchers, for performing small-scale experiments with Natural Language Processing. In addition, it might be helpful in the building of more complex language models, for quick baseline modeling.
I am not aware of any R package with same purpose and functionalities of
kgrams
. The CRAN package ngram has some relative overlap in scope, in that it provides k-gram tokenization algorithms and random text generation, but offers no support for language model algorithms.Not applicable.
The text was updated successfully, but these errors were encountered: