-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove text analysis pilot #337
Conversation
Please also update the changelog if this is approved by others. 🙏 Thank you! |
If i may voice an opinion here: Even though this category has seen very little "action", i do think it is very important as it represents one of the only entry points for the entire domain of "digital humanities", itself representing a large portion of most universities around the world. Interest in, and funding for, this area is increasing enormously, and text analysis software is a very rapidly developing area. It may be that rOpenSci currently has little direct expertise in the area, but outright removal of this category would effectively exclude a very large portion of most academic communities from even considering rOpenSci as potentially relevant. Conversely, if expertise is slowly cultivated in this area, even merely passively allowing it to continue to exist may open up the organisation to a whole new field. Disclaimer: I am also biased, because i hope one day to submit one of my own packages that only falls within this category and no other, and which is also definitely not statistical. |
I second Mark's take on this. |
What is the main cause of that? Do we find the packages rarely generalize well beyond a very specific problem/corpus (which might indicate its not a good category) or a lack of editors/reviews with expertise in this space (which perhaps could be addressed with recruiting)? |
Good question @emilyriederer. Looking back, basically all the packages were those submitted by Lincoln Mullen before we consolidated what our current scope, and then there was the wordVectors package which the author never followed up on review feedback. We've gotten two submissions in the past week in this category, both of which I have trouble admitting. Both implement machine-learning algorithms in NLP:
I find it challenging to think how we would accept these in our usual system and think it makes sense to refer them to statistical peer review. That said, one option is to clarify the scope to specify that under the non-statistical scope, text packages should still be data process/data lifecycle management packages, rather than ML. Some potential language:
|
@noamross thanks for pinging me on this. I was actually led to tentatively try for a presubmission, based on what I read in the rOpenSci guidelines. If you believe my package does not fit the scope of reviewed material, I will try to resubmit as soon as the Statistical Software section is consolidated. Bests, |
Thanks for your response @vgherard. We're just doing a little re-arranging for this scope and getting our statistical package submission templates in order. I'll follow up when we've resolved these, I expect next week. |
I've added back in this text, can I get a thumbs up/down from @ropensci/editors?
If accepted I will also update the categories in the software-review templates. |
@vgherard We've updated this, as well as our submission templates. Could you re-submit your pre-submission inquiry under the new template that now includes the statistical software project?: https://github.com/ropensci/software-review/issues/new?assignees=&labels=&template=B-submit-a-presubmission-inquiry.md |
Thanks for informing me @noamross, will do ASAP. Valerio |
@ropensci/editors
For some time we had a pilot text-analysis category in our Aims and Scope. This originated when we had a text specialist, Lincoln Mullen, on our editorial board, and some collaborations with a text analysis working group. There have been very few submissions in this category and most of them have had challenges getting through review.
Given the lack of uptake, that these packages would fall under the scope of our statistical packages peer review, and we never really established a firm set of criteria for such packages, I think we should remove this from our Aims and Scope.