-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closes #67 - Add Monero #516
base: main
Are you sure you want to change the base?
Conversation
Tested via data loading.
@napsternxg passes all the unit tests and loads fine, but I noticed if I do the following:
I find these all empty. Is this intended? |
@napsternxg also I made a small change at the end of the file (near the main call) |
Hi @hakunanatasha thanks. Let me have a look at this. I will address this by early next week. |
Hi @hakunanatasha I checked the entities. They are present. When no entity is present in a doc we see an empty list. from datasets import load_dataset
data = load_dataset("biodatasets/monero/monero.py", name="monero_bigbio_kb")
data["train"]["entities"][-5:] Will output
This means only the second last doc among the last 5 docs has any entity. I also added a fix about entity offsets. |
@phlobo I revised this dataset. Please have a look at it. |
Fixes #67 - Add Monero
If the following information is NOT present in the issue, please populate:
Checkbox
biodatasets/my_dataset/my_dataset.py
(please use only lowercase and underscore for dataset naming)._CITATION
,_DATASETNAME
,_DESCRIPTION
,_HOMEPAGE
,_LICENSE
,_URLs
,_SUPPORTED_TASKS
,_SOURCE_VERSION
, and_BIGBIO_VERSION
variables._info()
,_split_generators()
and_generate_examples()
in dataloader script.BUILDER_CONFIGS
class attribute is a list with at least oneBigBioConfig
for the source schema and one for a bigbio schema.datasets.load_dataset
function.python -m tests.test_bigbio biodatasets/my_dataset/my_dataset.py
.