Skip to content

Commit

Permalink
Merge branch 'scribe-org:main' into AK-Latin
Browse files Browse the repository at this point in the history
  • Loading branch information
KesharwaniArpita authored Oct 23, 2024
2 parents 35a1c18 + 399efe2 commit febc71e
Show file tree
Hide file tree
Showing 174 changed files with 3,555 additions and 2,206 deletions.
46 changes: 46 additions & 0 deletions .github/workflows/check_query_forms.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
name: Check Query Forms
on:
push:
branches: [main]
pull_request:
branches: [main]
types: [opened, reopened, synchronize]

jobs:
format_check:
strategy:
fail-fast: false
matrix:
os:
- ubuntu-latest
python-version:
- "3.9"

runs-on: ${{ matrix.os }}

name: Run Check Query Forms

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}

- name: Add project root to PYTHONPATH
run: echo "PYTHONPATH=$(pwd)/src" >> $GITHUB_ENV

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Run check_query_forms.py
working-directory: ./src/scribe_data/check
run: python check_query_forms.py

- name: Post-run status
if: failure()
run: echo "Project SPARQL query forms check failed. Please fix the reported errors."
6 changes: 3 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@ Emojis for the following are chosen based on [gitmoji](https://gitmoji.dev/).

- Scribe-Data is now a fully functional CLI.
- Querying Wikidata lexicographical data can be done via the `--query` command ([#159](https://github.com/scribe-org/Scribe-Data/issues/159)).
- The output type of queries can be in JSON, CSV, TSV and SQLite, with conversions output types also being possible ([#145](https://github.com/scribe-org/Scribe-Data/issues/145), [#146](https://github.com/scribe-org/Scribe-Data/issues/146))
- Output paths can be set for query results ([#144](https://github.com/scribe-org/Scribe-Data/issues/144)).
- The version of the CLI can be printed to the command line and the CLI can further be used to upgrade itself ([#186](https://github.com/scribe-org/Scribe-Data/issues/186), [#157 ](https://github.com/scribe-org/Scribe-Data/issues/157)).
- The output type of queries can be in JSON, CSV, TSV and SQLite, with conversions output types also being possible ([#145](https://github.com/scribe-org/Scribe-Data/issues/145), [#146](https://github.com/scribe-org/Scribe-Data/issues/146))
- Output paths can be set for query results ([#144](https://github.com/scribe-org/Scribe-Data/issues/144)).
- The version of the CLI can be printed to the command line and the CLI can further be used to upgrade itself ([#186](https://github.com/scribe-org/Scribe-Data/issues/186), [#157 ](https://github.com/scribe-org/Scribe-Data/issues/157)).
- Total Wikidata lexemes for languages and data types can be derived with the `--total` command ([#147](https://github.com/scribe-org/Scribe-Data/issues/147)).
- Commands can be used via an interactive mode with the `--interactive` command ([#158](https://github.com/scribe-org/Scribe-Data/issues/158)).
- Articles are removed from machine translations so they're more directly useful in Scribe applications ([#96](https://github.com/scribe-org/Scribe-Data/issues/96)).
Expand Down
2 changes: 1 addition & 1 deletion docs/source/_static/CONTRIBUTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Contents
- `First steps as a contributor <#first-steps-as-a-contributor>`__
- `Learning the tech stack <#learning-the-tech-stack>`__
- `Development environment <#development-environment>`__
- `Issues and projects <#issues-projects>`__
- `Issues and projects <#issues-and-projects>`__
- `Bug reports <#bug-reports>`__
- `Feature requests <#feature-requests>`__
- `Pull requests <#pull-requests>`__
Expand Down
3 changes: 3 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,11 @@
"numpydoc",
"sphinx.ext.viewcode",
"sphinx.ext.imgmath",
"nbsphinx",
]

nbsphinx_allow_errors = True
nbsphinx_execute = "never"
numpydoc_show_inherited_class_members = False
numpydoc_show_class_members = False

Expand Down
4 changes: 2 additions & 2 deletions docs/source/notes.rst
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
.. mdinclude:: _static/CONTRIBUTING.rst
.. include:: _static/CONTRIBUTING.rst

License
=======

.. literalinclude:: ../../LICENSE.txt
:language: text

.. mdinclude:: ../../CHANGELOG.md
.. include:: ../../CHANGELOG.md
17 changes: 12 additions & 5 deletions docs/source/scribe_data/cli.rst
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,10 @@ Example output:
adverbs
emoji-keywords
nouns
personal-pronouns
postpositions
prepositions
proper-nouns
verbs
-----------------------------------
Expand All @@ -94,7 +97,10 @@ Example output:
adverbs
emoji-keywords
nouns
personal-pronouns
postpositions
prepositions
proper-nouns
verbs
-----------------------------------
Expand All @@ -115,7 +121,10 @@ Example output:
adverbs
emoji-keywords
nouns
personal-pronouns
postpositions
prepositions
proper-nouns
verbs
-----------------------------------
Expand All @@ -137,6 +146,7 @@ Options:
- ``-dt, --data-type DATA_TYPE``: The data type(s) to get.
- ``-od, --output-dir OUTPUT_DIR``: The output directory path for results.
- ``-ot, --output-type {json,csv,tsv}``: The output file type.
- ``-ope, --outputs-per-entry OUTPUTS_PER_ENTRY``: How many outputs should be generated per data entry.
- ``-o, --overwrite``: Whether to overwrite existing files (default: False).
- ``-a, --all ALL``: Get all languages and data types.
- ``-i, --interactive``: Run in interactive mode.
Expand Down Expand Up @@ -257,7 +267,7 @@ Examples:
.. code-block:: text
$scribe-data total -lang English -dt nouns
$scribe-data total -lang English -dt nouns # verbs, adjectives, etc
Language: English
Data type: nouns
Total number of lexemes: 12345
Expand All @@ -278,7 +288,4 @@ Options:

- ``-f, --file FILE``: The file to convert to a new type.
- ``-ko, --keep-original``: Whether to keep the file to be converted (default: True).
- ``-json, --to-json TO_JSON``: Convert the file to JSON format.
- ``-csv, --to-csv TO_CSV``: Convert the file to CSV format.
- ``-tsv, --to-tsv TO_TSV``: Convert the file to TSV format.
- ``-sqlite, --to-sqlite TO_SQLITE``: Convert the file to SQLite format.
- ``-ot, --output-type {json,csv,tsv,sqlite}``: The output file type.
3 changes: 1 addition & 2 deletions docs/source/scribe_data/wikidata/query_profanity.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,7 @@ Queries all profane words from a given language to be removed from autosuggest o
}.
FILTER EXISTS {?sense wdt:P6191 ?filter.}.
}
}
ORDER BY
lcase(?lemma)
Expand Down
7 changes: 4 additions & 3 deletions docs/source/scribe_data/wikipedia/gen_autosuggestions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,10 @@ gen_autosuggestions.ipynb

`View code on Github <https://github.com/scribe-org/Scribe-Data/tree/main/src/scribe_data/wikipedia/gen_autosuggestions.ipynb>`_

Scribe Autosuggest Generation
-----------------------------

This notebook is used to run the functions found in Scribe-Data to extract, clean and load autosuggestion files into Scribe apps.

.. toctree::

notebook.ipynb

Use the :code:`View code on GitHub` link above to view the notebook and explore the process!
Loading

0 comments on commit febc71e

Please sign in to comment.