Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

script for smart tribune FAQ data #1673

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,8 @@ repos:
- repo: https://github.com/hadialqattan/pycln
rev: v2.3.0
hooks:
- id: pycln
- id: pycln
- repo: https://github.com/pypa/pip-audit
rev: v2.7.0
hooks:
- id: pip-audit
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,11 @@ A collection of tools to:

Install tools using Poetry from package directory base:

`poetry install`
`poetry install --no-root`

Install pre-commit for format code before commit:

`pre-commit install`

Then run the scripts by passing them to a Python interpreter (>= 3.9):

Expand Down Expand Up @@ -49,6 +53,42 @@ Turns a Smart Tribune CSV export file into a ready-to-index CSV file (one 'title
| Some title | http://example.com | This is example text. |
| ... | ... | ... |


### smarttribune_consumer.py

```
Smart Tribune import data and formatter for send in opensearch.

Usage:
smarttribune_consumer.py [-v] <knowledge_base> <base_url> <output_csv> [options]

Arguments:
knowledge_base name of the target knowledge base, ex: "name1 | name2 | name3"
base_url the base URL to prefix every FAQ entry's query parameter to
create a full URL
output_csv path to the output, ready-to-index CSV file

Options:
--tag_title=<value>
-h --help Show this screen
--version Show version
-v Verbose output for debugging (without this option, script will
be silent but for errors)

Import and Format a Smart Tribune data by API into a ready-to-index CSV file
(one 'title'|'url'|'text' line per filtered entry).
```
Set in a .env your APIKEY and your APISECRET

Import data from smart tribune API and return a ready-to-index CSV file (one 'title'|'url'|'text' line per filtered entry):


| Title | URL | Text |
| ------------ | -------------------- | ----------------------- |
| Some title | http://example.com | This is example text. |
| ... | ... | ... |


#### webscraper.py

```
Expand Down
Loading