diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 8ecc71c8..dade5a9e 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -34,4 +34,4 @@ repos: entry: pytest src/tests language: system pass_filenames: false - always_run: true + exclude: ".*.md" diff --git a/alembic/README.md b/alembic/README.md index 386d5533..26124e62 100644 --- a/alembic/README.md +++ b/alembic/README.md @@ -15,7 +15,7 @@ docker build -f alembic/Dockerfile . -t aiod-migration With the sqlserver container running, you can migrate to the latest schema with: ```commandline -docker run -v $(pwd)/alembic:/alembic:ro -v $(pwd)/src:/app -it --network aiod_default aiod-migration +docker run -v $(pwd)/alembic:/alembic:ro -v $(pwd)/src:/app -it --network aiod-rest-api_default aiod-migration ``` Make sure that the specified `--network` is the docker network that has the `sqlserver` container. The alembic directory is mounted to ensure the latest migrations are available, diff --git a/docs/Contributing.md b/docs/Contributing.md new file mode 100644 index 00000000..854139a3 --- /dev/null +++ b/docs/Contributing.md @@ -0,0 +1 @@ +# Contributing diff --git a/docs/Hosting.md b/docs/Hosting.md new file mode 100644 index 00000000..3c54a7d0 --- /dev/null +++ b/docs/Hosting.md @@ -0,0 +1,149 @@ +# Hosting the Metadata Catalogue +This page has information on how to host your own metadata catalogue. +If you plan to locally develop the REST API, please follow the installation procedure in ["Contributing"](../contributing) instead. + +## Prerequisites +The platform is tested on Linux, but should also work on Windows and MacOS. +Additionally, it needs [Docker](https://docs.docker.com/get-docker/) and +[Docker Compose](https://docs.docker.com/compose/install/) (version 2.21.0 or higher). + +## Installation +Starting the metadata catalogue is as simple as spinning up the docker containers through docker compose. +This means that other than the prerequisites, no installation steps are necessary. +However, we do need to fetch files the latest release of the repository: + +=== "CLI (git)" + ```commandline + git clone https://github.com/aiondemand/AIOD-rest-api.git + ``` + +=== "UI (browser)" + + * Navigate to the project page [aiondemand/AIOD-rest-api](https://github.com/aiondemand/AIOD-rest-api). + * Click the green `<> Code` button and download the `ZIP` file. + * Find the downloaded file on disk, and extract the content. + +## Starting the Metadata Catalogue +From the root of the project directory (i.e., the directory with the `docker-compose.yaml` file), run: + +=== "Shorthand" + We provide the following script as a convenience. + This is especially useful when running with a non-default or development configuration, + more on that later. + ```commandline + ./scripts/up.sh + ``` +=== "Docker Compose" + ```commandline + docker compose up -d + ``` + +This will start a number of services running within one docker network: + + * Database: a [MySQL](https://dev.mysql.com) database that contains the metadata. + * Keycloak: an authentication service, provides login functionality. + * Metadata Catalogue REST API: + * Elastic Search: indexes metadata catalogue data for faster keyword searches. + * Logstash: Loads data into Elastic Search. + * Deletion: Takes care of cleaning up deleted data. + * nginx: Redirects network traffic within the docker network. + * es_logstash_setup: Generates scripts for Logstash and creates Elastic Search indices. + +[//]: # (TODO: Make list items link to dedicated pages.) +These services are described in more detail in their dedicated pages. +After the previous command was executed successfully, you can navigate to [localhost](http://localhost.com) +and see the REST API documentation. This should look similar to the [api.aiod.eu](https://api.aiod.eu) page, +but is connected to your local database and services. + +### Starting Connector Services +To start connector services that automatically index data from external platforms into the metadata catalogue, +you must specify their docker-compose profiles (as defined in the `docker-compose.yaml` file). +For example, you can use the following commands when starting the connectors for OpenML and Zenodo. + +=== "Shorthand" + ```commandline + ./scripts/up.sh openml zenodo-datasets + ``` +=== "Docker Compose" + ```commandline + docker compose --profile openml --profile zenodo-datasets up -d + ``` + +The full list of connector profiles are: + +- openml: indexes datasets and models from OpenML. +- zenodo-datasets: indexes datasets from Zenodo. +- huggingface-datasets: indexes datasets from Hugging Face. +- examples: fills the database with some example data. Do not use in production. + +[//]: # (TODO: Link to docs or consolidate in dedicated page.) + +## Configuration +There are two main places to configure the metadata catalogue services: +environment variables configured in `.env` files, and REST API configuration in a `.toml` file. +The default files are `./.env` and `./src/config.default.toml` shown below. + +If you want to use non-default values, we strongly encourage you not to overwrite the contents of these files. +Instead, you can create `./override.env` and `./config.override.toml` files to override those files. +When using the `./scripts/up.sh` script to launch your services, these overrides are automatically taken into account. + +=== "`./src/config/default.toml`" + ```toml + --8<-- "./src/config.default.toml" + ``` + +=== "`./.env`" + ```.env + --8<-- ".env" + ``` + +Overwriting these files directly will likely complicate updating to newer releases due to merge conflicts. + +## Updating to New Releases + +[//]: # (TODO: Publish to docker hub and have the default docker-compose.yaml pull from docker hub instead.) + +First, stop running services: +```commandline +./scripts/down.sh +``` +Then get the new release: +```commandline +git fetch origin +git checkout vX.Y.Z +``` +A new release might come with a database migration. +If that is the case, follow the instructions in ["Database Schema Migration"](#database-schema-migration) below. +The database schema migration must be performed before resuming operations. + +Then run the startup commands again (either `up.sh` or `docker compose`). + +### Database Schema Migration + +We use [Alembic](https://alembic.sqlalchemy.org/en/latest/tutorial.html#running-our-first-migration) to automate database schema migrations +(e.g., adding a table, altering a column, and so on). +Please refer to the Alembic documentation for more information. +Commands below assume that the root directory of the project is your current working directory. + +!!! warning + + Database migrations may be irreversible. Always make sure there is a backup of the old database. + +Build the database schema migration docker image with: +```commandline +docker build -f alembic/Dockerfile . -t aiod-migration +``` + +With the sqlserver container running, you can migrate to the latest schema with + +```commandline +docker run -v $(pwd)/alembic:/alembic:ro -v $(pwd)/src:/app -it --network aiod-rest-api_default aiod-migration +``` + +since the default entrypoint of the container specifies to upgrade the database to the latest schema. + +Make sure that the specified `--network` is the docker network that has the `sqlserver` container. +The alembic directory is mounted to ensure the latest migrations are available, +the src directory is mounted so the migration scripts can use defined classes and variable from the project. + +[//]: # (TODO: Write documentation for when some of the migrations are not applicable. E.g., when a database was created in a new release.) diff --git a/README.md b/docs/README.md similarity index 97% rename from README.md rename to docs/README.md index 431ca1ab..098482ba 100644 --- a/README.md +++ b/docs/README.md @@ -209,7 +209,7 @@ Checkin is strict - as it should be. On our development keycloak, any redirectio accepted, so that it works on local host or wherever you deploy. This should never be the case for a production instance. -See [authentication README](authentication/README.md) for more information. +See [authentication README](developer/auth.md) for more information. ### Creating the Database @@ -243,14 +243,14 @@ start-up work (e.g., populating the database). #### Database Structure -The Python classes that define the database tables are found in [src/database/model/](src/database/model/). +The Python classes that define the database tables are found in [src/database/model/](../src/database/model/). The structure is based on the -[metadata schema](https://docs.google.com/spreadsheets/d/1n2DdSmzyljvTFzQzTLMAmuo3IVNx8yposdPLItBta68/edit?usp=sharing). +[metadata schema](https://github.com/aiondemand/metadata-schema). ## Adding resources -See [src/README.md](src/README.md). +See [src/README.md](developer/code.md). ## Backups and Restoration @@ -313,5 +313,5 @@ To create a new release, - Check which services currently work (before the update). It's a sanity check for if a service _doesn't_ work later. - Update the code on the server by checking out the release - Merge configurations as necessary - - Make sure the latest database migrations are applied: see ["Schema Migrations"](alembic/readme.md#update-the-database) + - Make sure the latest database migrations are applied: see ["Schema Migrations"](developer/migration.md#update-the-database) 9. Notify everyone (e.g., in the API channel in Slack). diff --git a/docs/Using.md b/docs/Using.md new file mode 100644 index 00000000..bf9c2d8a --- /dev/null +++ b/docs/Using.md @@ -0,0 +1,112 @@ +# Using the REST API + +The REST API allows you to retrieve, update, or remove asset metadata in the metadata catalogue. +The assets are indexed from many different platforms, such as educational resources from [AIDA](https://www.i-aida.org), +datasets from [HuggingFace](https://huggingface.co), models from [OpenML](https://openml.org), and many more. + +The REST API is available at [`https://api.aiod.eu`](https://api.aiod.eu) and documentation on endpoints +is available on complementary [Swagger](https://api.aiod.eu/docs) and [ReDoc](https://api.aiod.eu/redoc) pages. + +To use the REST API, simply make HTTP requests to the different endpoints. +Generally, these are `GET` requests when retrieving data, `PUT` requests when modifying data, `POST` requests when adding data, and `DEL` requests when deleting data. +Here are some examples on how to list datasets in different environments: + +=== "Python (requests)" + + This example uses the [`requests`](https://requests.readthedocs.io/en/latest/) library to list datasets. + + ``` python + import requests + response = requests.get("https://api.aiod.eu/datasets/v1?schema=aiod&offset=0&limit=10") + print(response.json()) + ``` + +=== "CLI (curl)" + + This example uses [curl](https://curl.se/) to retrieve data from the command line. + + ``` commandline + curl -X 'GET' \ + 'https://api.aiod.eu/datasets/v1?schema=aiod&offset=0&limit=10' \ + -H 'accept: application/json' + ``` + +Additionally, we also provide an [`aiondemand` package](https://aiondemand.github.io/aiondemand/) for Python +to simplify access to the REST API. You can see an example of that below, and we refer to their dedicated +documentation pages for full installation and usage instructions. + +```python +import aiod +aiod.datasets.get_list() +``` + + +## Exploring REST API Endpoints +By navigating to the [Swagger documentation](https://api.aiod.eu/docs), you can find information and examples on how to access the different endpoints. + +### Retrieving Information +For example, if we navigate to the [`GET /datasets/v1`](https://api.aiod.eu/docs#/datasets/List_datasets_datasets_v1_get) +endpoint and expand the documentation by clicking on the down chevron (`v`), we can see the different query parameters +and can execute a call directly on the API: + +![The Swagger documentation allows you to directly query the REST API from your browser.](media/swagger.webp) + +Click the `Try it out` button to be able to modify the parameter values and then click the `execute` button to make the request directly from the documentation page. +Under `response` you will also see an example on how to make the request through the command line using `curl`, e.g.: + +```bash +curl -X 'GET' \ + 'https://api.aiod.eu/datasets/v1?schema=aiod&offset=0&limit=10' \ + -H 'accept: application/json' +``` + +Below the example, you will find a section `Server Response` which displays the actual response from the service (if you clicked `execute`). +Normally, this should look similar to the image below; a [HTTP Status Code](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status), +and data (in JSON). + +![After executing a query, Swagger shows the JSON response.](media/response.webp) + +Below the actual server response is a `response` section which lists information about the possible responses, including +for example different error codes. + +### Modifying Information + +!!! tip + + When exploring these endpoints, prefer to connect to the test server instead to avoid editing production data. + You can find the test API at [https://aiod-dev.i3a.es](https://aiod-dev.i3a.es). + +The `POST` and `PUT` endpoints allow the addition or modification of assets on the platform. +You can explore them in a similar way as with the `GET` endpoints, with two important differences. + +The first is that they require authentication. +To authenticate within the Swagger pages, navigate to the top and click `Authorize` and log in. +Scroll up to `OpenIdConnect (OAuth2, authorization_code with PKCE)` and click `Authorize` to be taken to +the keycloak login page. Log in with your preferred identity provider through `EGI Check-in`. + +The second important distinction as that you will provide data through a JSON body instead of individual parameters. +The documentation page will prepopulate example data to help you know what information to provide under +the `Example Value` tab of the `Request Body` section. To know what values are accepted, you can click the +`Schema` tab instead. + +![The "schema" tab in Swagger shows allowed types](media/post.webp) + + +### Alternative Documentation (ReDoc) +The [ReDoc documentation](https://api.aiod.eu/redoc) provides pretty similar functionality to the Swagger documentation. +The main difference is that the Swagger page allows you to run queries against the REST API, whereas the ReDoc documentation does not. +However, some people prefer the organisation of ReDoc, +especially with respect to documentation of the expected responses and the schema documentation. + +## REST API using CURL +The Swagger documentation gives examples on how to use CURL for the various endpoints. +To see examples, simply expand the endpoint's documentation and click `Try it out`, fill in any parameters, and click `Execute`. +The query will be executed, but it will also generate a `curl` command which matches the query. + +For example, listing the first 10 datasets: + +```bash +curl -X 'GET' \ + 'http://api.aiod.eu/datasets/v1?schema=aiod&offset=0&limit=10' \ + -H 'accept: application/json' +``` \ No newline at end of file diff --git a/docs/developer/auth.md b/docs/developer/auth.md new file mode 100644 index 00000000..094860d6 --- /dev/null +++ b/docs/developer/auth.md @@ -0,0 +1,3 @@ +# Authentication + +--8<-- "./authentication/README.md" \ No newline at end of file diff --git a/docs/developer/code.md b/docs/developer/code.md new file mode 100644 index 00000000..e33c82c6 --- /dev/null +++ b/docs/developer/code.md @@ -0,0 +1,3 @@ +# Code/Architecture + +--8<-- "./src/README.md" diff --git a/docs/developer/migration.md b/docs/developer/migration.md new file mode 100644 index 00000000..f40ce0a5 --- /dev/null +++ b/docs/developer/migration.md @@ -0,0 +1,3 @@ +# Database Schema Migrations + +--8<-- "./alembic/README.md" \ No newline at end of file diff --git a/docs/developer/scripts.md b/docs/developer/scripts.md new file mode 100644 index 00000000..8648b070 --- /dev/null +++ b/docs/developer/scripts.md @@ -0,0 +1,3 @@ +# Scripts + +--8<-- "scripts/README.md" \ No newline at end of file diff --git a/media/AIoD_Metadata_Model.drawio b/docs/media/AIoD_Metadata_Model.drawio similarity index 100% rename from media/AIoD_Metadata_Model.drawio rename to docs/media/AIoD_Metadata_Model.drawio diff --git a/media/AIoD_Metadata_Model.drawio.png b/docs/media/AIoD_Metadata_Model.drawio.png similarity index 100% rename from media/AIoD_Metadata_Model.drawio.png rename to docs/media/AIoD_Metadata_Model.drawio.png diff --git a/media/GetDatasetUML.drawio b/docs/media/GetDatasetUML.drawio similarity index 100% rename from media/GetDatasetUML.drawio rename to docs/media/GetDatasetUML.drawio diff --git a/media/GetDatasetUML.png b/docs/media/GetDatasetUML.png similarity index 100% rename from media/GetDatasetUML.png rename to docs/media/GetDatasetUML.png diff --git a/docs/media/post.webp b/docs/media/post.webp new file mode 100644 index 00000000..fdc373e5 Binary files /dev/null and b/docs/media/post.webp differ diff --git a/docs/media/response.webp b/docs/media/response.webp new file mode 100644 index 00000000..f0a6fdff Binary files /dev/null and b/docs/media/response.webp differ diff --git a/docs/media/swagger.webp b/docs/media/swagger.webp new file mode 100644 index 00000000..9c1a8715 Binary files /dev/null and b/docs/media/swagger.webp differ diff --git a/mkdocs.yaml b/mkdocs.yaml new file mode 100644 index 00000000..57bc4bcc --- /dev/null +++ b/mkdocs.yaml @@ -0,0 +1,25 @@ +site_name: AI-on-Demand REST API +site_url: https://api.aiod.eu/docs +theme: + name: material + features: + - content.code.copy + +nav: + - Using the API: Using.md + - Hosting the API: Hosting.md + - 'Developer Resources': README.md + - 'Unorganized Docs': + - 'Code Advice': developer/code.md + - 'Keycloak': developer/auth.md + - 'DB Schema Migration': developer/migration.md + - 'Scripts': developer/scripts.md + +markdown_extensions: + - pymdownx.snippets: + check_paths: true + - admonition + - pymdownx.details + - pymdownx.superfences + - pymdownx.tabbed: + alternate_style: true \ No newline at end of file diff --git a/pyproject.toml b/pyproject.toml index f1f917c9..f693d8b3 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -14,6 +14,7 @@ authors = [ { name = "Taniya Das", email = "t.das@tue.nl" } ] dependencies = [ + "mkdocs-material", "urllib3== 2.1.0", "bibtexparser==1.4.1", "huggingface_hub==0.23.4",