Web services for OpenSpending, responsible for:
- user authentication, identity and access control
- package upload, management, and status
- package search of os-package-registry
- upload to the S3 datastore
os-conductor uses Flask web framework.
Clone the repo, install dependencies from pypi, and run the server. See the docs for more information.
The os-types
node utility is used to perform fiscal modelling for the processed datapackage. To install, use npm:
$ npm install -g os-types
With a running ElasticSearch server available on localhost:9200:
$ pip install tox # install tox
$ tox
OS Conductor requires environmental variables to be set, either in the local environment or in a .env
file in the root directory.
# Required settings
# Base URL for the application, e.g. 'http://localhost' or 'https://openspending.org'
OS_BASE_URL=
# Address for the postgres instance, e.g. postgresql://postgres@db/postgres
OS_CONDUCTOR_ENGINE=
# Address for ElasticSearch instance
OS_ELASTICSEARCH_ADDRESS=
# OAuth credentials. See the OAuth Credentials section below for details.
OS_CONDUCTOR_SECRETS_0=
OS_CONDUCTOR_SECRETS_1=
OS_CONDUCTOR_SECRETS_2=
OS_CONDUCTOR_SECRETS_3=
# AWS S3 credentials
OS_ACCESS_KEY_ID=
OS_SECRET_ACCESS_KEY=
OS_S3_HOSTNAME=
OS_STORAGE_BUCKET_NAME=
# Optional settings
# Address for memcached server, e.g. http://cache:11211
OS_CONDUCTOR_CACHE=
# Address for the redis os-api-cache server, e.g. redis
OS_API_CACHE=
# If this env var exists, the entrypoint script will check whether ElasticSearch is healthy before allowing os-conductor to start.
OS_CHECK_ES_HEALTHY=
# If using the fake-s3 docker container for development, openspending/fakes3, add these settings:
USE_FAKE_S3=True
OS_S3_PORT=4567
# comma-separated list of colon-separated api key and user id to enable secret-based authentication.
# API keys are strong secrets we make up for each client.
# userids are the user id of the user seen in loca storage or URLs after a user logs in via Google OAuth.
OS_CLIENT_API_KEYS=...apikey...:...userid...,...apikey2...:...userid2...
OS Conductor needs credentials for authentication and authorization tasks. Credential values is set on OS_CONDUCTOR_SECRETS_<n>
env vars. We provide a python script to help generate these values in docker/secrets/generate-secrets/to_env_vars.py
.
- Create a Google OAuth Credentials and retain the Client ID and Secret Key
- Paste the Client ID and Secret Key values in to
google.key
andgoogle.secret.key
files respectively within thegenerate-secrets
directory. - Run the python script:
$ cd docker/secrets/generate-secrets
$ python ./to_env_vars.py
- Copy the generated env var key/values into your local environment or
.env
file in the root directory.
Various admin tools are available in the /tools
directory. Some tools require dependencies to be installed from /tools/requirements.txt
.
Remove a named package (or packages) from the ElasticSearch index, and hence from searches and discovery within OpenSpending. Removing a package from the index won't remove it from the AWS datastore.
Conductor current ships with the following blueprints, and their API endpoints.
/datastore/authorize
Method: POST
Query Parameters:
jwt
- permission token (received from/user/authorize
)
Headers:
Auth-Token
- permission token (can be used instead of thejwt
query parameter)
Body:
JSON content with the following structure:
{
"metadata": {
"owner": "<user-id-of-uploader>",
"name": "<data-set-unique-id>"
},
"filedata": {
"<relative-path-to-file-in-package-1>": {
"length": 1234, // length in bytes of data
"md5": "<md5-hash-of-the-data>",
"type": "<content-type-of-the-data>",
"name": "<file-name>"
},
"<relative-path-to-file-in-package-2>": {
"length": 4321,
"md5": "<md5-hash-of-the-data>",
"type": "<content-type-of-the-data>",
"name": "<file-name>"
}
...
}
}
owner
must match the userid
that is in the authentication token.
/datastore/info
Method: GET
Query Parameters:
jwt
- permission token (received from/user/authorize
)
Headers:
Auth-Token
- permission token (can be used instead of thejwt
query parameter)
Returns:
JSON content with the following structure:
{
"prefixes": [
"https://datastore.openspending.org/123456789",
...
]
}
prefixes
is the list of possible prefixes for an uploaded file for this user.
/package/upload
Method: POST
Query Parameters:
jwt
- permission token (received from/user/authorize
)datapackage
- URL of the Fiscal DataPackage to load
/package/status
Method: GET
Query Parameters:
datapackage
- URL of the Fiscal DataPackage being loaded
Returns:
{
"status": "<status-code>",
"progress": 123,
"error": "<error-message-if-applicable>"
}
status-code
: one of the following:queued
: Waiting in queue for an available processorinitializing
: Getting ready to load the packageloading-datapackage
: Reading the Fiscal Data Packagevalidating-datapackage
: Validating Data Package correctnessloading-resource
: Loading Resource datadeleting-table
: Clearing previous rows for this dataset from the databasecreating-table
: Preparing space for rows in the databaseloading-data-ready
: Starting to load rows to databaseloading-data
: Loading data into the databasecreating-babbage-model
: Converting the Data Package into an API modelsaving-metadata
: Saving package metadatadone
: Donefail
: Failed
progress
: # of records loaded so far
Wil return an HTTP 404
if the package is not being loaded right now.
/package/publish
Method: POST
Query Parameters:
jwt
- permission token (received from/user/authorize
)id
- Unique identifier of the datapackage to modifypublish
- Publishing status, either:true
: force publish,false
: force private,toggle
: toggle the state
Returns:
{
"success": true,
"published": true // or false
}
/search/package
Method: GET
Query Parameters:
jwt
- authentication token (received from/user/check
)q
- match-all query stringpackage.title
- filter by package titlepackage.author
- filter by package authorpackage.description
- filter by package descriptionpackage.regionCode
- filter by package region codepackage.countryCode
- filter by package region codepackage.packageCode
- filter by package region codesize
- number of results to return
All values for all parameters (except jwt
) should be passed as JSON values.
Returns:
All packages that match the filter.
If authentication-token was provided, then private packages from the authenticated user will also be included. Otherwise, only public packages will be returned.
[
{
"id": "<package-unique-id>",
"model": { ... }, // Babbage model
"package": { .... }, // Original FDP
"origin_url": "<url-to-the-datapackage.json>"
}
]
/user/check
Method: GET
Query Parameters:
jwt
- authentication tokennext
- URL to redirect to when finished authentication
Returns:
If authenticated:
{
"authenticated": true,
"profile": {
"id": "<user-id>",
"name": "<user-name>",
"email": "<user-email>",
"avatar_url": "<url-for-user's-profile-photo>",
"idhash": "<unique-id-of-the-user>",
"username": "<user-selected-id>" // If user has a username
}
}
If not:
{
"authenticated": false,
"providers": {
"google": {
"url": "<url-for-logging-in-with-the-Google-provider>"
}
}
}
When the authentication flow is finished, the caller will be redirected to the next
URL with an extra query parameter
jwt
which contains the authentication token. The caller should cache this token for further interactions with the API.
/user/authorize
Method: GET
Query Parameters:
jwt
- user token (received from/user/check
)service
- the relevant service (e.g.os.datastore
)
Returns:
{
"token": "<token-for-the-relevant-service>"
"userid": "<unique-id-of-the-user>",
"permissions": {
"permission-x": true,
"permission-y": false
},
"service": "<relevant-service>"
}
Note: as of yet: the permissions
property is still returned empty. Real permissions will be implemented soon.
/user/update
Method: POST
Query Parameters:
jwt
- authentication token (received from/user/check
)username
- A new username for the user profile (this action is only allowed once)
Returns:
{
"success": true,
"error": "<error-message-if-applicable>"
}
Note: trying to update other user profile fields like email
will fail silently and return
{
"success": true
}
/user/public-key
Method: GET
Returns:
The conductor's public key in PEM format.
Can be used by services to validate that the permission token is authentic.
/user/lib
Method: GET
Returns:
Authentication Javascript library with Angular 1.x binding.