Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TB-397 remove jembi superset dependency #269

Merged
merged 17 commits into from
Mar 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
68fa993
Move custom superset dependencies from jembi/superset to the superset…
arran-standish Feb 23, 2024
db0bc86
Update the superset connection string to follow the new clickhouse-co…
arran-standish Feb 23, 2024
db0de1d
Make build-image tag to the version in config.yaml if no tag passed i…
arran-standish Feb 23, 2024
7249741
Make superset only deploy to a specific node in cluster (volume data …
arran-standish Feb 23, 2024
1d09490
Parameterise superset image
arran-standish Feb 23, 2024
d9b953b
Merge branch 'main' of https://github.com/jembi/platform into tb-397-…
arran-standish Feb 28, 2024
f58664f
Add postgres to replace superset default sqlite metastore
arran-standish Feb 29, 2024
7f752c1
Remvoe unnecessary volume definitions as they prevent superset versio…
arran-standish Feb 29, 2024
7edc1e4
Add details on how to upgrade superset + rollback a superset upgrade
arran-standish Feb 29, 2024
d92e504
Parameterise superset feature flags
arran-standish Mar 1, 2024
38974cd
Merge branch 'main' into tb-397-remove-jembi-superset-dependency
arran-standish Mar 1, 2024
2abac6b
Merge https://github.com/jembi/platform into tb-397-remove-jembi-supe…
arran-standish Mar 1, 2024
187dac2
Simplify image tag determination + remove yq dependency
arran-standish Mar 6, 2024
0d0e164
Use postgres image instead, since we dont replicated data
arran-standish Mar 6, 2024
e97172c
Merge branch 'tb-397-remove-jembi-superset-dependency' of https://git…
arran-standish Mar 6, 2024
87ceb50
Fix postgres cluster compose name
arran-standish Mar 12, 2024
6887cc2
Pin superset to current latest version
arran-standish Mar 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions build-image.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,15 @@
#!/bin/bash
TAG_NAME=${1:-latest}

# We did not specify a tag so try and use the tag in the config.yaml if present
if [ -z "$1" ]; then
# we grep out 'image: jembi/platform:2.x' from which we cut on : and choose the last column
# this will always be the image tag or an empty string
ImageTag=$(grep 'image:' ${PWD}/config.yaml | cut -d : -f 3)
# only overwrite TAG_NAME if we have a tag present, and it's not just the base image name
if [ -n "$ImageTag" ]; then
TAG_NAME=${ImageTag}
fi
fi

docker build -t jembi/platform:"$TAG_NAME" .
5 changes: 5 additions & 0 deletions dashboard-visualiser-superset/config/requirements-local.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
psycopg2>=2.9.9
clickhouse-connect>=0.7.0
flask-oidc>=1.3.0
itsdangerous>=2.0.1
flask_openid>=1.3.0
19 changes: 15 additions & 4 deletions dashboard-visualiser-superset/config/superset_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,16 +22,27 @@
#

# A list of available feature flags is available at https://github.com/apache/superset/blob/master/RESOURCES/FEATURE_FLAGS.md
# FEATURE_FLAGS = {
# }
# The environment variable SUPERSET_ENABLED_FEATURE_FLAGS (e.g: "DASHBOARD_RBAC,ENABLE_TEMPLATE_PROCESSING") is a comma seperated list of flags to enable
# And allows for flags to be enabled/disabled without having to redploy the superset package
import os

FLAGS = os.getenv('SUPERSET_ENABLED_FEATURE_FLAGS')
FEATURE_FLAGS = { key: True for key in FLAGS.split(',') if key != '' }

# Variables for use in Superset with Jinja templating
# JINJA_CONTEXT_ADDONS = {
# }

# --------------------------- POSTGRESQL METASTORE ----------------------------

# ---------------------------KEYCLOACK ----------------------------
import os
METASTORE_USERNAME = os.getenv('SUPERSET_POSTGRESQL_USERNAME')
METASTORE_PASSWORD = os.getenv('SUPERSET_POSTGRESQL_PASSWORD')
METASTORE_DATABASE = os.getenv('SUPERSET_POSTGRESQL_DATABASE')
METASTORE_URL = os.getenv('SUPERSET_POSTGRESQL_URL')
SQLALCHEMY_DATABASE_URI = f'postgresql://{METASTORE_USERNAME}:{METASTORE_PASSWORD}@{METASTORE_URL}/{METASTORE_DATABASE}'


# --------------------------- KEYCLOACK ----------------------------

KC_SUPERSET_SSO_ENABLED = os.getenv('KC_SUPERSET_SSO_ENABLED')

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
version: '3.9'

services:
postgres-metastore:
deploy:
placement:
constraints:
- "node.labels.name==node-2"
24 changes: 24 additions & 0 deletions dashboard-visualiser-superset/docker-compose.postgres.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
version: "3.9"

services:
postgres-metastore:
image: postgres:16.2
environment:
POSTGRES_USER: ${SUPERSET_POSTGRESQL_USERNAME}
POSTGRES_PASSWORD: ${SUPERSET_POSTGRESQL_PASSWORD}
POSTGRES_DB: ${SUPERSET_POSTGRESQL_DATABASE}
volumes:
- "superset-postgres-data:/var/lib/postgresql/data"
deploy:
replicas: 1
resources:
limits:
memory: ${SUPERSET_POSTGRES_MEMORY_LIMIT}
networks:
default:
arran-standish marked this conversation as resolved.
Show resolved Hide resolved

volumes:
superset-postgres-data:

networks:
default:
20 changes: 14 additions & 6 deletions dashboard-visualiser-superset/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ version: '3.9'

services:
dashboard-visualiser-superset:
image: jembi/superset:latest
image: ${SUPERSET_IMAGE}
environment:
KC_SUPERSET_SSO_ENABLED: ${KC_SUPERSET_SSO_ENABLED}
KC_SUPERSET_CLIENT_ID: ${KC_SUPERSET_CLIENT_ID}
Expand All @@ -13,11 +13,14 @@ services:
SUPERSET_SECRET_KEY: ${SUPERSET_SECRET_KEY}
AUTH_USER_REGISTRATION_ROLE: ${AUTH_USER_REGISTRATION_ROLE}
SUPERSET_SERVER_ROOT_URL: ${KC_SUPERSET_ROOT_URL}
SUPERSET_POSTGRESQL_USERNAME: ${SUPERSET_POSTGRESQL_USERNAME}
SUPERSET_POSTGRESQL_PASSWORD: ${SUPERSET_POSTGRESQL_PASSWORD}
SUPERSET_POSTGRESQL_DATABASE: ${SUPERSET_POSTGRESQL_DATABASE}
SUPERSET_POSTGRESQL_URL: ${SUPERSET_POSTGRESQL_URL}
SUPERSET_ENABLED_FEATURE_FLAGS: ${SUPERSET_ENABLED_FEATURE_FLAGS}
volumes:
- superset_home:/app/superset_home
- superset:/app/superset
- superset-frontend:/app/superset-frontend
command: sh -c "superset fab create-admin \ --username ${SUPERSET_USERNAME} \ --firstname ${SUPERSET_FIRSTNAME} \ --lastname ${SUPERSET_LASTNAME} \ --email ${SUPERSET_EMAIL} \ --password ${SUPERSET_PASSWORD} && superset db upgrade && superset init && cd /usr/bin && ./run-server.sh"
command: sh -c "pip install --no-cache-dir -r "/app/docker/requirements-local.txt" && superset fab create-admin \ --username ${SUPERSET_USERNAME} \ --firstname ${SUPERSET_FIRSTNAME} \ --lastname ${SUPERSET_LASTNAME} \ --email ${SUPERSET_EMAIL} \ --password ${SUPERSET_PASSWORD} && superset db upgrade && superset init && cd /usr/bin && ./run-server.sh"
configs:
- source: superset_config.py
target: /app/pythonpath/superset_config.py
Expand All @@ -27,6 +30,8 @@ services:
target: /app/pythonpath/client_secret.json
- source: keycloack_security_manager.py
target: /app/pythonpath/keycloack_security_manager.py
- source: requirements-local.txt
target: /app/docker/requirements-local.txt
networks:
clickhouse:
keycloak:
Expand All @@ -49,11 +54,14 @@ configs:
name: keycloack_security_manager.py-${keycloack_security_manager_py_DIGEST:?err}
labels:
name: superset
requirements-local.txt:
file: ./config/requirements-local.txt
name: requirements-local.txt-${requirements_local_txt_DIGEST:?err}
labels:
name: superset

volumes:
superset_home:
superset:
superset-frontend:

networks:
clickhouse:
Expand Down
Binary file not shown.
9 changes: 8 additions & 1 deletion dashboard-visualiser-superset/package-metadata.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
"version": "0.0.1",
"dependencies": ["analytics-datastore-clickhouse"],
"environmentVariables": {
"SUPERSET_IMAGE": "apache/superset:3.1.1",
"SUPERSET_ENABLED_FEATURE_FLAGS": "DASHBOARD_RBAC",
"SUPERSET_USERNAME": "admin",
"SUPERSET_FIRSTNAME": "SUPERSET",
"SUPERSET_LASTNAME": "ADMIN",
Expand All @@ -21,6 +23,11 @@
"KC_REALM_NAME": "platform-realm",
"KC_FRONTEND_URL": "http://localhost:9088",
"KC_API_URL": "http://identity-access-manager-keycloak:8080",
"AUTH_USER_REGISTRATION_ROLE": "Admin"
"AUTH_USER_REGISTRATION_ROLE": "Admin",
"SUPERSET_POSTGRES_MEMORY_LIMIT": "1G",
"SUPERSET_POSTGRESQL_USERNAME": "admin",
"SUPERSET_POSTGRESQL_PASSWORD": "admin",
"SUPERSET_POSTGRESQL_DATABASE": "superset",
"SUPERSET_POSTGRESQL_URL": "postgres-metastore:5432"
}
}
7 changes: 7 additions & 0 deletions dashboard-visualiser-superset/swarm.sh
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ function import_sources() {

function initialize_package() {
local superset_dev_compose_filename=""
local superset_postgres_cluster_compose_filename=""

if [[ "${MODE}" == "dev" ]]; then
log info "Running package in DEV mode"
Expand All @@ -41,10 +42,16 @@ function initialize_package() {
log info "Running package in PROD mode"
fi

if [[ "${CLUSTERED_MODE}" == "true" ]]; then
superset_postgres_cluster_compose_filename="docker-compose.postgres.cluster.yml"
fi

# Replace env vars
envsubst <"${COMPOSE_FILE_PATH}/config/client_secret_env.json" >"${COMPOSE_FILE_PATH}/config/client_secret.json"

(
docker::deploy_service $STACK "${COMPOSE_FILE_PATH}" "docker-compose.postgres.yml" "$superset_postgres_cluster_compose_filename"

docker::deploy_service $STACK "${COMPOSE_FILE_PATH}" "docker-compose.yml" "$superset_dev_compose_filename"
) || {
log error "Failed to deploy package"
Expand Down
31 changes: 31 additions & 0 deletions documentation/packages/dashboard-visualiser-superset/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,34 @@ description: >-

# Dashboard Visualiser - Superset

## Version upgrade process (with rollback capability)
By default if you simply update the image that the superset service uses to a later version, when the container is scheduled it will automatically run a database migration and the version of superset will be upgraded. The problem, however, is that if there is an issue with this newer version you cannot rollback the upgrade since the database migration that ran will cause the older version to throw an error and the container will no longer start.
As such it is recommended to first create a postgres dump of the superset postgres database before attempting to upgrade superset's version.
1. Exec into the postgres container as the root user (otherwise you will get write permission issues)
```bash
docker exec -u root -it superset_postgres-metastore-1.container-id-here bash
```
2. Run the pg_dump command on the superset database. The database name is stored in `SUPERSET_POSTGRESQL_DATABASE` and defaults to `superset`
```bash
pg_dump superset -c -U admin > superset_backup.sql
```
3. Copy that dumpped sql script outside the container
```bash
docker cp superset_postgres-metastore-1.container-id-here:/superset_backup.sql /path/to/save/to/superset_backup.sql
```
4. Update the superset version (either through a platform deploy or with a docker command on the server directly -- `docker service update superset_dashboard-visualiser-superset --image apache/superset:tag`)

### Rolling back upgrade
In the event that something goes wrong you'll need to rollback the database changes too, i.e.: run the superset_backup.sql script we created before upgrading the superset version
1. Copy the superset_backup.sql script into the container
```bash
docker cp /path/to/save/to/superset_backup.sql superset_postgres-metastore-1.container-id-here:/superset_backup.sql
```
2. Exec into the postgres container
```bash
docker exec -it superset_postgres-metastore-1.container-id-here bash
```
3. Run the sql script (where -d superset is the database name stored in `SUPERSET_POSTGRESQL_DATABASE`)
```bash
cat superset_backup.sql | psql -U admin -d superset
```