Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: enable ssh tunneling using env variables for superset #60

Closed
wants to merge 2 commits into from

Conversation

njuguna-n
Copy link
Contributor

#56

Enable SSH tunneling for superset

Copy link
Contributor

@lorerod lorerod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @njuguna-n, thank you for these changes.
Please add steps on the README file to explain setting up SSH tunneling.
I also left some comments inline.

}

ENABLE_PROXY_FIX = True
SECRET_KEY = "YOUR_OWN_RANDOM_GENERATED_STRING"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there another way to set this secret key? Maybe to also use an environment variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that should also be set with an env variable. I will make the change.

@@ -26,3 +26,4 @@ COUCHDB_SECURE=false
# superset: required environment variables for 'gamma', 'prod' and 'local'
SUPERSET_PASSWORD=password
[email protected]
SUPERSET_SSH_TUNNELING=false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SUPERSET_SSH_TUNNELING=false
SUPERSET_SSH_TUNNELING=False

With this setting: SUPERSET_SSH_TUNNELING=false in my .env file DBT container exits with errors.
It works when I change this line to SUPERSET_SSH_TUNNELING=False
Please add this detail to the README.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that. I will make the update.

@mrjones-plip
Copy link
Contributor

mrjones-plip commented Dec 5, 2023

Hey all! Amazing timing on this PR! I was just working on another superset deployment, and didn't know that there was built in support for SSH tunnels. Thank you for the hard work and PR here <3

What about more standard docker compose based deployment of Superset? There might be some history I'm missing which could make me appreciate why we're chose to use a bespoke compose solution for CHT Sync's Superset. I'm for sure coming into this late in the game, so there may be some real good reasons we're here!

For example, I notice our compose file is limited as compared to the Superset's production one (aka docker-compose-non-dev.yml). I believe this is forcing us to compile a local image which statically includes the superset_config.py file. If we use Superset's we get a bind mounted version which might avoid compiling. The SECRET_KEY is set in their example production .env file, which we're not leveraging either.

I'm concerned about the lack of a caching layer, like Redis, which Superset's production deployments include. When I worked on the Covid MoH Mali Superset dashboard, there was a point when we realized the system was so slow because there was no cache - likely it was falling back to filesystem cache or, worse, no cache (I failed to find Superset's default 🤷 ).

@mrjones-plip
Copy link
Contributor

Oh - I do see that production install flow is "quick, login with admin / admin and change you're password" which is kind of a loser move for an internet exposed instance. Fix it in our docs?

@njuguna-n
Copy link
Contributor Author

@mrjones-plip I was created a ticket today specifically to handle how we set up superset especially for production so your comment has come at a handy time as well 😄. The initial idea of adding the superset image the way we did was to limit the number of Docker containers required to get cht-sync up and running, we already have 8 containers that are used in the dev environment. This approach definitely needs to change for production as you rightly pointed out

@mrjones-plip
Copy link
Contributor

oh - that totally makes sense @njuguna-n to have a simplified dev environment - thanks! I suspect we can follow Superset's lead and have dev and prod compose files an environment files. Lemme know if you need a hand.

Also - I just tested with vanilla 2.1.2 docker compose production install it looks like maybe SSH Tunnels are supported with no extra config? I think maybe due to a change in the default. Here's my test in the GUI ( but didn't actual test a live tunnel):

superset.tunnel.mp4

@njuguna-n
Copy link
Contributor Author

Is the SHH Tunnel option still there if you click on the SQLAlchemy URI string option?

@njuguna-n
Copy link
Contributor Author

njuguna-n commented Dec 5, 2023

I suspect we can follow Superset's lead and have dev and prod compose files an environment files.

Yes, it makes sense to follow do that. Do you think we still need a cache for superset on the dev instance?

@mrjones-plip
Copy link
Contributor

Do you think we still need a cache for superset on the dev instance?

No - should be fine with without. KISS for dev , most def!

Is the SHH Tunnel option still there if you click on the SQLAlchemy URI string option?

Hmm - maybe? I don't know how that feature works, but here's what I see:

alchemy.mp4

@njuguna-n
Copy link
Contributor Author

There should be a toggle right above the Test Connection button. Maybe the default does not work. I am using this branch which has version 2.1.2 as well.

Screen.Recording.2023-12-06.at.03.09.03.mov

@mrjones-plip
Copy link
Contributor

@njuguna-n - we got the tunnel working, but for sure had to use the superset_config_docker.py trick (even in Superset 3.x). Here's the steps we took, and if you don't do that, you see this:

image

Still no testing of the SQLAlchemy URI string option - but happy to jump on a call and test as I have SSH and Postgres credentials that work!

@njuguna-n njuguna-n requested a review from lorerod January 4, 2024 14:19
@njuguna-n
Copy link
Contributor Author

@mrjones-plip I pushed a commit to change how the superset container is created. Please have a look.

@mrjones-plip
Copy link
Contributor

@njuguna-n - wow! Some really nice work here at improving CHT Sync to not build custom images, use version pinned upstream images of all the important services and use docker compose inheritance to chain together multiple configs. Excellent!

I trust the teams testing on this, so I haven't dived into running the code, but I did notice one possible error here in the readme in prod where you have an optional step 2 of :

docker-compose -f docker-compose.postgres.yml -f docker-compose.yml up postgres

Two changes I think are needed:

  1. add a trailing -d to make it run in the background like the rest of the up calls
  2. remove f docker-compose.yml as this will start logstash and dbt which we start in the npm calls.

Copy link
Contributor

@lorerod lorerod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please update the e2e test to successfully connect to the superset instance?

@njuguna-n
Copy link
Contributor Author

Closing this PR since we have decided not to include Superset in this repo for now. More context on this in slack.

@njuguna-n njuguna-n closed this Jan 16, 2024
@njuguna-n njuguna-n deleted the 56-enable-ssh-tunneling branch January 16, 2024 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants