Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for PostgREST #7

Open
moshthepitt opened this issue May 28, 2020 · 12 comments
Open

Add support for PostgREST #7

moshthepitt opened this issue May 28, 2020 · 12 comments
Assignees

Comments

@moshthepitt
Copy link
Contributor

It would be useful to interact with the database directly. Especially for instances when the web applications that rely on this server need to read and write from Canopy databases. Currently this is done through Superset using our Superset Connector.

Why is using Superset this way bad?

  • Superset is not meant to be used as an API to the database
  • Setting up Superset to be used this way introduces an intermediate manual step where Superset slices need to be created for each piece of data that needs to be accessed by the web application
  • The calls through Superset are slow
  • SupersetConnector's usage of OAuth is not standard, and is hacky

Using PostgREST would mean that we cut out the middle-man (Superset) and communicate directly with the database.

Why PostgREST

  • It gives us simple direct access to the database with no additional overheads
  • It is a small "sharp" tool that does only one thing very well, and gives us what we need: a way to access the database
  • It is a transparent API to the database and does not impose its own opinions, etc. The database is the source of truth for everything

Working with PostgREST for front-end applications

postgrest

Let's explain the diagram from left to right

Front-end web application

The front end web application will generate PostgREST API requests and send them to the express-server

Express Server

The express server acts as a proxy between the front-end application and PostgREST. The goal is to make it such that the front end application does not need to authenticate against or even "know" about how PostgREST is set up internally.

Therefore, the express server receives requests from the front end app, sends them to PostgREST, receives the results from PostgREST and then sends them back to the front end application. This is in accordance with the decisions made here.

Doing it this way will also mean that we don't expose PostgREST to the world. Access to it can (and should) be limited to only the express server.

PostgREST

PostgREST will, of course, provide a REST API for communicating with the database.

Postgres Database

This represents the PostgreSQL database(s) that we we want to communicate with.

Potential problems with this approach

  • This only works for Postgres. But then again, we only use Postgres (so far).
@moshthepitt moshthepitt self-assigned this May 28, 2020
@moshthepitt
Copy link
Contributor Author

@moshthepitt
Copy link
Contributor Author

For extra clarity: PostgREST itself will be a tool "owned" by the Canopy team, as way to conveniently access Canopy databases.

@moshthepitt
Copy link
Contributor Author

This, or something like this, is needed to solve onaio/reveal-frontend#838 and onaio/reveal-frontend#837

@mberg
Copy link
Member

mberg commented May 28, 2020

I like this idea a lot. I really want to get away from the way we are using superset.

If we assume this is the architure for opensrp web. Do we want to add in the node server or is this something we would want to consider adding to OpenSRP server?

@mberg
Copy link
Member

mberg commented May 28, 2020

Does PGRest provide a good endpoint for data exports?

@alepietrobon
Copy link

alepietrobon commented May 28, 2020

This is good. I'm as supportive as anyone in reducing our hacky superset use. :)
Just wanted to confirm if we are sure REST apis are the way to go here. I'm no expert but i know in my previous company we did a ton of work to transition to graphql so i wanted to confirm if that was considered and if we've evaluated the tradeoffs (https://www.howtographql.com/basics/1-graphql-is-the-better-rest/)

@gstuder-ona
Copy link

+10 on replacing the superset slice connector - it'll also push us toward using postgres views vs custom superset queries as the backends of our superset charts.

This, or something like this, is needed to solve onaio/reveal-frontend#838 and onaio/reveal-frontend#837

I'm 100% in favor of postgrest, but it worries me at first glance to bypass our current ingestion tools (NiFi) via this API - it seems more in line to have a NiFi endpoint so we don't tightly couple the reporting schema with the web ui, push data warehouse processing (maybe joining with public geo data, etc.) into the web application itself, etc.

@moshthepitt
Copy link
Contributor Author

If we assume this is the architure for opensrp web. Do we want to add in the node server or is this something we would want to consider adding to OpenSRP server?

In my view ultimately/eventually for OpenSRP we want to get rid of the node server completely. OpenSRP server can supply all that we need.

@moshthepitt
Copy link
Contributor Author

Does PGRest provide a good endpoint for data exports?

@mberg we can get the results of a query out as something like CSV - see: http://postgrest.org/en/v7.0.0/api.html?highlight=stream#response-format

@moshthepitt
Copy link
Contributor Author

but it worries me at first glance to bypass our current ingestion tools (NiFi) via this API - it seems more in line to have a NiFi endpoint so we don't tightly couple the reporting schema with the web ui, push data warehouse processing (maybe joining with public geo data, etc.) into the web application itself, etc.

@gstuder-ona Agree. I am currently also looking for an OpenSRP server solution to onaio/reveal-frontend#838 and onaio/reveal-frontend#837 coz I think that the data more naturally belongs there, instead of living only in the Canopy database.

@denniswambua
Copy link

The new design is to replace the express server with a reverse proxy.

warehouse api

Authentication is provided by JWT tokens and postgrest verifies the signature with the given key.
A robust authentication layer can be added with openresty and openidc on the reverse proxy(nginx).

Data access is managed using postgres roles and privileges and even have row level privileges https://www.postgresql.org/docs/9.5/ddl-rowsecurity.html

@denniswambua
Copy link

For graphql, we have a strong contender https://github.com/hasura/graphql-engine. It is open source, well documented, huge community. In my opinion hasura is more advance with many features we can use.
Kind similar architecture with postgrest.
auth-jwt-overview1

peterMuriuki pushed a commit that referenced this issue Feb 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants