This is a simple docker-compose project to demonstrate how to query a MongoDB database with Apache Superset through the MongoDB BI Connector.
I would like to thanks @smarzola for his sqlalchemy-mongobi python package which adds a dialect for sqlalchemy disabling transactions not supported with the BI connector.
This docker compose project is made up of these containers :
poc_ssm_mongo
: a mongo 5 single instancepoc_ssm_mongo_express
: a mongo express UI to browse the data in the instance at http://localhost:8081/poc_ssm_mongo_bi
: a mongo bi connector connected to the mongo instancepoc_ssm_mongo_data
: a container loading a sample dataset in the mongo instancepoc_ssm_mysql_client
: a container with a mysql cli client to demonstrate connection through the mongo bi connectorpoc_ssm_sqlalchemy
: a container with a simple sqlalchemy script to demonstrate connection through the mongo bi connectorpoc_ssm_superset
: a container with a superset instance to demonstrate connection through the mongo bi connector
Starts the project
docker-compose up
Wait for everything to be ready and for the poc_ssm_mongo_data
to exit with a 0 status code.
Note : look at the logs of the superset instance before connecting to wait for all the init scripts in the entrypoint to finish before the gunicorn web server is started
Then you can try the different connection mode.
$ docker-compose exec poc_ssm_mysql_client bash
# mysql -h poc_ssm_mongo_bi -P 3307 -u 'root?source=admin' --default-auth=mongosql_auth -proot
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 457
Server version: 5.7.12 mongosqld v2.14.3 mongosqld v2.14.3
Copyright (c) 2000, 2021, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| INFORMATION_SCHEMA |
| mysql |
| samples |
+--------------------+
3 rows in set (0.00 sec)
mysql> use samples;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show tables;
+---------------------------------------+
| Tables_in_samples |
+---------------------------------------+
| companies |
| companies_acquisitions |
| companies_competitions |
| companies_external_links |
| companies_funding_rounds |
| companies_funding_rounds_investments |
| companies_image_available_sizes |
| companies_investments |
| companies_milestones |
| companies_offices |
| companies_partners |
| companies_products |
| companies_providerships |
| companies_relationships |
| companies_screenshots |
| companies_screenshots_available_sizes |
| companies_video_embeds |
+---------------------------------------+
17 rows in set (0.00 sec)
mysql> select * from companies limit 1\G
*************************** 1. row ***************************
_id: 52cdef7c4bab8bd675297d8b
name: AdventNet
1 row in set (0.00 sec)
mysql>
$ docker-compose exec poc_ssm_sqlalchemy bash
# python3 query.py
/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/default.py:410: SAWarning: Exception attempting to detect unicode returns: ProgrammingError('(MySQLdb._exceptions.ProgrammingError) (1064, "parse sql \'SELECT CAST(\'test collated returns\' AS CHAR CHARACTER SET utf8mb4) COLLATE utf8mb4_bin AS anon_1\' error: unexpected CHARACTER at position 55 near CHARACTER")',)
"detect unicode returns: %r" % de
------------
[
{
"name": "AdventNet",
"_id": "52cdef7c4bab8bd675297d8b"
},
{
"name": "Wetpaint",
"_id": "52cdef7c4bab8bd675297d8a"
}
]
- 01 - Setup the database and datasets
- 02 - Create a simple bar chart
- 03 - SQLlab and virtual table
- 04 - Row level security
- 05 - Table chart with time filter
You can look at the different Dockerfile and entrypoint.sh scripts to see the details. However there are a few things I did in this POC and I want to point out.
-
Mongo BI connector :
-
As the database is in auth mode, the bi connector needs to be too (
--auth
) and needs to be secured with TLS--sslMode=allowSSL
and--sslPEMKeyFile=/path/to/an/autosigned/pem/cert/and/key
-
schema refresh has been enabled and is automatic but you can build your own schema
-
-
Superset :
-
in file
/app/superset/sql_lab.py
around line432
, there is aconn.commit()
instruction that needs to be commented in order to be able to use the sql lab. -
as the mysqlclient sends the password as cleartext (over tls), you need to add the
LIBMYSQL_ENABLE_CLEARTEXT_PLUGIN=1
environment variable
-