-
-
Notifications
You must be signed in to change notification settings - Fork 490
ElasticSearch SQL Plugin Overview
A simple web-based front-end that provides SQL-based access for querying ElasticSearch. This could be used by people who are not so familiar with the basic ElasticSearch query syntax and interface when wanting to build custom queries in the Geonetwork catalogue.
- (+/-) It's relatively easy to install but uses additional technologies (node), which adds to the complexity of a catalogue- there's more to maintain and keep up to date, more to ensure starts up when Geonetwork starts up, and so on.
- (-) You need to open up another port, which may be a limiting factor for security-conscious organisations- if it's open to the outside world it may need penetration testing.
- (-) It seems to be possible to create or update indices, although possibly not to delete in the version we would need to use with Geonetwork 3.4.2. This could be dangerous!
- (?) What would be the impact on the rest of the catalogue of running resource-heavy queries in this interface?
- (-) The front-end is quite limited, so there is no option to save queries for later use, or even a history. There's an issue in the repository for this but it has had no response.
- (+) The project does seem to be under active development, with commits in the last few days (as of June 4th 2018)
- This guide assumes you have ElasticSearch installed, configured and running within Geonetwork. See the following pages for guidelines on this: https://geonetwork-opensource.org/manuals/3.4.x/en/maintainer-guide/statistics/index.html
Ensure you install the version that corresponds to the version of ElasticSearch used in Geonetwork. For Geonetwork 3.4.2 this is version 5.6.1
https://github.com/NLPchina/elasticsearch-sql
-
Install the sql front-end as a plugin in ElasticSearch, by running the following command in your ElasticSearch/bin directory using the command-line:
./bin/elasticsearch-plugin install https://github.com/NLPchina/elasticsearch-sql/releases/download/5.6.1.0/elasticsearch-sql-5.6.1.0.zip
-
Download the web front-end from https://github.com/NLPchina/elasticsearch-sql/releases/download/5.4.1.0/es-sql-site-standalone.zip and unzip it to the plugin folder that has been created by the above step. Note that it doesn't seem to matter that the version of the web front-end does not match the version of the plugin!
-
Run the command in the command-line to install the web front-end:
cd site-server
npm install express --save
- You will need to change the default port that the web front-end runs as by editing the file
sql/site-server/site-configuration.json
as the default is to use the port that Geonetwork runs on. - You will need to edit your ElasticSearch configuration to allow cross-site scripting by editing the
elasticsearch.yml
config file in a text editor. - Add the following lines at the bottom of the file and restart ElasticSearch:
http.cors.enabled: true
http.cors.allow-origin: "*"
http.cors.allow-headers: X-Requested-With,X-Auth-Token,Content-Type,Content-Length
-
Run the web front-end using by running the following command in the command-line (while in the site-server directory):
node node-server.js
-
Test the plugin is working by visiting the following URL: http://localhost:9200/_sql?sql=select * from records limit 5 (records is one of the Geonetwork indices. This command selects the first 5 documents, just to check it's working)
-
Visit the web front-end. If running on your local computer this will be on http://localhost:8088 (or whatever port you chose in site-configuration.json)
-
Edit the URL in the top-right corner of the page to point at the URL that you are running ElasticSearch on, which is generally http://localhost:9200
There are two ways of using the SQL Plugin:
- Using the REST API, eg using Curl or a URL in a browser. This method is accessible via the URL for ElasticSearch, generally port 9200. See the testing URL above. Using this method will return the results in json format.
- Using the web interface, accessible at the port you configured in site-configuration.json above.
There are some meta-commands that you can use to explore the index, which are documented at https://github.com/NLPchina/elasticsearch-sql/wiki/Show-Commands. These are a wrapper around the "mapping" request in ElasticSearch, see https://www.elastic.co/guide/en/elasticsearch/reference/5.6/mapping.html
Some useful examples:
-
show *
: shows you all the indices in your cluster -
show records
: shows you the mappings for the "records" index -
show records/_default_
: shows you the mappings for the "_default_" type in the "records" index
You can run an "Explain" on your query, and also save the results as a comma-separated text file (csv).
To utilise ElasticSearch "scan and scroll", to efficiently retrieve large resultsets, use the following "hint" in your SQL:
/*! USE_SCROLL*/
eg:
SELECT /*! USE_SCROLL*/ * FROM records
The yellow "step forward" and "step to end" buttons are then enabled, otherwise they do nothing.
If you have some comments, start a discussion, raise an issue or use one of our other communication channels to talk to us.