elasticsearch-reindex
is a CLI tool for transferring Elasticsearch indexes between different servers.
Install the package using pip:
pip install elasticsearch-reindex
Ensure the source Elasticsearch host is whitelisted in the destination host. Edit the elasticsearch.yml configuration file on the destination Elasticsearch server.
You should edit Elasticsearch YML config:
/etc/elasticsearch/elasticsearch.yml
Add the following line to the file:
reindex.remote.whitelist: <es-source-host>:<es-source-port>
Use the CLI to migrate data between Elasticsearch instances:
elasticsearch_reindex \
--source_host http(s)://es-source-host:es-source-port \
--source_http_auth username:password \
--dest_host http(s)://es-dest-host:es-dest-port \
--dest_http_auth username:password \
--check_interval 5 \
--concurrent_tasks 3 \
-i test_index_1 -i test_index_2
Also, there is a command alias elasticsearch-reindex
:
elasticsearch-reindex ...
Required fields:
-
source_host
- Elasticsearch endpoint where data will be extracted. -
dest_host
- Elasticsearch endpoint where data will be transfered.
Optional fields:
-
source_http_auth
- HTTP Basic authentication, username and password. -
dest_http_auth
- HTTP Basic authentication, username and password. -
check_interval
- Time period (in second) to check task success status.Default value
-10
(seconds) -
concurrent_tasks
- How many parallel task Elasticsearch will process.Default value
-1
(sync mode) -
indexes
- List of user ES indexes to migrate instead of all source indexes.
from elasticsearch_reindex import ReindexManager
def main() -> None:
"""
Example reindex function.
"""
dict_config = {
"source_host": "http://localhost:9201",
"dest_host": "http://localhost:9202",
"check_interval": 20,
"concurrent_tasks": 5,
}
reindex_manager = ReindexManager.from_dict(data=dict_config)
reindex_manager.start_reindex()
if __name__ == "__main__":
main()
With custom user indexes:
from elasticsearch_reindex import ReindexManager
def main() -> None:
"""
Example reindex function with HTTP Basic authentication.
"""
dict_config = {
"source_host": "http://localhost:9201",
# If the source host requires authentication
# "source_http_auth": "tmp-source-user:tmp-source-PASSWD.220718",
"dest_host": "http://localhost:9202",
# If the destination host requires authentication
# "dest_http_auth": "tmp-reindex-user:tmp--PASSWD.220718",
"check_interval": 20,
"concurrent_tasks": 5,
"indexes": ["es-index-1", "es-index-2", "es-index-n"],
}
reindex_manager = ReindexManager.from_dict(data=dict_config)
reindex_manager.start_reindex()
if __name__ == "__main__":
main()
Set up and activate a Python 3 virtual environment:
make ve
To install Git hooks:
make install_hooks
Create .env file and fill the data:
cp .env.example .env
Export env variables:
export $(xargs < .env)
Variable for enable testing:
ENV
- variable for enable testing mode. For activate test mode set to value -test
.
Elasticsearch docker settings:
-
ES_SOURCE_PORT
- Source Elasticsearch port -
ES_DEST_PORT
- Destination Elasticsearch port -
ES_VERSION
- Elasticsearch version -
LOCAL_IP
- Address of you local host machine in LAN like192.168.4.106
.
- MacOS (find it in response):
ifconfig
- Linux (find it in response):
ip r
Start Elasticsearch nodes using Docker Compose:
docker-compose up -d
Verify Elasticsearch nodes are running:
- Source Elasticsearch:
curl -X GET $LOCAL_IP:$ES_SOURCE_PORT
- Destination Elasticsearch:
curl -X GET $LOCAL_IP:$ES_DEST_PORT
Export to PYTHONPATH
env variable:
export PYTHONPATH="."
For run tests with pytest
use:
make test
For run tests with pytest
and coverage
report use:
make test-cov