Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/elastic search pull request #181

Merged
merged 95 commits into from
Nov 28, 2023
Merged
Show file tree
Hide file tree
Changes from 71 commits
Commits
Show all changes
95 commits
Select commit Hold shift + click to select a range
1c9bbae
First approach to use the elasticsearch service
Aug 14, 2023
eb73469
elasticsearch query example
AdrianAlcolea Aug 14, 2023
8bd8115
elasticsearch query complete example
AdrianAlcolea Aug 14, 2023
2e7934b
Elasticsearch setup configured
AdrianAlcolea Aug 16, 2023
5ba4a7c
Remove .DS_Store and add it to dockerignore
AdrianAlcolea Aug 16, 2023
fcdc19c
Merge branch 'develop' of https://github.com/aiondemand/AIOD-rest-api…
josvandervelde Aug 17, 2023
22e8577
Merge branch 'develop' of https://github.com/aiondemand/AIOD-rest-api…
josvandervelde Aug 23, 2023
6d665bb
Created an elastic search endpoint. It isnt working yet
josvandervelde Aug 23, 2023
1123335
Just to test the new database
AdrianAlcolea Aug 23, 2023
80b9c3c
To rebase develop
AdrianAlcolea Aug 23, 2023
30d6eda
Working but not up-to-date with develop branch
AdrianAlcolea Aug 23, 2023
28e58fc
Service working. Need to get up-to-date with develop
AdrianAlcolea Aug 23, 2023
a93fa89
Resolved merge conflicts with develop
josvandervelde Aug 24, 2023
f2aa463
Some bugfixes for the connectors
josvandervelde Aug 24, 2023
f2893f4
Deleting docker images in clean script
josvandervelde Aug 24, 2023
3d17c8c
Fixed issues with authentication
josvandervelde Aug 24, 2023
575faaf
Publication search seems to work. TODO: test cases and other resourcse
josvandervelde Aug 24, 2023
afdb897
platform and platform_identifier changed from aiod_entry to each inst…
AdrianAlcolea Aug 24, 2023
895a8d2
Created testcase for publication search
josvandervelde Aug 24, 2023
4a2e0f9
Made ElasticSearch router generic, implemented it for dataset
josvandervelde Aug 24, 2023
d6ce8e9
Logstash configured for dataset, experiment, ml_model, publication, a…
AdrianAlcolea Aug 24, 2023
5fb737b
Logstash configured for dataset, experiment, ml_model, publication, a…
AdrianAlcolea Aug 24, 2023
0275144
Logstash waits until fill-db-with-examples ends
AdrianAlcolea Aug 24, 2023
ccc1ffd
Resolved merge conflicts with develop
josvandervelde Aug 24, 2023
5de434c
take src from develop
AdrianAlcolea Aug 30, 2023
5ff4498
Copied entire develop branch
AdrianAlcolea Aug 30, 2023
cae4321
Logstash configuration readapted to new names
AdrianAlcolea Aug 30, 2023
80aef7d
Logstash configuration readapted to new names
AdrianAlcolea Aug 30, 2023
ca2f44c
Merged with develop
AdrianAlcolea Aug 30, 2023
69ffa34
added ai4experiments to platform names
AdrianAlcolea Aug 31, 2023
ba49bbb
Copied initial search routers to start creating them
AdrianAlcolea Sep 5, 2023
ce08b0a
Examples of ml_model, dataset and experiment used to insert ai4experi…
AdrianAlcolea Sep 6, 2023
84686ea
Descriptions of the ai4experiment data improved
AdrianAlcolea Sep 6, 2023
7802496
platform added to mappings
AdrianAlcolea Sep 13, 2023
fdee2b7
elasticsearch query example completed
AdrianAlcolea Sep 13, 2023
2c3e9d7
First version of search service working
AdrianAlcolea Sep 13, 2023
0280c26
Search router tests implemented
AdrianAlcolea Sep 14, 2023
106d8c0
Merge with develop
AdrianAlcolea Sep 27, 2023
a0cc086
Search fields selection added
AdrianAlcolea Sep 28, 2023
97b361c
Added search for event, news, ortganisation and project
AdrianAlcolea Oct 10, 2023
e69620c
Added routers for event, news, organisation and project
AdrianAlcolea Oct 17, 2023
511b779
merged with develop
AdrianAlcolea Oct 17, 2023
ee4858c
merged with develop
AdrianAlcolea Oct 23, 2023
3e5d5f4
Logstash names changed
AdrianAlcolea Oct 23, 2023
2146a3d
added logstash_config.py, just for having it there
AdrianAlcolea Oct 26, 2023
f964457
Pagination changed to actual pages
AdrianAlcolea Oct 26, 2023
0fe2fe1
Pagination changed to actual pages
AdrianAlcolea Oct 26, 2023
464c270
Application areas added to elasticsearch resuts
AdrianAlcolea Oct 31, 2023
8289131
merged with develop
AdrianAlcolea Nov 6, 2023
85ba925
First version with deletion
AdrianAlcolea Nov 8, 2023
cbe2a82
merged with develop
AdrianAlcolea Nov 8, 2023
0865424
Prepared to be merged with develop
AdrianAlcolea Nov 9, 2023
518036c
merged with develop
AdrianAlcolea Nov 9, 2023
92038e7
pull request modifications
AdrianAlcolea Nov 10, 2023
9b1fd52
pull request modifications
AdrianAlcolea Nov 11, 2023
f1df999
merged with develop
AdrianAlcolea Nov 11, 2023
74ac4f0
pull request modifications
AdrianAlcolea Nov 11, 2023
7c20576
Combined search with sql queries in process
AdrianAlcolea Nov 13, 2023
88150dd
merged with develop
AdrianAlcolea Nov 13, 2023
0aeaf25
Search functionality combined with optional SQL statment to retrieve …
AdrianAlcolea Nov 13, 2023
05c6814
Elasticsearch and logstash configuration integrated in src
AdrianAlcolea Nov 15, 2023
a96e9f6
merged with develop
AdrianAlcolea Nov 15, 2023
92b05cf
Search router tests actualised
AdrianAlcolea Nov 15, 2023
22d8df2
Search router tests actualised
AdrianAlcolea Nov 15, 2023
03868c3
Search router tests actualised
AdrianAlcolea Nov 15, 2023
e9f14be
pre-commit passed
AdrianAlcolea Nov 20, 2023
4821f67
All test passed and working. Not merged with develop
AdrianAlcolea Nov 20, 2023
8c65397
huggingface connector test to its original state
AdrianAlcolea Nov 20, 2023
120f97a
back to commented huggingface connector
AdrianAlcolea Nov 20, 2023
3e5c446
Fixing unittests by making sure Elasticsearch instance can also be cr…
josvandervelde Nov 20, 2023
9bdd974
Merge branch 'develop' of https://github.com/aiondemand/AIOD-rest-api…
josvandervelde Nov 20, 2023
3daef84
clean logstash configuration
AdrianAlcolea Nov 20, 2023
7e0861a
clean logstash configuration
AdrianAlcolea Nov 20, 2023
649325d
clean logstash configuration
AdrianAlcolea Nov 20, 2023
f4739e9
clean logstash configuration
AdrianAlcolea Nov 20, 2023
985949d
clean logstash configuration
AdrianAlcolea Nov 20, 2023
ae2ac9e
clean logstash configuration
AdrianAlcolea Nov 20, 2023
cc8c22f
logstash config files generated with jinja2
AdrianAlcolea Nov 24, 2023
b73c011
logstash config files generated with jinja2
AdrianAlcolea Nov 24, 2023
6ae492d
Logstash config files generated with jinja2. All test passed, but not…
AdrianAlcolea Nov 24, 2023
eaf3451
Resolved merge conflicts with develop
josvandervelde Nov 24, 2023
ca5c137
Second round of pull request comments
AdrianAlcolea Nov 25, 2023
4f17532
Second round of pull request comments
AdrianAlcolea Nov 25, 2023
b60c9aa
Second round of pull request comments
AdrianAlcolea Nov 25, 2023
7c2106f
Second round of pull request comments
AdrianAlcolea Nov 25, 2023
0e13852
Second round of pull request comments
AdrianAlcolea Nov 25, 2023
453b1cd
Created data/elasticsearch/.gitkeep to make sure it exists with the r…
josvandervelde Nov 27, 2023
6767136
Deleted autogenerated file logstash/config/logstash.yml
josvandervelde Nov 27, 2023
defc100
cleanup
josvandervelde Nov 27, 2023
5f79f0d
Making sure docker compose up works even if generated files do not ex…
josvandervelde Nov 27, 2023
58b2514
Make sure data folders are always created with correct permissions (t…
josvandervelde Nov 27, 2023
0d6bd3b
Added default logstash configuration
josvandervelde Nov 27, 2023
82b049a
Fixed docker compose
josvandervelde Nov 27, 2023
85c76b5
Using FastAPI input validation
josvandervelde Nov 27, 2023
3adc54b
Made status nullable, so that we can return an empty status in the se…
josvandervelde Nov 27, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
scripts
venv
data
**.DS_Store
10 changes: 10 additions & 0 deletions .env
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,13 @@ KEYCLOAK_ADMIN_PASSWORD=password
KEYCLOAK_CLIENT_SECRET="QJiOGn09eCEfnqAmcPP2l4vMU8grlmVQ"
REDIRECT_URIS=http://${HOSTNAME}/docs/oauth2-redirect
POST_LOGOUT_REDIRECT_URIS=http://${HOSTNAME}/aiod-auth/realms/aiod/protocol/openid-connect/logout

#ELASTICSEARCH
ES_USER=elastic
ES_PASSWORD=changeme
ES_DISCOVERY_TYPE=single-node
ES_ROLE="edit_aiod_resources"
ES_JAVA_OPTS="-Xmx256m -Xms256m"

#LOGSTASH
LS_JAVA_OPTS="-Xmx256m -Xms256m"
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,7 @@ venv/
ENV/
env.bak/
venv.bak/
**.DS_Store
josvandervelde marked this conversation as resolved.
Show resolved Hide resolved

# Spyder project settings
.spyderproject
Expand Down
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,11 @@ For development:
- Additional 'mysqlclient' dependencies. Please have a look at [their installation instructions]
(https://github.com/PyMySQL/mysqlclient#install).

## Production environment

For production environments elasticsearch recomends -Xss4G and -Xmx8G for the JVM settings.\
josvandervelde marked this conversation as resolved.
Show resolved Hide resolved
This parameters can be defined in the .env file.
See the [elasticsearch guide](https://www.elastic.co/guide/en/logstash/current/jvm-settings.html).

## Installation

Expand Down
10 changes: 5 additions & 5 deletions connectors/fill-examples.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,6 @@ python3 connectors/synchronization.py \
-c connectors.example.example.ExampleEducationalResourceConnector \
-w /opt/connectors/data/example/educational_resource

python3 connectors/synchronization.py \
-c connectors.example.example.ExampleEventConnector \
-w /opt/connectors/data/example/event

python3 connectors/synchronization.py \
-c connectors.example.example.ExampleExperimentConnector \
-w /opt/connectors/data/example/experiment
Expand All @@ -40,6 +36,10 @@ python3 connectors/synchronization.py \
-c connectors.example.example.ExamplePersonConnector \
-w /opt/connectors/data/example/person

python3 connectors/synchronization.py \
-c connectors.example.example.ExampleEventConnector \
-w /opt/connectors/data/example/event

python3 connectors/synchronization.py \
-c connectors.example.example.ExampleProjectConnector \
-w /opt/connectors/data/example/project
Expand Down Expand Up @@ -92,4 +92,4 @@ python3 connectors/synchronization.py \

python3 connectors/synchronization.py \
-c connectors.example.enum.EnumConnectorStatus \
-w /opt/connectors/data/enum/status
-w /opt/connectors/data/enum/status
82 changes: 80 additions & 2 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,7 @@ services:
depends_on:
app:
condition: service_healthy



deletion:
build:
context: deletion
Expand Down Expand Up @@ -167,3 +166,82 @@ services:
depends_on:
app:
condition: service_healthy

elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.8.2
container_name: elasticsearch
env_file: .env
environment:
- ES_JAVA_OPTS=$ES_JAVA_OPTS
- ELASTIC_USER=$ES_USER
- ELASTIC_PASSWORD=$ES_PASSWORD
- discovery.type=$ES_DISCOVERY_TYPE
ports:
- 9200:9200
- 9300:9300
volumes:
- type: bind
source: ./es/elasticsearch.yml
target: /usr/share/elasticsearch/config/elasticsearch.yml
read_only: true
- ./data/elasticsearch:/usr/share/elasticsearch/data
josvandervelde marked this conversation as resolved.
Show resolved Hide resolved
healthcheck:
test: ["CMD-SHELL", "curl -u $ES_USER:$ES_PASSWORD --silent --fail localhost:9200/_cluster/health || exit 1"]
interval: 5s
timeout: 30s
retries: 30

es_logstash_setup:
image: ai4eu_server
container_name: es_logstash_setup
env_file: .env
environment:
- MYSQL_ROOT_PASSWORD=$MYSQL_ROOT_PASSWORD
- ES_USER=$ES_USER
- ES_PASSWORD=$ES_PASSWORD
josvandervelde marked this conversation as resolved.
Show resolved Hide resolved
volumes:
- ./src:/app
- ./logstash:/logstash
command: >
/bin/bash -c "python setup/logstash/generate_logstash_config_files.py &&
python setup/elasticsearch/generate_elasticsearch_indices.py"
restart: "no"
depends_on:
elasticsearch:
condition: service_healthy

logstash:
build:
context: logstash/
dockerfile: Dockerfile
container_name: logstash
env_file: .env
environment:
- LS_JAVA_OPTS=$LS_JAVA_OPTS
ports:
- 5044:5044
- 5000:5000/tcp
- 5000:5000/udp
- 9600:9600
volumes:
- type: bind
source: ./logstash/config/logstash.yml
target: /usr/share/logstash/config/logstash.yml
read_only: true
- type: bind
source: ./logstash/pipeline/pipelines.yml
target: /usr/share/logstash/config/pipelines.yml
read_only: true
- type: bind
source: ./logstash/pipeline/conf
josvandervelde marked this conversation as resolved.
Show resolved Hide resolved
target: /usr/share/logstash/pipeline
read_only: true
- type: bind
source: ./logstash/pipeline/sql
target: /usr/share/logstash/sql
read_only: true
depends_on:
fill-db-with-examples:
condition: service_completed_successfully
es_logstash_setup:
condition: service_completed_successfully
13 changes: 13 additions & 0 deletions es/elasticsearch.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
## Default Elasticsearch configuration from Elasticsearch base image.
## https://github.com/elastic/elasticsearch/blob/master/distribution/docker/src/docker/config/elasticsearch.yml
#
cluster.name: "docker-cluster"
network.host: 0.0.0.0

## X-Pack settings
## see https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-xpack.html
#
xpack.license.self_generated.type: basic
josvandervelde marked this conversation as resolved.
Show resolved Hide resolved
xpack.security.enabled: true
xpack.monitoring.collection.enabled: true
13 changes: 13 additions & 0 deletions logstash/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# https://www.docker.elastic.co/
FROM docker.elastic.co/logstash/logstash:8.11.0

# Download MySQL JDBC driver to connect Logstash to MySQL
RUN curl -Lo "mysql-connector-j-8.2.0.tar.gz" "https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-j-8.2.0.tar.gz" \
&& tar -xf "mysql-connector-j-8.2.0.tar.gz" "mysql-connector-j-8.2.0/mysql-connector-j-8.2.0.jar" \
&& mv "mysql-connector-j-8.2.0/mysql-connector-j-8.2.0.jar" "mysql-connector-j.jar" \
&& rm -r "mysql-connector-j-8.2.0" "mysql-connector-j-8.2.0.tar.gz"

ENTRYPOINT ["/usr/local/bin/docker-entrypoint"]

# Add your logstash plugins setup here
# Example: RUN logstash-plugin install logstash-filter-json
8 changes: 8 additions & 0 deletions logstash/config/logstash.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# This file has been generated by `generate_logstash_config.py`
# file, placed in `src/setup/logstash`
# -------------------------------------------------------------
http.host: "0.0.0.0"
xpack.monitoring.elasticsearch.hosts: [ "http://elasticsearch:9200" ]
xpack.monitoring.enabled: true
xpack.monitoring.elasticsearch.username: elastic
josvandervelde marked this conversation as resolved.
Show resolved Hide resolved
xpack.monitoring.elasticsearch.password: changeme
201 changes: 201 additions & 0 deletions logstash/pipeline/conf/init_table.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
# This file has been generated by `generate_logstash_config.py`
# file, placed in `src/setup/logstash`
# -------------------------------------------------------------
input {
jdbc {
jdbc_driver_library => "/usr/share/logstash/mysql-connector-j.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://sqlserver:3306/aiod"
jdbc_user => "root"
jdbc_password => "ok"
josvandervelde marked this conversation as resolved.
Show resolved Hide resolved
clean_run => true
record_last_run => false
statement_filepath => "/usr/share/logstash/sql/init_dataset.sql"
type => "dataset"
}
jdbc {
jdbc_driver_library => "/usr/share/logstash/mysql-connector-j.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://sqlserver:3306/aiod"
jdbc_user => "root"
jdbc_password => "ok"
clean_run => true
record_last_run => false
statement_filepath => "/usr/share/logstash/sql/init_event.sql"
type => "event"
}
jdbc {
jdbc_driver_library => "/usr/share/logstash/mysql-connector-j.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://sqlserver:3306/aiod"
jdbc_user => "root"
jdbc_password => "ok"
clean_run => true
record_last_run => false
statement_filepath => "/usr/share/logstash/sql/init_experiment.sql"
type => "experiment"
}
jdbc {
jdbc_driver_library => "/usr/share/logstash/mysql-connector-j.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://sqlserver:3306/aiod"
jdbc_user => "root"
jdbc_password => "ok"
clean_run => true
record_last_run => false
statement_filepath => "/usr/share/logstash/sql/init_ml_model.sql"
type => "ml_model"
}
jdbc {
jdbc_driver_library => "/usr/share/logstash/mysql-connector-j.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://sqlserver:3306/aiod"
jdbc_user => "root"
jdbc_password => "ok"
clean_run => true
record_last_run => false
statement_filepath => "/usr/share/logstash/sql/init_news.sql"
type => "news"
}
jdbc {
jdbc_driver_library => "/usr/share/logstash/mysql-connector-j.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://sqlserver:3306/aiod"
jdbc_user => "root"
jdbc_password => "ok"
clean_run => true
record_last_run => false
statement_filepath => "/usr/share/logstash/sql/init_organisation.sql"
type => "organisation"
}
jdbc {
jdbc_driver_library => "/usr/share/logstash/mysql-connector-j.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://sqlserver:3306/aiod"
jdbc_user => "root"
jdbc_password => "ok"
clean_run => true
record_last_run => false
statement_filepath => "/usr/share/logstash/sql/init_project.sql"
type => "project"
}
jdbc {
jdbc_driver_library => "/usr/share/logstash/mysql-connector-j.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://sqlserver:3306/aiod"
jdbc_user => "root"
jdbc_password => "ok"
clean_run => true
record_last_run => false
statement_filepath => "/usr/share/logstash/sql/init_publication.sql"
type => "publication"
}
jdbc {
jdbc_driver_library => "/usr/share/logstash/mysql-connector-j.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://sqlserver:3306/aiod"
jdbc_user => "root"
jdbc_password => "ok"
clean_run => true
record_last_run => false
statement_filepath => "/usr/share/logstash/sql/init_service.sql"
type => "service"
}
}
filter {
mutate {
remove_field => ["@version", "@timestamp"]
}
}
output {
if [type] == "dataset" {
elasticsearch {
hosts => "elasticsearch:9200"
user => "elastic"
password => "changeme"
ecs_compatibility => disabled
index => "dataset"
document_id => "dataset_%{identifier}"
}
}
if [type] == "event" {
elasticsearch {
hosts => "elasticsearch:9200"
user => "elastic"
password => "changeme"
ecs_compatibility => disabled
index => "event"
document_id => "event_%{identifier}"
}
}
if [type] == "experiment" {
elasticsearch {
hosts => "elasticsearch:9200"
user => "elastic"
password => "changeme"
ecs_compatibility => disabled
index => "experiment"
document_id => "experiment_%{identifier}"
}
}
if [type] == "ml_model" {
elasticsearch {
hosts => "elasticsearch:9200"
user => "elastic"
password => "changeme"
ecs_compatibility => disabled
index => "ml_model"
document_id => "ml_model_%{identifier}"
}
}
if [type] == "news" {
elasticsearch {
hosts => "elasticsearch:9200"
user => "elastic"
password => "changeme"
ecs_compatibility => disabled
index => "news"
document_id => "news_%{identifier}"
}
}
if [type] == "organisation" {
elasticsearch {
hosts => "elasticsearch:9200"
user => "elastic"
password => "changeme"
ecs_compatibility => disabled
index => "organisation"
document_id => "organisation_%{identifier}"
}
}
if [type] == "project" {
elasticsearch {
hosts => "elasticsearch:9200"
user => "elastic"
password => "changeme"
ecs_compatibility => disabled
index => "project"
document_id => "project_%{identifier}"
}
}
if [type] == "publication" {
elasticsearch {
hosts => "elasticsearch:9200"
user => "elastic"
password => "changeme"
ecs_compatibility => disabled
index => "publication"
document_id => "publication_%{identifier}"
}
}
if [type] == "service" {
elasticsearch {
hosts => "elasticsearch:9200"
user => "elastic"
password => "changeme"
ecs_compatibility => disabled
index => "service"
document_id => "service_%{identifier}"
}
}
}
Loading
Loading