Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.15] Add changelog for fallback to ILM if DSL not present (backport #13918) #13979

Merged
merged 1 commit into from
Sep 3, 2024

Conversation

mergify[bot]
Copy link
Contributor

@mergify mergify bot commented Sep 3, 2024

Motivation/summary

APM-Server switched from Index Lifecycle Management(ILM) to Datastream Lifecycle (DSL) in v8.15.0. This switch was done when we moved from APM integration (which used ILM) to APM-data plugin (which uses DSL) in ES for managing APM datastreams. As a result of the switch, any old datastreams created before the switch would be Unmanaged because the datastream will never be updated with the DSL lifecycle -- this has to be done manually by using the PUT API.

Checklist

How to test these changes

  1. Create a stack (ES, Kibana, APM-Server) with data-persistence enabled for ES using 8.14.3 version. We use the 8.14.3 as that is the latest available version which uses APM integration package and thus configures ILM policies.

    Example docker-compose.yaml
    version: '3.9'
    x-logging: &default-logging
      driver: "json-file"
      options:
        max-size: "1g"
    services:
      elasticsearch:
        image: docker.elastic.co/elasticsearch/elasticsearch:8.14.3
        ports:
          - 9200:9200
        healthcheck:
          test: ["CMD-SHELL", "curl -s http://localhost:9200/_cluster/health?wait_for_status=yellow&timeout=500ms"]
          retries: 300
          interval: 1s
        environment:
          - "ES_JAVA_OPTS=-Xms1g -Xmx1g"
          - "network.host=0.0.0.0"
          - "transport.host=127.0.0.1"
          - "http.host=0.0.0.0"
          - "cluster.routing.allocation.disk.threshold_enabled=false"
          - "discovery.type=single-node"
          - "xpack.security.authc.anonymous.roles=remote_monitoring_collector"
          - "xpack.security.authc.realms.file.file1.order=0"
          - "xpack.security.authc.realms.native.native1.order=1"
          - "xpack.security.enabled=true"
          - "xpack.license.self_generated.type=trial"
          - "xpack.security.authc.token.enabled=true"
          - "xpack.security.authc.api_key.enabled=true"
          - "logger.org.elasticsearch=${ES_LOG_LEVEL:-error}"
          - "action.destructive_requires_name=false"
        volumes:
          - "./testing/docker/elasticsearch/roles.yml:/usr/share/elasticsearch/config/roles.yml"
          - "./testing/docker/elasticsearch/users:/usr/share/elasticsearch/config/users"
          - "./testing/docker/elasticsearch/users_roles:/usr/share/elasticsearch/config/users_roles"
          - "./testing/docker/elasticsearch/ingest-geoip:/usr/share/elasticsearch/config/ingest-geoip"
          - "/Users/lahsivjar/Projects/elastic/tmp/esdata2:/usr/share/elasticsearch/data"
        logging: *default-logging
    
      kibana:
        image: docker.elastic.co/kibana/kibana:8.14.3
        ports:
          - 5601:5601
        healthcheck:
          test: ["CMD-SHELL", "curl -s http://localhost:5601/api/status | grep -q 'All services are available'"]
          retries: 300
          interval: 1s
        environment:
          ELASTICSEARCH_HOSTS: '["http://elasticsearch:9200"]'
          ELASTICSEARCH_USERNAME: "${KIBANA_ES_USER:-kibana_system_user}"
          ELASTICSEARCH_PASSWORD: "${KIBANA_ES_PASS:-changeme}"
          XPACK_FLEET_AGENTS_ELASTICSEARCH_HOSTS: '["http://elasticsearch:9200"]'
        depends_on:
          elasticsearch: { condition: service_healthy }
        volumes:
          - "./testing/docker/kibana/kibana.yml:/usr/share/kibana/config/kibana.yml"
        logging: *default-logging
    
      apm-server:
        image: docker.elastic.co/apm/apm-server:8.14.3
        ports:
          - 8200:8200
        healthcheck:
          test: ["CMD-SHELL", "bash -c 'echo -n > /dev/tcp/127.0.0.1/8200'"]
          retries: 300
          interval: 1s
        depends_on:
          elasticsearch: { condition: service_healthy }
        volumes:
          - "./testing/docker/apm-server/apm-server.yml:/usr/share/apm-server/apm-server.yml"
        logging: *default-logging
    NOTE: The config files used in the example docker-compose are [available here](https://github.com/elastic/apm-server/tree/main/testing/docker). `apm.server.yml` file used in the docker-compose could be a simple config file:
    apm-server:
      host: "0.0.0.0:8200"
    output.elasticsearch:
      hosts: ["elasticsearch:9200"]
      username: "admin"
      password: "changeme"
    logging.level: info
    logging.to_stderr: true
  2. Install the APM integration in the cluster.

  3. Send some data, for example: by using apmsoak. Example command: go run ./cmd/apmsoak/ run --file cmd/apmsoak/scenarios.yml --scenario apm-server --server-url http://localhost:8200

  4. Assert that the APM indices created are managed by ILM, for example: by running GET /_data_stream/traces-apm-default to check for trace indices

  5. Build an Elasticsearch docker image using the branch in this PR: ./gradlew buildAarch64DockerImage

  6. Update the versions used in the stack created in step 1 to 8.16.0-SNAPSHOT, for ES use the docker image built in step 5

  7. Send some more data as we did in step 3

  8. Assert that all the APM indices are still managed by ILM

  9. Rollover the datastream

  10. Assert that all the APM indices, including the one created using rollover in step 9, are still managed by ILM

Also, test if the setup works by itself i.e. if a cluster is created using the latest version (with the changes in the PR) then it works as expected and the created APM indices in this case are managed by DSL (datastream lifecycle).

NOTE: Any indices created when APM is on version 8.15.0 and datastream created before 8.15.0 i.e. with ILM, will remain Unmanaged even after this fix. To fix them, we would need to explicitly update them OR use the PUT API on datastream to set DSL.

Related issues


This is an automatic backport of pull request #13918 done by [Mergify](https://mergify.com).

@mergify mergify bot requested a review from a team as a code owner September 3, 2024 13:43
@mergify mergify bot added the backport label Sep 3, 2024
@lahsivjar lahsivjar enabled auto-merge (squash) September 3, 2024 13:44
@mergify mergify bot merged commit d36412d into 8.15 Sep 3, 2024
9 checks passed
@mergify mergify bot deleted the mergify/bp/8.15/pr-13918 branch September 3, 2024 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant