Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gatling data retention #249

Closed
1 task done
wayneseymour opened this issue Apr 6, 2022 · 11 comments
Closed
1 task done

Gatling data retention #249

wayneseymour opened this issue Apr 6, 2022 · 11 comments
Assignees

Comments

@wayneseymour
Copy link
Member

wayneseymour commented Apr 6, 2022

We store n kinds of gatling data.

  • user data: gatling-users
  • stats data: gatling-stats
  • metric data: gatling-data

For metric data , we wish to retain data as follows

  • Rollover after 30 days or 50GB

  • Delete after 1 year

  • After Gatling index too large #250 is over,
    we've to remove the index template used:
    GET _index_template/template_with_gatling_data_mappings_1shard_0replicas

@wayneseymour
Copy link
Member Author

So far we've used the following requests on the "practice" cluster:

# 2 mins, and 4 mins
PUT _ilm/policy/gatling-metrics-ilm-2
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_age": "2m",
            "max_primary_shard_size": "50gb"
          },
          "set_priority": {
            "priority": 100
          }
        },
        "min_age": "0ms"
      },
      "delete": {
        "min_age": "4m",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}
# This ingest pipeline was pre-existing,
# but we added the ability to add a new field
# that was necessary since we are now going 
# to use a data stream [@timestamp], instead
# of just an index alias.
PUT _ingest/pipeline/my_pipeline
{
  "description": "Removes the 'message' field",
  "processors": [
    {
      "set": {
        "field": "responseHeaders",
        "value": "",
        "if": "ctx.status != 'KO'"
      }
    },
    {
      "set": {
        "field": "responseBody",
        "value": "",
        "if": "ctx.status != 'KO'"
      }
    },
    {
      "set": {
        "field": "@timestamp",
        "copy_from": "timestamp"
      }
    }
  ]
}
# add index template, made up of the component, adding the 
# data stream.
# Note we also used an ingest pipeline
PUT _index_template/gatling-data-index-template-2
{
  "template": {
    "settings": {
      "index.lifecycle.name": "gatling-metrics-ilm-2",
      "index.default_pipeline": "my_pipeline"
    }
  },
  "index_patterns": ["gatling-data*"],
  "data_stream": { },
  "priority": 500,
  "_meta": {
    "description": "Template for gatling time series data"
  }
}

Then, we posted docs and waited to see that not only did the data stream ingest work correctly, but that the rollover actually worked as we'd hoped.
You can also see in the following result, that the ingest pipeline did indeed add the @timestamp,
needed by the data stream

POST gatling-data/_doc
{
  "timestamp": "2019-04-05T09:55:25.418Z",
  "message" : "pls add the at-timestamp"
}
# Match all query to see the result
GET gatling-data/_search
# Result
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : ".ds-gatling-data-2022.04.07-000228",
        "_type" : "_doc",
        "_id" : "aKVyA4ABynv0SaSS0Ql5",
        "_score" : 1.0,
        "_source" : {
          "responseHeaders" : "",
          "responseBody" : "",
          "@timestamp" : "2019-04-05T09:55:25.418Z",
          "message" : "pls add the at-timestamp",
          "timestamp" : "2019-04-05T09:55:25.418Z"
        }
      }
    ]
  }
}

@wayneseymour wayneseymour self-assigned this Apr 7, 2022
@wayneseymour
Copy link
Member Author

Added a new pipeline to Kibana Stats Prod:

PUT _ingest/pipeline/gatling-data-pipeline
{
  "description": "Drops responseHeaders and responseBody if ctx.status is not 'KO', sets @timestamp from timestamp.",
  "processors": [
    {
      "set": {
        "field": "responseHeaders",
        "value": "",
        "if": "ctx.status != 'KO'"
      }
    },
    {
      "set": {
        "field": "responseBody",
        "value": "",
        "if": "ctx.status != 'KO'"
      }
    },
    {
      "set": {
        "field": "@timestamp",
        "copy_from": "timestamp"
      }
    }
  ]
}

@wayneseymour
Copy link
Member Author

Added the ilm policy:

PUT _ilm/policy/gatling-data-30days-or-50GB-delete-after-6months
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_age": "30d",
            "max_primary_shard_size": "50gb"
          },
          "set_priority": {
            "priority": 100
          }
        },
        "min_age": "0ms"
      },
      "delete": {
        "min_age": "180d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

@wayneseymour
Copy link
Member Author

wayneseymour commented Apr 11, 2022

Added the component template

PUT _component_template/gatling-settings-template
{
  "template": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "gatling-data-30days-or-50GB-delete-after-6months"
        },
        "default_pipeline": "gatling-data-pipeline"
      }
    },
    "mappings": {
      "properties": {
        "CI_BUILD_ID": {
          "type": "integer",
          "coerce": true
        },
        "CI_BUILD_URL": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "CI_RUN_URL": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "baseUrl": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "branch": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "buildHash": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "buildNumber": {
          "type": "long"
        },
        "deploymentId": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "esBuildDate": {
          "type": "date"
        },
        "esBuildHash": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "esLuceneVersion": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "esUrl": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "esVersion": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "isCloudDeployment": {
          "type": "boolean"
        },
        "isSnapshotBuild": {
          "type": "boolean"
        },
        "kibanaBranch": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "maxUsers": {
          "type": "long"
        },
        "message": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "method": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "name": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "requestBody": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "requestHeaders": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "requestSendStartTime": {
          "type": "date"
        },
        "requestTime": {
          "type": "long"
        },
        "responseBody": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "responseHeaders": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "responseReceiveEndTime": {
          "type": "date"
        },
        "responseStatus": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "scenario": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "status": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "timestamp": {
          "type": "date"
        },
        "url": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "userId": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "version": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
    }
  }
}

@wayneseymour
Copy link
Member Author

Added index template as data stream

PUT _index_template/gatling-data-index-template
{
  "index_patterns": ["gatling-data*"],
  "data_stream": { },
  "composed_of": [ "gatling-settings-template"],
  "priority": 500,
  "_meta": {
    "description": "Template for gatling time series data"
  }
}

@wayneseymour
Copy link
Member Author

Posted one doc:

POST gatling-data/_doc
{
  "timestamp": "2019-04-05T09:55:25.418Z",
  "message" : "pls add the at-timestamp"
}

@wayneseymour
Copy link
Member Author

Dropping gatling-data-index-template until #250 is resolved.

@wayneseymour
Copy link
Member Author

Posted another doc:

GET gatling-data/_mapping

POST gatling-data/_doc
{
  "timestamp": "2022-04-21T12:38:00.000Z",
  "message" : "test doc from tre"
}
# id R3jqS4ABkeOZogzxwR20

Result:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : ".ds-gatling-data-2022.04.21-000001",
        "_type" : "_doc",
        "_id" : "R3jqS4ABkeOZogzxwR20",
        "_score" : 1.0,
        "_source" : {
          "responseHeaders" : "",
          "responseBody" : "",
          "@timestamp" : "2022-04-21T12:38:00.000Z",
          "message" : "test doc from tre",
          "timestamp" : "2022-04-21T12:38:00.000Z"
        }
      }
    ]
  }
}

@wayneseymour
Copy link
Member Author

Summary

Added and/or double checked:

ingest pipeline

  • GET _ingest/pipeline/gatling-data-pipeline

ilm policy

  • GET _ilm/policy/gatling-data-30days-or-50GB-delete-after-6months

component template

  • GET _component_template/gatling-settings-template

index template with data stream

  • GET _index_template/gatling-data-index-template

Also removed :
index template for fixing up gatling-data-2021-11

  • _index_template/template_with_gatling_data_mappings_1shard_0replicas

Posted some docs to gatling-data data stream

POST gatling-data/_doc
{
  "timestamp": "2022-04-21T12:38:00.000Z",
  "message" : "test doc from tre"
}# id R3jqS4ABkeOZogzxwR20
POST gatling-data/_doc
{
  "timestamp": "2022-04-21T12:43:00.000Z",
  "message" : "another test doc from tre"
}# id qXjvS4ABkeOZogzxazBZ
POST gatling-data/_doc
{
  "timestamp": "2022-04-21T14:10:00.000Z",
  "message" : "another test doc from tre with more fields",
  "buildNumber": 777,
  "isSnapshotBuild": false
}# id 9Hk_TIABkeOZogzxVn4i

@wayneseymour
Copy link
Member Author

@LeeDr @marius-dr @dmlemeshko
Added some rudimentary documentation here: Gatling Data Retention

@marius-dr
Copy link
Member

Added some clarification to the documentation regarding the pipeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants