Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI AWS account: write scripts to clean used resources #26711

Closed
mtojek opened this issue Jul 5, 2021 · 14 comments
Closed

CI AWS account: write scripts to clean used resources #26711

mtojek opened this issue Jul 5, 2021 · 14 comments
Labels
Stalled Team:Cloud-Monitoring Label for the Cloud Monitoring team

Comments

@mtojek
Copy link
Contributor

mtojek commented Jul 5, 2021

The idea of this issue is to enable some scripting to remove/clean old resources that have been used during tests. We can't always trust Terraform that it will remove all resources. The process running "tf" or the entire CI machine may go down and these resources will stay forever.

Possible solutions:

  • Lambda function which periodically cleans old resources.

Most likely we'll face same problem in elastic/integrations.

cc @jsoriano @kaiyan-sheng

@mtojek mtojek added the Team:Integrations Label for the Integrations team label Jul 5, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations (Team:Integrations)

@mtojek
Copy link
Contributor Author

mtojek commented Jul 5, 2021

There are few players in the game:

cloud nuke - https://github.com/gruntwork-io/cloud-nuke
auto cleanup - https://github.com/servian/aws-auto-cleanup
awsweeper - https://github.com/jckuester/awsweeper

@jsoriano
Copy link
Member

jsoriano commented Jul 5, 2021

Terraform state is archived by the jenkins pipeline. This could be used to discover resources created but not destroyed. Though this would mean to look through all the jobs that may create these scenarios, and won't work for removed jobs.

@mtojek
Copy link
Contributor Author

mtojek commented Jul 5, 2021

Yeah, that's actually the reason, why I personally prefer to simplify the logic and just depend on the timestamp (old enough? nuke it please).

I assume we need it for EC2 instances, DynamoDB databases, SQS queues, SNS topics. Is there anything else? Do we create also other resources?

@kaiyan-sheng
Copy link
Contributor

Good point!

I assume we need it for EC2 instances, DynamoDB databases, SQS queues, SNS topics. Is there anything else? Do we create also other resources?

S3 bucket also?

@kuisathaverat
Copy link
Contributor

The easy way is to tag everything created from the CI, then nuke everything with those tags every daily. If we add the tag CI and another like created-DAY_OF_YEAR we can nuke resources safely a day after they are created.

@botelastic
Copy link

botelastic bot commented Jul 19, 2022

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Jul 19, 2022
@mtojek
Copy link
Contributor Author

mtojek commented Jul 19, 2022

👍

@botelastic botelastic bot removed the Stalled label Jul 19, 2022
@mtojek mtojek added Team:Cloud-Monitoring Label for the Cloud Monitoring team and removed Team:Integrations Label for the Integrations team labels Jul 19, 2022
@kuisathaverat
Copy link
Contributor

@v1v is this on your radar?

@v1v
Copy link
Member

v1v commented Jul 19, 2022

IIRC, all the bits and pieces regarding the tagging/labelling was done with:

There is some automation in place to delete all the leftovers, @amannocci can you confirm if the automation is enabled to delete those resources when needed?

string(name: 'awsRegion', defaultValue: 'eu-central-1', description: 'Default AWS region to use for testing.')
is the current AWS region

@amannocci
Copy link
Contributor

Currently, only EC2 instances are handled by cloud-reaper.
AFAIK we will need to add support for S3, SNS & SQS services.
It should be easy for S3 and a bit less obvious for SNS and SQS services.
Should we add those items in an iteration? @v1v

@v1v
Copy link
Member

v1v commented Jul 19, 2022

Should we add those items in an iteration? @v1v

Would you mind raising an issue in our project, so we can prioritise it

@botelastic
Copy link

botelastic bot commented Jul 19, 2023

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Jul 19, 2023
@amannocci
Copy link
Contributor

This issue was addressed with internal tooling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Stalled Team:Cloud-Monitoring Label for the Cloud Monitoring team
Projects
None yet
Development

No branches or pull requests

7 participants