Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added enable flag for chef and set it to true by default #20

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
68 changes: 44 additions & 24 deletions bin/aws_cleaner.rb
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ def logger(config)
logger
end

def webhook(id, instance_id)
def webhook(instance_id)
if @config[:webhooks]
@config[:webhooks].each do |hook, hook_config|
if AwsCleaner::Webhooks.fire_webhook(hook_config, @config, instance_id)
Expand All @@ -72,31 +72,26 @@ def webhook(id, instance_id)
@logger.info("Failed to run webhook #{hook}")
end
end
AwsCleaner.new.delete_message(id, @config)
end
end

def chef(id, instance_id, chef_node)
if chef_node
if AwsCleaner::Chef.remove_from_chef(chef_node, @chef_client, @config)
@logger.info("Removed #{chef_node} from Chef")
AwsCleaner.new.delete_message(id, @config)
end
def chef(instance_id, chef_node)
return unless @config[:chef][:enable]

if AwsCleaner::Chef.remove_from_chef(chef_node, @chef_client, @config)
@logger.info("Removed #{chef_node} from Chef")
else
@logger.info("Instance #{instance_id} does not exist in Chef, deleting message")
AwsCleaner.new.delete_message(id, @config)
@logger.info("Instance #{instance_id} does not exist in Chef")
end
end

def sensu(id, instance_id, chef_node)
def sensu(instance_id, sensu_node)
return unless @config[:sensu][:enable]
if AwsCleaner::Sensu.in_sensu?(chef_node, @config)
if AwsCleaner::Sensu.remove_from_sensu(chef_node, @config)
@logger.info("Removed #{chef_node} from Sensu")
else
@logger.info("Instance #{instance_id} does not exist in Sensu, deleting message")
end
AwsCleaner.new.delete_message(id, @config)

if AwsCleaner::Sensu.remove_from_sensu(sensu_node, @config)
@logger.info("Removed #{sensu_node} from Sensu")
else
@logger.info("Instance #{instance_id} does not exist in Sensu")
end
end

Expand All @@ -115,12 +110,22 @@ def closelog(message)
@sqs_client = AwsCleaner::SQS.client(@config)
@chef_client = AwsCleaner::Chef.client(@config)

# to provide backwards compatibility as this key did not exist previously
# to provide backwards compatibility as these keys did not exist previously
@config[:chef][:enable] = if @config[:chef][:enable].nil?
true
else
@config[:chef][:enable]
end
@config[:sensu][:enable] = if @config[:sensu][:enable].nil?
true
else
@config[:sensu][:enable]
end
@config[:lookup] = if @config[:lookup].nil?
'chef'
else
@config[:lookup]
end

# main loop
loop do
Expand All @@ -147,14 +152,29 @@ def closelog(message)
instance_id = AwsCleaner.new.process_message(body)

if instance_id
chef_node = AwsCleaner::Chef.get_chef_node_name(instance_id, @config)
webhook(id, instance_id)
chef(id, instance_id, chef_node)
sensu(id, instance_id, chef_node)
node = case @config[:lookup]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option name seems too ambiguous to me, short variables are good but maybe making it a bit more verbose will do some good here. maybe something along the lines of lookup_type?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have updated it to lookup_type. That is a bit more clear.

when 'sensu'
AwsCleaner::Sensu.get_sensu_node_name(instance_id, @config)
else
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this not be when 'chef'?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this should be when chef, since the variable set to "chef" plus any invalid value would default to use chef.

AwsCleaner::Chef.get_chef_node_name(instance_id, @config)
end

if node
@logger.info("Instance #{instance_id} found, node name is #{node}.")

webhook(instance_id)
chef(instance_id, node)
sensu(instance_id, node)

@logger.info("All done with #{node}, deleting message.")
else
@logger.info("Instance #{instance_id} not found, deleting message.")
end
else
@logger.info('Message not relevant, deleting')
AwsCleaner.new.delete_message(id, @config)
end

AwsCleaner.new.delete_message(id, @config)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this and discussing with a colleague, right now the message is deleted when it is not deemed relevant like when the state is not "terminated", however if you have another process looking at the same SQS queue for "running" events this could cause messages to get lost. Should this be moved to before the else statement so it looks like this? That way the message is only deleted once processed and not when it is ignored, the other side affect is that if you have nothing else listening you will have extra messages queuing up that will never be processed.

        end

        AwsCleaner.new.delete_message(id, @config)
      else
        @logger.info('Message not relevant, ignoring...')
      end```

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the expectation has been that this queue is only used for aws-cleaner I would not want to mix this and say application queues there is no real cost to using a seperate queue. I'd like to hear more of a real world justification before wanting to go down this route.

end

sleep(5)
Expand Down
2 changes: 2 additions & 0 deletions config.yml.sample
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
:lookup: chef
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this should be quoted.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added that.

:aws:
# only region is required when using iam which is recommended
:access_key_id: 'AWS Access Key'
Expand All @@ -12,6 +13,7 @@
:url: 'https://chef.example.com/organizations/example'
:client: 'somebody'
:key: '/path/to/client.pem'
:enable: true
:hipchat:
:api_token: 'Hipchat API token'
:room: 'Notifications'
Expand Down
35 changes: 33 additions & 2 deletions lib/aws-cleaner.rb
Original file line number Diff line number Diff line change
Expand Up @@ -30,15 +30,15 @@ def self.client(config)
# call the Chef API to get the node name of the instance
def self.get_chef_node_name(instance_id, config)
chef = client(config)
results = chef.search.query(:node, "ec2_instance_id:#{instance_id} OR chef_provisioning_reference_server_id:#{instance_id}")
results = chef.search.query(:node, "ec2_instance_id:#{instance_id[2..-1]} OR chef_provisioning_reference_server_id:#{instance_id[2..-1]}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a chef server currently to test this against can you show me the output of this or explain this change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This removes the "i-" part from the instance ID in case you don't have that. I do the same on the sensu side since we don't have that part in our client names.

I tested it on a quick chef install and it seemed to work but don't have a full chef install to test against. I'm ok omitting this change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to address this separately and will be starting a new job next week so I will have a chef and sensu server to test with.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that should be removed since it's going to break lookups for the instance. Ohai populates that attribute with the actual instance-id and it will always include i-.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reverted that, I figured since it was the "search" api that was being called it would also search partial strings it seemed to work on my quick test with chef but since I don't use it I'm fine with reverting it.

return false if results.rows.empty?
results.rows.first['name']
end

# call the Chef API to get the FQDN of the instance
def self.get_chef_fqdn(instance_id, config)
chef = client(config)
results = chef.search.query(:node, "ec2_instance_id:#{instance_id} OR chef_provisioning_reference_server_id:#{instance_id}")
results = chef.search.query(:node, "ec2_instance_id:#{instance_id[2..-1]} OR chef_provisioning_reference_server_id:#{instance_id[2..-1]}")
return false if results.rows.empty?
results.rows.first['automatic']['fqdn']
end
Expand All @@ -58,6 +58,37 @@ def self.remove_from_chef(node_name, chef, config)
end

module Sensu
# call the Chef API to get the node name of the instance
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update comment to Sensu.

def self.get_sensu_node_name(instance_id, config)
offset = 0
returned_clients = 100

while returned_clients == 100
response = RestClient::Request.execute(
url: "#{config[:sensu][:url]}/clients?limit=100&offset=#{offset}",
method: :get,
timeout: 5,
open_timeout: 5
)

clients = JSON.parse(response.body)
client = clients.find { |child| child['name'].include?(instance_id[2..-1]) }
client_name = if client.nil?
false
else
client['name']
end

break if client_name

returned_clients = clients.count
offset += 100
end

return false unless client_name
client_name
end

# check if the node exists in Sensu
def self.in_sensu?(node_name, config)
RestClient::Request.execute(
Expand Down