In this article I’ll introduce a feature that some customers have frequently asked for. The customers’ request is linked to the fact that automation (whether via Tornado, ad-hoc scripts, etc.) generate a lot of services “on the fly”, for example as shown in my previous blog post. These services are created, updated and eventually manually or automatically closed. Once the problem is resolved, they are no longer needed by the monitoring system and so should be automatically cleaned from NetEye.
My approach is based on a bash script I named check_service_delete_icingcli_by_hard_state.sh.
This script performs two actions:
I’ve put some code snippets detailing these at the bottom of this article.
The script is inserted into the /neteye/shared/monitoring/plugins/ directory and is then configured via a command/service with the following parameters:
Once the command and service have been created, the service is added to a target host. It has to be properly configured by setting these 3 fields:
In the following example the script deletes all services with the custom_variable “created_by” set to apic_fault_check, and that have had their Hard Status set to OK for more than one day. All other services are ignored.
In the NetEye Dashboard, the Service provides basic information like this:
And here are the code snippets I promised you:
1) Call Director API in order to have all services with predefined filter:
#Call Director API curl -k -s -u director:@PASSWORD@ -H 'Accept: application/json' -X GET 'https://@FQDN@:5665/v1/objects/services' -d "$(generate_post_data)" | jq -r '.results[] | [.name, .attrs.last_hard_state_change] | @csv' >/$DIR/$var'_delete.tmp'
2) Remove services via the Icinga CLI:
if (($diff > $d2s));then icingacli director service delete "$servicename" --host "$hostname" >/dev/null if [ $? -eq 0 ]; then echo "<strong>DELETED Successfull</strong>: Host $hostname - Service $servicename - Last_Check: $diffg days ago" echo "" else echo "ERROR during DELETE Host $hostname - Service $servicename - Last_Check: $diffg days ago" echo "" fi else echo "Service not deleted: Host: $hostname - Service: $servicename - Last_Check: $diffg days ago" echo "" fi
Did you find this article interesting? Are you an “under the hood” kind of person? We’re really big on automation and we’re always looking for people in a similar vein to fill roles like this one as well as other roles here at Würth Phoenix.