In the new NetEye version 3.5 that will be shortly released, it has been implemented the Shut Down Management module that allows to configure automatic shutdown procedures in a data center.
I’ll try now to provide a simple example to let you understand the potentiality and the necessity of this feature. For example if there are problems with the power supply in the data center, the UPS usually will start with an half an hour autonomy. Therefore, it is necessary to shut down all the servers before the power will be definitely interrupted. In this case a “Business process” can be configured in NetEye to execute the desired logic for the checks (i.e. it will check that the UPS is started). In the Shut Down Management module the user can configure that in case the check fails, and a determined tolerance time (30 minutes ) has been elapsed, the automatic shutdown procedure will be started to stop on time all the servers with a certain order and logic. The tolerance time has to be calculated in terms of total UPS battery autonomy time minus the total time required to complete the shutdown procedure of all hosts.
Let’s see how to set up a shutdown management procedure.
Create a new shut down management
When configuring a new Shutdown Procedure definition it is possible to choose the following principal settings:Require Explicit User Confirm:
the automatic shutdown can require a manual confirmation by the user before starting the process
Tolerance Period before shut down: is the time period that should elapse from the failure of the check till the starting of the shutdown procedure
Configure the Nagios check for the shutdown procedure
Finally also the Business Process has to be chosen, as the condition determining the status and the condition of whether to in invoke the shutdown procedure:
Add or remove the hosts that needs to be included or excluded from the shutdown procedure
Finally a shutdown definition consists of a list of hosts to be shut down and a clear shutdown sequence. To helps to ensure to shut down the hosts in the right sequence in order to avoid data corruption and system inconsistencies. Already build-in shutdown Commands are available and can be attributed to the hosts. These commands can interact directly with the system or invoke the shutdown process via remote NetEye agent.
Starting of the shutdown procedure
Once the Nagios monitoring status becomes a Critical with a HARD status ( confirmed critical ) the shutdown management checks the status and makes sure:
that the tolerance time for the Business Service to recover is elapsed without any positive recovery from the Nagios side
that the Nagios check is still executed and the results are fresh
that the user confirmes the shutdown procedure to start, if that is required from the settings
The logs of all activities, but also the activities of the shutdown itself can be seen from the logs collected by the module.
After my graduation in Applied Computer Science at the Free University of Bolzano I decided to start my professional career outside the province. With a bit of good timing and good luck I went into the booming IT-Dept. of Geox in the shoe district of Montebelluna, where I realized how a big IT infrastructure has to grow and adapt to quickly changing requirements. During this experience I had also the nice possibility to travel the world, while setting up the various production and retail areas of this company. Arrived at Würth Phoenix I started developing on our monitoring solution NetEye. Today, in my position as Consulting an Project Manager I am continuously heading to implement our solutions to meet the expectation of your enterprise customers.
Author
Patrick Zambelli
After my graduation in Applied Computer Science at the Free University of Bolzano I decided to start my professional career outside the province. With a bit of good timing and good luck I went into the booming IT-Dept. of Geox in the shoe district of Montebelluna, where I realized how a big IT infrastructure has to grow and adapt to quickly changing requirements. During this experience I had also the nice possibility to travel the world, while setting up the various production and retail areas of this company. Arrived at Würth Phoenix I started developing on our monitoring solution NetEye. Today, in my position as Consulting an Project Manager I am continuously heading to implement our solutions to meet the expectation of your enterprise customers.
When using Kibana in environments that require a proxy to reach external services, you might encounter issues with unrecognized SSL certificates. Specifically, if the proxy is exposed with its own certificate and acts as an SSL terminator, requests made by Read More
In a previous post we went through the configuration of Elastic Universal Profiling in NetEye, seeing how we can profile applications written in programming languages that do not compile to native code (for example Python, PHP, Perl, etc.) But what Read More
Elastic 8.16, which comes with NetEye 4.39, made Elastic Universal Profiling generally available for self-hosted installations. This means that NetEye SIEM installations will now be able to take advantage of the continuous profiling solution by Elastic. In this blog post Read More
In the first part of this series, we explored how Jira Service Management (JSM) helps streamline Incident Management, aligning with ITIL v4 best practices. Incident Management aims to restore normal service operation as quickly as possible after a disruption, ensuring Read More
Hello everyone! Today, I'd like to briefly discuss an improvement to the update and upgrade procedures that we've started to adopt with NetEye 4.39! What we wanted to improve One aspect that made quite an impact was that whenever the Read More