Ever since NetEye 4.11, the NetEye upgrade procedure has been automated with the neteye upgrade
command. This command performs several checks to ensure that:
Then, if the checks are successful, the command updates the NetEye repository definitions to the newer version, installs the new RPMs and upgrades the NetEye’s yum groups.
Suppose Bob needs to upgrade his NetEye 4.17 cluster to the newer NetEye 4.18. To do so, he types in the following command:
[root@neteye-cluster1 ~]# neteye upgrade
This command will trigger the execution of a set of Ansible tasks (collected in a playbook) that will perform all the operations described before.
Logically, the Ansible tasks that are performed during an upgrade can be divided into the following groups:
When dealing with Ansible playbooks, it may be useful or required to run a subset of the tasks composing the playbook, instead of running it in its entirety. Parts of a playbook can be marked with a tag, an attribute that you can set to an Ansible structure (plays, roles, tasks), in order to execute or skip specific tasks.
We can list the available tags in the neteye upgrade
with the following command:
[root@neteye-cluster1 ~]# neteye upgrade --list-tags
playbook: /usr/share/neteye/scripts/upgrade/playbooks/upgrade.yml
play #1 (localhost): localhost TAGS: [check_prerequisites]
TASK TAGS: [check_prerequisites]
play #2 (all): all TAGS: [check_version]
TASK TAGS: [check_version]
play #3 (all): all TAGS: [check_health]
TASK TAGS: [check_health]
play #4 (all): all TAGS: [check_update]
TASK TAGS: [check_update]
play #5 (nodes,!voting_nodes,!es_nodes): nodes,!voting_nodes,!es_nodes TAGS: [check_cluster]
TASK TAGS: [check_cluster]
play #6 (all): all TAGS: [check_upgrade]
TASK TAGS: [check_upgrade]
play #7 (all): all TAGS: [upgrade]
TASK TAGS: [upgrade]
As a wrapper for the ansible-playbook
command, the neteye upgrade
commands accepts all those flags and switches that can normally be passed to the former executable. This means that Bob can check the installed NetEye version via neteye upgrade
without having to run the entire set of tasks.
To explicitly execute only the tasks that are marked by a specific tag, like check_version, Bob can run:
[root@neteye-cluster1 ~]# neteye upgrade --tags check_version
PLAY [localhost]
PLAY [all]
TASK [NetEye | retrieve NetEye version from daemon]
ok: [neteye-cluster3.neteyelocal]
ok: [neteye-cluster1.neteyelocal]
ok: [neteye-cluster2.neteyelocal]
TASK [NetEye | get current NetEye version]
ok: [neteye-cluster1.neteyelocal]
ok: [neteye-cluster3.neteyelocal]
ok: [neteye-cluster2.neteyelocal]
TASK [NetEye | get current NetEye minor version]
ok: [neteye-cluster2.neteyelocal]
ok: [neteye-cluster1.neteyelocal]
ok: [neteye-cluster3.neteyelocal]
TASK [NetEye | get next NetEye minor version]
ok: [neteye-cluster1.neteyelocal]
ok: [neteye-cluster2.neteyelocal]
ok: [neteye-cluster3.neteyelocal]
TASK [NetEye | get current NetEye major version]
ok: [neteye-cluster1.neteyelocal]
ok: [neteye-cluster2.neteyelocal]
ok: [neteye-cluster3.neteyelocal]
TASK [NetEye | get next NetEye major version]
ok: [neteye-cluster1.neteyelocal]
ok: [neteye-cluster2.neteyelocal]
ok: [neteye-cluster3.neteyelocal]
TASK [NetEye | retrieve installation status from daemon]
ok: [neteye-cluster2.neteyelocal]
ok: [neteye-cluster3.neteyelocal]
ok: [neteye-cluster1.neteyelocal]
TASK [NetEye | verify if installation status allows upgrading]
skipping: [neteye-cluster1.neteyelocal]
skipping: [neteye-cluster2.neteyelocal]
skipping: [neteye-cluster3.neteyelocal]
PLAY [all]
PLAY [all]
PLAY [nodes,!voting_nodes,!es_nodes]
PLAY [all]
PLAY [all]
PLAY RECAP ****************************************************************************************************************************************************************************************************************************
neteye-cluster1.neteyelocal : ok=7 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
neteye-cluster2.neteyelocal : ok=7 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
neteye-cluster3.neteyelocal : ok=7 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
As you can see from the output, Bob only executed the tasks marked with the tag check_version, skipping all others.
By increasing the verbosity of the command, Bob can also see which version is actually installed.
[root@neteye-cluster1 ~]# neteye upgrade --tags check_version -v
Using /etc/ansible/ansible.cfg as config file
....
TASK [NetEye | get current NetEye version]
ok: [neteye-cluster1.neteyelocal] => {
"ansible_facts": {
"neteye_version": "4.18"
},
"changed": false
}
ok: [neteye-cluster2.neteyelocal] => {
"ansible_facts": {
"neteye_version": "4.18"
},
"changed": false
}
ok: [neteye-cluster3.neteyelocal] => {
"ansible_facts": {
"neteye_version": "4.18"
},
"changed": false
}
It’s also possible to skip a specific group of tasks, for example, because Bob doesn’t need to check his cluster, or because he already knows that his NetEye installation is fully updated. Skipping the tags marked with check_update can be done with the following command:
[root@neteye-cluster1 ~]# neteye upgrade --skip-tags check_update
Technically speaking, each tag can be potentially skipped, but in the case of the neteye upgrade
command, some task groups have a strict dependency on others. As you can imagine, the upgrade task group needs to know if the installation can be upgraded, meaning there is a dependency between the upgrade task group and the check_upgrade task group. The neteye upgrade
command will then fail if this dependency is broken at execution time.
Remember that the failure of a task will terminate the automatic upgrade procedure and the
upgrade
command will fail until all errors are manually fixed or dependencies between tasks are guaranteed.
As a NetEye user, I would like to be able to execute the
neteye upgrade
command again after a failure during the actual upgrade of the RPMs due to network or other issues.
Imagine the following scenario: Bob launched the neteye upgrade
command to upgrade from NetEye 4.17 to NetEye 4.18, but he experienced an error during the upgrade of the RPMs because of a network problem. The upgrade command has a retry mechanism in place to prevent issues or failures of the Ansible yum module during the upgrade, but sometimes it might happen that retries are not enough to avoid errors. The outcome of this kind of error is a system that is in between two versions, not yet fully upgraded but no longer at version 4.17.
This unhealthy condition can be solved in two ways: the first is to use yum to manually conclude the upgrade. However, this requires several commands that must be executed by the user with the possibility of causing further errors. The second way is, instead, to reuse the neteye upgrade
command.
In general, the neteye upgrade
command does not support executing the command twice after a failure occurred during the upgrade of the RPMs phase. This limitation, however, can be addressed by using the ansible-playbook switches --tags
and --extra-vars
. We’ve already seen the first switch above. The second switch, --extra-vars
, allows us to pass variables to the neteye upgrade command, override default values, or initialize variables that are not defined.
Then, Bob can now run the neteye upgrade
command as follows:
[root@neteye-cluster1 ~]# neteye upgrade --tags check_upgrade,upgrade --extra-vars "neteye_version=4.17 neteye_next_version=4.18"
As you can see, Bob is limiting the execution to the task groups check_upgrade and upgrade, to only perform upgrade-related tasks and to avoid checking for conditions that cannot be satisfied in his current in-between-versions situation. Furthermore, Bob explicitly passes the variables neteye_version and neteye_next_version, to tell the command that the next version is NetEye 4.18, but that the system is still at 4.17. This command allows Bob to execute the upgrade again, without needing to manually type additional yum commands.
As mentioned above, the underlying logic of the neteye upgrade
command is written in Ansible. Thus Ansible command line options can also be passed to the command. For example, to increase the verbosity of the output of the upgrade
command, Bob can type:
[root@neteye-cluster1 ~]# neteye upgrade -v
Increasing the number of v‘s will further increase the verbosity level (i.e., -vvv).
Correct execution of the neteye upgrade
command will produce the following output:
PLAY RECAP ****************************************************************************************************************************************************************************************************************************
localhost : ok=1 changed=0 unreachable=0 failed=0
neteye-cluster1 : ok=14 changed=4 unreachable=0 failed=0
neteye-cluster2 : ok=10 changed=2 unreachable=0 failed=0
neteye-cluster3 : ok=10 changed=2 unreachable=0 failed=0
We can see a summary of the tasks executed, divided by host and by task outcome. For example, for the node neteye-cluster1
we have 14 tasks whose outcome was ok, 4 with changed, and 0 with unreachable or failed.
To be able to proceed with the upgrade, be sure that you don’t see any unreachable or failed items in the PLAY RECAP section of the command output!
While executing the neteye upgrade
command, Bob notices the following warnings:
PLAY [all] **
[WARNING]: Could not match supplied host pattern, ignoring: voting_nodes
[WARNING]: Could not match supplied host pattern, ignoring: es_nodes
A NetEye cluster can be composed of different types of nodes: regular nodes that are members of the Red Hat cluster, voting-only nodes, and elastic-only nodes. which are part of an Elasticsearch cluster but not of the Red Hat cluster.
Ansible by default does not know which machines are part of Bob’s cluster. Static lists of hosts cannot be used, since they would not respect the topology of the system, and they would require manual intervention in the event there were any changes.
To overcome this problem, we use a dynamic inventory script, which calculates the topology of Bob’s cluster at run-time. In the example above, Bob’s cluster is made of regular nodes only, and the command is just saying that it could not find any hosts labeled as a voting-only or elastic-only node, therefore those host patterns will be ignored. In conclusion, the warnings can be safely ignored by Bob.