24. 10. 2024 Reinhold Trocker Log Management, Log-SIEM

Categories of documents – create more namespaces within an agent’s environment

In the ever-evolving landscape of IT monitoring and management, the ability to efficiently handle multi-dimensional namespaces is crucial. Within NetEye, Log-SIEM (Elastic), provides a comprehensive solution for managing the single namespace dimension with the namespace of a data_stream. This blog post deals with multi-dimensional namespaces and how NetEye’s Log-SIEM solution simplifies their management.

Understanding Multidimensional Namespaces
- Potential challenges in managing multidimensional namespaces
An example two-dimensional Namespace
Three simple solutions
Conclusion
These Solutions are Engineered by Humans

Understanding Multidimensional Namespaces

A namespace in IT is a container that holds a set of identifiers, ensuring that all names within it are unique. When we talk about multi-dimensional namespaces, we refer to namespaces that span multiple dimensions, such as time, location, and context. These are particularly common in large-scale IT environments where resources and services are distributed across various regions and time zones.

Potential challenges in managing multidimensional namespaces

Complexity: Managing namespaces across multiple dimensions can be complex
Scalability: As the number of dimensions increases, the scalability of the namespace management system becomes a critical concern
Consistency: Ensuring consistency across different dimensions and avoiding conflicts is a significant challenge

An example two-dimensional Namespace

Let’s assume that within the collected logs in your SIEM system, you classify the servers (document sources) into the following two dimensions:

Customer/tenant (customer A, customer B, customer C)
Environment (development, test, production)

We are used to create Elastic Agent Policies using its namespace functionality either:

For each customer/tenant
For each environment

Three simple solutions

Solution 1: A naming convention approach to two-dimensional namespaces

Continuing with the example above, we assume there are only 3 different tenants and 3 environments. You could use the namespace functionality of Elastic to map the two dimensions by just combining two strings as in the following table:

Elastic namespace (data_set.namespace)	Development	Test	Production
customerA	customer_a.dev	customer_a.test	customer_a.prod
customerB	customer_b.dev	customer_b.test	customer_b.prod
customerC	customer_c.dev	customer_c.test	customer_c.prod

naming convention approach

Here you can see that there are 3×3=9 namespaces, and correspondingly 9 Elastic Agent Policies that you’ll have to create and maintain.

Pros:

For access rights (authorization), you can continue to assign indices to different user groups (i.e., the tenant customerA would get access to all logs-*-customer_a.*)

Cons:

You can imagine that with growing numbers (let’s say you acquire 10 more customers), this will quickly grow and lead to a scalability problem doing manual policy management.

Imagine that the dev and test environments in the above example produce only a very limited number of logs. In that case two thirds of indices in our Elastic infrastructure would be nearly empty, even though Elastic resources are bounded.

Solution 2: Multidimensional namespaces with environment variables

In the example above, the two categories/dimensions can be seen as two properties of a SIEM document. Setting one field per (additional) category in the SIEM document is the easiest way to save this information.

The question is: how can we manage the additional category information without diminishing the Elastic namespace functionality.

In our example you’d probably use the Elastic namespace functionality for the customer/tenant information like this:

Elastic namespace (data_set.namespace)	Any environment
customerA	customerA
customerB	customerB
customerC	customerC

The missing environment information then has to be “injected” in some other way.

Curious how this works? Please see the next paragraph 😉

Solution 3: Practical implementation

The missing environment information can be “injected” using environment variables on the operating system (server) which is running an Elastic Agent instance. Such servers usually only reside in one environment, in other words: a server is either a development server or a test server or a production server.

The following explains how to set the environment variable EA_ENVIRONMENT and how to add this field to incoming documents, using the Elastic 8.15+ feature add-custom-fields to set document field service.environment to the content of EA_ENVIRONMENT variable.

Set Windows environment variable on Elastic Agent of log source

Follow these steps to change the environment variable on each computer of the log source (on which Elastic Agent is running) on which you want to set the specific environment variable:

In system environment variables
Or in the specific Elastic Agent service:

Store the new categories information you want to set, into a new multi string value named “Environment” within registry key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Elastic Agent.

Add Linux environment variable on Elastic Agent of log source

Usually a tgz-deployed Elastic Agent in a Linux OS reads its environment variables from /etc/sysconfig/elastic-agent. You can use the following script to add an entry in that file:

if [ -d /etc/sysconfig ]; then echo  EA_ENVIRONMENT=test >> /etc/sysconfig/elastic-agent; else mkdir /etc/sysconfig; echo  EA_ENVIRONMENT=test >> /etc/sysconfig/elastic-agent;fi

Configure the Elastic Agent policy

For every (one-dimensional) Elastic Agent policy and every new category information which you previously stored at the agent side, use the feature add-custom-fields to set the additional fields with the stored content:

Choose your Elastic Agent policy
Go to the Settings tab
Click on the “Add another field” button
Add the field name you want to find in your document
Add the desired (previously set) environment variable
(in this example ${env.EA_ENVIRONMENT|''})
In “Output for integrations” used by the agent policy to use the env provider in the section “Advanced YAML configuration”, set

providers:
  env:
    enabled: true

Example Result

Notes

This solution relies on the fact that every Elastic Agent instance is only a part of one category of the wanted additional namespace (in this example “environment”).

If you want additional dimensions, just repeat the steps of:

Adding environment variables
and
Adding “Add another field” in elastic agent policies
The part |'' of the variable reference is only to make sure that a non-set environment variable will be interpreted as empty, see alternative variables and constants

If you have roles that should only have access to a subset of the new namespace, please refer to Document level security

Conclusion

Handling multi-dimensional namespaces can be easily done by setting environment variables where the Elastic Agent is running, and using its definition in the Elastic Agent policy that is used by this Elastic Agent.

These Solutions are Engineered by Humans

Did you find this article interesting? Does it match your skill set? Our customers often present us with problems that need customized solutions. In fact, we’re currently hiring for roles just like this and others here at Würth Phoenix.