As you have surely read from the release notes of NetEye 4.27, we have integrated ClickHouse to be able to use the historical flows and alerts feature of ntopng.
Directly from the official website:
ClickHouse is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP). ClickHouse’s performance exceeds all other column-oriented database management systems. It processes billions of rows and tens of gigabytes of data per server per second.
ClickHouse runs in NetEye as a local service connected with other nodes in the NetEye cluster to ensure a high availability of service.
With the NetEye 4.27 upgrade, ClickHouse is preconfigured with one shard, and all data is replicated on each standard NetEye node. The voting-only node in this case does not hold data but only votes when needed to maintain the quorum.
In /neteye/local/clickhouse-server/
you can find the configuration, logs, and data related to clickhouse-server. Should it become necessary for a particular use case to make changes to the configuration, we recommend creating a new file in /neteye/local/clickhouse-server/conf/config.d
. Files in this path are loaded in lexicographic order.
The easiest way to connect from the CLI to ClickHouse is to use the clickhouse-client
command, which is already preconfigured to use a TLS connection:
[root@node01 ~]# clickhouse-client ClickHouse client version 22.9.2.7 (official build). Connecting to clickhouse-server.neteyelocal:9440 as user default. Connected to ClickHouse server version 22.9.2 revision 54460. node02 :)
We’ve now successfully connected to clickhouse-server.neteyelocal
, where an instance of Nginx is listening in load balancer mode, which then forwards us to clickhouse-server on node02.
Consider the example of a customization that needs to insert data into ClickHouse. The first thing to do is create a dedicated database.
As I mentioned above, NetEye provides a ClickHouse HA cluster:
node02 :) show clusters SHOW CLUSTERS Query id: 46cf26e5-61ac-4d96-854e-91950263df43 ┌─cluster─────────────────────────────────────────┐ │ neteye_ha_cluster │ └─────────────────────────────────────────────────┘ 1 rows in set. Elapsed: 0.002 sec.
so you will need to create the database within this cluster
node01 :) CREATE DATABASE mycustomization ON CLUSTER neteye_ha_cluster; CREATE DATABASE mycustomization ON CLUSTER neteye_ha_cluster Query id: f2a6d412-62dc-48b3-94ef-35c8962a77ed ┌─host─────────────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐ │ neteye02.neteyelocal │ 9440 │ 0 │ │ 1 │ 0 │ │ neteye01.neteyelocal │ 9440 │ 0 │ │ 0 │ 0 │ └──────────────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘ 2 rows in set. Elapsed: 0.114 sec.
as you can see from the output of the command, the query was executed on both standard NetEye nodes.
Now it’s time to create a new table:
node01 :) use mycustomization; USE mycustomization Query id: 65acaa86-07d6-4878-880f-da7494f40aae Ok. 0 rows in set. Elapsed: 0.002 sec. CREATE TABLE mylogs ON CLUSTER neteye_ha_cluster ( `date` DateTime, `user` String, `message` String ) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/mycustomization/mylogs', '{replica}') PARTITION BY toYYYYMMDD(date) ORDER BY date Query id: 25d929d8-139d-4943-9932-03377b1b3484 ┌─host─────────────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐ │ neteye02.neteyelocal │ 9440 │ 0 │ │ 1 │ 0 │ │ neteye01.neteyelocal │ 9440 │ 0 │ │ 0 │ 0 │ └──────────────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘ 2 rows in set. Elapsed: 0.115 sec.
When creating a table you have to pay attention to two things: create it within the cluster via the syntax ON CLUSTER neteye_ha_cluster
and the type of engine. To provide data redundancy on each node, the engine type must be ReplicatedMergeTree and must have the following syntax:ReplicatedMergeTree('/clickhouse/tables/{shard}/<DB_NAME>/<TABLE_NAME>', '{replica}')
The table is ready and you can proceed to insert some initial data:
INSERT INTO mylogs (date, user, message) VALUES ('2022-11-13 00:00:01', 'root', 'User logged in'); Query id: c1507766-4edc-4163-97ff-60ee8a012233 Ok. 1 row in set. Elapsed: 0.016 sec.
And retrieve the inserted data:
SELECT * FROM mylogs Query id: 0c67b927-0a89-4027-ae8e-e40c7cc7245b ┌────────────────date─┬─user─┬─message────────┐ │ 2022-11-13 00:00:01 │ root │ User logged in │ └─────────────────────┴──────┴────────────────┘ 1 rows in set. Elapsed: 0.002 sec.
Thanks to the ReplicatedMergeTree
engine type, when fetching via a select you don’t need to worry about which node they were inserted into since they are present on both cluster nodes.
In this short article, we looked at the easiest way to take advantage of the high availability ClickHouse configuration present in NetEye. For advanced configurations and special use cases I recommend the official documentation as a starting point.
Did you find this article interesting? Does it match your skill set? Programming is at the heart of how we develop customized solutions. In fact, we’re currently hiring for roles just like this and others here at Würth Phoenix.