Skip to content

This page shows you how to use ClickHouse as an analytical store for Tenzir pipelines: write normalized security telemetry to ClickHouse with to_clickhouse and read tables or SQL query results back into Tenzir with from_clickhouse.

ClickHouse is an open-source analytical database built for fast SQL over large event volumes. Tenzir sits in front of ClickHouse as a programmable security data pipeline: it collects telemetry, parses source formats, maps events to OCSF, enriches or reduces volume, and writes structured tables that analysts and detection systems can query.

With from_clickhouse, ClickHouse can also feed later pipeline stages. Use this path to backfill historical events, export query results, materialize detection outputs, or route subsets to another tool without adding another ingestion path.

Databaseclickhouse-cppTableclickhouse-cppclickhouse-cppclickhouse-cpp

Use the path that matches the role ClickHouse plays in your deployment:

GoalClickHouse roleTenzir building blocks
Build a security data lakeDestination for OCSF-normalized telemetryocsf::cast, to_clickhouse
Keep a schema you manage in ClickHouseExisting MergeTree table with explicit types, TTLs, projections, or materialized viewsto_clickhouse with mode="append"
Query retained telemetrySource table or SQL query resultfrom_clickhouse
Run lake-backed hunts or detectionsSQL engine for filtering, grouping, and sorting large event setsfrom_clickhouse, where, publish
Export or fan out query resultsQuery result source for downstream tools or object storagefrom_clickhouse, output operators such as to_s3 or to_http
Inspect available tables and schemasSHOW, DESCRIBE, or system.* metadata queriesfrom_clickhouse with sql=...

Tenzir connects to ClickHouse through the native ClickHouse TCP protocol using the official clickhouse-cpp client library. The operators don’t use ClickHouse’s HTTP interface or a local IPC mechanism.

Use either a ClickHouse URI or explicit connection arguments:

from_clickhouse uri="clickhouse://default:secret@localhost:9000/security",
table="events",
tls=false

You can also pass the connection details as separate arguments:

from_clickhouse table="security.events",
host="localhost",
port=9000,
user="default",
password=secret("CLICKHOUSE_PASSWORD"),
tls=false

Use the same connection arguments with to_clickhouse. If a URI selects a database and table is unqualified, Tenzir uses that database. In create modes, to_clickhouse also creates the selected database if it doesn’t exist.

If you need a local or self-managed ClickHouse deployment, start with the ClickHouse OSS quick start. Tenzir connects to a self-managed server the same way, using the native host and port you configure in from_clickhouse and to_clickhouse.

These examples assume that ClickHouse runs on the same host as Tenzir and allows non-TLS connections, which is why the pipelines use tls=false. For ClickHouse Cloud, use your service endpoint and credentials, and keep TLS enabled.

Use this path when ClickHouse is your security data lake. Tenzir can create tables from incoming events, and ocsf::cast keeps the ClickHouse schema stable by casting events to the selected OCSF class, encoding variant fields, and filling missing fields with typed nulls.

from_file "ocsf_network_activity.json"
ocsf::cast encode_variants=true, null_fill=true
to_clickhouse table=f"ocsf.{class_name.replace(" ","_")}",
primary=time,
tls=false

When creating a table, the to_clickhouse operator uses the first event to determine the schema. Make sure the first event doesn’t contain untyped nulls or empty records. ocsf::cast helps because it gives expected OCSF fields explicit types before the table is created.

After the data lands, query it directly in ClickHouse:

SELECT
dst_endpoint.ip,
count() AS events,
median(traffic.bytes_in) AS median_bytes_in
FROM ocsf.Network_Activity
WHERE time > now() - INTERVAL 1 DAY
GROUP BY dst_endpoint.ip
ORDER BY events DESC
LIMIT 20;

Create the ClickHouse table first when you need explicit types, table engines, TTL policies, projections, materialized views, or partitioning that should stay under ClickHouse control.

  1. Create the table in ClickHouse:

    CREATE DATABASE IF NOT EXISTS security;
    CREATE TABLE security.alerts (
    time DateTime64(9),
    rule_name String,
    severity Int64,
    src_ip Nullable(IPv6),
    message String
    ) ENGINE = MergeTree()
    ORDER BY (time, rule_name);
  2. Ingest data from Tenzir:

    from {
    time: 2026-06-01T12:00:00Z,
    rule_name: "failed-login-burst",
    severity: 5,
    src_ip: 10.0.1.12,
    message: "More than 20 failed logons from one source in 5 minutes",
    }
    to_clickhouse table="security.alerts", mode="append", tls=false

    mode="append" makes the pipeline fail if the table doesn’t already exist, so ClickHouse remains the source of truth for the schema.

Use table mode when you want Tenzir to read a ClickHouse table as structured events. Filters after from_clickhouse can be pushed into ClickHouse in table mode, which reduces the amount of data transferred into the pipeline.

from_clickhouse table="ocsf.Network_Activity", tls=false
where time > now() - 1d and severity_id >= 3
publish "clickhouse-network-activity"

Use SQL mode when ClickHouse should project, aggregate, sort, or otherwise shape the result before Tenzir receives it:

from_clickhouse sql="SELECT time, host, severity, message FROM security.events WHERE severity >= 3 ORDER BY time DESC",
tls=false
publish "clickhouse-findings"

Use SQL mode for ClickHouse metadata queries:

from_clickhouse sql="SHOW TABLES FROM ocsf", tls=false

To inspect columns for a specific table, run DESCRIBE TABLE:

from_clickhouse sql="DESCRIBE TABLE ocsf.Network_Activity", tls=false

Because from_clickhouse turns query results into regular Tenzir events, you can route them to any supported destination. For example, export a seven-day slice to S3 as Parquet:

from_clickhouse sql="SELECT * FROM ocsf.Network_Activity WHERE time >= now() - INTERVAL 7 DAY",
tls=false
to_s3 "s3://security-exports/clickhouse/network_activity_{uuid}.parquet" {
write_parquet
}

Last updated: