Write tests

This guide shows you how to create integration tests with the tenzir-test framework. You’ll set up a standalone repository, write test scenarios, and record reference output to verify your pipelines work as expected.

Prerequisites

Python 3.12 or newer.
uv installed locally.
A working installation of Tenzir. The harness automatically detects tenzir and tenzir-node using this precedence:
1. TENZIR_BINARY / TENZIR_NODE_BINARY environment variables
2. Local binary on PATH
3. Fallback to uvx tenzir / uvx --from tenzir tenzir-node when uv is installed
Most users need no configuration because the harness uses uvx to fetch Tenzir on demand.

Step 1: Scaffold a project

Create a clean directory that holds nothing but integration tests and their shared assets. The harness treats this directory as the project root.

mkdir demo
cd demo

Step 2: Check the harness

Run the harness through uvx to make sure the tooling works without setting up a virtual environment. uvx downloads and caches the latest release when needed.

uvx tenzir-test --help

If the command succeeds, you’re ready to add tests.

Step 3: Add shared data

Populate inputs/ with artifacts that tests will read. The example below stores a short NDJSON dataset that models a few alerts.

{"id": 1, "severity": 5, "message": "Disk usage above 90%"}
{"id": 2, "severity": 2, "message": "Routine backup completed"}
{"id": 3, "severity": 7, "message": "Authentication failure on admin"}

Save the snippet as inputs/alerts.ndjson.

Step 4: Author a pipeline test

Create your first scenario under tests/. The harness discovers tests recursively, so you can organize them by feature or risk level. Here, you create tests/high-severity.tql.

from_file f"{env("TENZIR_INPUTS")}/alerts.ndjson"
where severity >= 5
project id, message
sort id

The harness also injects a unique scratch directory into TENZIR_TMP_DIR while each test executes. Use it for transient files you do not want under version control; pass --keep when you run tenzir-test if you need to inspect the generated artifacts afterwards.

Stream raw output while iterating

During early iterations you may want to inspect command output before you record reference artifacts. Enable passthrough mode via --passthrough (-p) to pipe the tenzir process output directly to your terminal while the harness still provisions fixtures and environment variables:

uvx tenzir-test --passthrough tests/high-severity.tql

The harness enforces the exit code but skips comparisons, letting you decide when to capture the baseline with --update.

Step 5: Capture the reference output

Run the harness once in update mode to execute the pipeline and write the expected output next to the test.

uvx tenzir-test --update

The command produces tests/high-severity.txt with the captured stdout.

{"id":1,"message":"Disk usage above 90%"}
{"id":3,"message":"Authentication failure on admin"}

Review the reference file, adjust the pipeline if needed, and rerun --update until you are satisfied with the results. Commit the .tql test and .txt baseline together so future runs can compare against known-good output.

Step 6: Rerun the tests

After you check in the reference output, execute the tests without --update. The harness verifies that the actual output matches the baseline.

uvx tenzir-test

When the output diverges, the harness prints a diff and returns a non-zero exit code. By default (quiet mode) the harness only shows failures, which keeps large test runs readable. Add --verbose (-v) to see passing and skipped tests as they complete. Use --debug to see comparison targets alongside the usual harness diagnostics; debug mode automatically enables verbose output so you see all test results. For CI-only visibility you can set TENZIR_TEST_DEBUG=1. Add --summary together with --verbose when you also want the tabular breakdown and failure tree at the end.

Retry flaky tests (sparingly)

If a scenario fails intermittently, add a retry entry to its frontmatter so the harness reruns it before flagging a failure. The value is the total attempt budget:

---
retry: 3
---

With retry: 3, the test runs up to three times. Intermediate attempts stay quiet; the final result line includes attempts=3/3 (or the actual number on a success). Use this as a guardrail while you investigate the underlying flake and keep the budget small to avoid masking issues.

Handle non-deterministic output

Some tests produce output in an unpredictable order, for example when querying hash-based aggregations or parallel operations. Instead of retrying until the order happens to match, use pre-compare transforms to normalize both sides before comparison:

---
pre-compare: sort
---

The sort transform sorts output lines lexicographically before comparing them against the baseline. This lets you capture the raw output with --update while still verifying correctness regardless of order.

#!/usr/bin/env bash
# pre-compare: sort

# Output order depends on filesystem enumeration
ls /tmp/*.log 2>/dev/null || echo "no logs"

Transforms only affect comparison. The baseline file stores the original untransformed output, keeping it human-readable. See pre-compare transforms in the reference documentation for the full list of available transforms.

Run multiple projects together

Large organisations often split tests across several repositories but still want an aggregated run. List additional project directories after --root and add --all-projects to execute the root alongside its satellites under a single invocation. Those positional paths form the selection; here it only names the satellite project:

uvx tenzir-test --root example-project --all-projects ../example-satellite

The root project (example-project above) supplies the shared fixtures and runners. Satellites inherit those definitions, can register their own helpers, and run their tests in isolation. Because the selection only listed the satellite, --all-projects keeps the root in scope. The CLI prints a compact summary showing how many tests each project contributes and which runners are involved. Add --verbose to see individual test results as they complete, and combine it with --summary for the tabular breakdown and detailed failure listing after each project.

Load packages explicitly

When your tests need packages, point --package-dirs at the package directories (or a parent that contains them). The flag is repeatable and supports comma-separated lists:

uvx tenzir-test --package-dirs example-library example-library

Here example-library contains multiple packages, so the harness loads them all and makes sibling packages visible for cross-imports. You can also declare package directories in a directory test.yaml via package-dirs:; those entries merge with --package-dirs.

Step 7: Introduce a fixture

Fixtures let you bootstrap external resources and expose their configuration through environment variables. Add a simple node-driven test to exercise a running Tenzir node.

Create tests/node/ping.tql with the following contents:

---
fixtures: [node]
timeout: 10
---

// Get the version from the running node.
remote {
  version
}

Because the test needs a node to run, include the built-in node fixture and give it a reasonable timeout. The fixture starts tenzir-node, injects connection details into the environment, and tears the process down after the run. Capture the baseline via --update just like before.

The fixture launches tenzir-node from the directory that owns the test file, so tenzir-node.yaml placed next to the scenario can refer to files with relative paths (for example ../inputs/alerts.ndjson).

Reuse fixtures with suites

When several tests should share the same fixture lifecycle, promote their directory to a suite. Add suite: to the directory’s test.yaml and keep the fixture selection alongside the other defaults:

suite: smoke-http
fixtures: [http]
timeout: 45
retry: 2

Key behaviour:

Suites are directory-scoped. Once a test.yaml declares suite, every test in that directory and its subdirectories joins automatically. Move the scenarios that should remain independent into a sibling directory.
Suites run sequentially on a single worker. The harness activates the shared fixtures once, executes members in lexicographic order of their relative paths, and tears the fixtures down afterwards. Other suites (and standalone tests) still run in parallel when --jobs allows it.
Per-test frontmatter cannot introduce suite, and suite members may not define their own fixtures or retry. Keep those policies in the directory defaults so every member agrees on the shared lifecycle. Outside a suite, frontmatter can still set fixtures, retry, or timeout as before.
Tests can override other keys (for example inputs: or additional metadata) on a per-file basis when necessary.

Run the http directory that defines the suite when you iterate on it:

uvx tenzir-test tests/http

Selecting a single file inside that suite fails fast with a descriptive error, which keeps the fixture lifecycle predictable and prevents partial runs from leaving shared state behind.

Drive fixtures manually

When you switch to the Python runner you can drive fixtures manually. The controller API makes it easy to start, stop, or even crash the same node fixture inside a single test:

# runner: python
# fixtures: [node]

import signal

# Context-manager style: `with` automatically calls `start()` and `stop()` on
# the fixture.
with acquire_fixture("node") as node:
    tenzir = Executor.from_env(node.env)
    tenzir.run("remove { version }")  # talk to the running node

# Without the context manager, you need to call `start()` and `stop()` manually.
node.start()
Executor.from_env(node.env).run("version")
node.stop()

This imperative style complements the declarative fixtures: [node] flow and is especially useful for fault-injection scenarios. The harness preloads helpers like acquire_fixture, Executor, and fixtures(), so Python-mode tests can call them directly.

When you restart the same controller, the node keeps using the state and cache directories it created during the first start(). Those paths (exported via TENZIR_NODE_STATE_DIRECTORY and TENZIR_NODE_CACHE_DIRECTORY) live inside the test’s scratch directory by default and are cleaned up automatically when the controller goes out of scope. Acquire a fresh controller when you need a brand new workspace.

Step 8: Organize defaults with `test.yaml`

As suites grow, you can extract shared configuration into directory-level defaults. Place a tests/node/test.yaml file with convenient settings:

fixtures: [node]
timeout: 120
# Optional: reuse datasets that live in tests/data/ instead of the project root.
inputs: ../data

The harness merges this mapping into every test under tests/node/. Relative paths resolve against the directory that owns the YAML file, so inputs: ../data points at tests/data/. Individual files still override keys in their frontmatter when necessary.

Step 9: Automate runs

Once the suite passes locally, integrate it into your CI pipeline. Configure the job to install Python 3.12, install tenzir-test, provision or download the required Tenzir binaries, and execute uvx tenzir-test --root .. For reproducible results, keep your datasets small and deterministic, and prefer fixtures that wipe state between runs.

Next steps

You now have a project that owns its inputs, tests, fixtures, and baselines. From here you can:

Add custom runners under runners/ when you need specialized logic around tenzir invocations.
Build Python fixtures that publish or verify data through the helper APIs in tenzir_test.fixtures.
Explore coverage collection by passing --coverage to the harness.

Refer back to the test framework reference whenever you need deeper details about runners, fixtures, or configuration knobs.