Collector best practices: where did this telemetry come from?

When I look at my traces, or when I’m frustrated by the quantity of metric events, I want to know how that data got to me. That’s why the first thing I add to any OpenTelemetry Collector configuration is: attribution.

This came up in End User Discussion Group yesterday; two people said they do this too.

Here’s a processor configuration to add an attribute to every trace span, saying what collector it went through:

transform/labelme:
    trace_statements:
    - context: resource
      statements:
      - set(attributes["collector.collector"], "${COLLECTOR_NAME}")

Then I add that to the pipeline configuration:

    traces:
      receivers: [otlp]
      processors: [transform/labelme, batch]
      exporters: [otlp]

Next, I often want to know what receiver a metric came from. I can’t graph a metric properly until I know how it was collected. To add attribution for the receiver, I have to break into multiple pipelines, so each can have its own transform processor.

For an example, check this processor configuration at otelbin.io. Also available as a gist.

Each processor definition looks like this:

  transform/kubeletstats:
    metric_statements:
    - context: resource
      statements:
      - set(attributes["meta.signal_type"], "metrics") where attributes["meta.signal_type"]
        == nil
      - set(attributes["collector.receiver"], "otlp")
      - set(attributes["collector.collector"], "daemonset-opentelemetry-collector")
        where attributes["collector.collector"] == nil

Here, I define three attributes: the collector, the receiver, and the signal type. Why the signal type? Because traces get “trace” in this field, logs get “log” in this field automatically, and I want metrics to be consistent with that. The collector lets me make it so.

Creating a pipeline per receiver is repetition; when you add a metrics processor you have to add it in three places (in this example). Dan Nelson uses Helm templates to generate their collector configuration, abstracting over this duplication.

Collector best practices: where did this telemetry come from?

Like this:

Node.js: instrumentation for ‘fetch’ is published

Java: more metrics, less garbage collection

Subscribe to
our newsletter

Collector best practices: where did this telemetry come from?

Share this:

Like this:

OpenTelemetry Collector Feedback Panel

Node.js: instrumentation for ‘fetch’ is published

Java: more metrics, less garbage collection

Subscribe to our newsletter

Subscribe to
our newsletter