Skip to content

[Task]: Make the go.d Prometheus collector profile-based (curated templates + selection) #22637

@ilyam8

Description

@ilyam8

Problem / root cause

The generic collector autogens charts from raw metric names, so known exporters get un-curated dashboards and there is no way to ship curated, exporter-specific chart templates. The
collector currently opts out of taxonomy (collector/prometheus/taxonomy.yaml is a taxonomy_optout) because its contexts are dynamic; curated profiles introduce stable contexts, which
changes that.

Clean end state

Nested profile config (azure_monitor style):

profiles:
  mode: auto              # auto (default) | exact | combined
  mode_exact:
    entries:
      - name: apache_exporter
  mode_combined:
    entries:
      - name: jvm_micrometer

entries are objects ({name, …}), extensible for future per-profile job knobs. A profile's match/templates/relabel live in the profile YAML files, not here. Delivered as 2 PRs:

  • PR7 — catalog + match templates + first stock profile + docs. promprofiles catalog (stock/user, override-by-name, strict YAML); the nested config; selected profiles merge into the
    runtime ChartTemplateYAML() (profile charts get prometheus.<...> contexts via the compiler) with autogen fallback; selection completes by Check(). One first stock profile
    (template-only, no profile-local relabel)
    shipped end-to-end. Profile metrics use per-integration metadata overrides, never the shared base metrics.scopes:[] (inherited by 160
    integration pages); a non-overlapping taxonomy ownership model; taxonomy_optout converted.
  • PR8 — profile-local relabeling + ownership/overlap gates. Profile metrics_relabeling (reuses the relabel engine) with curated strictness (no dynamic target / __name__ / le /
    quantile mutation, no labelkeep); ownership + overlap detection; HELP remap; post-relabel typed-family validation; check-strict / collect-lenient policy.

Acceptance criteria

  • PR7: auto/exact/combined tested; generic jobs need no profile; template/chart-ID collisions detected at Check(); config fixtures gain the nested profile fields; the first stock
    profile validated end-to-end by tests; consistency in-PR (config_schema.json + per-integration metadata.yaml + taxonomy.yaml + stock conf + regenerated docs);
    check_collector_taxonomy.py passes; not inert.
  • PR8: Check() fails on observed overlap or malformed curated output; runtime conflicts deterministic (drop+log); curated rules cannot do unsafe mutations; tests cover overlap,
    typed-family split, HELP remap, curated-rule rejection. Depends on the relabeling issue.

Category

follow-up from prior work

Scope boundaries

IN: profile catalog + first stock profile (PR7), profile-local relabeling (PR8), the taxonomy/metadata model, the nested config shape. OUT: a large stock-profile library; native
histogram/summary heatmap rendering (separate enrichment). Contingent: metrix flatten/collision robustness only if relabel/profiles prove it (separate gated framework PR). Depends on Issue
2 (V2 migration); PR8 also depends on Issue 3 (relabeling).

Validation

Catalog/selection unit tests; check_collector_taxonomy.py; real-node run with the stock profile; consistency/CI.

Risks / compatibility

Metadata inheritance hazard: the shared base metrics.scopes:[] is inherited by 160 pages — profile metrics must use per-integration overrides. Taxonomy ownership must be non-overlapping
(overlap is fatal; check_collector_taxonomy.py is CI-enforced). Curated templates need enumerable chart surfaces vs unrestricted dynamic relabel → curated strictness.

Metadata

Metadata

Assignees

No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions