SolvedVictoriaMetrics Add data deduplication from HA Prometheus pair based on --query.replica-label arg similar to Thanos Query

Hi,

VM seems great and we'd like to replace our Thanos setup (for 6 K8s Clusters) with it and we do understand how to label series with different labels to split Cluster data but I still struggle with data deduplication. In order to drastically reduce infrastructure costs we are running multiple Prom instances on preemptible hosts. With Thanos Query dedup capabilities it works fine. We don't need to perform any extra actions to get metrics once from all Prom instances for the given cluster. In addition to that dedup handles partial responses for us.

Is there any chance dedup can be included into VM in the observable future or this feature is irrelevant for the case?

23 Answers

βœ”οΈAccepted Answer

Just one thing to add:
If you're using Prometheus-operator in HA mode, you have to set prometheus.prometheusSpec.replicaExternalLabelNameClear: true for VictoriaMetrics' deduplication to work. Otherwise, the operator will add an external label prometheus_replica (e.g. prometheus_replica="prometheus-kps-prometheus-0"), thus the requirement "Note that these Prometheus instances must have identical external_labels section in their configs, so they write data to the same time series." (Deduplication in VM) will not be satisfied.

///

And for those who was a bit confused like me while configuring deduplication in VM cluster, dedup.minScrapeInterval has to be set in two places: vmselect (=1ms; for deduplication at a query level) and vmstorage (something lower than your scrape interval, maybe 5s; for deduplication at a storage level). - The documentation is not very clear about that.
I hope I'm not giving a wrong advice here πŸ˜…

Other Answers:

Thanos deduplication requires setting distinct label value in external_labels config for each replica in Prometheus HA pair - see these docs. This label must be passed to --query.replica-label arg when starting Query component in order to enable proper deduplication.

It is easy to create similar deduplication in VictoriaMetrics using the following steps:

  • start multiple VictoriaMetrics instances (or clusters) in different datacenters (availability zones)
  • configure each replica1 from Prometheus HA pairs to write data to the first VictoriaMetrics, while replica2 must write data to the second VictoriaMetrics. Replicas should have identical labels in external_labels section. There is no need in setting distinct label values for each replica like in Thanos case.
  • start Promxy in front of VictoriaMetrics instances.
  • send queries from Grafana to Promxy. It should handle data deduplication and merging.

Read more about high availability setups here.

What about add an option for customizing replica label like -dedup.replicaLabel=prometheus_replica when deduplicating metrics in vmselect?

It could be useful in some cases below:

  • scrape_interval are not same in prometheus config.
  • save "duplicated" metrics in vmstorage with replica label and query them by this label.

More Issues: