Prometheus Grafana Best Practices for IT Monitoring and Alerting 6

Mondo Technology Updated on 2024-01-31

As you can see, I'm Xiaofei, and the basic construction and data collection of Prometheus have been explained earlierToday, we will explain the details of obtaining data, optimizing queries, and optimizing data display.

When the host obtains the data through the automatic file discovery method, the data will be processed and labeled simply. As shown in the figure below:

308 host up, indicating that the data has been obtained through the file automatic discovery method, but we can query the data through promql and the data format.

Because the data needs to be displayed on a large screen, it is necessary to incorporate the data into the drawing of Grafana to display the data on the large screen, where the 308 hosts are all Linux nodes, so you can directly install the node exporter on each host. And grafana has been installed, open the grafana interface, find the template that meets your needs in the official template library, here I use the template ID: 16098.

Open the grafana panel:

1. Configure the data source

Fill in your Prometheus data source, that is, the address and port of the Prometheus service.

Enter the template ID, click load, and it's OK, as shown in the figure below

The above is the template data of Grafana, the Grafana query variables are also complete, and the data has been reprocessed.

The example of the instance that has not been processed is shown as follows: IP+ port, for example: 192168.10.1:9100

Query data by promql: node uname info

The query shows that there is a host under the instance of localhost, and the instance is 17217.40.25:9100, and there are more than 300 hosts that display instance built-in tags like this, and in grafana, query variables are based on instance as query variables, but they are always followed by ports, obsessive-compulsive disorder is not liked, I only need to display the ip address in grafana.

You need to rewrite the label, and use the relabel configs of Prometheus to rewrite the label.

relabel_configs: -source_labels: -"__address__" regex: "(.*9100" target_label: "instance" action: replace replacement: "$1"
Write the above override configuration to the corresponding job in the prometheus configuration file, for example:

- job_name: "vmware-host" metrics_path: /metrics scheme: http scrape_interval: 5s file_sd_configs: -files: -/root/monitor/prometheus/targets/node-*.yml refresh_interval: 2m relabel_configs: -source_labels: -"__address__" regex: "(.*9100" target_label: "instance" action: replace replacement: "$1"

As shown in the preceding figure, rewrite all the tag values of the instance, remove port 9100, and display all IP addresses.

The operation of the label is google, mainly look at the relabel configs.

Grafana's variable queries should also follow the corresponding syntax. If you choose the data source of Prometheus, then you need to follow the syntax of PromQL and Google it yourself, because the templates are all shared by others in the official library.

Prometheus can perform data operations on TSDB through the HTTP API interface, and here are several ways to delete data

Enable HTTP API:

The tsdb admin api of prometheus is disabled by default, and the startup parameter --web. needs to be addedenable-admin-api will start.

The startup parameters are as follows: Example --webenable-admin-api

[unit]description=prometheus serverwants=network-online.targetafter=network.target[service]type=***user=rootexecstart=/root/monitor/prometheus/current/prometheus --config.file=/root/monitor/prometheus/conf/prometheus.yml --web.listen-address=:9090 --storage.tsdb.path=/root/monitor/prometheus/data/ --storage.tsdb.retention=90d --web.enable-lifecycle --web.enable-admin-apiexecreload=/bin/kill -hup $mainpidkillmode=processrestart=on-failure[install]wantedby=multi-user.target
After you enable the TSDB admin API, you can use the following APIs to delete metrics data:

Delete Data Interface:

curl -x put -g ''
Deleted according to time:

curl -x put -g 'up&start=2022-11-07t00:00:00.000z'
This interface has the following three URL query parameters:

match=:Name of metricsstart=:Timestamp of the beginning end=:Timestamp of the end of more api operations by Google.

Related Pages