Prometheus Grafana Best Practices for IT Monitoring and Alerting 8

Mondo Technology Updated on 2024-01-31

As you can see, I'm Xiaofei, and today I'm sharing the best practices of Sangfor AD device monitoring, and at the same time, there are instructions on Grafana monitoring.

Consult Sangfor after-sales service, get the OID comparison table information of the corresponding version of AD equipment, here I demonstrate Sangfor AD1000-G642 device monitoring information, the system version is7.0.8r4。

Sangfor after-sales service:

Sangfor community according to Sangfor OID information, get the information you want to monitor, here I will make a basic demonstration:

First of all, you need the MIB file to the AD management page, as shown in the following figure:

As shown in the figure above, after the MIB library, set SNMP V2C, add the community name and the IP address that is allowed to accessNote that you must let go of the IP address of the host on which your SNMP exporter is running, otherwise SNMPwalk will not be able to collect the relevant AD information.

When everything is ready, open the host running the SNMP exporter collector, here I use the SNMP collector host that I built before, and the system is installed with CentOS 7 by default9. Host IP:172.17.40.54,SSH login host,Upload the mib file of ** to the corresponding directory

For more information on how to deploy an SNMP collector, see the previous article:

Edit the generator configuration filegenerator.yml

Sangfor Network Device Information Capture sangfor: walk: -adsysname Device Name - adcpucostrate AD CPU Usage - admemcostrate AD Memory Usage - sfintcputemp Sangfor Device Temperature - sfdisktemp Sangfor AD Disk Temperature - addiskcostrate - AD Disk Usage - sfdevicestatus ad disk status - sffanstate sffanspeed ad fan speed - sfpowerstate ad power status - adconns ad system concurrent connections - adnewconns adnewconns ad system new connections - advshealthstatus health status of the virtual service - advshealthnodecnt Number of healthy nodes of a virtual service - aduplinkthroughput All link upstream traffic - addownlinkthroughput All link downstream traffic - aduptime AD device running time - addevicepattern AD running mode A single node is 3 - adstandbystate ADDactive/standby status - adlinkname ADClinkName - adlinktype AD Link Type - adlinkifname Network port referenced by ADLiclinkStatus Link Status - adlinkbitin Link Upstream Traffic - AdLinkBitout Link Downstream Traffic - AdlinkNumber Number of Device Links Max Repetitions: 25 Retries: 3 Timeout: 5s Version: 2 auth: Community: zzpbptyciz1tm lookups: -source indexes: [linkindex] lookup: adlinktype - source indexes: [linkindex] lookup: adlinkifname - source indexes: [linkindex] lookup: adlinkname overrides: adsysname: type: displaystring adlinkname: type: displaystring ignore: true adlinkifname: type: displaystring ignore: true sfcputemp: type: displaystring adlinktype: type: displaystring ignore: true advshealthstatus: type: displaystring sffanstate: type: displaystring sfpowerstate: type: displaystring
The above configuration file can be put together with the Huawei module and the Dell IDRAC module in the previous article to generate SNMPYML file.

Before executing the generator, you need to set the path of the temporary variable mib file:

[root@snmp generator]# ./generator generate[root@snmp generator]# mv snmp.yml ../[root@snmp generator]# systemctl restart snmp-exporter.service
After performing the above steps, you need to understand that the SNMP collector has collected data according to the new configuration file, open the web page of the SNMP collector, and perform the collection test to see whether the data is collected normally

Submit and found that the data collected is normal

Now that the data collection of AD has been completed, the next step is to store the data of the Prometheus grabber collector into TSDB.

When the data collection of AD is completed, we need to poke the data into Prometheus, so that the data can be saved and formatted data to facilitate subsequent data query and drawingYML file to configure the job of Sangfor AD.

in prometheusyml fileAdd the following configuration information to scrape configs:

Collect Sangfor AD information - job name:"sangfor" scrape_interval: 15s scrape_timeout: 10s file_sd_configs: -files: -/root/monitor/prometheus/targets/sangfor-*.yml refresh_interval: 2m metrics_path: /snmp relabel_configs: -source_labels: ["__address__"] target_label: _param_target - source_labels: ["__param_target"] target_label: instance - target_label: _address__ replacement: 172.17.40.54:9116 SNMP Exporter service IP address - source labels: [.]"mib"] Get the name of the MIB module from the custom target label: param module
sangfor-device.The yml file is as follows: Note sangfor-deviceThe yml file needs to be placed in the directory corresponding to the files/root/monitor/prometheus/targets

- labels: mib: sangfor brand: sangfor hostname: hz-zbnl-ad model: ad1000 g642 targets: -192.168.2.1
At this point, the configuration is complete and the configuration file is reloaded:

curl -x post localhost:9090/-/reload
Open the web UI interface of Prometheus and verify the targets and data:

Prometheus obtains the target normally, and click to open the data to verify

If you open a random metric, the number of concurrent connections in AD can be queried normally in Prometheus, as shown in the figure above.

Now that the data store is solved, the next step is to paint.

If you want to draw, because you need to have a simple understanding of promql and need to be proficient in the use of grafana visualization composition, I will post a simple template written by myself and make do with it.

I will share the template of grafana in grafana later, if you need it, you can ask me separately.

If you like it, pay attention to it, or pay attention to WeChat *** network Xiaofei

Related Pages