Accurately capture abnormal moments Start by writing the title and content of the event

Mondo Technology Updated on 2024-01-29

Before discussing how to write the content of the event notification when the monitor is configured, it is necessary to clarify the following logic:

After the monitor detection rule takes effect, it performs a series of aggregate data processing on the system business data and retains it in the form of events. These event records can be understood as carriers of abnormal signals emitted by the current monitor's detection object, and the event titles and content discussed in this article are part of these event records. If stakeholders believe that the current anomaly is an urgent matter and needs to estimate the risk and respond in a timely manner, they can send these event records in the form of alarms.

In this transmission process, the three methods in the alarm configuration: non-aggregation, rule aggregation, and intelligent aggregation will process the event title and content accordingly, and finally become the abnormal alarm notification received by the stakeholders. (as shown in the image below).

Back to the original problem, when we use the monitor to detect all kinds of data, we expect that when an abnormal event occurs, the notified object can obtain the detailed context information at the time of the abnormality as soon as possible. This requires creators to pay attention to and understand how to define and edit the title and content of the event that needs to be notified when configuring the monitor.

Theme variables are one of the core elements of editing titles and content. The following are the template variables supported by the observation cloud to help render dynamic copy.

The title of the event is not procrastinating, that is, it can clarify the main point in one sentence. This way, when you receive an event notification, you can get a general idea of what the event is about when you see the title at first glance. Such as:

There is an exception in the status of the members in the consul cluster.

In addition to text-only titles like the ones described above, we can also insert template variables within the titles. Such as:

Host } Less than 10% of the available memory

The link trace error rate for the service is too high, with an error rate of }%.

When specifying the content of the event, we can do so with the help of template syntax. Next, we will use a few common scenarios to show the actual editing effect of event content notifications.

Let's say the monitorbyconfiguredregionwithhostBased on the template variables in the table above, we can edit a basic list of event contents

Event Title:

Monitor } found } to be faulty
Event Content:

Region:}Host:}Host:}Level:}Detection Value:}Monitor:} (Alarm Policy:}).
Well, produceerrorAfter the event, the rendered event output is as follows:

Output Event Title:

Monitor Monitor 001 found a fault
Output Event Content:

Region: hangzhouHost: web-001Host: web-001 level: error detection value: 9012345 Monitor: Monitor 001 (Alarm Policy: Team 001)
Except for such asScenario 1In addition to directly displaying the field values in the event, we can also use template functions to further process the field values. The template function can be used to optimize the output of the notification content of the event and integrate the necessary information.

Its combined usage form is:

In Observing Clouds, the list of template functions available to us is as follows:

Or a referenceScenario 1, in this case using a template function (ieThe event title and content are written as follows:

Event Title:

Monitor } found } to be faulty
Event Content:

Object:}Time:}Level:}Detection Value:}
Then, after the error event is generated, the rendered event output is as follows:

Output Event Title:

Monitor My monitor found that region:hangzhou, host:web-001 is faulty
Output Event Content:

Detection object: region:hangzhou, host:web-001 detection time: 2022-01-01 01:23:45 Fault level: important detection value: 901235
Based on the existing configuration page of the observation cloud, we can use the template branching syntax, assuming that we expect to describe exceptions at different levels in the same event content boxif else) to achieve it.

The grammar is roughly written as follows:

Urgent issues, please deal with them immediately!Important issues, please deal with possible problems, if you have time to deal with data interruptions, please deal with them immediately!No problem!
Here's an example effect:

Level:}Host:}Content: Elasticsearch JVM heap memory usage is }%Suggestion: The current collection of JVM garbage cannot keep up with the generation of JVM garbagePlease check the business status in timeLevel:}Host:}Content: Elasticsearch JVM heap memory alarm has been restored
There is also a situation where event followers want to see the content related to the events generated under the monitor, and also want to implement additional data queries, regardless of whether such data is related to the current monitor configuration rules. At this point, using only template variables is not sufficient for rendering needs. We can do this with an embedded dql query function.

This function supports the detection time range (that is,check_start_timewithend_timeNormally, the first piece of data from the query can be used as a template variable in the template as follows:

A field: }
Let's say, we need a queryhostThe field is"my_server"and assign the first piece of data todql_dataVariables:

The editing effect is:

") %host os:}
It can be used in templates from now onOutputs the specific fields in the query results.

Sometimes, the dql statement that needs to be executed needs to be passed with parameters.

Let's say the monitorbyconditionsregionwithhost, and the content of the event is written as follows:

", region, host) %hostinfo:ip:}os: }
Since the event contains onlyregionwithhostTemplate variables are used to label different data and do not contain more information such as IP address, operating system, etc. Well, using inline dql can passregionwithhostUse it as a dql query parameter to obtain the corresponding data and use itand other output related information.

Regardless of the above scenarios, our final output goal is to use the template variables provided by the observation cloud to customize and accurately capture the contextual event information at the abnormal moment, so that relevant followers can react in time.

Related Pages