How OpenTelemetry Collector scales observability

Mondo Science Updated on 2024-01-30

Two presentations at KubeCon+CloudNativeCon 2023 showcased a variety of tools and services in the observability space.

Translated from How The OpenTelemetry Collector Scales Observability, by BCameron Gain is the founder and principal analyst at Revecom Media. His obsession with computers began in the early '80s, when he hacked a Space Invaders console at a local arcade and played games all day at 25 cents. And then. You can do without OpenTelemetry Collector, an open-source observability framework, but you probably don't want to do that, especially when deploying and monitoring large-scale applications. You may want to use OpenTelemetry Collector, especially if you have multiple applications or microservices, especially for security reasons.

As OpenTelemetry expands in scope and is widely accepted as a way to use your favorite observable unified interface or component, this becomes apparent as vendors seek to meet the OpenTelemetry standard.

OpenTelemetry Collector is an observability pipeline middleware that ingests, processes, and exports data at scale. Dynatrace's Senior Software Engineer, Evan Bradley, spoke with Honeycomb. at KubeCon + CloudnativeCon last monthTyler Helmuth, Senior Software Engineer at IO and open source expert, explained this in a talk titled "Ottl Me Why Transforming Telemetry in the OpenTelemetry Collector". "So why might you want to use Collector?Well, there are a lot of reasons, but the first important one is that you can process at the edge — processing at the edge allows you to distribute this work across multiple machines, which can help increase data throughput in the pipeline," Bradley said. "You can run Collector anywhere else at the edge or in the pipeline because it can be deployed anywhere, containerized, virtualized, or even as a service. In addition, you can process the data near the origin or at a distance, such as at a critical point in the pipeline, such as at the entry point of the secure network boundary. ”

Because the Collector is fast and versatile, it's designed to adapt well to different use cases, Bradley says. "It's designed with high throughput and low latency in mind, so it won't slow down your pipeline. In addition, it has low CPU, memory, and disk space requirements," says Bradley.

OpenTelemetry Collector is used to collect data sent to it from one or more sources. In addition to receiving data, it also sends data to endpoints, such as visualization with grafana panels.

With configuration, it can be used to collect specific types of logs, traces, and metrics for observability.

Initially, you can choose not to use it, especially if you're using a monitoring application that collects and transmits all data directly to an observability platform, or via OpenTelemetry, collecting metrics, logs, traces, and so on.

However, when monitoring multiple applications or microservices, this approach becomes challenging. Without OpenTelemetry Collector, you'll need to configure it separately for each backend or user monitoring, which can be cumbersome.

Instead, OpenTelemetry Collector acts as a single endpoint for all microservices, simplifying access to applications and microservices through the unified point provided by Collector.

With this collector, you can view and manage them collectively, providing a unified view on a platform like Grafana. While Grafana offers some alternatives without OpenTelemetry Collector, Collector significantly simplifies the process.

Depending on the situation, a Collector can be customized to suit the situation by selecting the components you need, Bradley says. In cases where the existing options aren't available, all Collector components are written using the same core API, allowing you to add your own ** with these components to accomplish the task," says Bradley.

The data is organized through Collector's pipeline and is made up of individual components, each of which handles a specific task, Bradley said. Collector has five categories of components, but in his presentation, receivers, processors, and exporters are covered. The diagram above illustrates an example pipe that enters the collector at a point on the left, is piped, and emitted on the right, Bradley said.

Through the OpenTelemetry Transformation Language (OTTL), OpenTelemetry Collector's filters or handlers can be used to filter the kinds of telemetry it receives and sends. Helmuth shows how OTTL supports filtering. In his presentation, Helmuth showed when it makes sense to lower intake by removing events that are classified as completed because they are considered unnecessary, he said.

In the diagram above, the intent is to implement decisions about which data to discard with a filtering processor, which operates on the OTTL condition. These conditions interact with the underlying telemetry without changing it. The filtering processor uses the OTTL condition to select the data to be discardedWhen the conditions are met, the processor removes the data, Helmuth said.

In the case of a Kubernetes object sink, it emits Kubernetes events in the form of logs, where these events exist in a nested map of the log body.

Any principals that are not arranged in the expected structure (i.e., those that do not resemble k8s events) will be discarded, helmuth describes. In the top box of the image above, the body is the mapping that contains the object key inline mapping, so the condition is not met and the data is preserved. Instead, in the second box of the image above, the body is a string, which doesn't fit the expected mapping structure, Helmuth says.

There are different alternatives for telemetry data collection. As a result, OpenTelemetry Collector falls into the category of observability. Observability**, such as OpenTelemetry Collector, Fluentbit, Vector, etc., "exhibits a high degree of robustness and performs a variety of tasks to achieve their remarkable results," said Braydon Kains, a software developer at Google, in his KubeCon + CloudNativeCon talk observability agent performance”。 At the end of the presentation, the question was asked about which collector was the best. Kains describes Google Cloud Ops as a fusion of the two. Behind the scenes, it uses Fluent Bit for log collection and OpenTelemetry for metrics and tracking data collection, he said.

The team manages a configuration layer that is responsible for generating configurations for the underlying OpenTelemetry and Fluent Bits. These configurations include recommended optimizations tailored for users who run primarily on virtual machines, such as normal VMs, in order to efficiently collect metrics using OpenTelemetry, he said.

There are a lot of knobs to keep track of, and it can be difficult for a newbie to keep track of all of them," says Kains. "We took on the responsibility of keeping track of these knobs and trying to come up with settings that would be optimal in most general situations. ”

Related Pages