Learn how the ELK real-time logging platform works and practice how to build and use it.
In the process of troubleshooting online exceptions, query logs are always an indispensable part. Nowadays, most of the microservice architectures are used, and the logs are scattered across different machines, which makes it extremely difficult to query the logs. If you want to do a good job, you must first sharpen your tools. If there is a unified real-time log analysis platform at this time, it can be said to be a carbon delivery in the snow, which will definitely improve the efficiency of our online troubleshooting. This topic takes you through the construction and use of ELK, an open-source real-time log analysis platform.
ELK is an open-source real-time log analysis platform that consists of three parts: Elasticsearch, Logstash, and Kiabana.
Primarily used to collect server logs, Logstash is an open-source data collection engine with real-time pipeline capabilities. Logstash dynamically unifies data from disparate data sources and normalizes the data to the destination of your choice.
There are three main parts to the process of collecting data from Logstash:
Input: Data (including but not limited to logs) is often stored in different forms and formats in different systems, and LogStash allows you to collect data from a variety of data sources (file, syslog, mysql, message middleware, etc.). Filters: Parse and transform data in real time, identify named fields to build structure, and convert them into a common format. Output: Elasticsearch isn't the only option for storage, and Logstash offers a lot of output options. Elasticsearch (ES) is a distributed, RESTful search and data analytics engine that features:
Queries: Allows you to perform and combine multiple types of searches—structured, unstructured, geographical, metric—and change the way you search. Analytics: Elasticsearch aggregations allow you to see the big picture and explore trends and patterns in your data. Speed: Very fast, can achieve hundreds of millions of levels of data, millisecond level return. Scalability: Run on laptops or hundreds or thousands of servers that host petabytes of data. Resiliency: Running in a distributed environment, designed with this in mind. Flexibility: Multiple use cases. Numeric, textual, geographical, structured, unstructured, all data types are welcome. Kibana makes massive amounts of data easy to understand. It's simple, with a browser-based interface that lets you quickly create and share dynamic data dashboards to track real-time data changes in Elasticsearch. It's also easy to set up, so you can install Kibana and start exploring Elasticsearch's indexed data in minutes—with no additional infrastructure required.
The above three components are described in detail in the article "ELK Protocol Stack Introduction and Architecture", and will not be repeated here. In ELK, the approximate workflow of the three components is shown in the following figure, with Logstash collecting logs from each service and storing them in Elasticsearch, and then Kiabana querying the logs from Elasticsearch and displaying them to end users.
Figure 1Approximate workflow for ELK
Usually our services are deployed on different servers, so how to collect log information from multiple servers is a key point. The solutions provided in this article are shown in the following diagram:
Figure 2This article provides an ELK implementation
As shown in the figure above, the entire ELK runs as follows:
Deploy a logstash on the microservice (the service that generates logs), as the shipper role, which is mainly responsible for collecting data from the log files generated by the service on the machine and pushing messages to the Redis message queue. In addition, a server is deployed with a Logstash role in the Indexer role, which is mainly responsible for reading data from the Redis message queue, parsing and processing it in the Logstash pipeline, and then outputting it to the Elasticsearch cluster for storage. Data synchronization between Elasticsearch primary and secondary nodes. Deploy Kibana on a single server to read log data from Elasticsearch and display it on a web page. With this diagram, I believe you have a general idea of the workflow of the ELK platform we are going to build, and the components required. Let's start building it together.
This section describes how to build the ELK logging platform, including Logstash, Elasticsearch, and Kibana with the Indexer role. To complete this section, you need to do the following:
An Ubuntu machine or virtual machine, as a tutorial, omits the setup of an Elasticsearch cluster and installs Logstash(Indexer), Elasticsearch, and Kibana on the same machine. Install the JDK on Ubuntu, note that Logstash requires the JDK to be in 1For versions 7 or above, for details on how to install JDK on Ubuntu, please refer to "Installing JDK1 on Ubuntu."8" This text. Logstash, Elasticsearch, Kibana installation packages, you can find them on this page**. Unzip the zip package:
tar -xzvf logstash-7.3.0.tar.gz
A simple use case test goes to the decompression directory and starts a pipeline that puts console inputs and outputs to the console.
cd logstash-7.3.0elk@elk:~/elk/logstash-7.3.0$ bin/logstash -e 'input }output }'
Seeing the following logs means that logstash started successfully.
Figure 3logstash start success logs.
Enter in the consolehello logstash
, and the following effect indicates that the logstash installation is successful.
Listing 1Verify that logstash started successfully
hello logstash
Unzip the installation package:
tar -xzvf elasticsearch-7.3.0-linux-x86_64.tar.gz
Start Elasticsearch:
cd elasticsearch-7.3.0 bin elasticsearch show moreshow more iconIn the process of launching elasticsearch, I encountered two problems, which are listed here for easy troubleshooting. Problem 1: The memory is too small, if your machine's memory is less than the value set by Elasticsearch, the error shown in the following figure will be reported. The solution is to modify elasticsearch-73.0/config/jvm.The following configuration in the options file is suitable for the memory size of your machine, if this error still reports after modification, you can reconnect to the server and try again. Figure 4The memory is too small, causing an error in Elasticsearch startup**!Figure 4The memory is too small, causing an error in Elasticsearch startup**!/ibm_articles_img/build-elk-and-use-it-for-springboot-and-nginx_images_image004.png)**Problem 2** If you start as the root user, you will report the error shown in the figure below. The solution, of course, is to add a new user to start Elasticsearch, and there are many ways to add a new user online, so I won't repeat them here. Figure 5The root user gets an error when starting Elasticsearch**!Figure 5The root user gets an error when starting Elasticsearch**!/ibm_articles_img/build-elk-and-use-it-for-springboot-and-nginx_images_image005.png)
There are two issues I encountered in the process of launching Elasticsearch, and I will list them here for easy troubleshooting.
Question one: If the memory on your machine is less than the value set by Elasticsearch, an error will be reported as shown in the following figure. The solution is, modifyelasticsearch-7.3.0/config/jvm.options
The following configuration in the file is suitable for the memory size of your machine, if this error is still reported after modification, you can reconnect to the server and try again.
Figure 4The memory is too small, causing an error in Elasticsearch startup**!
Question two, if you are in useroot
If the user starts it, it will report the error shown in the figure below. The solution, of course, is to add a new user to start Elasticsearch, and there are many ways to add a new user online, so I won't repeat them here.
Figure 5The root user gets an error when starting Elasticsearch**!
After the startup is successful, another session window is set up to executecurl http://localhost:9200
command, if the following result appears, the Elasticsearch installation is successful.
Listing 2Check if Elasticsearch was successfully launched
elk@elk:~$curl http://localhost:9200,"tagline" : "you know, for search"}
Unzip the installation package:
tar -xzvf kibana-7.3.0-linux-x86_64.tar.gz
Modify the configuration fileconfig/kibana.yml
, which specifies information about Elasticsearch.
Listing 3Kibana configuration information
Elasticsearch host address: elasticsearchhosts: "http://ip:9200"Allow remote access to the serverhost: "0.0.0.0"Elasticsearch username This is actually the username I used to start elasticsearch on the serverusername: "es"Elasticsearch Authentication Password This is actually the password I used to start Elasticsearch on the serverpassword: "es"
Start Kibana:
cd kibana-7.3.0-linux-x86_64/bin./kibana
Access in your browserhttp://ip:5601
If the following screen appears, Kibana is successfully installed.
Figure 6Kibana Launch Success Screen
After the ELK logging platform is installed, we will take a look at how to use ELK through specific examples, and the following will introduce how to hand over Spring Boot logs and NGINX logs to ELK for analysis.
First we need to create a Spring Boot project, I wrote an article about how to use AOP to unify the processing of Spring Boot web logs, the Spring Boot project in this article is based on this article, and the source code can be obtained here. in the projectresources
directoryspring-logback.xml
Profiles.
Listing 4Configuration of the Spring Boot project logback
logback for demo mobile ..d [%thread] %5level %logger $ msg%n ..
A lot of the above is omitted, and you can get it in the source code. In the above configuration we have defined a file calledrolling_file
ofappender
Outputs the log file in the specified format. And the abovepattern
Tags are the configuration of the specific log format, and through the above configuration, we specify the output of information such as time, thread, log level, logger (usually the full path of the class where the log is printed), and the name of the service.
Package the project and deploy it to an ubuntu server.
Listing 5Package and deploy the Spring Boot project
Packing command mvn package -dm**entest.skip=true deployment command j**a -jar sb-elk-start-00.1-snapshot.jar
View the log fileslogback
In the configuration file I store the logs in/log/sb-log.log
file, executemore /log/sb-log.log
command, the following result indicates that the deployment is successful.
Figure 7Spring Boot log file
After the Spring Boot project is successfully deployed, we also need to install and configure Logstash for the shipper role on the machine that is currently deployed. The installation process of Logstash is already covered in the ELK platform setup section, so I won't go into detail here. Once the installation is complete, we need to write a configuration file for Logstash to support collecting logs from log files and outputting them into the Redis message pipeline, as shown in the configuration of the shipper.
Listing 6Configuration of logstash for the shipper role
input }output }
In fact, the configuration of logstash corresponds to the three parts of the logstash pipeline (input, filter, and output) mentioned above, but we don't need filters here, so we don't write it. The data source used by input in the above configuration is of file type, and you only need to configure the path of the native log file that needs to be collected. output describes how the data is exported, and in this case it is configured to output to redis.
Redisdata_type
The available values arechannel
withlist
Two. channel
is the publish-subscribe communication pattern for Redislist
is a queue data structure for Redis, both of which can be used to enable ordered asynchronous communication of messages between systems. channel
comparativelylist
The benefit is that the coupling between publishers and subscribers is removed. For example, an indexer that is continuously reading records in Redis now wants to add a second indexer, if usedlist
, the previous record is picked up by the first indexer and the next record is picked up by the second indexer, creating a competition between the two indexers, resulting in neither party reading the full log. channel
This can be avoided. This is used in both the configuration file of the shipper role and the configuration file of the indexer role that will be mentioned belowchannel
After configuring the Logstash of the Shipper role, we also need to configure the Indexer role Logstash to support receiving log data from Redis and storing it in Elasticsearch through filters, as shown below.
Listing 7Configuration of the logstash for the indexer role
input }filter \=%ms|)"} }output elasticsearch }
Unlike Shipper, we define filters in the Indexer pipeline, and that's where the logs are parsed into structured data. Here's one I took a screenshotlogback
Log Contents:
Listing 8A log of the output of the Spring Boot project
2019-08-11 18:01:31.602 [http-nio-8080-exec-2] info c.i.s.aop.WeblogaspAspect SB-ELK - Interface Log POST Request Test Interface End Call: Time Consumption=11ms, result=BaseResponse
In the filter, we use the grok plugin to parse the time, thread name, logger, service name, and interface time consumption fields from the above log. How does grok work?
message
A field is a logstash field that holds the collected datamatch =
This means that the log content is processed. Grok actually parses the data through regular expressions as well, as seen abovetimestamp_iso8601
notspace
etc are all built into grokpatterns
, which ones are built into the grokpatterns
It can be viewed here. We can use the grok debugger to test the correctness of the parsed strings, which avoids repeated verification of the correctness of the parsing rules in the real environment. After the above steps, we have completed the construction of the entire ELK platform and the integration of the Spring Boot project. Let's follow the steps below to do a few things and see the effect.
1. Start Elasticsearch, the startup command is mentioned in the ELK platform construction section, and will not be repeated here (Kibana startup is the same). 2. Start Logstash in the indexer role.
Go to the logstash decompression directory and run the following command bin logstash -f indexer-logstashconf
3. Start Kibana.
4. Start Logstash for the shipper role.
Go to the logstash decompression directory and run the following command: bin logstash -f shipper-logstashconf
5. Call the Spring Boot interface, and the data should have been written to ES at this time.
6. Access in your browserhttp://ip:5601
to open the Kibana web interface, and add it as shown in the following imagelogback
Index.
Figure 8Add an Elasticsearch index in Kibana
7. Enterdiscoverinterface, selectlogback
index, you can see the log data, as shown in the following figure.
Figure 9ELK log viewing
I believe that you have successfully built your own ELK real-time logging platform through the above steps, and connected logback logs. However, in the actual scenario, it is almost impossible to have only one type of log, so let's access the logs of nginx on the basis of the above steps. Of course, the premise of this step is that we need to install nginx on the server, and the specific installation process is described on the Internet, so I will not repeat it here. View the logs of nginx as follows (the access logs of nginx are on by default.)/var/log/nginx/access.log
file).
Listing 9nginx's access logs
192.168.142.1 - 17/aug/2019:21:31:43 +0800] "get /weblog/get-test?name=elk http/1.1"200 3 "" "mozilla/5.0 (windows nt 10.0; win64; x64)applewebkit/537.36 (khtml, like gecko) chrome/76.0.3809.100 safari/537.36"
Again, we need to write a grok parsing rule for this log, like this:
Listing 10Grok parsing rules for nginx access logs
% \"% %http/%" % "%" "%"
The key point after all of this is that the indexer type of logstash needs to support two types of inputs, filters, and outputs, but how? First, you need to specify the input type, and then go through different filters and outputs according to different input types, as shown below (for space reasons, the configuration file is not shown here, you can get it here). Listing 11The logstash configuration of the indexer role supports both types of log inputs
input redis }filter if [type] == "nginx" }output if [type] == "nginx" }
My nginx is deployed on the same machine as the Spring Boot project, so I need to modify the configuration of the shipper type logstash to support both types of log input and output, and the contents of the configuration file can be obtained here. After the above configuration is completed, we can start the ELK platform, Logstash, Nginx, and Spring Boot projects of the Shipper role according to the steps in the View Effect chapter, and then add the Nignx index on Kibana to view the Spring Boot and Nginx logs at the same time, as shown in the following figure. Figure 10elk to view nginx logs
In the above steps, the startup process of ELK is to execute the startup commands of the three components one by one. And it's also started in the foreground, which means that if we close the session window, the component stops and renders the entire ELK platform unusable, which is unrealistic in practice, and the rest of us is how to make the ELK run in the background. As recommended in the book Best Practices for Logstash, we will use supervisors to manage the start and stop of ELKs. First we need to install supervisor and execute it on ubuntuapt-get install supervisor
Can. After the installation is successful, we also need to configure the three components of ELK in the supervisor's configuration file (its configuration file defaults/etc/supervisor/supervisord.conf
file). Listing 12ELK starts in the background
[program:elasticsearch]environment=j**a_home="/usr/j**a/jdk1.8.0_221/"directory=/home/elk/elk/elasticsearchuser=elkcommand=/home/elk/elk/elasticsearch/bin/elasticsearch[program:logstash]environment=j**a_home="/usr/j**a/jdk1.8.0_221/"directory=/home/elk/elk/logstashuser=elkcommand=/home/elk/elk/logstash/bin/logstash -f /home/elk/elk/logstash/indexer-logstash.conf[program:kibana]environment=ls_heap_size=5000mdirectory=/home/elk/elk/kibanauser=elkcommand=/home/elk/elk/kibana/bin/kibana
Execute the following parameters after the preceding configuration is completesudo supervisorctl reload
The entire ELK boot is complete, and it starts automatically by default. Of course, we can also use itsudo supervisorctl start/stop [program_name]
to manage individual apps.
In this tutorial, we mainly understand what ELK is, and then build an ELK log analysis platform with you through practical operations, and access logback and nginx logs. You can find the source ** and the logstash configuration file on GitHub.