As a back-end engineer, the subconscious is to write the interface, hierarchical caching, machine expansion, thread fullness, and a series of combo preparations, and because the frequency of data updates can be counted on both hands, we have adopted the most secure way to deal with it, and directly generate static files to take CDN resistance.
The architecture process looks something like this:
After the data is updated, a new round of files will be regenerated, and a large number of back-to-origin requests will be triggered when the CDN is refreshed, and the application server will have to hold the 9W QPS in extreme cases
There are a total of 40 4C machines in the dual computer room, 25KB data files, and 5W QPS directly hit the CPU to 90%.
This obviously doesn't meet the needs of the business, what should I do? Let's try brainless and machine first.
At this time, the test classmates reported that the data of the stress test was incorrect, and the maximum file size in the last round would be 125kb, which made matters worse.
As a result, the number of machines doubled to 80 as a result, and the server-side CPU was still a bottleneck, and the QPS could not be increased.
In the end, it is consuming CPU resources, and the overall architecture is so simple that it couldn't be simpler.
At this time, we noticed that in order to save network bandwidth, nginx turned on gzip compression, is this kid doing it?
server
To test this hypothesis, we adjusted the gzip compression ratio in nginx from 6 to 2 to reduce the amount of computation on the CPU.
gzip_comp_level 2;This round of pressing down the CPU was still quickly filled, but the QPS barely reached 9W, which confirmed that it was indeed gzip consuming the CPU
As a well-known web server, nginx is known for its high performance and high concurrency, and it must be wrong to press the application server so high with just a static data file.
After making it clear that gzip was consuming CPU, we dive into the relevant information and found some progress.
Static files such as HTML css js often contain a lot of repetitive characters such as spaces and tags, and the recurring parts can be expressed using distance plus length to reduce the number of characters, which in turn greatly reduces bandwidth, which is the basic principle of gzip lossless compression.
As an end-to-end compression technology, gzip conventions filesCompression is completed on the server side, which remains unchanged in transit until it reaches the client. That's not a good theoretical basis
There are two types of gzip compression in nginx: dynamic compression and static compression.
Dynamic compression. When the server returns a response to the client, it consumes its own resources for real-time compression to ensure that the client gets the gzip file.
This module is compiled by default, and you can check it for details
Static compression. Directly convert the pre-compressed .The gz file is returned to the client and the file is no longer compressed in real time, if the .gz file, the corresponding original file will be used.
This module needs to be compiled separately, and you can check it for details
If gzip static always is enabled, and the client does not support gzip, you can also install gunzip on the server to help the client decompress, which we don't need here.
I checked that the nginx that comes with jdos has compiled the ngx http gzip static module, saving me the trouble of recompiling.
Next, an additional .. is generated locally via gzipoutputstreamgz, nginx configuration on static compression again.
gzip_static on;
In the face of 9W QPS, 40 machines only used 7% CPU usage.
In order to continue to increase the pressure at the bottom, the CPU growth of the application server was slow until the network egress rate was pulled to 89MB s, and the QPS had reached 27W for fear of affecting the pressure on other containers on the host
QPS 5W->27W is 5x faster, CPU 90%->7% is 10x lower, and overall performance is more than 50x higher
After a series of analysis practices, it seems that static compression has an "overwhelming" advantage, so what scenarios are suitable for dynamic compression and what scenarios are suitable for static compression? After some **, the following conclusions were drawn.
Purely static files that do not change are suitable for static compression, and gzip compression should be used in advance to avoid wasting CPU and bandwidth. Dynamic compression is suitable for dynamic scenarios such as the data returned by the API interface to the front-end data, and the data will change, so NGINX needs to dynamically compress the data according to the returned content to save server bandwidth.As a back-end engineer, Nginx is an old acquaintance of ours, and we don't see each other when we look up. In daily work, with a matching ** rule, check the header settings, basically use nginx as a reverse**. This time it was a direct access to static resources, and a series of optimizations in the tuning process deepened our basic understanding of the dynamic compression and static compression of gzip, which seemed trivial in the eyes of the old gun, but it was a rare opportunity for us to expand our skills.
In the previous career, we have been focusing on business architecture design and development, and the optimization of performance seems to have formed a mental inertia. In the face of large data volume and long transaction requests, reducing the number of loops and batches, increasing concurrency, increasing caches, and solving asynchronous tasks without walking, the general bottlenecks appear at the level of iO, after all, the disk is slow, and reducing the number of interactions with the database often has an effect, and other probabilities are not a problem. This time it's a bit different, the reason why the CPU is beaten up is that there is a lot of data computation, and any link can cause performance problems before high concurrency requests.
Author: Jingdong Retail Yan Chuang.
*:JD Cloud Developer Community**Please indicate**.