How to solve the HTTP status code 502 of nginx+php-fpm service

10-30-2023

One of our web projects, as the number of new cities increases, leads to an increase in the number of visits and the pressure on db. As a business party providing interfaces, it has recently received a large number of requests from downstream feedback.

502, bad gateway, generally upstream (here is php) error, for php, the common cause of 502 is that the script execution exceeds the timeout setting time, or the timeout setting is too large, resulting in the php process can not be released for a long time, and there is no idle worker process to receive customers.

Our project is caused by the short execution time of php. In this case, we can appropriately increase the execution time of php and ensure that 502 is cleared first. After all, it will take more time to optimize things.

There are two options to control the execution time of php, max_execution_time in php.ini and request_terminate_timeout in php-fpm, in which request_terminate_timeout can override max_execution_time, so if you don't want to change the global php.ini, just change the configuration of php-fpm.

Let me analyze in detail why the execution of php scripts beyond the set time will cause nginx to return to 502.

Come to the set first, and let the problem reappear:

Nginx and php only start one worker, which is convenient for tracking.

The request_terminate_timeout of php-fpm is set to 3s.

Test script test.php

sleep(20); echo 'ok';

go go go:

When the browser visited www.v.com/test.php, it appeared as scheduled after 3s ... 404? ? ? what? ? ?

It's not a good start. Look at nginx's configuration file quickly.

This location configuration is to jump to a nice interface when a 5xx error occurs, but I don't have the file 50x.html under /usr/share/nginx/html. So I got a 404. Doesn't this affect the accuracy of my judgment? Just comment it out! Visit again, wait for 3s, and finally the' normal' interface comes out.

When the environment is good, let's go through the routine and follow the routine of troubleshooting web problems. Let's take a look at the error log first:

nginx:

The errors are all recv () failed (104: connection reset by peer.

Recv failed and the connection was reset. Why was the connection reset? Don't you agree with me?

We are looking at the error log of php-fpm:

(note php_admin_value in php-fpm.

Each request generates two warning and one notice:

Warning: Script execution timed out and terminated.

Warning: The subprocess received sigterm signal and quit.

Notice: A new subprocess was started (because I set pm.min_spare_servers = 1).

It seems that if php's worker process runs out of time, not only the script execution will be terminated, but also the worker process will quit. It seems that nginx's error connection was reset because php's worker process quit (in tcp connection, one party will send rst to the other party if it is broken).

Through the log, we can already know that the php script execution timed out and the worker subprocess quit, which led to nginx reporting the error connection reset by peer. Let's take a look at php and nginx through strace:

php:

1.accept a connection request of nginx (socket, bind and listen are all completed in the master). It can be seen that the port of nginx is 47039, and reading data from fd0 is from standard input, which is stipulated in the fast-cgi protocol. The connected descriptor after accept is 3.

2. read the data transmitted by nginx from fd3, in fastcgi protocol format, and received 856 bytes. Why did you read5 it five times?

Because the fastcgi protocol data packet is 8 bytes aligned, it consists of a packet header and a packet body. And they all send a request packet first, including some information such as request id, version, typpe, etc. (8 bytes in the packet header), then send a params packet, passing the get parameter and environment variable (8 bytes in the packet header, and the packet body becomes longer), and finally send a params packet with no packet body but only the packet header, indicating that the parameter sending is over (8 bytes in the packet header). So the first three reads are used to read the header and body of the request packet and the header of the params packet, the fourth READ is used to read the real data, and the last READ is used to read the header of the last params packet. Therefore, the data transmitted by nginx should be 8+8+8+856+8=896 bytes (corresponding to the transmission bytes of nginx below). Note that in the post mode, stdin packets will also be sent.

3. Set the sleep for 20s, that is, sleep(20) in the php program. After that, the process is terminated, so there is nothing behind it. Strace program also quit.

nginx:

1.accept the request to the browser, and you can see that the port on the browser side is 56434, the ip is 192.168.1.105, and the fd of the established connection is 3.

2. Receiving data from fd3, http protocol.

3. Create a socket, fd21, to establish a connection with php.

4. When connecting to fd21, you can see that it is connected to port 9000 of this machine, where nginx and php-fpm are connected by ip socket, and when nginx and php-fpm are deployed on one machine, unix domain socket can be considered.

5. Write data to fd21 in the format of fast-cgi protocol. We see that the length of writing is 896, which corresponds to the length received by php above.

6.recvfrom function returns econreset (connection reset by peer) from fd21.

7. By writing error information into fd9, it can be inferred that fd9 is the file descriptor of nginx error log.

8. Close the connection with fd21.

9. Write 502 bad gateway to fd3, which is the information returned to the browser.

10. Write an access log to fd8, and it can be inferred that fd8 is the file descriptor of nginx access log.

To verify the inference of nginx access log and error log. You can see that it is indeed fd8 and fd9, and it is in write mode.

Then we might as well take a look at the transmission of the whole network packet in this process:

Grab the bag through tcpdump, and it is more convenient to look at it with artifacts.

Because I just want to see the communication between nginx and php, and I know that the port of nginx is 47039, I can filter out the corresponding packets through tcp.srcport==47039.

You can see the process of data interaction between nginx and php-fpm: 47039->9000 establishes three-way handshake, then sends data to 9000, 9000 replies to ack, and 9000 replies to rst after 3s. No problem.

note:

Syn and fin each have a serial number.

Ack, rst does not occupy the sequence number (reqnum and acknum of 28 and 29 packets are the same)

The sequence number is one for each byte (29 packets send 896 bytes, while the seq of 29 packets is 4219146879, and the ack of 30 packets is 4219147775, which is exactly 896).

Rst does not need to reply.

Copyright Description:No reproduction without permission。

Knowledge sharing community for developers。

Let more developers benefit from it。

Help developers share knowledge through the Internet。

Follow us