Bug #286: lighttpd crashes under highload - Lighttpd - lighty labs

Actions

Copy link

Bug #286

closed

lighttpd crashes under highload

Added by Anonymous over 18 years ago. Updated over 17 years ago.

Status:

Fixed

Priority:

Urgent

Category:

core

Target version:

ASK QUESTIONS IN Forums:

Description

we could trace down to a performance issue of lighttpd. sporadicly lighttpd crashes...
valgrind log is here: http://www.thecenter.at/lighttpd.1025.txt

-- sl

Actions

Copy link

Updated by jan over 18 years ago

Status changed from New to Assigned

please verify if the problem persists with 1.4.5

Actions

Copy link

Updated by Anonymous over 18 years ago

still is an issue. but its not as hard as before anymore. compare yourself: 1.4.4 i had a dozen crashs a day, with 1.4.5 i have "only" a couple.

-- sl

Actions

Copy link

Updated by Anonymous over 18 years ago

I have seen similar during DOS condition. No core dump (though enabled). lighttpd seemed to 'stop'. php-cgi processes continued until I send a killall -TERM php-cgi. Did not need to send KILL, so however lighttpd stopped, it did not do so in an entirely orderly manner.

Trying

server.max-connections = 1024
server.max-fds = 3072

to see if max-connections protects against this problem. Well hopefully the DOS won't re-occurr ;)

Hope this extra information is useful.

Have a great weekend!

-- richardgreen1965

Actions

Copy link

Updated by Anonymous over 18 years ago

I can confirm this too. I'm evaluating 1.4.7 and unexpectedly crashes after 10 minutes or so of high load. My environment is Debian 3.1 (sarge) with the stock 2.6.8 (-686-smp) kernel package.

I set it up to exclusively have mod_proxy distribute load to several (11) backend servers. No "regular" file requests were served by the server. At a output-rate of more than 150 Mbps and 1800 rps the process quietly exits all of a sudden. When I started lighttpd with the -D flag to see if anything was printed to stderr, I didn't see anything there either when it crashed again. However, I noticed that it did leave with an "aborted" exit code.

I switched off both the rrdtool- and accesslog-modules and could exclude them from suspicion.

I will try a more recent kernel revision later on, but my gut feeling hints me that the problem is indeed in Lighttpd.

-- conny

Actions

Copy link

Updated by jan over 18 years ago

Can you generate a strace for me ? The wiki knows how to report a bug.

Actions

Copy link

Updated by Anonymous over 18 years ago

I'll try to make one. Problem is that under high loads strace itself becomes the performance penalty, thus limiting the rq/sec rate and apparently the chance of the crash to occur...

-- conny

Actions

Copy link

Updated by Anonymous over 18 years ago

Here are my premier results:


11:37:14.450805 accept(5, {sa_family=AF_INET, sin_port=htons(2315), sin_addr=inet_addr("[xxxxxxxxxxxxx]")}, [16]) = 42
11:37:14.450900 fcntl64(42, F_SETFD, FD_CLOEXEC) = 0
11:37:14.450941 fcntl64(42, F_SETFL, O_RDWR|O_NONBLOCK) = 0
11:37:14.450980 ioctl(42, FIONREAD, [7935]) = 0
11:37:14.451026 read(42, "POST /[xxxxxxxxxxx]\r\n[xxxxxxxxxxxxx]"..., 7935) = 7935
11:37:14.452304 ioctl(42, FIONREAD, [0]) = 0
11:37:14.452361 read(42, 0x886ec38, 4159) = -1 EAGAIN (Resource temporarily unavailable)
11:37:14.452440 write(2, "lighttpd: connections.c:962: connection_handle_read_state: Assertion `c->mem->used\' failed.\n", 92) = 92
11:37:14.452580 rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
11:37:14.452664 gettid()                = 2539
11:37:14.452703 tgkill(2539, 2539, SIGABRT) = 0
11:37:14.452740 --- SIGABRT (Aborted) @ 0 (0) ---

A connection is accepted from a client and a POST request is read. Then we ask to read an additional 0 bytes from ...?

-- conny

Actions

Copy link

Updated by Anonymous over 18 years ago

Status changed from Fixed to Need Feedback
Resolution deleted (~~fixed~~)

Wonderful! That patch fixed the problem..._in most cases_! I can still make it crash however (though it seems even less common now).


lighttpd: connections.c:962: connection_handle_read_state: Assertion `c->mem->used' failed.

I have not had time to make a new strace run yet. It looks like a variant of the same problem, no? That some certain chunk sequences still can slip through the cleanup?

-- conny

Actions

Copy link

Updated by Anonymous over 18 years ago

I reproduced the crash with strace attached again. It's exactly the order of calls as last time (see above).

-- conny

Actions

Copy link

#10

Updated by Anonymous over 18 years ago

...but that was with 1.4.7+patch. I have not seen this after I upgraded to the 1.4.8 release. (On the other hand I also switched to slightly faster hardware.)

Let's close it and reopen if someone can reproduce with 1.4.8

-- conny

Actions

Copy link

#11

Updated by Anonymous over 18 years ago

Status changed from Need Feedback to Fixed
Resolution set to fixed

I can now confirm that this issue never appeared again after the 1.4.8 release.

-- conny

Actions

Copy link

Also available in: Atom

Project

General

Profile

Lighttpd

Custom queries

Bug #286

lighttpd crashes under highload

Updated by jan over 18 years ago

Updated by Anonymous over 18 years ago

Updated by Anonymous over 18 years ago

Updated by Anonymous over 18 years ago

Updated by jan over 18 years ago

Updated by Anonymous over 18 years ago

Updated by Anonymous over 18 years ago

Updated by Anonymous over 18 years ago

Updated by Anonymous over 18 years ago

Updated by Anonymous over 18 years ago

Updated by Anonymous over 18 years ago