Bug #604
EINTR not check, rrdtool-read: failed Interrupted system call (stopped updating rrd)
| Status: | New | Start: | ||
| Priority: | Normal | Due date: | ||
| Assigned to: | jan | % Done: | 0% |
|
| Category: | core | |||
| Target version: | - | |||
| Pending: | No |
Resolution: | ||
Description
machine env: almost 0 traffic/request, full cpu usage, disk io busy.
i don't have the strace when it's stopping.
2006-03-27 17:23:26: (src/log.c.75) server started 2006-03-27 18:22:00: (src/mod_rrdtool.c.398) rrdtool-read: failed Interrupted system call 2006-03-27 18:22:00: (src/server.c.1085) one of the triggers failed 2006-03-27 21:26:39: (src/log.c.135) server stopped (manually) 2006-03-27 21:26:42: (src/log.c.75) server started 2006-03-28 19:19:00: (src/mod_rrdtool.c.398) rrdtool-read: failed Interrupted system call 2006-03-28 19:19:00: (src/server.c.1085) one of the triggers failed
# strace -p `pidof rrdtool` Process 11484 attached - interrupt to quit read(0, <unfinished ...> Process 11484 detached (CTRL+C)
# strace -p `pidof lighttpd`
Process 11480 attached - interrupt to quit
time(NULL) = 1143634917
epoll_wait(8, {}, 10231, 1000) = 0
time(NULL) = 1143634918
epoll_wait(8, {}, 10231, 1000) = 0
time(NULL) = 1143634919
epoll_wait(8, {}, 10231, 1000) = 0
time(NULL) = 1143634920
epoll_wait(8, <unfinished ...>
Process 11480 detached (CTRL+C)
i guess it take more than 1 seconds to read() in mod_rrdtool.c because rrdtool take some time to update data to disk, as disk io is already heavy busy. and another lighttpd trigger/alarm kill the read() in progress.
History
04/25/2006 02:53 AM - moo
i'm sure that, the code failed to check the value (r) returned by read()/write(), and wrongly think as "rrdtool is quiting with error.", due to EINTR, simply check r == EINTR, and do something right: don't do p->rrdtool_running = 0 at least. i have no glue to make a best patch.