I have a custom logging process that is reading from STDIN and sending the data out via TCP to a scribed logging server. STDIN is in my case an access log that is attached to Apache httpd 2.2 like this in httpd.conf: CustomLog "|/usr/local/bin/serelog" default
My serelog process sometimes goes into uninterruptable sleep under FreeBSD 9.0 and does not return from it. It works reliably under other operating systems though, including FreeBSD 8, Linux 2.6 and Linux 3.1.
How can I find out what could be the reason for the uninterruptable sleep?
The overall structure is like this: httpd --[PIPE]--> serelog --[TCP-CONNECTION]--> scribed
Until now I did the following analysis:
- Using ps: stat is "D" and wchan is "-". So there is apparently no syscall, which doesn't make too much sense to me, as the process in uninterruptable sleep and should be in kernel land.
- As the process is in state "D", the process does not react to kill -9 as expected.
- Attaching truss to serelog externally from a shell: As long as truss is attached, serelog runs smoothly. Shortly (Seconds) after detaching truss from serelog, serelog goes into "D" state.
- When attaching truss to serelog AFTER it has entered "D" state, truss prints nothing
- In "D" state, lsof shows that the incoming PIPE is full. This is exected, as in "D" state the process "sleeps" and cannot read any longer. The outgoing TCP-CONNECTION is empty.
- If I kill the "surrounding" Apache httpd server, the serelog process eventually terminates after (e.g.) 40 minutes.
- Checking what others report in forums about the uninterruptable problem was not successful: In my setup there is no NFS. And as it is a server there is also no user interaction with CD drives or pluggable hardware.
So I am now stuck with a process that is uninterruptable, is apparently not in a syscall, and works reliably when traced. The only good thing is that I am able to reproduce the behavior in a few seconds or minutes when I send a lot of HTTP requests via JMeter loadtest (5 threads in JMeter).
Any tips on debugging, kernel parameter tuning are appreciated.
Greetings