我已经使用 apache jsvc 在 linux 上部署了 java 服务应用程序。这是一个多线程应用程序,但是当启动守护进程停止时,它会在所有线程完成之前杀死 jvm。
它是这样的:
- 实现 daemon 接口的 WorkerLauncher,启动服务
- WorkerPool 类启动由 Worker 类表示的父线程和工作(子)线程。
- 当 WorkerLauncher stop 被调用时,WorkerPool 线程被中断并且这个 InterruptedException 被捕获,其中子线程也被中断。
- 当子线程被中断时,它会在停止之前执行计算。这就是它出错的地方:我猜在线程被杀死之前计算还没有完成(这里不太确定)。
工人启动器
public class WorkerLauncher implements Daemon {
private static final Logger log = LoggerFactory.getLogger("com.worker");
private WorkerPool pool;
@Override
public void init(DaemonContext context) {
}
@Override
public void start() {
log.info("Starting worker pool...");
if (pool == null) pool = new WorkerPool();
pool.start();
log.info("Worker pool started");
}
@Override
public void stop(){
pool.stop();
}
@Override
public void destroy() {
pool = null;
}
}
工人池
public class WorkerPool implements Runnable {
private static final Logger log = LoggerFactory.getLogger("com.worker");
private Thread thread = null;
private List<Thread> workers = new ArrayList<Thread>();
public WorkerPool() {
for (int i = 0; i < 1; i++) workers.add(new Thread(new Worker("Worker "+i), "Worker "+i));
this.thread = new Thread(this, "Worker Main");
}
@Override
public void run() {
for (Thread thread : workers) {
thread.start();
}
while (isRunning()) {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
log.info("Worker pool stopping");
for (Thread worker : workers) {
worker.interrupt();
}
break;
}
}
}
public boolean isRunning() {
return !thread.isInterrupted();
}
public void stop() {
thread.interrupt();
}
public void start() {
thread.start();
}
}
工人
public class Worker implements Runnable{
private static final Logger log = LoggerFactory.getLogger("com.worker");
private String name;
public Worker(String name) {
this.name = name;
}
private void stop() {
longCalculations();
log.info("Worker {] stopped", name);
}
@Override
public void run() {
try {
try {
while (true) {
Thread.sleep(1000);
}
} catch (InterruptedException e) {
log.info("InterruptedException");
}
} finally {
stop();
}
}
private void longCalculations() {
for (int i = 0; i < 99999; i++) {
for (int j = 0; j < 9999; j++) {
Math.round(i + j * 0.999);
}
}
}
}
这只是来自实际应用程序的一个示例,但是我重现了同样的问题。commons-daemon 1.1 和 1.2 版本经过测试。如果长计算被删除或持续更短(更好的性能)一切正常。我在这里想念什么?有任何想法吗?
以下是日志输出的样子:
07:55:10.790 ( main) INFO Starting worker pool...
07:55:10.791 ( main) INFO Worker pool started
07:55:34.056 (Worker Main) INFO Worker pool stopping
07:55:34.057 ( Worker 0) INFO InterruptedException
请注意如何Worker {] stopped
丢失。并且在实际应用程序中,由 worker 启动的 3rd 方进程仍在运行,ps -A
即使 jsvc 未显示在进程中也可以看到。
编辑
修改stop()
方法:
private void stop() {
log.info("Worker {] being stopped", name);
longCalculations();
log.info("Worker {] stopped", name);
}
日志输出:
06:18:55.762 ( main) INFO Starting worker pool...
06:18:55.764 ( main) INFO Worker pool started
06:19:08.614 (Worker Main) INFO Worker pool stopping
06:19:08.615 ( Worker 0) INFO InterruptedException
06:19:08.616 ( Worker 0) INFO Worker {] being stopped
我使用 jsvc启动/停止服务:
开始
./jsvc -cwd . -cp commons-daemon-1.1.0.jar:multithread-test.jar -outfile /tmp/worker.out -errfile /tmp/worker.err -pidfile /var/run/worker.pid com.worker.WorkerLauncher
停止
./jsvc -cwd . -cp commons-daemon-1.1.0.jar:multithread-test.jar -outfile /tmp/worker.out -errfile /tmp/worker.err -pidfile /var/run/worker.pid -stop com.worker.WorkerLauncher
重要的!
忘了提一下,这个应用程序在使用 apache procun 的 Windows 上按预期工作。这只发生在使用 jsvc 在 linux 上启动它时。
另一个编辑
如果我在一切正常之后等待线程完成(检查是否有工作线程isAlive
) 。pool.stop()
WorkerLauncher.stop()
@Override
public void stop(){
pool.stop();
while(pool.isAnyGuardianRunning()) {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
break;
}
}
}
在 WorkerPool 中:
public boolean isAnyGuardianRunning() {
for (Thread thread : workers) {
if (thread.isAlive()) return true;
}
return false;
}
但我仍然不知道为什么会发生这种情况......有什么想法吗?