7

我有一个运行嵌入式 linux 的系统,它连续运行至关重要。基本上,它是一个与传感器通信并将数据中继到数据库和 Web 客户端的过程。

如果发生崩溃,如何自动重启应用程序?

此外,还有几个线程在进行轮询(例如套接字和 uart 通信)。如何确保没有线程挂起或意外退出?是否有一个易于使用的线程友好的看门狗?

4

4 回答 4

7

您可以在进程终止时无缝地重新启动进程,forkwaitpid答案中所述。它不会花费任何大量资源,因为操作系统将共享内存页面。

只剩下检测挂起进程的问题。您可以使用 Michael Aaron Safyan 指出的任何解决方案,但更简单的解决方案是alarm重复使用系统调用,让信号终止进程(相应地使用 sigaction)。只要您继续调用alarm(即只要您的程序正在运行),它就会继续运行。一旦你不这样做,信号就会触发。
这样,不需要额外的程序,只使用可移植的 POSIX 东西。

于 2011-09-11T22:28:32.643 回答
6

The gist of it is:

  1. You need to detect if the program is still running and not hung.
  2. You need to (re)start the program if the program is not running or is hung.

There are a number of different ways to do #1, but two that come to mind are:

  1. Listening on a UNIX domain socket, to handle status requests. An external application can then inquire as to whether the application is still ok. If it gets no response within some timeout period, then it can be assumed that the application being queried has deadlocked or is dead.

  2. Periodically touching a file with a preselected path. An external application can look a the timestamp for the file, and if it is stale, then it can assume that the appliation is dead or deadlocked.

With respect to #2, killing the previous PID and using fork+exec to launch a new process is typical. You might also consider making your application that runs "continuously", into an application that runs once, but then use "cron" or some other application to continuously rerun that single-run application.

Unfortunately, watchdog timers and getting out of deadlock are non-trivial issues. I don't know of any generic way to do it, and the few that I've seen are pretty ugly and not 100% bug-free. However, tsan can help detect potential deadlock scenarios and other threading issues with static analysis.

于 2011-09-11T06:07:23.817 回答
1

您可以创建一个CRON 作业来检查进程是否不时使用start-stop-daemon运行。

于 2012-06-06T18:40:31.460 回答
1

使用此脚本运行您的应用程序

#!/bin/bash

while ! /path/to/program   #This will wait for the program to exit successfully.
do
echo “restarting”                  # Else it will restart.
done

你也可以把这个脚本放在你/etc/init.d/的另一个上作为守护进程启动

于 2017-04-21T11:39:09.450 回答