linux - What is the best way to build own system metric collector agent

Question

Myself have an idea to build own metric collection agent for linux systems with various customised features and controls. Would like to know what is the best practice to collect metrics continuous from a linux system.

Is it best to use infinite while loop with sleep inside for required time interval of data collection ? or any other best method available for recursive data collection without wasting system memory much.
If i want to collect multiple metrics, like CPU util, memory util, disk util etc. What is the best way to execute all commands in parallel ? is it good way to use & and leave it for background and collect all process ids and verify all are completed ? or any other best way is present which for this purpose ?

Thanks in Advance.

score 0 · Accepted Answer

降低计算消耗

使用 C 编程语言，或用汇编语言编写。一般来说，答案是：越低越好，越少越好。我在下面的答案中假设 C 编程语言。

对于所需的数据收集时间间隔，是否最好使用带有睡眠的无限循环？

使用特定于操作系统的界面定期执行操作。timer_create(). 循环调用nanosleep()需要计算时间差才能准确，这需要获取当前时间，这很昂贵。取决于内核更好。

在中断处理程序中，只需在信号处理程序中设置一个sig_atomic_t标志。在循环中异步等待pause()for 事件。

并行执行所有命令的最佳方法是什么？

为了尽量减少“计算消耗”，不要调用fork()，不要创建线程。如果事件是异步的，则使用一个线程和一个大循环poll()来等待所有事件。这种方法很快就会产生意大利面条式的代码——要特别注意正确地构建和模块化你的代码。

open()/proc/** /sys/**您需要监视的所有接口，在需要lseek发送数据时定期读取它们并再次读取。

所以总的来说，在非常伪代码中：

void timer_callback(int) {
   flag = 1;
}
int main() {
    metrics_read = 0; // keep count of asynchronous reads

    timer_create();
    foreach(metric) {
        int usage = open("/proc/stat", O_NONBLOCK); // for example
    }

    while(1) {
       r = pselect(...);
       if (FD_ISSET(socket_to_send_data)) {
           // take action, in case like socket was closed or smth
       }
       if (FD_ISSET(usage)) {
           parse(usage_buffer); // parse events as they come
           metrics_read++;
       }
       // FD_ISSET etc. for each metric

      if (EINTR && flag) {
          flag = 0;
          foreach(metric) {
              lseek(usage, SEEK_SET, 0)
              read(usage, usage_buffer); // non blocking, each read with custom buffer, to let kernel do the job
          }
       }

       if (metrics_read == ALL_METRICS_CNT) {
           send_metrics(); // also asynchronous on `socket()` `write()` with O_NONBLOCK
           metrics_read = 0;
       }
 }

不要写任何日志。日志会导致 I/O 操作，这是“计算消耗”。不要输出任何东西。此外，需要进行特殊工作pselect来屏蔽信号以“保证”始终按时正确解析标志。

使用 & 并将其留作后台并收集所有进程 ID 并验证所有进程是否已完成，这是一种好方法吗？

绝对不是 -fork()是一个非常“计算消耗”的功能，生成过程非常昂贵。最好不要留下任何“背景”并在单线程单进程中执行所有内容。

或为此目的存在任何其他最佳方式？

较低的“计算消耗”当然是编写一个完成这项工作的内核模块。然后，您可以专门管理内核资源以实现尽可能低的“计算消耗”，同时保持您的系统与 linux 兼容。

linux - What is the best way to build own system metric collector agent

1 回答 1

Related

Reference