8

I have written a program that captures and displays video from three video cards. For every frame I spawn a thread that compresses the frame to Jpeg and then puts it in queue for writing to disk. I also have other threads that read from these files and decodes them in their own threads. Usually this works fine, it's a pretty CPU intensive program using about 70-80 percent of all six CPU cores. But after a while the encoding suddenly slows down and the program can't handle the video fast enough and starts dropping frames. If I check the CPU utilization I can see that one core (usually core 5) is not doing much anymore.

When this happens, it doesn't matter if I quit and restart my program. CPU 5 will still have a low utilization and the program starts dropping frames immediately. Deleting all saved video doesn't have any effect either. Restarting the computer is the only thing that helps. Oh, and if I set the affinity of my program to use all but the semi-idling core, it works until the same happens to another core. Here is my setup:

  • AMD X6 1055T (Cool & Quiet OFF)
  • GA-790FX-UD5 motherboard
  • 4Gig RAM unganged 1333Mhz'
  • Blackmagic Decklink DUO capture cards (x2)
  • Linux - Ubuntu x64 10.10 with kernel 2.6.32.29

My app uses:

  • libjpeg-turbo
  • posix threads
  • decklink api
  • Qt
  • Written in C/C++
  • All libraries linked dynamically

It seems to me like it would be some kind of problem with the way Linux schedules threads on the cores. Or is there some way my program can mess up so bad that it doesn't help to restart the program?

Thank you for reading, any and all input is welcome. I'm stuck :)

4

2 回答 2

4

首先,确保它不是你的程序——也许你遇到了一个复杂的并发错误,即使你的程序架构不太可能,而且重启内核会有所帮助。我发现,通常情况下,一个好方法是事后调试。使用调试符号编译,当程序出现异常时使用 -SEGV 终止程序,并使用 gdb 检查核心转储。

于 2011-10-02T13:11:50.677 回答
2

当产生新的帧处理线程并将线程固定到该核心时,我会尝试选择核心循环 a。统计线程运行所需的时间。如果这实际上是 Linux 调度程序中的一个错误 - 您的线程将花费大致相同的时间在任何内核上运行。如果核心实际上正忙于其他事情 - 固定到该核心的线程将获得更少的 CPU 时间。

于 2011-10-02T18:24:58.853 回答