0

I have a problem with scalability and processing and I want to get the opinion of the stack overflow community.

I basically have XML data coming down a socket and I want to process that data. For each XML line sent processing can include writing to a text file, opening a socket to another server and using various database queries; all of which take time.

At the minute my solution involves the following threads: Thread 1 Accepts incoming sockets and thus generates child threads that handle each socket (there will only be a couple of incoming sockets from clients). When an XML line comes through (ReadLine() method on StreamReader) I basically put this line into a Queue, which is accessible via a static method on a class. This static method contains locking logic to ensure that the program is threadsafe (I could use Concurrent Queue for this of course instead of manual locking).

Threads 2-5 Constantly take XML lines from the queue and processes them one at a time (database queries, file writes etc).

This method seems to be working but I was curious if there is a better way of doing things because this seems very crude. If I take the processing that threads 2-5 do into thread 1 this results in extremely slow performance, which I expected, so I created my worker threads (2-5).

I appreciate I could replace threads 2-5 with a thread pool but the thread pool would still be reading from the same Queue of XML lines so I wandered if there is a more efficient way of processing these events instead of using the Queue?

4

1 回答 1

0

队列1是正确的方法。但是我肯定会从手动线程控制转移到线程池(因此我不需要进行线程管理)并让它管理线程数。2

最终,一台计算机(无论多么昂贵)只能完成这么多的处理。在某个时刻,内存大小、CPU 内存带宽、存储 IO、网络 IO ……将饱和。那时,使用外部队列系统(MSMQ、WebSphere*MQ、Rabbit-MQ,……),每个任务都是一条单独的消息,允许许多计算机上的许多工作人员处理数据(“竞争消费者”模式)。


1我会立即转向ConcurrentQueue:正确锁定很难,您不需要自己做的越多越好。

2在某些时候,您可能会发现您需要比线程池提供者更多的控制权,那就是切换到自定义线程池的时候了。但是原型和测试:很可能你的实现实际上会更糟:见第 2 段。

于 2013-09-05T07:54:20.803 回答