java - 设计问题：这仅适用于生产者/消费者吗？

Question

我正在尝试提高索引我的 lucene 文件的性能。为此，我创建了一个工作人员“LuceneWorker”来完成这项工作。

鉴于下面的代码，“并发”执行变得非常缓慢。我想我知道为什么 - 这是因为期货增长到一个极限，几乎没有内存来执行 LuceneWorker 的另一项任务。

问：有没有办法限制进入执行程序的“工人”数量？换句话说，如果有“n”个期货 - 不要继续并允许首先对文档进行索引？

我的直观方法是我应该使用 ArrayBlockingQueue 构建消费者/生产者。但是在我重新设计它之前想知道我是否正确。

        ExecutorService executor = Executors.newFixedThreadPool(cores);
        List<Future<List<Document>>> futures = new ArrayList<Future<List<Document>>>(3);
        for (File file : files)
        {
            if (isFileIndexingOK(file))
            {
                System.out.println(file.getName());
                Future<List<Document>> future = executor.submit(new LuceneWorker(file, indexSearcher));
                futures.add(future);
            }
            else
            {
                System.out.println("NOT A VALID FILE FOR INDEXING: "+file.getName());
                continue;   
            }
        } 

        int index=0;
        for (Future<List<Document>> future : futures)
        {
            try{

                List<Document> docs = future.get();

                for(Document doc : docs)
                    writer.addDocument(doc);    


            }catch(Exception exp)
            {
                //exp code comes here.
            }
        }

score 1 · Accepted Answer

如果要限制等待作业的数量，请使用ThreadPoolExecutor带有有界队列的 a，例如ArrayBlockingQueue. 也滚动你自己的RejectedExecutionHandler，以便提交线程等待队列中的容量。您不能使用其中的便捷方法Executors作为newFixedThreadPool使用 unbounded LinkedBlockingQueue。

score 1 · Accepted Answer

根据标准输入大小和 LuceneWorker 类的复杂性，我可以想象至少部分地使用 Fork/Join 框架来解决这个问题。当使用 JDK 8 的CountedCompleter实现（包含在jsr166y中）时，I/O 操作不会产生任何问题。

java - 设计问题：这仅适用于生产者/消费者吗？

2 回答 2

Related

Reference