6

我有一个单线程目录扫描程序。扫描文件时,我必须读取属性信息并将其插入数据库。

我有 2 个问题。为了提高性能:

  1. 如何使用多线程扫描?(扫描安卓手机的sd卡​​)
  2. 如何优化批量插入数据库?

下面是代码清单:

void scan() {
    File file = new File("/mnt/sdcard");
    fun(file);
}

void fun(File file) {
    if (!file.exists()) {
        return;
    }
    if (!file.isDirectory()) {
        // read attribute information and insert to db
        return;
    } else {
        File[] arr = file.listFiles();
        for (int i = 0; i < arr.length; i++) {
            fun(arr[i]);
        }
    }
}
4

3 回答 3

5

我不认为使用多线程在这里会有所帮助。扫描目录是 IO 有界的。即使你使用多线程,它们都会在一个工作线程中等待 IO 操作完成。所以在任何时候都只有一个线程扫描。

除非您的目录上的 IO 操作可以并行化,例如多个磁盘,否则它将有所帮助。

于 2013-03-04T03:17:29.470 回答
3

是的,您可以使用多线程来提高性能,当一个进行磁盘 I/O 另一个进行网络 I/O 时,我将写一个小例子。

这是示例,最好在睡眠前阅读 :) 使用类的构造函数ReadThenAll(5);创建 5 个线程来探索文件夹和子文件夹。玩得开心 !!

package foo;

import java.io.File;
import java.util.Queue;
import java.util.concurrent.ConcurrentLinkedQueue;

public class ReadThenAll {

    // subfolders to explore
    private final Queue exploreList = new ConcurrentLinkedQueue();

    private long counter = 0;

    public void count() {
        counter++;
    }

    public static void main(String[] args) {

        ReadThenAll me = new ReadThenAll(5);
        me.scan("/tmp");

    }

    int[] threads;

    public ReadThenAll(int numberOfThreads) {
        threads = new int[numberOfThreads];

        for (int i = 0; i < threads.length; i++) {
            threads[i] = -1;
        }
    }

    void scan(String fileName) {

        final long start = System.currentTimeMillis();

        // add the first one to the list
        File file = new File(fileName);
        exploreList.add(file);

        for (int i = 0; i < threads.length; i++) {
            FileExplorer explorer = new FileExplorer(i, this);
            Thread t = new Thread(explorer);
            t.start();
        }

        Thread waitToFinish = new Thread(new Runnable() {

            @Override
            public void run() {

                boolean working = true;
                while (working) {
                    working = false;

                    for (int i = 0; i < threads.length; i++) {
                        if (threads[i] == -1) {
                            working = true;
                            break;
                        }
                    }

                    try {
                        Thread.sleep(2);
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                }

                long elapsed = System.currentTimeMillis() - start;
                System.out.println("total time (ms) : " + elapsed);

            }
        });

        waitToFinish.start();
    }

    public void done(int id, int counter) {
        threads[id] = counter;
    }

    class FileExplorer implements Runnable {

        public int counter = 0;
        public ReadThenAll owner;
        private int id;

        public FileExplorer(int id, ReadThenAll owner) {
            this.id = id;
            this.owner = owner;
        }

        @Override
        public void run() {
            while (!owner.exploreList.isEmpty()) {

                // get the first from the list
                try {
                    File file = (File) owner.exploreList.remove();

                    if (file.exists()) {

                        if (!file.isDirectory()) {
                            doThemagic(file);
                        } else {

                            // add the files to the queue
                            File[] arr = file.listFiles();
                            if (arr != null) {
                                for (int i = 0; i < arr.length; i++) {
                                    owner.exploreList.add(arr[i]);
                                }
                            }
                        }
                    }
                } catch (Exception e) {
                    e.printStackTrace();
                    // silent kill :)
                }

                try {
                    Thread.sleep(1);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }

            owner.done(id, counter);
            System.out.println("total of files : " + counter);
        }

        private void doThemagic(File file) {
            System.out.println(file.toString());
            counter++;
        }
    }

}
于 2013-03-04T03:45:16.557 回答
1

你可以按照下面的设计之一

1 - Create a queue in which supports multiple read  and single write.
2-  Get the number of cpu in the system in which you need to run the program because you can not run more threads simultaneously.

3- I/O is always blocking if you have 2 threads which are writing on Disk then they have to be serialized or you have multiple physical storage devices so you can access those.

4- The Queue you created in step 1 , you can write into the queue and simultaneously read.

5- Again database operation is blocking one that means your thread has to wait until it got the response from the db server rather than blocking the thread you can think of asynchronous  processing and callback mechanism. 
于 2013-03-04T04:11:11.073 回答