6

我已经(在 Java 中)实现了插入排序、合并排序、修改合并排序和快速排序:

ModifiedMergeSort 有一个用于元素“绑定”的变量。当要排序的元素小于或等于“绑定”时,则使用插入排序对它们进行排序。

为什么版本 1 比版本 3、4 和 5 更好?

第 2 版和第 6 版的结果是否现实?

这是我的结果(以毫秒为单位):

Version 1 - Insertion Sort: Run-Times over 50 test runs
Input Size      Best-Case       Worst-Case      Average-Case
N = 10000       14              19              14.96
N = 20000       59              60              59.3
N = 40000       234             277             243.1

Version 2 - Merge Sort: Run-Times over 50 test runs
Input Size      Best-Case       Worst-Case      Average-Case
N = 10000       1               15              1.78
N = 20000       3               8               3.4
N = 40000       6               9               6.7

Version 3 - Merge Sort and Insertion Sort on 15 elements: Run-Times over 50 test runs
Input Size      Best-Case       Worst-Case      Average-Case
N = 10000       41              52              45.42
N = 20000       170             176             170.56
N = 40000       712             823             728.08

Version 4 - Merge Sort and Insertion Sort on 30 elements: Run-Times over 50 test runs
Input Size      Best-Case       Worst-Case      Average-Case
N = 10000       27              33              28.04
N = 20000       113             119             114.36
N = 40000       436             497             444.2

Version 5 - Merge Sort and Insertion Sort on 45 elements: Run-Times over 50 test runs
Input Size      Best-Case       Worst-Case      Average-Case
N = 10000       20              21              20.68
N = 20000       79              82              79.7
N = 40000       321             383             325.64

Version 6 - Quick Sort: Run-Times over 50 test runs
Input Size      Best-Case       Worst-Case      Average-Case
N = 10000       1               9               1.18
N = 20000       2               3               2.06
N = 40000       4               5               4.26

这是我的代码:

package com.testing;

import com.sorting.InsertionSort;
import com.sorting.MergeSort;
import com.sorting.ModifiedMergeSort;
import com.sorting.RandomizedQuickSort;

/**
 *
 * @author mih1406
 */
public class Main {
    private static final int R = 50; // # of tests
    private static int N = 0; // Input size
    private static int[] array; // Tests array
    private static int[] temp; // Tests array

    private static long InsertionSort_best = -1;
    private static long InsertionSort_worst = -1;
    private static double InsertionSort_average = -1.0;

    private static long MergeSort_best = -1;
    private static long MergeSort_worst = -1;
    private static double MergeSort_average = -1.0;

    private static long ModifiedMergeSort_V3_best = -1;
    private static long ModifiedMergeSort_V3_worst = -1;
    private static double ModifiedMergeSort_V3_average = -1.0;

    private static long ModifiedMergeSort_V4_best = -1;
    private static long ModifiedMergeSort_V4_worst = -1;
    private static double ModifiedMergeSort_V4_average = -1.0;

    private static long ModifiedMergeSort_V5_best = -1;
    private static long ModifiedMergeSort_V5_worst = -1;
    private static double ModifiedMergeSort_V5_average = -1.0;

    private static long RandomizedQuickSort_best = -1;
    private static long RandomizedQuickSort_worst = -1;
    private static double RandomizedQuickSort_average = -1.0;


    public static void main(String args[]) {
        StringBuilder InsertionSort_text = new StringBuilder(
                "Version 1 - Insertion Sort: Run-Times over 50 test runs\n");

        StringBuilder MergeSort_text = new StringBuilder(
                "Version 2 - Merge Sort: Run-Times over 50 test runs\n");

        StringBuilder ModifiedMergeSort_V3_text = new StringBuilder(
                "Version 3 - Merge Sort and Insertion Sort on 15 elements: Run-Times over 50 test runs\n");

        StringBuilder ModifiedMergeSort_V4_text = new StringBuilder(
                "Version 4 - Merge Sort and Insertion Sort on 30 elements: Run-Times over 50 test runs\n");

        StringBuilder ModifiedMergeSort_V5_text = new StringBuilder(
                "Version 5 - Merge Sort and Insertion Sort on 45 elements: Run-Times over 50 test runs\n");

        StringBuilder RandomizedQuickSort_text = new StringBuilder(
                "Version 6 - Quick Sort: Run-Times over 50 test runs\n");

        InsertionSort_text.append("Input Size\t\t"
                + "Best-Case\t\tWorst-Case\t\tAverage-Case\n");

        MergeSort_text.append("Input Size\t\t"
                + "Best-Case\t\tWorst-Case\t\tAverage-Case\n");

        ModifiedMergeSort_V3_text.append("Input Size\t\t"
                + "Best-Case\t\tWorst-Case\t\tAverage-Case\n");

        ModifiedMergeSort_V4_text.append("Input Size\t\t"
                + "Best-Case\t\tWorst-Case\t\tAverage-Case\n");

        ModifiedMergeSort_V5_text.append("Input Size\t\t"
                + "Best-Case\t\tWorst-Case\t\tAverage-Case\n");

        RandomizedQuickSort_text.append("Input Size\t\t"
                + "Best-Case\t\tWorst-Case\t\tAverage-Case\n");

        // Case N = 10000
        N = 10000;
        fillArray();
        testing_InsertionSort();
        testing_MergeSort();
        testing_ModifiedMergeSort(15);
        testing_ModifiedMergeSort(30);
        testing_ModifiedMergeSort(45);
        testing_RandomizedQuickSort();

        InsertionSort_text.append("N = " + N + "\t\t" + InsertionSort_best
                + "\t\t\t" + InsertionSort_worst + "\t\t\t"
                + InsertionSort_average + "\n");

        MergeSort_text.append("N = " + N + "\t\t" + MergeSort_best
                + "\t\t\t" + MergeSort_worst + "\t\t\t"
                + MergeSort_average + "\n");

        ModifiedMergeSort_V3_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V3_best
                + "\t\t\t" + ModifiedMergeSort_V3_worst + "\t\t\t"
                + ModifiedMergeSort_V3_average + "\n");

        ModifiedMergeSort_V4_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V4_best
                + "\t\t\t" + ModifiedMergeSort_V4_worst + "\t\t\t"
                + ModifiedMergeSort_V4_average + "\n");

        ModifiedMergeSort_V5_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V5_best
                + "\t\t\t" + ModifiedMergeSort_V5_worst + "\t\t\t"
                + ModifiedMergeSort_V5_average + "\n");

        RandomizedQuickSort_text.append("N = " + N + "\t\t" + RandomizedQuickSort_best
                + "\t\t\t" + RandomizedQuickSort_worst + "\t\t\t"
                + RandomizedQuickSort_average + "\n");

        // Case N = 20000
        N = 20000;
        fillArray();
        testing_InsertionSort();
        testing_MergeSort();
        testing_ModifiedMergeSort(15);
        testing_ModifiedMergeSort(30);
        testing_ModifiedMergeSort(45);
        testing_RandomizedQuickSort();

        InsertionSort_text.append("N = " + N + "\t\t" + InsertionSort_best
                + "\t\t\t" + InsertionSort_worst + "\t\t\t"
                + InsertionSort_average + "\n");

        MergeSort_text.append("N = " + N + "\t\t" + MergeSort_best
                + "\t\t\t" + MergeSort_worst + "\t\t\t"
                + MergeSort_average + "\n");

        ModifiedMergeSort_V3_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V3_best
                + "\t\t\t" + ModifiedMergeSort_V3_worst + "\t\t\t"
                + ModifiedMergeSort_V3_average + "\n");

        ModifiedMergeSort_V4_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V4_best
                + "\t\t\t" + ModifiedMergeSort_V4_worst + "\t\t\t"
                + ModifiedMergeSort_V4_average + "\n");

        ModifiedMergeSort_V5_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V5_best
                + "\t\t\t" + ModifiedMergeSort_V5_worst + "\t\t\t"
                + ModifiedMergeSort_V5_average + "\n");

        RandomizedQuickSort_text.append("N = " + N + "\t\t" + RandomizedQuickSort_best
                + "\t\t\t" + RandomizedQuickSort_worst + "\t\t\t"
                + RandomizedQuickSort_average + "\n");

        // Case N = 40000
        N = 40000;
        fillArray();
        testing_InsertionSort();
        testing_MergeSort();
        testing_ModifiedMergeSort(15);
        testing_ModifiedMergeSort(30);
        testing_ModifiedMergeSort(45);
        testing_RandomizedQuickSort();

        InsertionSort_text.append("N = " + N + "\t\t" + InsertionSort_best
                + "\t\t\t" + InsertionSort_worst + "\t\t\t"
                + InsertionSort_average + "\n");

        MergeSort_text.append("N = " + N + "\t\t" + MergeSort_best
                + "\t\t\t" + MergeSort_worst + "\t\t\t"
                + MergeSort_average + "\n");

        ModifiedMergeSort_V3_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V3_best
                + "\t\t\t" + ModifiedMergeSort_V3_worst + "\t\t\t"
                + ModifiedMergeSort_V3_average + "\n");

        ModifiedMergeSort_V4_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V4_best
                + "\t\t\t" + ModifiedMergeSort_V4_worst + "\t\t\t"
                + ModifiedMergeSort_V4_average + "\n");

        ModifiedMergeSort_V5_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V5_best
                + "\t\t\t" + ModifiedMergeSort_V5_worst + "\t\t\t"
                + ModifiedMergeSort_V5_average + "\n");

        RandomizedQuickSort_text.append("N = " + N + "\t\t" + RandomizedQuickSort_best
                + "\t\t\t" + RandomizedQuickSort_worst + "\t\t\t"
                + RandomizedQuickSort_average + "\n");

        System.out.println(InsertionSort_text);
        System.out.println(MergeSort_text);
        System.out.println(ModifiedMergeSort_V3_text);
        System.out.println(ModifiedMergeSort_V4_text);
        System.out.println(ModifiedMergeSort_V5_text);
        System.out.println(RandomizedQuickSort_text);

    }

    private static void fillArray() {
        // (re)creating the array
        array = new int[N];

        // filling the array with random numbers
        // using for-loop and the given random generator instruction
        for(int i = 0; i < array.length; i++) {
            array[i] = (int)(1 + Math.random() * (N-0+1));
        }
    }

    private static void testing_InsertionSort() {
        // run-times arrays
        long [] run_times = new long[R];

        // start/finish times
        long start;
        long finish;
        for(int i = 0; i < R; i++) {
            copyArray();
            // recording start time
            start = System.currentTimeMillis();

            // testing the algorithm
            InsertionSort.sort(temp);

            // recording finish time
            finish = System.currentTimeMillis();

            run_times[i] = finish-start;
        }

        InsertionSort_best = findMin(run_times);
        InsertionSort_worst = findMax(run_times);
        InsertionSort_average = findAverage(run_times);
    }

    private static void testing_MergeSort() {
        // run-times arrays
        long [] run_times = new long[R];

        // start/finish times
        long start;
        long finish;
        for(int i = 0; i < R; i++) {
            copyArray();
            // recording start time
            start = System.currentTimeMillis();

            // testing the algorithm
            MergeSort.sort(temp);

            // recording finish time
            finish = System.currentTimeMillis();

            run_times[i] = finish-start;
        }

        MergeSort_best = findMin(run_times);
        MergeSort_worst = findMax(run_times);
        MergeSort_average = findAverage(run_times);
    }

    private static void testing_ModifiedMergeSort(int bound) {
        // run-times arrays
        long [] run_times = new long[R];

        // setting the bound
        ModifiedMergeSort.bound = bound;

        // start/finish times
        long start;
        long finish;
        for(int i = 0; i < R; i++) {
            copyArray();
            // recording start time
            start = System.currentTimeMillis();

            // testing the algorithm
            ModifiedMergeSort.sort(temp);

            // recording finish time
            finish = System.currentTimeMillis();

            run_times[i] = finish-start;
        }

        if(bound == 15) {
            ModifiedMergeSort_V3_best = findMin(run_times);
            ModifiedMergeSort_V3_worst = findMax(run_times);
            ModifiedMergeSort_V3_average = findAverage(run_times);
        }

        if(bound == 30) {
            ModifiedMergeSort_V4_best = findMin(run_times);
            ModifiedMergeSort_V4_worst = findMax(run_times);
            ModifiedMergeSort_V4_average = findAverage(run_times);
        }

        if(bound == 45) {
            ModifiedMergeSort_V5_best = findMin(run_times);
            ModifiedMergeSort_V5_worst = findMax(run_times);
            ModifiedMergeSort_V5_average = findAverage(run_times);
        }
    }

    private static void testing_RandomizedQuickSort() {
        // run-times arrays
        long [] run_times = new long[R];

        // start/finish times
        long start;
        long finish;
        for(int i = 0; i < R; i++) {
            copyArray();
            // recording start time
            start = System.currentTimeMillis();

            // testing the algorithm
            RandomizedQuickSort.sort(temp);

            // recording finish time
            finish = System.currentTimeMillis();

            run_times[i] = finish-start;
        }

        RandomizedQuickSort_best = findMin(run_times);
        RandomizedQuickSort_worst = findMax(run_times);
        RandomizedQuickSort_average = findAverage(run_times);
    }

    private static long findMax(long[] times) {
        long max = times[0];

        for(int i = 1; i < times.length; i++) {
            if(max < times[i]) {
                max = times[i];
            }
        }

        return max;
    }

    private static long findMin(long[] times) {
        long min = times[0];

        for(int i = 1; i < times.length; i++) {
            if(min > times[i]) {
                min = times[i];
            }
        }

        return min;
    }

    private static double findAverage(long[] times) {
        long sum = 0;
        double avg;

        for(int i = 0; i < times.length; i++) {
            sum += times[i];
        }

        avg = (double)sum/(double)times.length;

        return avg;
    }

    private static void copyArray() {
        temp = new int[N];
        System.arraycopy(array, 0, temp, 0, array.length);
    }
}
4

2 回答 2

3

您当前采用的方法似乎存在一些系统性错误。我将说明您面临的一些最明显的实验问题,即使它们可能不是直接导致您的实验结果的罪魁祸首。

JVM编译

正如我之前在评论中所说,JVM 默认会以解释模式运行您的代码。这意味着在您的方法中找到的每条字节码指令都将在现场被解释,然后运行。

这种方法的优点是它允许您的应用程序具有比在每次运行启动时编译为本机代码的 Java 程序更快的启动时间。

不利的一面是,虽然没有影响启动性能,但您会得到一个比其他情况下性能更慢的程序。

为了改善这两个问题,JVM 团队进行了权衡。最初,您的程序将被解释,但 JVM 将收集有关哪些方法(或方法的一部分)被密集使用的信息,并且只会编译那些似乎被大量使用的方法。编译时,您会受到很小的性能影响,但代码会比以前快得多。

在进行测量时,您必须考虑到这一事实。

标准方法是“预热 JVM”,也就是说,运行你的算法一段时间以确保 JVM 完成它需要做的所有编译。重要的是,让 JVM 升温的代码与您要进行基准测试的代码相同,否则在您对代码进行基准测试时可能会有一些编译。

测量时间

测量时间时,应使用System.nanoTime()代替System.currentTimeMillis。细节我就不说了,可以看这里

你也应该小心。以下代码块乍一看可能是等效的,但大多数时候会产生不同的结果:

totalDuration = 0;
for (i = 0; i < 1000; ++i)
    startMeasure = now();
    algorithm();
    endMeasure = now();
    duration = endMeasure - startMeasure;
    totalDuration += duration;
}

//...

TRIALS_COUNT = 1000;
startMeasure = now();
for (i = 0; i < TRIALS_COUNT; ++i)
    algorithm();
}
endMeasure = now();
 duration = endMeasure - startMeasure / TRIALS_COUNT;

为什么?因为特别是对于更快algorithm()的实现,它们越快,就越难获得准确的结果。

大输入值

渐近符号有助于理解不同的算法将如何升级n. 将它们应用于小的输入值通常是荒谬的,因为在那个量级上,您通常想要计算操作的精确数量及其相关成本(类似于 Jakub 所做的事情)。大多数情况下,大 O 符号只对大 O 有意义。Big O 将告诉您该算法如何适用于难以忍受的输入值大小,而不是它对于小数字的表现。典型的例子是 QuickSort,它对于大数组来说是王道,但是对于大小为 4 或 5 的数组,它通常比选择或插入排序要慢。不过,您的输入大小似乎足够大。

最后一点

正如 Chang 之前所说,Jakub 所做的数学练习是完全错误的,不应考虑在内。

于 2013-03-10T19:54:42.833 回答
-1

Do the computations of the complexity by yourself. I assume 10000 samples for the following calculations:

Insertion sort: O(n^2), 10 000*10 000 = 100 000 000.

Merge sort: O(nlogn), 10 000*log10 000 = 140 000.

Merge with insertion (15): 15 is between 9 (arrays of sizes 20) and 10 (arrays of sizes 10) 2^10 insertion sorts (of size 10), then 2^10 * 10 000 merges: 1 024 * 10*10 (insertions) + 1 024 * 10 000 (merges) = 10 342 400

Merge with insertion (30): 30 is between 8 (arrays of sizes 40) and 9 (arrays of sizes 20) 2^9 insertion sorts (of size 20), then 2^9 * 10 000 merges: 512 * 20*20 (insertions) + 512 * 10 000 (merges) = 5 324 800

Merge with insertion (45): 45 is between 7 (arrays of sizes 80) and 8 (arrays of sizes 40) 2^8 insertion sorts (of size 40), then 2^8 * 10 000 merges: 256 * 40*40 (insertions) + * 10 000 (merges) = 2 969 600

Quicksort: while worst-case quicksort takes O(n^2), the worst case must be specially crafted to hit that limit. Mostly, using radomly generated algorithm, on average it is O(nlogn): 10 000*log10 000 = 140 000.

Measuring sorting algorithm performace can became quite a pain because you need to measure efficiently, on as wide range of input data as possible.

The figures you see on insertion sort can be largerly biased by the input numbers. If you are using only 0s and 1s in the array, and the array is randomly generated, then you actually have much easier problem for the algorithm to solve. For the given case, on average half of the array is already sorted, and you don't need to compare 0s and 1s with each other. The problem is reduced to transporting all 0s to the left, which on average takes only (log(n/2))!+n time. For 10 000, the actual time is 5 000!+10 000 = 133 888.

于 2013-03-10T00:51:25.233 回答