5

I have 2 arrays which are not sorted. Would it be faster to sort them individually and then merge them? Or would it be faster to just concatenate the arrays first and sort the combined huge array?

4

5 回答 5

7

Assuming that concatenation is done in O(1), merging takes O(n) and sorting O(n log n), you have the choice between:

  • sort and merge: O(n log n) + O(n) = O(n log n)
  • concatenate and sort: O(1) + O((2n) log (2n)) = O(n log n)

therefore, asymptotically both options are equivalent.

Of course, the whole discussion is moot anyway if you use MergeSort.

于 2013-04-10T11:35:10.713 回答
4

Clearly, big-O isn't really saying anything in this problem. Assuming the algorithm you are using is quicksort. It has a average running time of:

enter image description here

So now, if sort then merge we get:

f1 = 1.39n * log(n) * 2 + 2n

merge then sort:

f2 = n + 1.39 * 2n * log(2n)

The difference is

f2 - f1 = -n + 2.78n > 0

In the general case, if a sorting algorithm has complexity

C = k * nlog(n)

then since k should be normally bigger than 1, and isn't likely to be anywhere near 0.5, sort then merge will be faster if you are assuming the merge costs at most 2n.

于 2013-04-10T13:42:58.300 回答
0

I think that'll depend on the sorting algorithm and the size of your data.

But a wild guess is that merging and then sorting the whole lot is preferable. Because the merge in that case is just appending.

While in the other case you'll need to apply a sorting merge.

于 2013-04-10T11:32:24.257 回答
0

When it is guaranteed that all entries in the 2nd array are larger than all in the 1st array, then you can concatenate the arrays after sorting each one. Every sorting algorithm has a complexity which is worse than linear, so when you can break down a sorting task into subsets which can be sorted individually, you should do it.

But when the entries need to be sorted again after merging the arrays, sorting each array beforehand is unlikely to make this any faster.

When you want to know it exactly, create a large set of test data and measure the performance yourself.

于 2013-04-10T11:34:21.667 回答
0

That depends on the technique you are using.

Sort first and then merge would give you better results on a modern multi-processor architecture where you can run sorting algorithms on both arrays in parallel threads something around O(nlogn) (but with much lesser constant) and then merge them in O(n) time.

于 2013-04-10T11:35:04.450 回答