我有一个整数向量。向量的大小在 2k 左右,向量中的每个数字都在 [0, 2M] 的范围内,很有可能为 0。
由于它是一个稀疏向量,我想知道是否有比常规算法更好的算法来对向量进行排序?哪种排序算法最适合这种情况?
谢谢
This answer might be a bit too obvious...
Since most entries are zero why not do a preliminary exchange so that all the zeros are at one end of the vector and the non-zero elements at the other.
Start from both ends of the vector. From one end search for the first non-zero element, from the other end search for the first zero element. Swap them and then continue until the two search positions meet. The vector is now partitioned into two parts at the meeting point. One part contains only zero elements and the other non-zero elements. Sort the vector from the meeting point over the non-zero elements. There should be very few items that acutally need sorting.
When sorting a few dozen elements or so the actual sorting algorithm used doesn't make much difference from a performance point of view (for a half dozen elements or so, bubble sort is hard to beat!).
如果您有一个包含 2000 个元素的向量,请不要太担心如何对其进行排序……它非常小!
也就是说,如果您有一个包含 n 个整数的向量,每个整数都在 0 和 M 之间,并且 M 很小,您可以使用Counting sort在 O(n) 时间内对其进行排序。
如果向量在某个已知范围内有 n 个实数,并且这些数字是均匀分布的,则可以使用桶排序在 O(n) 的预期时间内对它们进行排序。
You're describing a regular dense vector that happens to have lots of 0
elements. A sparse vector only stores the nonzero elements, and if an element is not stored then it is assumed to be 0
.
To sort a sparse vector just sort it normally. 2000 is already small, but if you genuinely use a sparse structure and "there is a high possibility [an element is] 0" then that number will be much smaller.
An example of a sparse structure is vector< pair<int, double> >
where pair.first
is the index and pair.second
is the value.
The best which comes to my mind is Radix Sort, but thats harder to implement than 3-way quicksort. 3-way quicksort is optimal because it will skip a lot of the same elements, being O(n*log(n)) -> O(n), + i think there is an implementation in almost every programming language.