nth_element (C++) / Finding the first n largest elements in an array

a partial sorting algorithm that rearranges elements in [first, last) with O(N) worst running time. - nth_element / SO

nth_element (C++)

  • The element pointed at by nth is changed to whatever element would occur in that position if [first, last) were sorted.
  • All of the elements before this new nth element are less than or equal to the elements after the new nth element.

On average, the implementations of C++ are based on introspective selection which has O(N) worst running time.

Median

Using the nth_element, we can specify the Nth to be the middle, which will be the definition of the median number (in the sorted array).

vector<int> data = { 5, 4, 2, 3, 9, 8, 6, 7, 0, 1 };
nth_element(begin(data), begin(data) + data.size() / 2, end(data));
cout << "Median is " << (data[data.size()/2]);

Using a priority queue

Applicable to an unbounded input stream Total complexity: O(N log k) where N is the total number of elements in the array and k the number element to keep.

Notes:

  • For k < N ( N > 100k, k=100), using a priority queue, was benched to be far more efficient than sorting and taking first k elements when looking for max.
  • For smaller array std::nth_element or even sort were faster (nth_element being always faster than sort).

Algorithm:

  • Unconditionnally insert into the queue the k first elements
  • For each remaining element x, insert x if it is greater than the least element of the queue (O(log k) operation), and remove the least element (O(log k)).
  • When done, the priority queue contains k elements, which are the k largest elements of the original array.

see also

std::vector<unsigned char> container;
container.reserve(1024);
std::priority_queue<unsigned char, std::vector<unsigned char>> pq (
    std::less<unsigned char>(), std::move(container));

see also

Written on December 23, 2020, Last update on June 9, 2023
c++ sort algorithm complexity median