Algorithms for Modern Hardware

More practical ways to speed up a program than by going from O(nlog n) to O(nlog log n). - Algorithmica / HN

Among the cool things that we will speed up:

  • 2x faster GCD (compared to std::gcd)
  • 8-15x faster binary search (compared to std::lower_bound)
  • 5-10x faster segment trees (compared to Fenwick trees)
  • 5x faster hash tables (compared to std::unordered_map)
  • 2x faster popcount (compared to repeatedly calling popcnt)
  • 2x faster parsing series of integers (compared to scanf)
  • ?x faster sorting (compared to std::sort)
  • 2x faster sum (compared to std::accumulate)
  • 2-3x faster prefix sum (compared to naive implementation)
  • 10x faster argmin (compared to naive implementation)
  • 10x faster array searching (compared to std::find)
  • 100x faster matrix multiplication (compared to “for-for-for”)
  • optimal word-size integer factorization (~0.4ms per 60-bit integer)
  • optimal Karatsuba Algorithm
  • optimal FFT
Written on March 8, 2022, Last update on March 9, 2022
book algorithm fastware