Algorithms for Modern Hardware
More practical ways to speed up a program than by going from O(nlog n) to O(nlog log n). - Algorithmica / HN
Among the cool things that we will speed up:
- 2x faster GCD (compared to std::gcd)
- 8-15x faster binary search (compared to std::lower_bound)
- 5-10x faster segment trees (compared to Fenwick trees)
- 5x faster hash tables (compared to std::unordered_map)
- 2x faster popcount (compared to repeatedly calling popcnt)
- 2x faster parsing series of integers (compared to scanf)
- ?x faster sorting (compared to std::sort)
- 2x faster sum (compared to std::accumulate)
- 2-3x faster prefix sum (compared to naive implementation)
- 10x faster argmin (compared to naive implementation)
- 10x faster array searching (compared to std::find)
- 100x faster matrix multiplication (compared to “for-for-for”)
- optimal word-size integer factorization (~0.4ms per 60-bit integer)
- optimal Karatsuba Algorithm
- optimal FFT
Written on March 8, 2022, Last update on March 9, 2022
book
algorithm
fastware