Fast Search in Hamming Space with Multi-Index Hashing
Build multiple hash tables on binary code substrings that enables exact K-nearest neighbor search in Hamming space. The algorithm is straightforward to implement, storage efficient, and it has sub-linear runtime behavior for uniformly distributed codes. - Fast Search in Hamming Space with Multi-Index Hashing / SO
see also
- Exact binary vector search for RAG in 100 lines of Julia - RAG + full scan & hamming distance
- Improved Hamming Distance Search using Variable Length Hashing
- Detecting Near-Duplicates for Web Crawling
- Similarity Estimation Techniques from Rounding Algorithms
- Finding Near-Duplicate Web Pages: A Large-Scale Evaluation of Algorithms
- MyScale vs. PostgreSQL & OpenSearch: An Exploration into Integrated Vector Databases
Written on September 3, 2019, Last update on May 19, 2024
hash
hamming
distance
nearest-neighbor
vector
search
LLM