Perceptual Hash (phash)

A perceptual hash is a fingerprint of a multimedia file derived from various features from its content. Unlike cryptographic hash functions which rely on the avalanche effect of small changes in input leading to drastic changes in the output, perceptual hashes are “close” to one another if the features are similar. - phash.org

Perceptual image hashes

The Average Hash implementation is the easiest and the fastest one, but it appears to be a bit too inaccurate and generates some false positives. Two other implementations are Difference Hash (or dHash) and pHash.

Difference Hash follows the same steps as the Average Hash, but generates the fingerprint based on whether the left pixel is brighter than the right one, instead of using a single average value. Compared to Average Hash it generates less false positives, which makes it a great default implementation.

pHash is an implementation that is quite different from the other ones, and does some really fancy stuff to increase the accuracy. It resizes to a 32x32 image, gets the Luma (brightness) value of each pixel and applies a discrete cosine transform (DCT) on the matrix. It then takes the top-left 8x8 pixels, which represent the lowest frequencies in the picture, to calculate the resulting hash by comparing each pixel to the median value. Because of it’s complexity it is also the slowest one.

  • toy/pHash - rby interface to pHash as well as fixed fork
    • support audio / video / image / text
  • vmchale/phash - a Haskell library to detect (potential) duplicate images. It also contains a command-line tool.

HashImage

perceptual-dct-hash

ImageMatch / github

see also

Written on September 3, 2019, Last update on March 22, 2023
hash opencv