So far, pHash is capable of computing hashes for audio, video and image files. You start by computing hash values for the media files. The following are the function prototypes for the three methods:
int ph_dct_imagehash(const char *file, ulong64 &hash);
ulong64* ph_dct_videohash(const char *file, int &Length);
uint32_t* ph_audiohash(float *buf, int N, int sr, int &nbFrames);
The image hash is returned in the hash
parameter. Audio hashes are returned
as an array of uint32_t
types with the nbFrames parameter indicating the
buffer length. Video hashes are returned as an array of uint64_t
types with the Length
parameter indicating the number of elements in the array. You must free the memory returned by
ph_audiohash()
and ph_dct_videohash()
with
free()
when you are finished using it.
For the audio hash, you must read the data into the buffer
first. You do so with this function:
float* ph_readaudio(const char *filename, int sr, int channels, int &N);
(It is recommended you use sr as 8000 and just one channel.) N will be
the length of your returned buffer. You must free the memory returned by
ph_readaudio()
with free()
when you are finished using it
(usually after a call to
ph_audiohash()
).
There are image hash functions that use the radial hash projections method, rather than the discrete cosine transform (dct), but their results have not shown to be as good as the dct.
Once you have the hashes for two files, you can compare them. The functions you use to compare two files are as follows:
int ph_hamming_distance(ulong64 hasha, ulong64 hashb);
double* ph_audio_distance_ber(uint32_t *hasha, int Na, uint32_t *hashb,
int Nb, float threshold, int block_size, int &Nc);
The hamming distance function can be used for both video and image
hashes. For audio distance, the threshold should be around 0.30 (0.25
to 0.35), the block_size should be 256. The block_size is just the
number of blocks of uint32
types to compare at a time in computing the
bit error rate (ber) between two hashes. It returns a double buffer of
length Nc, which is a confidence vector. It basically gives a confidence
rating that indicates the similarity of two hashes at various positions
of alignment. The maximum of this confidence vector should be a fairly
good indication of similarity, a value of 0.5 being the threshold. You must free the
memory returned by ph_audio_distance_ber()
with free()
when
you are finished using it.
If you want to use the radial hashing method for images, the function for getting the hash is here:
ph_image_digest(const char *file, double sigma, double gamma, Digest
&dig, N);
Use values sigma=1.0 and gamma=1.0 for now. N indicates the number of lines to project through the center for 0 to 180 degrees orientation. Use 180. Be sure to declare a digest before calling the function, like this:
Digest dig;
ph_image_digest(filename, 1.0, 1.0, dig, 180);
The function returns -1 if it fails. This standard will be found in most of the functions in the pHash library.
To compare two radial hashes, a peak of cross correlation is determined between two hashes:
int ph_crosscorr(Digest &x, Digest &y, double &pcc, double
threshold=0.90);
The peak of cross correlation between the two vectors is returned in the pcc parameter.
Using the mvp functions for hash storage is fairly straightforward - the basic idea being to build, add and query the db, and there are three functions in the api to do just that:
MVPRetCode ph_save_mvptree(MVPFile *m, DP **points, int nbpoints);
MVPRetCode ph_add_mvptree(MVPFile *m, DP *new_dp);
MVPRetCode ph_query_mvptree(MVPFile *m, DP *query, int knearest, float radius, DP **results, int *count);
The functions are all documented in the pHash.h header file, but MVPRetCode
is simply an enumerated type to indicate an error message.
Zero indicates success; nonzero values indicate error. A datapoint, or DP is just a
pHash structure to hold a file name and hash value.
MVPFile mfile;
mfile.branchfactor = 2;
mfile.pathlength = 5;
mfile.leafcapacity = 25;
Or, you can just use the void ph_init_mvpfile(MVPFile *m)
function to
initiate the fields to those values. You only need to set these three members if you wish
to experiment with other values. You will also need to set these fields:
mfile.hash_type = (HashType)type
mfile.hashdist = funcCB
Obviously, the values here will depend on what hash you are using and what distance function you want for the callback. The callback must follow this form:
float hashdist(DP *dpA, DP *dpB);
In order to build the db, you will need an array of datapoints.
The pHash function char** ph_readfilenames(const char* dirname, int &N)
will read
the files from a given directory and give you the list of filenames. The reference parameter,
N will tell you how many files there are. From there you can loop through the files,
create a hash for each file, and store the hash and filename in the new
datapoint struct. Use DP *dp = ph_malloc_datapoint(m.pathlength, m.hash_type)
to get a pointer to a
new DP struct. Be sure to assign the filename, hash and hash_length to the respective
fields in the DP.
The pointers to the newly created dp's can be stored in an array of pointers to the datapoints. This is important, because when points are sorted into the file format, the actual datapoints are never reassigned, just the pointers to those original datapoints.
In the examples directory in the download, you will find the following files:
build_mvptree.cpp, add_mvptree.cpp, query_mvptree.cpp
.
These should demonstrate how to do each operation for the dct image hash. You will need
a directory of images with at least 28 images - i.e. more images than it takes to fill a leaf
node of the tree structure, or three greater than the mfile.leafcapacity. You will also find three
similarly named files for the audio hash function.