摘要
Background:
In the existing supervised hashing methods for images ,an input image is usually encoded by a vector of hand-crafted visual features.
e.g. Such hand-crafted feature vectors do not necessarily preserve the accurate semantic similarities of images pairs,which may often degrade the performance of hashing function learning.(人工提取的特征无法保证图片对之间的语义正确性,也会降低哈希函数学习的性能)
In this paper:
We propose a supervised hashing method for image retrieval, in which we automatically learn a good image representation tailored to hashing as well as a set of hash functions.The proposed method has two stages. In the first stage, given the pairwise similarity matrix S over training images, we propose a scalable coordinate descent method to decompose S into a product of HHT where H is a matrix with each of its rows being the approximate hash code associated to a training image. In the second stage, we propose to simultaneously learn a good feature representation for the input images as well as a set of hash functions, via a deep convolutional network tailored to the learned hash codes in H and optionally the discrete class labels of the images.
Introduction
The learning-based hashing methods can be divided into three main streams.
<a>Unsupervised methonds,in which only unlabeled data is used to learn hash functions.(无监督)
<b>The other two streams are semi-supervised and supervised methods.(半监督和监督)
Key question:
In learning-based hashing for images is how to encode images into a useful feature representation so as to enhance the hashing performance.
Ideally,one would like to automatically learn such a fecture representation that sufficiently preserves the semantic similarities for images during the hash learning process.
e.g. Without using hand-crafted visual features, Semantic Hashing (Salakhutdinov and Hinton 2007) is a hashing method which automatically constructs binary-code feature representation for images by a multi-layer auto-encoder, with the raw pixels of images being directly used as input.
Semantic hashing imposes