文件类型:PDF文档
文件大小:1316K
A computer-implemented method for training classifiers to tag images, comprising : receiving a set of input data including images and tags; partitioning the set of input data into first and second clusters of data based on similarity of the images, wherein the clusters include sets of images similar to one another and corresponding image tags; determining that a size of the first cluster exceeds a predefined threshold; partitioning the first set of images and corresponding image tags into third and fourth clusters of data, wherein the third and fourth clusters each have a size that is less than the threshold; and training a classifier that predicts tags for an untagged image using the second, third, and fourth clusters of data. A computer-readable medium comprising computer-useable instructions that, when used by computing devices, cause the computing devices to perform operations for reducing user tagging biases in image tagging by determining a similarity of image tag providers; and a computerized system for improving tag prediction performance for rare tags using cluster-sensitive hashing distance.