Image Representation using Attribute-Graphs
Abstract
In a digital world of Flickr, Picasa and Google Images, developing a semantic image represen-tation has become a vital problem. Image processing and computer vision researchers to date, have used several di erent representations for images. They vary from low level features such as SIFT, HOG, GIST etc. to high level concepts such as objects and people.
When asked to describe an object or a scene, people usually resort to mid-level features such as size, appearance, feel, use, behaviour etc. Such descriptions are commonly referred to as the attributes of the object or scene. These human understandable, machine detectable attributes have recently become a popular feature category for image representation for various vision tasks. In addition to image and object characteristics, object interactions and back-ground/context information and the actions taking place in the scene form an important part of an image description. It is therefore, essential, to develop an image representation which can e ectively describe various image components and their interactions.
Towards this end, we propose a novel image representation, termed Attribute-Graph. An Attribute-Graph is an undirected graph, incorporating both local and global image character-istics. The graph nodes characterise objects as well as the overall scene context using mid-level semantic attributes, while the edges capture the object topology and the actions being per-formed.
We demonstrate the e ectiveness of Attribute-Graphs by applying them to the problem of image ranking. Since an image retrieval system should rank images in a way which is compatible with visual similarity as perceived by humans, it is intuitive that we work in a human understandable feature space. Most content based image retrieval algorithms treat images as a set of low level features or try to de ne them in terms of the associated text. Such a representation fails to capture the semantics of the image. This, more often than not, results in retrieved images which are semantically dissimilar to the query. Ranking using the proposed attribute-graph representation alleviates this problem.
We benchmark the performance of our ranking algorithm on the rPascal and rImageNet datasets, which we have created in order to evaluate the ranking performance on complex
queries containing multiple objects. Our experimental evaluation shows that modelling images as Attribute-Graphs results in improved ranking performance over existing techniques.