University of Florida :: Department of Computer and Information Science and Engineering (CISE)

 CISE News & Events

CISE PhD Students and professors receive best scientific paper award at ICPR

February 5th, 2009

Anand Rangarajan, Karthik Gurumoorthy, Ajit Rajwade, Arunava Banerjee

UF CISE PhD students Karthik Gurumoorthy and Ajit Rajwade along with their supervisors Anand Rangarajan and Arunava Banerjee, received the best scientific paper award in the signal processing and representation track at the International Conference on Pattern Recognition (ICPR) held in Tampa from 8th to 12th December 2008. Their paper is entitled "Beyond SVD: Learning Matrix Orthonormal Bases for Compact Image Representation". ICPR is one of the most popular conferences in the field of computer vision and pattern recognition.

The ICPR paper can be accessed here: http://www.cise.ufl.edu/~anand/pdf/icpr2008_FinalSubmission_BeyondSVD.pdf

In their paper, they have developed a new method to represent grayscale images compactly and applied their technique to image compression. Typically, existing image compression algorithms encode images or small patches from images as some linear combination of codewords called as bases. Once these bases are known, one need only store a few coefficients of this linear combination to be able to represent the entire image, thereby saving storage space. Conventionally, standards such as JPEG have been popularly used in image compression. Algorithms that are part of the JPEG standard typically encode images as combinations of a set "universal bases" that are obtained from the Discrete Cosine Transform, a tool in signal processing, popularly abbreviated as the DCT.

Their paper replaces these DCT bases with other bases that are specifically tuned to a certain type of dataset by means of a machine learning algorithm. This gives the bases the flexibility to represent the properties of a particular dataset effectively, as opposed to be being so general. The main contribution of their paper lies in using a matrix-based representation for images or image patches, which really treats the image or the image patch as a 2D signal. Though images are 2D signals, this representation has been rarely used in the machine learning community for image representation. Using these matrix orthonormal bases that are learnt from a so-called training set of image patches, any unseen patch can be represented by means of a sparse projection onto these bases. The sparser the projection, the more is the error incurred in reconstructing the patch. This error is a controllable user parameter that decides the quality of the compression algorithm. This issue of sparsity based projections with matrix orthonormal bases is the novel feature of their algorithm.

Their algorithm was trained and tested on well-known and large facial image databases, and its performance was seen to be comparable to the JPEG standard, as well as some other existing machine learning based algorithms. Besides simple image compression, the algorithm is very easily extensible to compression of entire image databases as well. The same algorithm has also been elegantly extended to color images using higher-order matrix algebra.

Feedback