Research

Research Interests

  • Computer Vision

  • Machine Learning

  • Medical Image Analysis and Image Processing

  • Computational Geometry

  • Information Theory

Recent Projects

(Details will be added soon for: )

Regularized Boosting;

Dictionary learning for classification;

Metric learning;

DTI estimation.

Total Bregman Divergence Clustering Shape Retrieval

Shape database search is ubiquitous in the world. Shape data in many domains is having an explosive growth and usually requires search of whole shape databases to retrieve the best matches with accuracy and efficiency. An accessible shape representation and accurate dissimilarity measure is very important.

 

Total Bregman Divergence

The total Bregman divergence (TBD) delta associated with a real valued strictly convex and differentiable function f defined on a convex set X between points x,yin X is defined as,

 delta_{f}(x,y)=frac{f(x)-f(y)-langle x-y,nabla f(y)rangle}{sqrt{1+|nabla f(y)|^2}},

langle cdot,cdotrangle is inner product, and |nabla f(y)|^2 = langle nabla f(y), nabla f(y) rangle generally.

Fig. In the left figure, d_f(x,y) (dotted red line) is BD, delta_f(x,y) (bold green line) is tBD, and the two arrows indicate the coordinate system.

Total Bregman Divergence ell_1-norm center  —  t-center

Given a set E={x_i}_{i=1}^n, we can obtain the ell_1-norm t-center x^* of E by solving the following minimization problem

 x^*=argmin_xdelta_f^1(x,E)=argmin_xsum_{i=1}^ndelta_f(x,x_i)

and get

 nabla f(x^*)=left(sum_{i=1}^nw_inabla f(x_i)right)left/left(sum_{i=1}^nw_iright)right., quad  w_i=left(1+|nabla f(x_i)|^2right)^{-1/2}.

t-center is statistically robust to outliers.

Shape Representation using Gaussian Mixture Model (GMM)

A shape is represented as a GMM. A GMM is depicted as

  p(x) = sum_i^m a_imathcal{N}(x;mu_i,Sigma_i), quad 0leq a_i leq 1, quad sum_i^n a_i = 1,

where m is the number of components; mathcal{N}(x;mu_i,Sigma_i) is the Gaussian density function with mean mu_i, variance Sigma_i, and weight a_i in the mixture model.

GaussianMixture 

Fig. Left to right: original shapes; aligned boundaries (using affine allignment); GMM with 10 components, the dot inside each circle is the mean of the corresponding Gaussian density function; 3D view of the mixture of Gaussians.

ktree 

First Clustering and then Retrieval

Fig. tBD clustering algorithm is applied to separate the shapes into different clusters, and the clustering results are stored using a k-tree. The inner nodes of the tree are clustering centers, while the leaves are the GMMs for all shapes. This hierachical tree structure enables efficient retrieval.

References

  • Meizhu Liu, Baba C. Vemuri, Shun-Ichi Amari and Frank Nielsen, “Shape Retrieval Using Hierarchical Total Bregman Soft Clustering”, Transactions on Pattern Analysis and Machine Intelligence (TPAMI’12), to appear, 2012. [PDF]

  • Meizhu Liu, Baba C. Vemuri, Shun-Ichi Amari and Frank Nielsen, Total Bregman Divergence and its Applications to Shape Retrieval, IEEE Conference on Computer Vision and Pattern Recognition (CVPR’10), pp. 3463-3468, 2010. [PDF]

Diffusion Tensor Imaging (DTI) Segmentation

Order two SPD tensors can be seen as covariance matrices of zero mean Gaussian densities. The dissimilarity between two tensors is measures using the total Kullback-Leibler (tKL) divergence between their corresponding Gaussian densities.

 tKL(P,Q)=frac{int p log frac{p}{q}dx}{sqrt{1 + int (1 + log q)^2qdx}}=frac{log (det(P^{-1}Q))+tr(Q^{-1}P)-n}{2sqrt{c + frac{(log (det Q))^2}{4} -frac{n(1+log 2pi)}{2}log(det Q)}},

where c =frac{3n}{4}+frac{n^2log 2pi}{2}+frac{(nlog 2pi)^2}{4}. When an SL(n) transformation is applied on x, i.e., xmapsto Ax, then Pmapsto A'PA and Qmapsto A'QA. It is easy to see that

 tKL(P,Q) = tKL(A'PA,A'QA), quad forall Ain SL(n),

which means that tKL between SPD tensors is invariant under the group action, when the group member belongs to SL(n).

Given an SPD tensor set {Q_i}_{i=1}^m, its t-center P^* can be obtained and has an explicit form

 P^* =(sum_i w_i Q_i^{-1})^{-1}, w_i =frac{mu_i}{sum_j mu_j}, mu_i = left({2sqrt{c + frac{(log (det Q_i))^2}{4}-frac{n(1+log 2pi)}{2}log(det Q_i)}}right)^{-1}.
seg2D 

Piecewise constant segmentation

 E(C,T_1,T_2) = int_R delta (T_0(x),T_1)dx + int_{R^c} delta (T_0(x),T_2)dx + beta |C|,

where delta is the tKL divergence specificially. T_0 is the noisy diffusion tensor image. T_1 is the t-center of DTI for the region R inside the curve C and T_2 is the t-center of the DTI for the region R^c outside C.

Fig. From left to right are initialization, intermediate step and final segmentation.

seg3D 

Piecewise smooth segmentation

 E(C,T)=alpha int_R delta (T_0(x),T_R(x))dx+int_{R^c}delta (T_0(x),T_{R^c}(x))dx +int_{Omega-C}p(T)(x)dx+beta |C|,

where alpha and beta are control parameters, and T(x) is an approximation to T_0(x), which can be discontinuous only along C.

Fig. (a) A 2D slice of the corresponding evolving surface, from left to right are initialization, intermediate steps and final segmentation. (b) The 3D view of the segmentation result.

References

  • Baba C. Vemuri, Meizhu Liu, Shun-Ichi Amari and Frank Nielsen, Total Bregman Divergence and its Applications to DTI Analysis, IEEE Transactions on Medical Imaging (TMI’10), Vol. 30, No. 2, pp. 475-483, 2011. [PDF]

Boosting and Classification

Boosting is a well known machine learning technique used to improve the performance of weak classifiers. The goal of boosting is trying to maximize the margin between different classes. Given the input mathcal{X}timesmathcal{Y}, where mathcal{X} is the training sample set and mathcal{Y} is the label set, the goal is to learn a function bar{H}colon mathcal{X}mapstomathcal{Y}. In the binary classification problem, mathcal{X} is a set of feature vectors, and mathcal{Y}={-1,1}. The training samples are {(x_n,y_n)}_{n=1}^N, x_nin mathcal{X} is the feature vector, and y_nin {1,-1} is the class label for x_n. Given the training samples, the task of boosting is to learn the strong classifier H(x)={bf sign}(sum_{t=1}^T w_t h_t(x)) to approximate bar{H}.

The LPBoost formulation to maximize the hard margin at iteration t is given by,

 max_{mathbf{w},rho} rho label{eqn:hmPrimal}, mbox{ s.t. } sum_{i=1}^t u_n^iomega_igeq rho,,n=1,cdots, N, ,mathbf{w}inDelta_t, u_n^{t}={bf sign}(h_t(x_n)y_n).

rho is the minimum hard margin between the two classes. Delta_t is the t-simplex and mathbf{w}inDelta_t implies that sum_{j=1}^tw_j=1 and w_jgeq 0, for j=1,cdots,t.

LPBoost to maximize the soft margin can be expressed by the following function,

 max_{mathbf{w},rho, mathbf{zeta}},rho-Dsum_{n=1}^Nzeta_n, mbox{s.t. }sum_{i=1}^t u_n^iw_igeq rho-zeta_n,,,n=1,cdots, N,  mathbf{w}inDelta_t,,mathbf{zeta}geq mathbf{0},,

where, mathbf{bf zeta} is the slack variable vector, and D is the constant factor which penalizes the slack variables. Its dual problem is

 min_{mathbf{d}inDelta_N,mathbf{d}leq Dmathbf{1}}, max_{i=1,cdots,t}mathbf{u}^icdotpmathbf{d}.
classification 

The tBD regularized LPBoost (tBRLPBoost) is

 min_{mathbf{d}inDelta_N,mathbf{d}leq Dmathbf{1}},(max_{i=1,cdots,t}mathbf{u}^icdotpmathbf{d} +lambdadelta(mathbf{d},mathbf{d}^0))

where, mathbf{d}^0 is the initialized distribution and delta is the tKL specifically. Using tKL to regularize LPBoost, a constant iteration upper bound will be achieved.

Fig. Classification accuracy vs. number of iterations. Comparision between tBRLPBoost and entropy regularized LPBoost (ELPBoost) on the training and testing sets of the (a) pima, (b) spambase, (c) iris and (d) spectf datasets (which belong to the UCI machine learning repository).

References

  • Meizhu Liu, Baba C Vemuri, Robust and Efficient Regularized Boosting Using Total Bregman Divergence, IEEE Proceedings of the 24th Conference on Computer Vision and Pattern Recognition (CVPR’11), to appear, 2011.

  • Meizhu Liu and Baba C. Vemuri, RBOOST: Riemannian Distance based Regularized Boosting, IEEE International Symposium on Biomedical Imaging (ISBI’11), to appear, 2011. [PDF]