## Departmental Report : REP-2012-541

Report ID: | REP-2012-541 |

Title: | A Fast Algorithm for Learning Weighted Ensemble of Roles |

Authors: | Abdullah Almutairi University of Florida 352-226-4228 Abdullah Almutairi University of Florida 352-226-4228 Abdullah Almutairi University of Florida 352-226-4228 Abdullah Almutairi University of Florida 352-226-4228 Sanjay Ranka Sanjay Ranka Sanjay Ranka Sanjay Ranka Manas Somaiya Manas Somaiya Manas Somaiya Manas Somaiya |

Abstract: | The POWER (PrObabilistic Weighted Ensemble of Roles) modelcite{power} is a Bayesian mixture model where a single data point is generated by multiple components with varying degrees of influence. This is unlike standard mixture models where each data point is generated by a single mixture component. The POWER model allows for capturing various hidden and complexly overlapping components and roles of entities that contribute to generating the data points. However, the POWER model suffers from a very slow learning time. The highest complexity time of the parameters' learning steps is O(n.k.d^2), with n being the number of data points, k the number of components of the model and d the number of data attributes. In this paper, we propose an approximation to the POWER model weight parameter learning algorithm that reduces the computational time significantly. The overall complexity of the new algorithm is O(n.k.d.p) in practice (where p is an approximation parameter much smaller than d). This allows the model learning time to be linear in the number of attributes of the data and provides a significant speedup over the original algorithm. We demonstrate the accuracy of our approximation technique using synthetic and real datasets. We also provide experimental results of the approximate POWER model on a dataset of NIPS papers and a dataset of wireless web access patterns and show that the model learnt are similar. An implementation of the approximate POWER model on the dataset of NIPS papers is about 27 times faster than the original version. |

Supporting Files: | REP-2012-541.pdf (262 KB) |

Posted: | February 26, 2012 |