Alan Smith

Kun Li

PHD Candidate


University of Florida

PHD Candidate, Computer Engineering, Aug 2009 - May 2015(expected)

GPA: 3.64

Xidian University

B.Sc. College of Computer Science, Sep 2005 - Jul 2009

GPA: 88/100, major GPA: 90/100, Rank 6/485

Excellent Undergraduate of Shaanxi Province,Highest Honor


  • UDA-GIST: An In-database Framework to Unify Data-Parallel and State-Parallel Analytics. Kun Li, Daisy Zhe Wang, Alin Dobra, Christopher Dudley. VLDB 2015
  • GPText: Greenplum Parallel Statistical Text Analysis Framework. Kun Li, Christan Grant, Daisy Zhe Wang, Sunny Khatri, George Chitouras. SIGMOD DanaC 2013
  • The MADlib Analytics Library or MAD Skills, the SQL. Joseph M. Hellerstein, Christopher R, Florian Schoppmann, Daisy Zhe Wang, Eugene Fratkin, Aleks Gorajek, Kee Siong Ng, Caleb Welton, Xixuan Feng, Kun Li and Arun Kumar. VLDB 2012.
  • Efficient In-Database Analytics with Graphical Models. Daisy Zhe Wang, Yang Chen, Christan Grant and Kun Li. IEEE Bulletin 2014.
  • MADden: Query-Driven Statistical Text Analytics. Christan Grant, Jordan Gumbs, Kun Li, Daisy Zhe Wang, George Chitouras. CIKM 2012
  • Automatic Knowledge Base Construction using Probabilistic Extraction, Deductive Reasoning, and Human Feedback. Daisy Zhe Wang, Yang Chen, Sean Goldberg, Christan Grant, and Kun Li. Proceedings of NAACL-HLT, 2012
  • Reactive Programming Optimizations in Pervasive Computing. Chao Chen, Yi Xu, Kun Li and A. Helal. SAINT 2010. (Best Paper Award).
  • Work Experience

    Software Engineer Intern at Google Inc.

    May 2013 - August 2013

    Mountain View, CA

    Software Engineer Intern at Google Inc.

    May 2011 - August 2011

    Mountain View, CA


  • 2008 National Encouragement Scholarship
  • 2007 First Class Scholarship Of Xidian University
  • 2008 China Computer Scholarship
  • 2007 First Class Scholarship of Tian Jin Guang Dian Group
  • 2006 National Scholarship(1%)
  • 2007 First Class Award Of China Undergraduate Mathematical Contest in Modeling(0.1%)
  • 2006 Second Class Award Of Shaanxi Province Undergraduate Advanced Mathematics Competition
  • 2009 Excellent undergraduate Of Shaanxi Province
  • [2006, 2007, 2008] Excellent Student Of University
  • 2006 Excellent Student Of Freshmen Department
  • Key Skills

    • Languages: Java, C/C++, Python, SQL, MATLAB, HTML, CSS
    • OS: Ubuntu, CentOS, Windows
    • Frameworks and Toolkits: GWT, AppEngine, Hadoop, MADlib, Spark, GraphLab, RPC, NLTK, Scikit, WordNet, UMLS
    • Database: Postgres, Greenplum, Oracle
    • IDE: Eclipse, NetBean, pgAdmin
    • Others: Git, Vim, LATEX, MS office

    Graduate Courses

    • Analysis of Algorithm
    • Computer Networks
    • Embedded System
    • Machine Learning
    • Advanced Topics on Computer Networks
    • Mobile Computing
    • Large-scale Advanced Data Analysis
    • Computer Architecture
    • Stochastic Network Optimization
    • Distributed Operating System
    • Programming Language Principle
    • Advanced Data Strucutres
    • Database Management System
    • Software Engineering
    • Spatial Database

    Research Assistant Experience

    Undergraduate Research

    Project: P2P Model Applied to Traffic Database Sponsored by China Undergraduate Innovation Plan from 9/2007 to 10/2008. Wrote P2P protocol about than 10000 lines of code. Advisor: Dr. Fan.

    RA in Undergraduate Mathematical Modeling Lab from 7/2007 to 9/2007. Advisor: Dr. Zhu.

    Graduate Research

    RA in projects Pervasive Computing, Parallel Computing. Spring 2010 to Spring 2011.

    RA in project Large-scale In-DB Statistical Inference. Fall 11 to Current. Advisor: Dr. Daisy Wang.

    Teaching Assistant Experience

  • Fall 2009: Advanced Data Structure
  • {Summer 09, Fall 09}: Computer Organization
  • {Spring 10, Fall 10, Spring 12}: Distributed Operating System
  • {Fall 12, Fall 13}: Database Management System
  • Spring 2009: Software Engineering
  • {Fall 14}: Discrete Mathmatics
  • Open Source Projects


    An open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning methods for structured and unstructured data

    Contribution:A linear-chain conditional random field learning and inference module for NLP.


    Designed from scratch to allow efficient use of modern architectures for large analytical queries. DataPath makes full use of mult-cores, large amounts of memory, many disks.

    Contribution: A scalable in-database inference framework.