Computer and Information Science and Engineering
University of Florida, Gainesville, FL 32611
Office: E457 CSE Building
Phone: (352) 284 - 6721
Ph.D., Computer Engineering, University of Florida (Fall 2012 - )
B.S., Computer Science, Nanjing University, China (2008 - 2012)
GPA 3.70/4.00, rank 10/160
Software Engineering Intern, Google Photos Team, Google, Fall 2016
- Implemented an efficient concept mining system to tag photos with concepts automatically and avoid time-consuming manual concept designing
- Learned thousands of meaningful and useful concepts from millions of photos using topic modelling algorithms and MapReduce
Research and Projects
Knowledge Base Completion
- Implemented a search-based question answering system to fill in missing information for entities in knowledge bases to improve recall
- Designed a candidate answer ranking model based on classification and type checking
Multimodal Ensemble Fusion [paper][blog]
- Designed a probabilistic ensemble fusion model to combine results from both images and text
- Multimodal Information Retrieval (Java), December 2015 [paper][code]
- Multimodal Word Sense Disambiguation (C++, Python), December 2015 [paper][code]
Scalable Systems for Information Extraction and Image Retrieval in Big Data
- ScaDIR: Large-scale Distributed Image Retrieval System (Java), Mar. 2015 [paper][code]
- Designed and implemented a distributed image retrieval system with high efficiency and
scalability on millions of images using Hadoop, Mahout and Solr
- Parallelized the bag-of-visual-words model with distributed clustering algorithms on Hadoop, improving performance by orders of magnitude compared to previous approaches
- Streaming Fact Extraction from Big Data for TREC KBA (Java, Scala), Dec. 2013 [paper][blog]
- Implemented a streaming system to perform entity recognition, entity resolution and relation extraction on 5TB text data, including blogs, news, forum posts,
tweets, wikipedia using Stanford NLP, LingPipe, OpenNLP
- Non-Arrival Flights: A Website for Presenting Multi-lingual Experiments [demo][code]
- Developed a website to present dynamic translations among different languages
- Yang Peng, Xiaofeng Zhou, Daisy Zhe Wang, Ishan Patwa, Dihong Gong, Chunsheng Victor Fang.
Multimodal Ensemble Fusion for Disambiguation and Retrieval.
In proceedings of the IEEE Multimedia Magazine, April-June 2016
- Yang Peng, Xiaofeng Zhou, Daisy Zhe Wang, Chunsheng Victor Fang.
Scalable Image Retrieval with Multimodal Fusion.
In proceedings of the 29th International FLAIRS conference, May 2016.
- Yang Peng, Daisy Zhe Wang, Ishan Patwa, Dihong Gong.
Probabilistic Ensemble Fusion for Multimodal Word Sense Disambiguation.
In proceedings of the IEEE International Symposium on Multimedia, December 2015.
- Morteza Shahriari Nia*, Christan Grant*, Yang Peng*, Daisy Zhe Wang, Milenko Petrovic.
Streaming Fact Extraction for Wikipedia Entities at Web-Scale.
In proceedings of the 27th International FLAIRS Conference, May 2014.
(* authors contributed equally)
- Morteza Shahriari Nia, Christan Grant, Yang Peng, Daisy Zhe Wang, Milenko Petrovic.
University of Florida Knowledge Base Acceleration Notebook.
In the Proceedings of the Twentieth Text REtrieval Conference (TREC 2013), Gaithersburg, Maryland. November 2013.
Awards and Honors
- University of Florida Graduate Fellowship, 2012 - 2016
- Excellent Student of Nanjing University, 2011
- China National Scholarship, 2011
- Excellent Youth League Member of Nanjing University, 2010
- China Institute of Electronics Technology Group Corporation Scholarship, 2010
- Excellent Student of Computer Science and Technology Department, 2009
- China National Scholarship, 2009
- Reviewer: WWW 17
- Session chair: IEEE ISM 2015, session 16
- External reviewer: VLDB 16, VLDB Journal 16, IJCAI 16, ICDE 16
- Programming Fundamentals I, Spring 2016
- Database Mangement Systems, Fall 2013 and Spring 2014
- Problem Solving using Computer Software, Spring 2013 and Summer 2013
- Programming Using C Language, Fall 2012
Courses: Database Management Systems, Analysis of Algorithmas, Machine Learning, Advanced Data Structures, Computer Networks, Operating Systems, Distributed Operating Systems, Data Science, Data Mining, Digital Poetics, Artificial Intelligence, etc.