Ph.D. in Computer Science
Computer and Information Science and Engineering
University of Florida, Gainesville, FL 32611
Office: E457 CSE Building
Phone: (352) 284 - 6721
Ph.D., Computer Science, University of Florida (August 2012 - December 2017)
B.S., Computer Science, Nanjing University, China (2008 - 2012)
GPA 3.70/4.00, rank 10/160
Software Engineering Intern, Google Photos Team, Google, Fall 2016
- Implemented an efficient concept mining system to tag photos with concepts automatically and avoid time-consuming manual concept designing
- Learned thousands of meaningful and useful concepts from millions of photos using topic modelling algorithms and MapReduce
Research and Projects
Knowledge Base Completion, August 2017
- Implemented a query-driven knowledge base completion system using web-based question answering and rule inference to fill in missing information in knowledge bases
- Employed multimodal fusion to combine unstructured and structured data to achieve stateof-the-art
performance and query-driven approaches to provide real-time responses
Multimodal Ensemble Fusion [paper][blog]
- Designed an ensemble fusion model to combine text and images to obtain higher quality than single-modality methods for word sense disambiguation and information retrieval
- Implemented rule-based and classification-based fusion approaches, and text/image classification/retrieval methods
- Multimodal Information Retrieval (Java), December 2015 [paper][code]
- Multimodal Word Sense Disambiguation (C++, Python), December 2015 [paper][code]
Scalable Systems for Information Extraction and Image Retrieval in Big Data
- ScaDIR: Large-scale Distributed Image Retrieval System (Java), Mar. 2015 [paper][code]
- Designed and implemented a distributed image retrieval system with high efficiency and
scalability on millions of images using Hadoop, Mahout and Solr
- Parallelized the bag-of-visual-words model with distributed clustering algorithms on Hadoop, improving performance by orders of magnitude compared to previous approaches
- Streaming Fact Extraction from Big Data for TREC KBA (Java, Scala), Dec. 2013 [paper][blog]
- Implemented a fast streaming system to perform entity recognition, entity resolution and relation extraction on 5TB text data, including blogs, news, forum posts,
tweets, wikipedia using Stanford NLP, LingPipe, OpenNLP
- Non-Arrival Flights: A Website for Presenting Multi-lingual Experiments, Nov. 2015 [demo][code]
- Developed a website to present dynamic translations among different languages
- Yang Peng.
Multimodal Fusion: a Theory and Applications [paper][slides].
PhD Dissertation. The University of Florida, 2017.
- Dihong Gong, Daisy Zhe Wang, Yang Peng.
Multimodal Learning for Web Information Extraction.
In proceedings of ACM Multimedia, October 2017.
- Yang Peng, Xiaofeng Zhou, Daisy Zhe Wang, Ishan Patwa, Dihong Gong, Chunsheng Victor Fang.
Multimodal Ensemble Fusion for Disambiguation and Retrieval.
In proceedings of the IEEE Multimedia Magazine, April-June 2016.
- Yang Peng, Xiaofeng Zhou, Daisy Zhe Wang, Chunsheng Victor Fang.
Scalable Image Retrieval with Multimodal Fusion.
In proceedings of the 29th International FLAIRS conference, May 2016.
- Yang Peng, Daisy Zhe Wang, Ishan Patwa, Dihong Gong.
Probabilistic Ensemble Fusion for Multimodal Word Sense Disambiguation.
In proceedings of the IEEE International Symposium on Multimedia, December 2015.
- Morteza Shahriari Nia*, Christan Grant*, Yang Peng*, Daisy Zhe Wang, Milenko Petrovic.
Streaming Fact Extraction for Wikipedia Entities at Web-Scale.
In proceedings of the 27th International FLAIRS Conference, May 2014.
(* authors contributed equally)
- Morteza Shahriari Nia, Christan Grant, Yang Peng, Daisy Zhe Wang, Milenko Petrovic.
University of Florida Knowledge Base Acceleration Notebook.
In the Proceedings of the Twentieth Text REtrieval Conference (TREC 2013), Gaithersburg, Maryland. November 2013.
Awards and Honors
- University of Florida Graduate Fellowship, 2012 - 2016
- Excellent Student of Nanjing University, 2011
- China National Scholarship, 2011
- Excellent Youth League Member of Nanjing University, 2010
- China Institute of Electronics Technology Group Corporation Scholarship, 2010
- Excellent Student of Computer Science and Technology Department, 2009
- China National Scholarship, 2009
- Reviewer: WWW 17
- Session chair: IEEE ISM 2015, session 16
- External reviewer: VLDB 17, VLDB 16, VLDB Journal 16, IJCAI 16, ICDE 16, etc.
- Programming Fundamentals I: Spring 2016
- Database Mangement Systems: Fall 2013, Spring 2014, Spring 2017 and Fall 2017
- Problem Solving using Computer Software: Spring 2013 and Summer 2013
- Programming Using C Language: Fall 2012
Courses: Database Management Systems, Analysis of Algorithmas, Machine Learning, Advanced Data Structures, Computer Networks, Operating Systems, Distributed Operating Systems, Data Science, Data Mining, Digital Poetics, Artificial Intelligence, etc.