Date: January 23, 2024
Time: 12:00 PM - 1:00 PM
Location: 1889 Museum Road, Gainesville, FL, 32611
Host: Department of CISE; Faculty Host: Dr. Kejun Huang
Zoom Link to the Meeting: https://ufl.zoom.us/j/99350698347
Bio: Xiao Fu has been with the School of Electrical Engineering and Computer Science, Oregon State University since 2017, where he is currently an Associate Professor. He received the Ph.D. degree in Electronic Engineering from The Chinese University of Hong Kong, in 2014. He was a Postdoctoral Associate with the Department of Electrical and Computer Engineering, University of Minnesota – Twin Cities, from 2014 to 2017. His research interests include the broad area of machine learning and signal processing, especially theory and algorithms.
Dr. Fu received the Best Student Paper Award at ICASSP 2014, the 2022 IEEE Signal Processing Society (SPS) Best Paper Award, and the 2022 IEEE SPS Donald G. Fink Overview Paper Award. He also received the Outstanding Postdoctoral Scholar Award at University of Minnesota in 2016, the Engelbrecht Early Career Faculty Award from the College of Engineering at Oregon State University in 2023, and the National Science Foundation (NSF) CAREER Award in 2022. Since 2023, he has been the Chair of the IEEE SPS Oregon Chapter. He is currently an Associate Editor of IEEE Transactions on Signal Processing. He was a tutorial speaker at ICASSP 2017 and SIAM Conference on Linear Algebra 2021.
Title: Towards Provable Multimodal Learning: A Model Identification Perspective
Abstract: 2023 was “the year of AI”, highlighted by the release of numerous AI models with remarkable capabilities. Multimodal learning is at the forefront of AI advancements, with state-of-the-art models like GPT-4 and Gemini emphasizing multimodal functionalities as their defining features. Despite its importance, many aspects of multimodal learning, and AI developments in general, still lack a concrete and comprehensive understanding—which is essential for building resilient and trustworthy systems. Our research focuses on the understanding of AI/ML systems to drive theory-backed advancements. From this perspective, this presentation revisits a core component of multimodal learning—Unsupervised Domain Translation (UDT).
Many UDT systems, such as CycleGAN, use Distribution Matching (DM) modules, which often fail in content-aligned translations due to measure-preserving automorphism (MPA). Existing remedies fall short of guaranteed performance. In my talk, I will introduce a model identification perspective for UDT, overcoming the MPA issues and ensuring identifiability of the desired translation functions. This is the first proven identification result in UDT under CycleGAN’s sTitle: ettings, to our knowledge. We have also broadened these concepts, providing solutions for various translation challenges, enabling provable content-style disentanglement, and offering more versatile cross-domain data generation. These advancements promise significant theoretically supported enhancements for UDT applications, particularly in data-limited fields such as medicine and biology.