Sketching Big Network Data
                                                                               

 

Project Goals

This project aims to develop new sketching methods that reduce big network data to measurement summaries orders-of-magnitude smaller than what the traditional sketches can do. The new methods hold the promise of allowing routers to perform measurement on large network traffic at line speed, allowing enterprise systems to keep their network data records for much longer time, and allowing users with ordinary computing resources to work on big network data.

Sketches are compact data structures that store the summary of a large data set and answer queries with estimated results. A major research thrust in sketches is to reduce their memory footprint. After decades of improvement, the traditional way of sketching appears to reach its limit, e.g., requiring hundreds of bits per data flow for cardinality estimation. This project sets an ambitious goal of further reducing average space requirement by orders of magnitude to 1 bit or even 0.1 bit per flow. To achieve this goal, we take a non-conventional approach based on a new concept called virtual sketches. We still allocate (virtual) sketches to each data flow. However, the sketches of different flows share a common pool of memory. Space sharing can drastically reduce the memory requirement, but when we write information to the sketches of one flow, it introduces noise to the measurement of other flows through shared data structures. Fortunately, we find that for randomized sharing schemes, the noise can be statistically measured and removed.

Virtual sketches represent a new research branch, still in its infancy and in need of a comprehensive investigation to fully realize its potential in handling big network data. This project will fulfill such a need with the following research goals: (1) developing new methods of space sharing, (2) designing common procedures for highly efficient online operations on virtual sketches with low overhead, (3) constructing spatial virtual sketches for joint measurement of network data at different routers, (4) constructing temporal virtual sketches for joint measurement of network data across different time periods, and (5) constructing virtual composite sketches for sophisticated network data measurement.

Funding Agency: National Science Foundation

PI: Dr. Shigang Chen, Co-PI: Jih-kwon Peir

Postdoc: Dr. Yu-e Sun

Graduate Students:  Youlin Zhang, Chaoyi Ma

Project Duration: 07/15/2017-07/14/2020

Publication

  1. Qingyun Xiao, Shigang Chen, You Zhou, Min Chen, Junzhou Luo, Tengli Li, Yibei Ling, Cardinality Estimation for Elephant Flows: A Compact Solution based on Virtual Register Sharing, IEEE/ACM Transactions on Networking, vol. 25, issue 6, pp. 3738-3752, December 2017.

  2. Hongli Xu, He Huang, Shigang Chen, Gongming Zhao, Liusheng Huang, Achieving High Scalability Through Hybrid Switching in Software-Defined Networking, IEEE/ACM Transactions on Networking, vol. 26, no. 1, pp. 618-632, February 2018.

  3. Junzhi Gong, Tong Yang, Haowei Zhang, Hao Li, Steve Uhlig, Shigang Chen, Lorna Uden, Xiaoming Li, HeavyKeeper: An Accurate Algorithm for Finding Top-k Elephant Flows, in Proc. of USENIX Annual Technical Conference (ACT'18), Boston, MA, USA, JULY, 2018. (Acceptace rate: 20%)

  4. You Zhou, Yian Zhou, Shigang Chen, Youlin Zhang, Highly Compact Virtual Active Counters for Per-flow Traffic Measurement, Proc. of IEEE INFOCOM, Honolulu, HI, USA, April 2018. (Acceptance rate: 19.2%)

  5. He Huang, Yu-E Sun, Shigang Chen, Shaojie Tang, Kai Han, Jing Yuan, Wenjian Yang, You Can Drop but You Can't Hide: K-persistent Spread Estimation in High-speed Networks, Proc. of IEEE INFOCOM, Honolulu, HI, USA, April 2018. (Acceptance rate: 19.2%)

  6. Yu-e Sun, He Huang, Shigang Chen, Hongli Xu, Kai Han, Yian Zhou, Persistent Traffic Measurement Through Vehicle-to-Infrastructure Communications in Cyber-Physical Road Systems, IEEE Transactions on Mobile Computing, vol. 18, no. 7, pp. 1616-1630, July 2019. 

  7. Qi Zeng, Rakesh Jha, Shigang Chen, Jih-Kwon Peir, Data Locality Exploitation in Cache Compression, in Proc. of 24th IEEE International Conference on Parallel and Distributed Systems, Singapore, December 2018.

  8. Olufemi Odegbile, Shigang Chen, Yuanda Wang, Dependable Policy Enforcement in Traditional Non-SDN Networks, in Proc. of 39th IEEE International Conference on Distributed Computing (ICDCS), Dallas, Texas, USA, July 2019. (Acceptance ratio: 19.6%)

  9. You Zhou, Yian Zhou, Shigang Chen, Threshold-Based Widespread Event Detection, in Proc. of 39th IEEE International Conference on Distributed Computing (ICDCS), Dallas, Texas, USA, July 2019. (Acceptance ratio: 19.6%)

  10. Hongli Xu, Shigang Chen, Qianpiao Ma, Liusheng Huang, Lightweight Flow Distribution for Collaborative Traffic Measurement in Software Defined Networks, in Proc. of IEEE INFOCOM, 2019.(Acceptace rate: 19.7%)

  11. Yu-e Sun, He Huang, Shigang Chen, You Zhou, Kai Han, Wenjian Yang, Privacy-preserving estimation of k-Persistent Traffic in Vehicular Cyber-Physical Systems, accepted by IEEE Internet of Things.