Study Memory Behaviors of Multithreaded Multimedia Workloads Using Simics

Investigators: Jih-Kwon Peir

Sponsor: Intel

Abstract:

Digital signal and multimedia processing become increasingly popular in many microprocessor applications. Today, almost all the commercial processors, from ARM to Pentium, have some types of media enhancement hardware. While there has been much work studying memory behavior and performance for general integer and large-scale scientific applications, this proposed research will focus on the needs for multimedia applications.

The memory reference behavior of multimedia applications is complicated by the fact that general-purpose processors have begun to support multithreading capability to improve throughput and hardware utilization. The new Hyper-Threading technology brings the multithreading idea into Intel Architecture to make a single physical processor appear as two logical processors. Active threads have their own local states, such as Program Counter (PC), register file, and thread status word, while share other expensive hardware, such as functional units and caches. When multiple threads are running in parallel on a multithreaded processor, their memory behavior can be either constructive or disruptive. The memory behavior is constructive when multiple threads share the data that was brought in by one or another. The memory behavior is disruptive when multiple threads have their own distinct working set and complete with the limited memory hierarchy resources.

We will use Simics, a full-system simulator, capable of simulating multiprocessor/multithread applications for this project. Simics is a system-level architecture simulator that is capable of running unmodified commercial operating system and applications. The host and target operating systems can be Windows or Linux. A consistent environment will be established between Intel and Dr. Peir's lab. We will collaborate with Intel's China Research Center (ICRC) and the Graphics and Media Lab at Intel's Microprocessor Research (MRL). We will study several multithreaded media workloads that are parallelized with Intel's OpenMP Compiler.

 

Acknowledgement: This research project uses Simics - whole system simulation tool from Virtutech for conducting performance evaluation. Visit Virtutech Website for obtaining academic Simics license .

 

Papers and Presentations:

  1. X. Shi,, F. Su, J-K. Peir, Y. Xia, Z. Yang, Accessibility vs. Capacity: Modeling and Single-Pass Stack Simulation on CMP Caches”, 2007 IEEE Int'l Symp. on Performance Analysis of Systems and Software (ISPASS), April 2007.  
  2. Z. Yang, X Shi, F Su, J-K. Peir, “Overlapping Dependent Loads with Addressless Preload,”  IEEE/ACM 15th Int'l Conf. on Parallel Architectures and Compilation Techniques (PACT), Sep. 2006.
  3. X. Shi, Z. Yang, J-K. Peir, L. Peng, Y. Chen, V. Lee, and B. Liang, Coterminous Locality and Coterminous Group Data Prefetching on Chip-Multiprocessors, 20th IEEE Int'l Parallel & Distributed Processing Symp. (IPDPS), April 2006.
  4. Z, Yang, X. Shi, F. Su, and J-K. Peir, MLP Explotation with Seamless Preload, 4th Workshop on Memory Performance Issues (WMPI), Feb. 2006.
  5. L. Peng, J. Song, S. Ge, Y-K. Chen, V. Lee, J-K. Peir, and B. Liang, Case Studies: Memory Behavior of Multithreaded Multimedia and AI Applications, 7th Workshop on Computer Architecture Evaluation Using Commercial Workloads (CAECW 7) , Feb. 2004.
  6. A. Gheewala, J-K. Peir, Y. Chen, and K. Lai, Estimating Multimedia Instruction Performance Based on Workload Characterization and Measurement, IEEE 2002 Int'l Workshop on Workload Characterization, Nov. 2002.