• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • About us
  • Research
  • Publications
  • News
  • People
  • Contact Us

High Performance Computing Laboratory

Texas A&M University College of Engineering

Packet Coalescing Exploiting Data Redundancy in GPGPU Architectures

K. H. Kim, R. Boyapati, J. Huang, Y. Jin, K. H. Yum and E. J. Kim

Proceedings of 31st International Conference on Supercomputing (ICS), Chicago, IL, USA, June, 2017

General Purpose Graphics Processing Units (GPGPUs) are becoming a cost-effective hardware approach for parallel computing. Many executions on the GPGPUs place heavy stress on the memory system, creating network bottlenecks near memory controllers. We observe that data redundancy in communication traffic is commonplace across a wide range of GPGPU applications. To exploit the data redundancy, we propose a packet coalescing mechanism to alleviate the network bottlenecks by directly reducing the traffic volume. The key idea is to coalesce multiple packets into one without increasing the packet size when they carry redundant cache blocks. To ensure that the coalesced packets are delivered to their respective destinations, we adopt multicast routing for the interconnection network of GPGPUs. Our coalescing approach yields 15% IPC improvement (up to 112%) in a large-scale GPGPU with 2D mesh across various GPGPU applications, by reducing average memory access time (AMAT) by 15.5% (up to 65.2%) and obtaining network bandwidth savings by 13% (up to 37%). Also, our coalescing approach achieves 7% IPC improvement in the NVIDIA Fermi architecture with the crossbar.

© 2016–2025 High Performance Computing Laboratory Log in

Texas A&M Engineering Experiment Station Logo
  • State of Texas
  • Open Records
  • Risk, Fraud & Misconduct Hotline
  • Statewide Search
  • Site Links & Policies
  • Accommodations
  • Environmental Health, Safety & Security
  • Employment