- Home
- Register
- Attend
- Conference Program
- SC15 Schedule
- Technical Program
- Awards
- Students@SC
- Research with SCinet
- HPC Impact Showcase
- HPC Matters Plenary
- Keynote Address
- Support SC
- SC15 Archive
- Exhibits
- Media
- SCinet
- HPC Matters
SCHEDULE: NOV 15-20, 2015
When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.
Optimizing CUDA Shared Memory Usage
SESSION: Regular & ACM Student Research Competition Poster Reception
EVENT TYPE: Posters, Receptions, ACM Student Research Competition
EVENT TAG(S): HPC Beginner Friendly, Regular Poster
TIME: 5:15PM - 7:00PM
SESSION CHAIR(S): Michela Becchi, Manish Parashar, Dorian C. Arnold
AUTHOR(S):Shuang Gao, Gregory D. Peterson
ROOM:Level 4 - Lobby
ABSTRACT:
CUDA shared memory is fast, on-chip storage. However, the bank conflict issue could cause a performance bottleneck. Current NVIDIA Tesla GPUs support memory bank accesses with configurable bit-widths. While this feature provides an efficient bank mapping scheme for 32-bit and 64-bit data types, it becomes trickier to solve the bank conflict problem through manual code tuning. This paper presents a framework for automatic bank conflict analysis and optimization. Given static array access information, we calculate the conflict degree, and then provide optimized data access patterns. Basically, by searching among different combinations of inter- and intra- array padding, along with bank access bit-width configurations, we can efficiently reduce or eliminate bank conflicts. From RODINIA and the CUDA SDK we selected 13 kernels with bottlenecks due to shared memory bank conflicts. After using our approach, these benchmarks achieve 5%-35% improvement in runtime.
Chair/Author Details:
Michela Becchi, Manish Parashar, Dorian C. Arnold (Chair) - University of Missouri|Rutgers University|University of New Mexico|
Shuang Gao - University of Tennessee, Knoxville
Gregory D. Peterson - University of Tennessee, Knoxville
Click here to download .ics calendar file
