AIR: Accelerated Image Registration

Master's Thesis

Abstract

This thesis presents a performance analysis of an accelerated 2-D rigid image registration implementation that employs the Compute Unified Device Architecture (CUDA) programming environment to take advantage of the parallel processing capabilities of NVIDIA’s Tesla C870 GPU. We explain the underlying structure of the GPU implementation and compare its performance and accuracy against a fast CPU-based implementation. Our experimental results demonstrate that our GPU version is capable of up to 90ÌÄ" speedup with bilinear interpolation and 30ÌÄ" speedup with bicubic interpolation while maintaining a high level of accuracy. This compares favorably to recent image registration studies, but it also indicates that our implementation only reaches about 70% of theoretical peak performance. To analyze our results, we utilize profiling data to identify some of the underlying limitations of CUDA that prohibit peak performance. At the end, we emphasize the need to manage memory resources carefully to fully utilize the GPU and obtain maximum speedup.

Attributes

Attribute NameValues
URN
  • etd-03092010-154958

Author Peter James Bui
Advisor Jay Brockman
Contributor Peter Kogge, Committee Member
Contributor Jay Brockman, Committee Chair
Contributor Patrick Flynn, Committee Member
Degree Level Master's Thesis
Degree Discipline Computer Science and Engineering
Degree Name MSCSE
Defense Date
  • 2009-11-02

Submission Date 2010-03-09
Country
  • United States of America

Subject
  • performance analysis

  • gpgpu

  • image registration

Publisher
  • University of Notre Dame

Language
  • English

Record Visibility Public
Content License
  • All rights reserved

Departments and Units

Files

Please Note: You may encounter a delay before a download begins. Large or infrequently accessed files can take several minutes to retrieve from our archival storage system.