AIR: Accelerated Image Registration

Master's Thesis


This thesis presents a performance analysis of an accelerated 2-D rigid image registration implementation that employs the Compute Unified Device Architecture (CUDA) programming environment to take advantage of the parallel processing capabilities of NVIDIA’s Tesla C870 GPU. We explain the underlying structure of the GPU implementation and compare its performance and accuracy against a fast CPU-based implementation. Our experimental results demonstrate that our GPU version is capable of up to 90ÌÄ" speedup with bilinear interpolation and 30ÌÄ" speedup with bicubic interpolation while maintaining a high level of accuracy. This compares favorably to recent image registration studies, but it also indicates that our implementation only reaches about 70% of theoretical peak performance. To analyze our results, we utilize profiling data to identify some of the underlying limitations of CUDA that prohibit peak performance. At the end, we emphasize the need to manage memory resources carefully to fully utilize the GPU and obtain maximum speedup.


Attribute NameValues
  • etd-03092010-154958

Author Peter James Bui
Advisor Jay Brockman
Contributor Peter Kogge, Committee Member
Contributor Jay Brockman, Committee Chair
Contributor Patrick Flynn, Committee Member
Degree Level Master's Thesis
Degree Discipline Computer Science and Engineering
Degree Name MSCSE
Defense Date
  • 2009-11-02

Submission Date 2010-03-09
  • United States of America

  • performance analysis

  • gpgpu

  • image registration

  • University of Notre Dame

  • English

Record Visibility Public
Content License
  • All rights reserved

Departments and Units


Please Note: You may encounter a delay before a download begins. Large or infrequently accessed files can take several minutes to retrieve from our archival storage system.