New Algorithms for Spatial Data Clustering Problems

Doctoral Dissertation
Thumbnail

Abstract

Knowledge discovery in spatial (i.e., numerical) data sets is becoming more and more important since increasingly large amounts of spatial data are available and used in key application areas such as public administration, science, and business. Data clustering, i.e., grouping various objects into meaningful subclasses, is one of the major data mining operations, and is a fundamental problem that arises in many applications. Due to the huge sizes and large number of attributes of the input data nowadays, achieving computational efficiency has become a primary objective and a big challenge to data clustering algorithms. In this dissertation, we study a number of spatial data clustering problems such as density-based data clustering, density-based data clustering with secondary memory management for lower dimensions, agglomerative hierarchical data clustering, and constrained 1-D K-means data clustering. Based on interesting spatial data structures such as BBD-tree, fair-split tree, and hashing, and computational geometry techniques such as range search, closest pair search, well-separate pair decomposition, and K-link shortest path, we present new efficient and effective spatial data clustering algorithms with applications. Comparing with the previous best known algorithms, our algorithms improve the time complexity and/or the quality of the output significantly. We believe that our spatial data clustering algorithms are of considerable practical importance.

Attributes

Attribute NameValues
URN
  • etd-04022008-234707

Author Bin Xu
Advisor Danny Z. Chen
Contributor Gregory R. Madey, Committee Member
Contributor Amitabh Chaudhary, Committee Member
Contributor Xiaobo Sharon Hu, Committee Member
Contributor Danny Z. Chen, Committee Chair
Degree Level Doctoral Dissertation
Degree Discipline Computer Science and Engineering
Degree Name PhD
Defense Date
  • 2008-02-20

Submission Date 2008-04-02
Country
  • United States of America

Subject
  • algorithm

  • computational geometry

  • data mining

  • spatial data clustering

Publisher
  • University of Notre Dame

Language
  • English

Record Visibility and Access Public
Content License
  • All rights reserved

Departments and Units

Files

Please Note: You may encounter a delay before a download begins. Large or infrequently accessed files can take several minutes to retrieve from our archival storage system.