Most conventional data structures and data analysis methods were designed with simple transaction data in mind. However, data miners are increasingly presented with more complex datasets that have embedded within them some relationships or dependencies. Incorporating these relationships into the data mining process can pose both algorithmic as well as computational challenges, but there is also a tremendous opportunity to leverage them as an additional source of information. Indeed, we believe that there is relational structure in every dataset, which can be exploited for analysis and learning if a suitable data representation is used. In this dissertation, we take a look at the world through a ‘network lens’, that is, we advocate the use of networks for representing and analyzing complex datasets from various domains. First, we propose a methodological advance in the form of a novel algorithm for identifying community structure in networks that is relevant across many domains. Second, we present applications wherein we impose the network view on datasets that do not contain explicit relationships and show how the ‘network lens’ brings into focus some interesting and potentially useful patterns in the data. Specifically, in climate science we demonstrate the value of networks as a unified framework for descriptive analysis and predictive modeling, which has led to some novel insights in the domain.
|Contributor||Nitesh Chawla, Committee Member|
|Contributor||Patrick Flynn, Committee Member|
|Contributor||Jessica Hellman, Committee Chair|
|Contributor||Auroop Ganguly, Committee Member|
|Contributor||Edward Bensman, Committee Member|
|Contributor||Kevin Bowyer, Committee Member|
|Degree Level||Doctoral Dissertation|
|Degree Discipline||Computer Science and Engineering|
|Degree Name||Doctor of Philosophy|
|Departments and Units|
Digital Object Identifier
This DOI is the best way to cite this doctoral dissertation.
|Thumbnail||File Name||Description||Size||Type||File Access||Actions|