GPU-Accelerated Summarization and Reconstruction Techniques for Big Data Analysis and Visualization

Doctoral Dissertation


With the ever-growing amount of data we are capable of collecting nowadays, the need for methods to analyze and generate insight into the big data emerges. Data visualization offers a powerful tool to allow humans to better understand the data. However, the large-scale data generated through simulation or collected from real-world scenarios leads to an insurmountable amount to sift through for humans, even when the data are visualized in a concise format. To combat this, it is necessary to simplify the data using summarization techniques. While summarization allows an overview that is easier to digest for humans, it comes with the drawback of omitting parts of the data. To overcome this drawback, data reconstruction techniques allow for a level-of-detail analysis of the underlying data. They further make it possible to synthesize missing data. For both data summarization and reconstruction, it is important to tackle different kinds of data in their respective ways. In this dissertation, I describe several ways to summarize and reconstruct time-varying multivariate, vector field data, and graph data.

Time-varying multivariate volumetric data typically stem from scientific simulations that describe physical or chemical processes, usually in either two or three spatial di- mensions, a temporal dimension, and contain multiple variables. Volume visualization techniques are usually used to visualize and analyze them. In this dissertation, I focus on data analysis using isosurface rendering, a commonly used volume visualization technique.

While isosurface rendering allows detailed insight into time-varying multivariate data sets, the sheer complexity of them makes it often impossible to analyze and visualize all the data at once. A major challenge is the selection of interesting or important parts to bring to the attention of the analyst. Previous works have presented algorithms to select salient values, however, these algorithms do not scale well enough for big data analysis. In this dissertation, I present an acceleration and analysis framework to efficiently analyze and visualize complete large-scale time-varying multivariate data sets using isosurfaces. A different type of data produced by scientific simulation is vector field data. At the core of describing fluid dynamics, these data sets are composed of vector fields revealing the underlying development of the flow. When it comes to the simulation of unsteady flow, scientists often face the challenge of only being able to generate output at a coarse temporal granularity. This leads to the problem of reconstructing the data at missing time steps with high quality for postprocessing and analysis. In this dissertation, I study three different deep-learning approaches to reconstruct missing vector fields from a small set of stored ones.

Another way to navigate through data is by using a graph representation. Graph representations are commonly used to show the relationship among entities. A typical way to depict graph is a node-link diagram with entities being the nodes and their connections the links. Similar to isosurface rendering, graph visualization suffers from the data overloading issue. Further, computing a pleasing and information revealing layout for large graphs can take a long time. To combat these problems, I present a framework to efficiently summarize a graph into sparse levels, compute a layout, and then reconstruct this information for denser levels of the graph.


Attribute NameValues
Author Martin Imre
Contributor Hanqi Guo, Committee Member
Contributor Collin McMillan, Committee Member
Contributor Chaoli Wang, Research Director
Contributor Ronald Metoyer, Committee Member
Degree Level Doctoral Dissertation
Degree Discipline Computer Science and Engineering
Degree Name Doctor of Philosophy
Banner Code

Defense Date
  • 2020-04-16

Submission Date 2020-04-22
Record Visibility Public
Content License
Departments and Units
Catalog Record


Please Note: You may encounter a delay before a download begins. Large or infrequently accessed files can take several minutes to retrieve from our archival storage system.