posted on 2010-12-14, 00:00authored byMichael Albrecht
Dataset sizes for scientific experiments are expanding at a prodigious rate. Even small-scale laboratories can produce terabytes of raw data each year. This data needs to be stored, but also needs to be analyzed in order to make it anything other than a waste of space. Furthermore, in areas like physics, scientists are frequently looking for interesting events or trends amongst a sea of boring data, making visualization and mass analysis very important.
One experiment that follows this pattern is the Gamma Ray Astrophysics experiment at Notre Dame. In this work I discuss the needs and constraints of data repositories for data-intensive scientific experiments in the context of developing such a system for GRAND. Challenges such as storing large datasets, interface design, fast data analysis, and large-scale data visualization are examined, and solutions are presented in the form of distributed storage and parallel computation.
History
Date Modified
2017-06-02
Research Director(s)
Douglas Thain
Committee Members
Scott Emrich
Greg Madey
Degree
Master of Science in Computer Science and Engineering