Title: Vision for Data-Intensive Analysis Centers and the Challenges to Its Realization
Presenting Author: Kwo-Sen Kuo
Organization: NASA GSFC/Bayesics
Co-Author(s):
Thomas L Clune; Amidu O Oloso; Michael L Rilee; Khoa Doan; Rahul Ramachandran

Abstract:
In the race to provide a basis for wise decisions regarding our climate future, we need to extract timely and accurate information from massive, diverse, and ever-increasing volumes of data. Existing practice is not only inefficient but also leads to other undesirable consequences such as impediments to collaboration and reproducibility. We present our vision for data-intensive analysis centers, which has the potential to revolutionize data analysis research. We describe the challenges that are being addressed by our current project and the challenges that we anticipate in the future. In this project, we are addressing mostly technical challenges, the solutions of which will not only provide a convenient and uniform means for analysis regardless of the original data models used (i.e. Grid, Swath, or Point), but also help optimize performance for Big Data technologies based on shared-nothing architecture. An upcoming challenge associated with the vast improvement of the analysis technology is a fast and effective way to visualize large multi-terabyte data, such as an animation of blizzard evolutions identified and tracked by the analysis system for the past 30 years. There are multiple scenarios where more effective ways of visualization are crucial. Visualization ultimately involves communication and data movement. We are attempting to find techniques to optimize the process. We also face other, perhaps more important but less technical, challenges, such as devising a strategy for transition/adoption. There are two aspects to this challenge. First, assuming that we start with an initial subset of the whole data holding, what is the best way to incrementally and gradually incorporate data into the system? Secondly, how to incentivize researchers to adopt the new practice with a sufficiently brisk pace to ensure continuous, sustained development (and survival) of the new system?