IST Mini-NRA Selections Announced
NASA’s Office of Earth Science Awards Six Grants for Advanced Information Systems Technology
08/19/2004 – The National Aeronautics and Space Administration (NASA) has
awarded funding for six new investigations for information systems technology
development, under the Advanced Information Systems Technology (AIST) Program,
which supports NASA’s mission to understand and protect our home planet.
The proposals, selected from a field of 30 submitted proposals, focus on high-priority
information technology areas: tools for warehousing, data mining, and knowledge
discovery; technologies to facilitate queries/access of multi-disciplinary data;
and techniques to facilitate customized data services. The data mining technologies
sought address two challenge areas: ocean biology and biogeochemistry data mining,
and data mining for climate and weather models. The total funding for these
investigations, over a period of two years, is approximately $1.9 million; investigators
hail from 7 states.
The main purpose of AIST is to invest in research and development
of new and innovative information technologies to support and enhance the Earth
science capability. AIST focuses on creating mature technologies leading to
smaller, less resource-intensive and less expensive flight systems that can
be built quickly and efficiently, and on more-efficient ground-based processing
and modeling systems that improve the use of Earth science data.
The technologies selected include a statistical data mining and
machine learning toolkit whose development will enable scaling of global data
sets and integration of heterogeneous data sources to evaluate/predict the effects
of varying weather patterns on agricultural crop yields. A spatiotemporal data
mining tool will enable monitoring and modeling for multiple oceanographic objects,
such as river-based plume and harmful algae blooms.
Technologies to improve the utilization of large heterogeneous
data sets will also be developed. These include the modification of data compression
techniques for use as a data reduction method to create small summary data sets
that are substantially reduced in volume and complexity, and the wavelet analysis
of local information content in a data scene to intelligently select the density
of observations to use for weather and climate modeling.
Climate modeling and prediction techniques will be further enhanced
through the development of data mining and knowledge discovery tools. A suite
of data mining tools based on new information-theoretic techniques will enable
rapid identification, characterization and quantification of causal interactions
among relevant climate variables in large distributed data sets, allowing evaluation
and prediction of climate and climate subsystem changes over time in response
to natural and human-induced changes. Data mining and knowledge discovery techniques
will facilitate analysis, visualization, and modeling of land-surface variables
obtained from the TERRA and AQUA platforms in support of climate and weather
applications to enable better parameterization of the relevant processes in
forecast models for weather and inter-annual climate prediction.
The investigations selected by NASA’s Earth Science Technology
Office are
Braverman, Amy (Jet Propulsion Laboratory (JPL), Pasadena, CA): Mining Massive Earth Science Data Sets for Climate and Weather Forecast Models |
Cai, Yang (Carnegie Mellon University, Pittsburgh, PA): Data Mining System for Tracking and Modeling Ocean Object Movement |
Hoffman, Ross (Atmospheric and Environmental Research (AER) Incorporated, Lexington, MA): Selection Technique for Thinning Satellite Data for Numerical Weather Prediction |
Knuth, Kevin (NASA Ames Research Center, Moffett Field, CA): Rapid Characterization of Causal Interactions among Climate/Weather System Variables: An Advanced Information-Theoretic Technique |
Kumar, Praveen (University of Illinois, Urbana, IL): Data Mining for Understanding the Dynamic Evolution of Land-Surface Variables: Technology Demonstration Using the D2K Platform |
Wagstaff, Kiri (JPL, Pasadena, CA): Interactive Analysis of Heterogeneous Data to Determine the Impact of Weather on Crop Yield |
Title | Mining Massive Earth Science Data Sets for Climate and Weather Forecast Models |
Full Name | Amy Braverman |
Institution Name | JPL |
Proposal # | AIST-QRS-04-3014 |
In this proposal we address the technology objectives specified in Section I.1. of the Mini-AIST NRA announced May 5, 2004. Specifically, we will provide tools and support for data warehousing, data mining, and knowledge discovery for the ESE science challenge posed in Section I.2.2.b: Data Preparation for Medium Range Weather Forecasts. The sheer volume of Earth science data precludes interactive, real-time scientific exploration required to characterize and understand features that can inform and improve physical models. We propose to solve this problem by creating small, reduced volume and complexity summary data sets which can be used in place of the original as input to models, or for comparisons to model output. We propose using data compression techniques, modified for use as data reduction methods, to create summary data sets of small size and high accuracy for observational data from AIRS, MISR, ISCCP (International Satellite Cloud Climatology Project) together with model data from NCAR’s CAM3 and GFDL’s AM2 atmospheric models. These summary data sets can be thought of as “thinned” in the sense of retaining representative observations which, taken together, preserve the statistical and distributional character of the original data. The summary data can therefore also be used to create customized data products that estimate features of modelers’ choice. Our technology is currently at TRL 4, and we expect to achieve TRL 6 in the 24-month performance period. |
|
Title | Spatiotemporal Data Mining System for Tracking and Modeling Ocean Object Movement |
Full Name | Yang Cai |
Institution Name | Carnegie Mellon University |
Proposal # | AIST-QRS-04-3031 |
Tracking and modeling spatiotemporal dynamics of ocean objects are essential to ESE missions in oceanographic studies, such as monitoring and predicting harmful algal blooms along the coastline, or river-based plume discharged to the open ocean. In this project, we propose a spatiotemporal data mining system for following This generalized spatiotemporal data mining tool enables monitoring and We will use SeaWiFS database as our main source. Meanwhile, we will explore The technology would be based on our lab prototypes of multi-sensor data The Co-PI Dr. Richard P. Stumpf, Oceanographer from NOAA will specify |
|
Title | Selection Technique for Thinning Satellite Data for Numerical Weather Prediction |
Full Name | Ross Hoffman |
Institution Name | Atmospheric and Environmental Research, Inc. |
Proposal # | AIST-QRS-04-3019 |
Operational weather prediction centers use only a fraction of observations of the atmosphere and the earth’s surface that are made by satellite, in situ, and ground-based instruments. In many cases satellite data are selected by regular decimation, i.e., every nth observation. The objective of this proposed project is to develop a more intelligent selection method that uses the local information content in a data scene to determine the density of observations to use. The method will be based on a wavelet analysis of the satellite data. Tests using QuikSCAT scatterometer wind observations in analysis and forecast systems will compare results based on ALL of the data to results from the REGULAR and WAVELET selections. A two year level of effort is proposed to advance the TRL of the method from 4 to 6. |
|
Title | Rapid Characterization of Causal Interactions among Climate/Weather System Variables: An Advanced Information-Theoretic Approach |
Full Name | Kevin Knuth |
Institution Name | NASA Ames Research Center |
Proposal # | AIST-QRS-04-3010 |
The NASA Earth Science Enterprise is focused on obtaining a better understanding of our home planet. While it is clear that the Earth’s climate changes over time, it is not known how this change occurs, what the primary causes of change are, or how climate subsystems respond to natural and human-induced changes. Vast amounts of data are being collected on the Earth climate system and it is increasingly important to rapidly discover relevant climate variables, and qualify and quantify their causal interactions. We will develop a suite of data-mining tools based on new information-theoretic techniques to rapidly identify, characterize, and quantify causal interactions among relevant climate variables in large distributed datasets. This information-theoretic approach relies on established quantities such as mutual information in addition to novel quantities called co-informations and derived quantities such as transfer entropy, which enable us to quantify complex causal interactions over different spatiotemporal scales. We have demonstrated these techniques at TRL 4, and during the period of performance from 10/01/04 through 9/30/06 the development of these tools will take them to TRL 6. In addition to quantifying causal interactions, these tools will also quantify the errors in the estimates thus quantifying inherent uncertainties in the results. These uncertainties are crucial to accurately evaluating our state of knowledge about the climate system, which is an element of key interest to the US Climate Change Science Program. These measures will be demonstrated using important climate datasets including MODIS, TRMM, and the International Satellite Cloud Climatology Project (ISCCP) dataset. |
|
Title | Data Mining for Understanding the Dynamic Evolution of Land-Surface Variables: Technology Demonstration using the D2K Platform |
Full Name | Praveen Kumar |
Institution Name | University of Illinois |
Proposal # | AIST-QRS-04-3015 |
The objective of this research proposal is to develop data mining and knowledge discovery in databases (KDD) techniques, using the “Data to Knowledge” (D2K) platform developed by National Center for Supercomputing Application (NCSA), to facilitate analysis, visualization and modeling of land-surface variables obtained from the TERRA and AQUA platforms in support of climate and weather applications. The specific technology objective addressed is: Tools and support for data mining and knowledge discovery. The ESE science challenge addressed is: Data mining for climate and weather applications. The specific science questions that this project will focus on are: 1. How are evolving surface variables such as vegetation indices, temperature, and emissivity, as obtained from the TERRA and AQUA platforms, dynamically linked? 2. How do they evolve in response to climate variability such as ENSO (El Niño Southern Oscillation)? and 3. How are they dependent on temporally invariant factors such as topography (and derived variables such as slope, aspect, nearness to streams), soil characteristics, land cover classification, etc? Answers to these questions, at the continental to global scales, using data mining, will enable us to develop better parameterization of the relevant processes in forecast models for weather, and inter-seasonal to inter-annual climate prediction. The entry Technology Readiness Level (TRL) for the project is 4 and we |
|
Title | Selection Interactive Analysis of Heterogeneous Data to Determine the Impact of Weather on Crop Yield |
Full Name | Kiri Wagstaff |
Institution Name | JPL |
Proposal # | AIST-QRS-04-3004 |
We will develop a versatile toolkit for statistical data mining and machine learning that will enable (1) rapid, in-depth analysis of subtle relationships between multiple, different science data products, and (2) efficient testing of competing scientific hypotheses. The toolkit will feature advanced methods that are optimized for the analysis of data with spatial dependencies. We will include technologies for classification (support vector machines), clustering (using spatial constraints), and prediction (multivariate spatial models) that currently exist only as standalone methods (TRL 4). These techniques will be refined and demonstrated in a critical scientific investigation: a study of the fine-scale effects of varying weather patterns on agricultural crop yields. The final system will be an integrated toolkit with an easy-to-use graphical interface, demonstrated on full-scale science data (TRL 6). As a result of this work, scientists will be able to easily answer questions such as, “What is the impact on corn yield if Kansas receives only 75% of its normal rainfall?” Benefits over the state of the art include a) analysis methods that scale to global data sets and b) the ability to integrate heterogeneous data sources for improved prediction accuracy. The anticipated period of performance is October 2004 to September 2006. |
|