22 Projects Awarded Funding Under the Advanced Information Systems Technology (AIST) Program
(2016 ROSES A.41 SolicitationNNH16ZDA001N -AIST Research Opportunities in Space and Earth Sciences)
06/07/2017 – NASA's Science Mission Directorate, NASA Headquarters, Washington, DC, has selected proposals for the Advanced Information Systems Technology Program (AIST- 16) in support of the Earth Science Division (ESD). The AIST-16 will provide technologies to reduce the risk and cost of evolving NASA information systems to support future Earth observation and to transform those observations into Earth information.
Through ESD’s Earth Science Technology Office a total of 22 proposals will be awarded over a 2-year period. The total amount of all the awards is roughly $25M.
The Advanced Information Systems Technology (AIST) program sought proposals for technology development activities to enable science measurements, make use of data for research, and facilitate practical applications for societal benefit by directly supporting each of the core functions within ESD: research and analysis, flight, and applied sciences. The objectives of the AIST Program are to identify, develop and demonstrate advanced information system technologies that:
• Reduce the risk, cost, size, and development time of ESD space-based and ground- based information systems,
• Increase the accessibility and utility of Earth science data and models, and
• Enable new Earth observation measurements and information products.
A total of 137 proposals were evaluated of which 22 have been selected for award. The awards are as follows (names hyperlinked to project abstracts):
Ved Chirayath, Ames Research Center |
Martyn Clark, National Center for Atmospheric Research |
Helen Conover, University of Alabama in Huntsville |
Dara Entekhabi, Massachusetts Institute of Technology |
Matthew French, University of Southern California |
Barton Forman, University of Maryland, College Park A science and applications driven mission planning tool for next generation remote sensing of snow |
Milton Halem, University of Maryland, Baltimore County |
Jonathan Hobbs, Jet Propulsion Laboratory |
Walter Jetz, Yale University |
Joel Johnson, The Ohio State University |
Branko Kosovic, National Center for Atmospheric Research |
Jacqueline Le Moigne, Goddard Space Flight Center |
James McDuffie, Jet Propulsion Laboratory |
Jeffrey Morisette, USGS Fort Collins Science Center |
Christopher Neigh, Goddard Space Flight Center |
Victor Pankratius, Massachusetts Institute of Technology |
Paul Rosen, Jet Propulsion Laboratory |
Jennifer Swenson, Duke University |
Philip Townsend, University of Wisconsin-Madison |
Petr Votava, Ames Research Center |
Anne Wilson, University of Colorado, Boulder |
Charlie Zender, University of California at Irvine |
Ved Chirayath, Ames Research Center
NeMO-Net - The Neural Multi-Modal Observation & Training Network for Global Coral Reef Assessment
We propose NeMO-Net, the first neural multi-modal observation and training network for global coral reef assessment. NeMO-Net is an open-source deep convolutional neural network (CNN) and interactive active learning training software proposed to accurately assess the present and past dynamics of coral reef ecosystems through determination of percent living cover and morphology as well as mapping of spatial distribution. NeMO- Net exploits active learning and data fusion of mm-scale remotely sensed 3D images of coral reefs from our ESTO FluidCam instrument, presently the highest-resolution remote sensing benthic imaging technology capable of removing ocean wave distortion, as well as lower-resolution airborne remote sensing data from the ongoing NASA CORAL mission and satellite data to determine coral reef ecosystem makeup globally at unprecedented spatial and temporal scales.
Aquatic ecosystems, particularly coral reefs, remain quantitatively misrepresented by low-resolution remote sensing as a result of refractive distortion from ocean waves, optical attenuation, and remoteness. ESTO-funded machine learning classification of coral reefs using FluidCam mm-scale 3D data showed that present satellite and airborne remote sensing techniques poorly characterize fundamental coral reef health indicators, such as percent living cover, morphology type, and species breakdown at the cm and meter scale. Indeed, current global assessments of coral reef cover and morphology classification based on km-scale satellite data alone can suffer from segmentation errors greater than 40%, capable of change detection only on yearly temporal scales and decameter spatial scales, significantly hindering our understanding of patterns and processes in marine biodiversity at a time when these ecosystems are experiencing unprecedented anthropogenic pressures, ocean acidification, and sea surface temperature rise.
NeMO-Net leverages ESTO investment in our augmented machine learning algorithm that demonstrates data fusion of regional FluidCam (mm, cm-scale) airborne remote sensing with global low-resolution (m, km-scale) airborne and spaceborne imagery to reduce classification errors up to 80% over regional scales. Such technologies can substantially enhance our ability to assess coral reef ecosystems dynamics using NASA EOS data. Through unique international partnerships with the IUCN Global Marine Program, Dr. Sylvia Earle’s Mission Blue, NASA’s CORAL, HICE-PR, and CoralBASICS projects, we are working directly with target recipient communities on NeMO-Net to train the largest aquatic neural network and produce an impactful technology development that has real-world scientific and policy impacts.
Our project goals are to: (1) create a fused global dataset of coral reefs from FluidCam, CORAL, and NASA EOS data, (2) train NeMO-Net’s CNN through active learning via an interactive app and global partners, (3) develop the NeMO-Net CNN architecture, (4) perform global coral reef assessment using NeMO-Net and determine the spatial distribution, percent living cover, and morphology breakdown of corals at present and over the past decade at meter spatial scales and weekly intervals, (5) evaluate NeMO-Net CNN error and robustness against existing unfused methods and (6) deploy NeMO-Net as a NASA NEX and QGIS open-source module for use in the community.
NeMO-Net is relevant to AIST Data-Centric Technologies in data fusion and data mining, as well as special subtopics subsection 3.2.1 (a-d), (f-i) through autonomous integration of data from sensors of various observational capacities to form data products across a wide range of spatial and temporal domains. NeMO-Net has broad applicability to data fusion for automated assessment of both terrestrial and aquatic ecosystems. The period of performance for this project spans 2 years and leverages significant previous ESTO investments in our technology for an entry TRL of 2 and exit of TRL of 4.
Martyn Clark, National Center for Atmospheric Research
Climate Risks in the Water Sector: Advancing the Readiness of Emerging Technologies in Climate Downscaling and Hydrologic Modeling
Objectives and benefits: Many water resources planning decisions require understanding the vulnerability of hydrologic systems to a wide range of different stresses. For many users, this requires developing a set of discrete quantitative hydrologic storylines of climate change impacts that can be used to evaluate adaptation measures. Quantitative hydrologic storylines rely on modern climate downscaling tools and process-based hydrologic models. Each storyline represents key features from the full range of possible climate scenarios, and, taken together, the storylines provide a comprehensive yet concise description of possible climate change impacts.
The research community has made substantial scientific advances in understanding impacts of climate variability and change on water resource systems; however, the technologies in climate downscaling and hydrologic modeling have considerable unrealized potential and lack sufficient technical readiness to be used widely for water resources planning.
The goal of this proposed project is to increase the value of emerging science advances in climate downscaling and hydrologic modeling for water resources planning.
This proposed work will provide tools and data resources for both researchers and practitioners to better manage current climate risk, reveal future climate change risks, and more effectively evaluate future change and adaptation options.
Outline of the proposed work and methodology: We propose to increase the readiness of emerging technologies in climate downscaling and hydrologic modeling by extending the NASA Land Information System (NASA-LIS) to evaluate climate-related risks in the water sector. The specific work elements in the proposal are:
1. Advance climate downscaling tools to provide climate change scenarios for input to NASA-LIS, including capabilities to explore sensitivities to downscaling methodological choices;
2. Develop watershed-based hydrologic model configurations, and implement them in NASA-LIS, to define land model configurations for water-resource planning;
3. Refine hydrologic models to improve the fidelity of hydrologic model simulations, using a suite of remotely sensed data for diagnostic assessment and improvement as well as bias correction of simulated streamflow time series;
4. Tailor model outputs to increase applicability of hydrologic climate change scenarios for water resources planning, using interactive web-based tools and summary products; and
5. Apply advanced concepts of information theory and machine learning to identify process-level tradeoffs between modeling options, and guide priorities for future research investments.
We will accomplish the work elements in this proposal by extending and applying new community models and methods for climate downscaling and hydrologic modeling developed jointly by the Computational Hydrology groups at the National Center for Atmospheric Research and the University of Washington (see https://ral.ucar.edu/hap/computational-hydrology).
Helen Conover, University of Alabama in Huntsville
VISAGE: Visualization for Integrated Satellite, Airborne, and Ground-Based Data Exploration
A key component of NASA’s Earth observation system is its field experiments, for intensive observation of particular phenomena such as hurricanes, or for ground validation of satellite observations. These experiments collect datasets from a wide variety of airborne and ground-based instruments, on different spatial and temporal scales, often in unique formats. The field data are often used with high volume satellite observations that have very different spatial and temporal coverage. The challenges inherent in working with such diverse datasets make it difficult for scientists to rapidly collect and analyze the data for physical process studies and validation of satellite algorithms, such as precipitation estimation.
The VISAGE (Visualization for Integrated Satellite, Airborne, and Ground-based data Exploration) project will address these issues by combining and extending nascent efforts to provide a data-centric technology with on-line data fusion, exploration, analysis and delivery capabilities. A key building block is the web-based Field Campaign Explorer (FCX), due for beta release at NASA’s Global Hydrology Resource Center Distributed Active Archive Center (GHRC DAAC) in mid-2017. FCX allows users to examine data collected during field campaigns and simplifies targeted and process-specific data acquisition for event-based research. VISAGE will extend FCX’s capabilities beyond interactive exploration of coincident datasets, to provide integrated visualization, interrogation of data values, and basic analyses such as ratios and differences between data fields.
Another key aspect of the VISAGE project will be incorporation of new, higher level fused and aggregated analysis products into the system from the System for Integrating Multi-platform data to Build the Atmospheric column (SIMBA), which combines satellite and ground-based observations into a common gridded atmospheric column data product; and the Validation Network (VN), which compiles a nationwide database of coincident ground- and satellite-based radar measurements of precipitation for larger scale scientific analysis. The VISAGE proof-of-concept will target Global Precipitation Measurement Ground Validation and precipitation analysis use cases, though reliance on standards will support broader use. “Golden cases” from GPM GV campaigns will be selected, and both field observation datasets and higher level products most relevant to these cases will be identified.
Taken together, these components will support the display of multi-sensor observation visualizations, allowing data exploration in a single framework without worries of coordinate system inconsistencies or the complexities of reading multiple data formats.
The multi-discipline research team includes developers of FCX, SIMBA and VN; and representatives of the GHRC DAAC which hosts the current FCX prototype and is the potential home for the proposed FCX enhancements as well as selected SIMBA and VN data products. The target user community is Earth scientists that require diverse measurements to help them address science research and analysis questions, as well as those who use numerical weather prediction models to conduct Earth Science investigations. The proposed capabilities have been identified as a need by VISAGE investigators from NASA’s Precipitation Measurement Mission (PMM) Science Team, which has been tasked with using GPM data to better understand Earth’s water cycle, weather and climate.
Ultimately, technology developed for VISAGE will make Earth Science research more efficient. Its visualization and interrogation of diverse datasets will facilitate selection of weather events or features for study and assist with both qualitative and quantitative analysis of the measurements. VISAGE will assist researchers in quickly formulating and justifying science questions, reducing the amount of time spent on proposal development and speed publication of results.
Dara Entekhabi, Massachusetts Institute of Technology
Autonomous Moisture Continuum Sensing Network
We propose new technology concepts and advancements to aid in our understanding of the effects of Climate Change on Biodiversity. This proposal is targeted towards the current AIST “Operational Technologies” core topics along with a focus on the “Climate Change and Biodiversity” Special Subtopics. Specifically, this proposal seeks to (1) develop wireless in situ observations networks and associated technologies to monitor the vertical flow and distribution of water along the soil, vegetation, and atmosphere continuum, (2) develop autonomous and event-driven network decision-making strategies based on ecohydrological understandings. Outcomes of this proposal will help interlink plant-level to field-level hydrological processes. Furthermore, immediate application to and support of ground calibration and validation (Cal/Val) activities of current and planned NASA Earth remote sensing missions is expected. Certain technology heritage and key building blocks at high Technology Readiness Levels (TRLs) (6-7) through prior AIST support exist. However, with the inclusion of these newly proposed concepts entry level TRLs are ranked between 2-3 with an expected exit TRL of 4-5 after two-years (starting September 2017 and ending September 2019).
Matthew French, University Of Southern California
SpaceCubeX: On-board processing for Distributed Measurement and Multi- Satellite Missions
This proposal addresses NASA’s Earth Science missions and its underlying needs for high performance, scalable on-board processing. The decadal survey missions stressed higher resolution instruments and persistent measurements, which drove computing needs up by 100-1,000x. Our AIST-14 SpaceCubeX effort developed a framework which supported rapid trade space exploration of on-board heterogeneous computing solutions (Multi-core CPUs coupled with DSP or FPGA co-processors) across a benchmark suite of Earth Science applications, achieving 20-20,000x performance improvements. Today, measurement capabilities emerging from the upcoming 2017 decadal survey such as distributed sensing, data continuity, multi-satellite constellations, and intelligent sensor control will require an additional 10-100x increase in onboard computing performance.
For this renewal effort, we propose to extend the SpaceCubeX on-board compute analysis framework and develop a hardware prototype to support the evaluation of these new measurement criteria. Distributed sensing missions require application portability across platform types, however platform constraints lead UAVs and satellites to utilize different processor types (GPUS vs FPGAs). A common framework which supports both FPGAs and GPUs will facilitate migration between these platform types. Multi-satellite missions enable diurnal, and multi-angle measurements, and invoke complex communication and control logic which must be processed on-board. Intelligent sensor control capabilities present complex, ad hoc processing which requires experimentation on prototype hardware. The SpaceCubeX project addresses these challenges by extending the evolvable testbed to include GPU support, model distributed sensors and high bandwidth communication links, develop prototype hardware, and demonstrate the technology. SpaceCubeX provides the following benefits:
-Accessible, rapid prototyping of next generation satellite and multi-satellite constellations capabilities by creating virtual satellites in the cloud.
-A proto-type heterogeneous on-board computer for experimentation of advanced autonomy and control capabilities required by intelligent instrument control and constellation management.
-Accelerate migration of missions from UAV and airborne platforms to satellites to support distributed sensing.
-Accurate, scalable approach to assessing Multi-Satellite mission performance.
-Detailed analysis and initial run-time implementation of FluidCam Structure from Motion, MiDAR, Diurnal Measurements, and Mult-Angle Measurement applications.
This project utilizes advanced instruments (FluidCam and MiDAR) currently operational on UAV platforms to drive distributed sensing and intelligent sensor research, and upcoming sensing concepts (Diurnal Measurements and Multi-angle Measurements) to guide multi-satellite constellation platform research. SpaceCubeX leverages substantial research investments from NASA, DARPA, and NRO on revolutionary imaging instruments, space based computing, multi-core and FPGA architectures, and focuses them on NASA Earth science missions and applications. The core team has worked together successfully for 14 years. The University of Southern California`s Information Sciences Institute (USC/ISI) will oversee the effort, leading the extension of the framework. NASA Goddard Space Flight Center will develop a prototype science mission processor, and provide multi-satellite applications. NASA Ames Research Center will provide distributed sensing applications. NASA Jet Propulsion Laboratory will advise on application mapping to heterogeneous processors. The team will perform in these areas over a two year period to develop a distributed sensing and multi-satellite framework capabilities in year 1 and to use this framework to characterize end to end performance of candidate processor architectures and create a hardware prototype in year 2, raising the TRL from 3 to 5 in all areas.
Barton Forman, University of Maryland, College Park
A science and applications driven mission planning tool for next generation remote sensing of snow
The central objective of this proposal is to create a terrestrial snow mission planning tool to help inform experimental design with relevance to terrestrial snow (i.e., snow depth and snow water equivalent), passive and active microwave remote sensing, optical remote sensing, hydrologic modeling, and data assimilation. Accurate estimates of snow are important as snow is the primary freshwater resource for 1+ billion people globally, has been identified as a priority in the most recent decadal survey, and can only be viewed globally (in an operational sense) with the use of satellite-based sensors. Currently, preparatory activities and field campaigns are underway towards the determination of sensors and instruments for the next generation NASA snow mission. The development of a simulation tool that can quantify the utility of mission data for research and applications is directly relevant to these planning efforts.
Leveraging the existing capabilities of the NASA Land Information System (LIS) and the Trade Space Analysis Tool for Constellations (TAT-C), a comprehensive environment for conducting observing system simulation experiments (OSSEs) will be developed to quantify the added value associated with different sensor and orbital configurations as related to snow. In addition, the integrated system will provide insights into the advantages (and disadvantages) of different configurations including simultaneous measurements. The goal of the OSSE is to maximize the utility in experimental design in terms of greatest benefit to global snow characterization.
TAT-C is a pre-phase A tool that can generate multiple mission configurations for Earth imaging and RADAR missions. It also allows quantitative assessment of orbital configurations (e.g., polar versus geostationary), number of sensors (e.g., single sensor versus constellation), and the associated costs with installing space-based instrumentation. The NASA LIS is a comprehensive land surface modeling and data assimilation framework that includes a large collection of land surface models and data assimilation algorithms. Instances of LIS are used in several agencies around the world for monitoring and addressing water management and availability issues. As part of the proposed project, we plan to combine the capabilities of both systems to provide a comprehensive OSSE environment for terrestrial hydrology with a focus on snow. The integrated system will enable a true end-to-end OSSE that can help to quantify the value of observations based on their utility to science research and applications and to guide mission designs and configurations.
Synthetic passive and active microwave brightness temperature (Tb) observations as well as synthetic optical (e.g., LIDAR) observations will be generated to provide information related to snow conditions in the terrestrial environment. This suite of synthetic observations will be assimilated into the NASA LIS to systematically assess the added value (or lack thereof) associated with an observation in space and time. Science and mission planning questions addressed as part of this proposal include, but are not limited to:
1. What observational records are needed (in space and time) to maximize terrestrial snow experimental utility?
2. How might observations be coordinated (in space and time) to maximize utility? How can this coordination help inform experimental design and mission planning?
3. What is the added utility associated with an additional observation?
4. How (and where) are sensitivities reflected in the observations and the different orbital configurations?
5. How can future mission costs be minimized while ensuring science requirements are fulfilled?
This project will leverage a suite of existing NASA data products and models, and therefore, add value to previously incurred costs
associated with their development.
Milton Halem, University of Maryland, Baltimore County
Computational Technologies: An Assessment of Hybrid Quantum Annealing Approaches for Inferring and Assimilating Satellite Surface Flux Data into Global Land Surface Models.
The objective of this proposal is to expand the research progress the investigators have made in developing Quantum Annealing algorithms that can contribute directly to supporting science related NASA Earth science mission products on the current Ames D- Wave 2X, but to also port and substantially extend these capabilities to the next generation D-Wave 2000Q, when and where available. In particular, having developed unique hybrid neural net algorithmic capabilities for the OCO-2 mission, we plan to expand our research to a broader class of Earth science mission data products and problems, namely calculating surface fluxes from other satellite data products and fusing these data for land surface model data assimilation This includes completing the development, testing and evaluation of Ensemble Quantum Kalman-filter algorithms applicable to current and planned Earth Science missions over the next two years and beyond. We will conduct extensive validation demonstrations of the potential of these D- Wave quantum annealing algorithms at ARC, which we believe will show significant scientific impacts and benefits, potentially more effective than what can be achieved with todays classical computers. It is our, and others, experience that NN optimization algorithms lend themselves especially well to quantum annealing architectures. If successful, this mission enhancing capability would lead to consideration of continued quantum computer technology infusion over the next four years for integration into an operational phase.
As use cases, we initially focus on assessing the D-Wave quantum annealing capability to address (i) the Global Carbon Source and Sink budgets over land (ii) perform image registration for direct estimates of Vegetation Growth from Solar Induced Fluorescence employing a multi-satellite Triple Collocation NN quantum algorithm and (iii) conduct Data Information Fusion analysis of Satellite and In-Situ Sensor observations with Reanalysis Model Outputs. We will extend our 3-year OCO-2 data collection to 5-years to infer annual variations in quantum computed global gridded CO2 fluxes, and to the upcoming ISS based OCO- 3 if available in the next two years.
We have successfully fitted a highly complex turbulent multivariate, non-linear ARM data set employing a feed forward and backward propagation neural net algorithm on a loosely coupled ARC D-Wave 2X system used as a co-processor accelerator with a remote cluster at UMBC for the general purpose computations. The algorithm performed thousands of samples and hundreds of epochs with two hidden layers. Utilizing these algorithms, we have shown that we can infer CO2 fluxes utilizing historical ARM data for training that is comparable with classic computers. We believe this is the first time one has successfully demonstrated that the D-Wave can perform feed forward regressions yielding comparable results to that obtained with classical computers by employing the D-Wave in such a hybrid algorithmic approach.
In this follow-on proposal, we plan to completely couple the hidden layers, forming a Boltzmann Machine, as part of the feed forward algorithm and will test various methods for recalculating the training weights in the backward propagation, which should produce improved global optimizations. This could prove to be a unique quantum capability capable of improved global optimization not reasonably possible with conventional computers. We have added several additional Earth scientists to substantially broaden the quantum computational science scope of applications. Thus, if awarded, we expect to improve on the current TRL 3/4 quantum computing capabilities to achieve a TRL 5/6 by the end of the proposed solicitation, thereby moving quantum annealing computing well on its way towards operational infusion.
Jonathan Hobbs, Jet Propulsion Laboratory
Simulation-Based Uncertainty Quantification for Atmospheric Remote Sensing Retrievals
The project will develop a data-centric technology consisting of statistical methods and analysis software to facilitate uncertainty quantification (UQ) for atmospheric remote sensing products, specifically Level 2 data products produced by operational retrieval algorithms. In this very common retrieval setting, an instrument (OCO-2, AIRS, and upcoming missions like Hyspiri and OCO-3) observes a radiance spectrum characterizing the atmospheric composition, and the retrieval algorithm converts the radiance into a quantity of interest, such as temperature, water vapor, or CO2 concentration. The framework we propose relies on Monte Carlo simulation by assembling an ensemble of true atmospheric states, generating synthetic radiances from an appropriate forward model, and performing an operational retrieval. Mission algorithm teams can typically generate these datasets, and our tools will allow the investigation of the retrieval error distribution. In addition, the full collection of true states, radiances, and retrieved state can be summarized in this framework.
The UQ tools will include the capability to summarize the correlation in retrieval errors for different components of the state vector. This correlation structure is particularly relevant for applications that use Level 2 for further inference, such as flux estimation in carbon cycle science. This capability relies on characterizing state vector ensembles with heterogeneous constituents. For example, a particular use case might require a state that includes a combination of temperature and humidity profiles, along with surface properties and cloud information.
The team will build on experience with simulation-based UQ applied to individual retrievals for the optimal estimation retrieval used by OCO-2. The proposed tools can be infused into a variety of retrieval systems, as long as an appropriate model for generating radiances given true atmospheric states is available. We will implement the methodology with the AIRS Level 2 algorithm as a use case. The UQ framework provides valuable information about sources of uncertainty, such as cloud clearing, that are unique to this retrieval and hard to grapple with using conventional technologies. Further, the proposed framework can be used as a data-centric technology for contrasting uncertainties from different retrievals that are estimating the state of the same true atmosphere.
Walter Jetz, Yale University
Software Workflows and Tools for Integrating Remote Sensing and Organismal Occurrence Data Streams to Assess and Monitor Biodiversity Change
Remote sensing combined with rapidly growing types and amounts of in situ spatiotemporal biodiversity data now enable an unrivaled opportunity for planetary scale monitoring of biodiversity change. However, i) the breadth of remote-sensing data streams of different spectral and spatiotemporal nature, ii) the heterogeneity of spatial biodiversity data types, including individual movement GPS tracks, survey- or sensor- based inventories, and vast citizen science observations, and iii) the spatial and temporal scale-dependence of biodiversity change and its detection, all necessitate versatile technology capable of complex data fusion. This need is further exacerbated by ongoing growth of data that requires highly scalable visualization and analysis solutions. No general solution currently exists. The objective of the proposed work is to fill this gap with dedicated open-source software workflows and tools to the benefit of both remote sensing and biodiversity change communities.
The proposed work will build on earlier developments of global remote-sensing supported climate and environmental layers for biodiversity assessment and develop a general workflow that allows the environmental annotation, visualization, and change assessment for past and future spatial biodiversity occurrence data. Observed, in-situ biodiversity has intrinsic spatiotemporal grain and associated uncertainty based on observation methodology and data collection. Likewise, remotely sensed environmental data also vary in spatiotemporal grain from meters to kilometers. This project will develop technical infrastructure and software workflows to easily develop and serve appropriate summaries of environmental data for biodiversity observations. For example, a list of migrating birds observed from one location one afternoon would require a different summary of environmental data compared to a list of vascular plants known to exist in a 100km2 protected area. We will develop algorithms that automate appropriate summaries of relevant environmental data to characterize the spatiotemporal environmental context of the in-situ biodiversity observation. Furthermore, this system will draw from near-real time collection of RS and RS-derived environmental data (such as land surface temperature, precipitation, and vegetation indices) to enable both historical and near real-time annotation of continuously updating biodiversity data streams. Our scalable system will be capable of fusing large spatial biodiversity data available through Map of Life (https://mol.org), including data assets from GBIF (http://www.gbif.org, >700M records), public Movebank GPS tracking records (https://www.movebank.org, ca. 20M records), and other incidental and inventory datasets (ca. 100M records). The generalized software workflows and tools will enable characterization and comparison of environmental associations of individuals, populations or species over time, globally. This will allow the quantification of observed environmental niches as well as the detection of change through time in both environmental associations and geographic distributions of biodiversity.
Our proposal addresses the ROSES A.41 solicitation dramatically “improving the ease with which the biology and ecology communities can understand, select and use appropriately NASA remote sensing data.” And the planned workflows will provide some “automated analytic techniques to scale the use of all relevant observational data in the understanding of patterns and processes in biodiversity” as well as “tools which aid the researcher in formulating and evaluating hypotheses quickly.” The proposed work will address the biodiversity aspect of the overarching science goal of the NASA CC&E focus area “Detect and predict changes in Earth’s ecosystems and biogeochemical cycles, including land cover, biological diversity, and the global carbon cycle.” The planned entry TRL for this project is 2-3 and the exit is aimed at TRL 5-6.
Joel Johnson, The Ohio State University
Enabling Multi-Platform Mission Planning and Operations Simulation Environments for Adaptive Remote Sensors
We propose to develop a flexible and modular open source library for designing multi- platform missions with adaptive sensors operating under resource constraints in order to predict the science performance achieved and to refine adaptive methods in the design process. This work represents an important step in promoting the use of adaptive sensors in Earth observing missions.
The growing potential of sensors capable of real-time adaptation of their operational parameters calls for a new class of mission planning and simulation tools. Existing simulation tools used in performing observing system simulation experiments (OSSEs) assume a fixed set of sensor parameters in terms of observation geometry, frequencies used, resolution, or observation time, which allows simplifications to be made in the simulation process and allows sensor observation errors to be characterized a-priori. Adaptive sensors may vary all of these parameters depending on the scene observed, so that sensor performance is not simple to model without conducting OSSE simulations including sensor adaptation in response to stochastic variations in the scenes observed. The management of power and data volume resources on small satellite platforms as well as methods to allow collaborative sensing among sensors on multiple platforms are also high current needs for inclusion in mission simulation tools.
The library will be developed based on the past experience of the project team with the development of OSSE and mission planning tools for multiple projects. These include the end-to-end-simulator developed for the eight satellite constellation of the CYGNSS program, a simulation tool and onboard processor developed to optimize mission operations for the CubeSat Radiometer Radio Frequency Interference Technology validation experiment, and past experience in the development of a fully adaptive radar system and associated modeling and simulation environment. The library will progress from its current TRL 2 to a TRL 4 exit status through a two year project focused on library development and testing in year one and extensive demonstration through three case studies in year two. The open source modular library will be designed to facilitate its incorporation into a variety of OSSE tools for future missions, particularly those currently under consideration by the Earth Science Decadal Survey.
Branko Kosovic, National Center for Atmospheric Research
Estimations of Fuel Moisture Content for Improved Wildland Fire Spread Prediction
Decision support systems for wildland fire behavior are essential for effective and efficient wildland fire risk assessment and firefighting. Together with the Center of
Excellence for Advanced Technology Aerial Firefighting in Rifle, Colorado we are developing a wildland fire prediction system for the State of Colorado. The mission of the Center of Excellence is: “To protect citizens, land, and resources in Colorado, the Center of Excellence will research, test, and evaluate existing and new technologies that support sustainable, effective, and efficient aerial firefighting techniques.” The wildland fire prediction system is based on the National Center of Atmospheric Research’s Coupled Atmosphere Wildland Fire Environment (CAWFE) model, and the Weather Research and Forecasting - Fire (WRF-Fire) model. WRF-Fire is an extension of the widely used, community numerical weather prediction model WRF.
In addition to atmospheric conditions and fuel type, fuel moisture content (FMC) is a critical factor controlling the rate of spread and heat release from wildland fires. Previous studies have shown that the intensity and frequency of occurrence of large fires are more highly correlated to reduced vegetation moisture than increased air or fuel temperature. Accurate information about FMC is therefore essential for more accurate wildland fire spread prediction.
Currently the coupled operational wildland fire prediction system is at Technical Readiness Level (TRL) 4. The coupled atmosphere-wildland fire spread model has been implemented in the prototype operational system and the available relevant data sets are assimilated in the operational forecasting process. The system is undergoing testing in real time at the scale of the State of Colorado, as required. The fuel data component of the operational system is at the TRL 2. A simple fuel model (Anderson 1982) has been implemented. This model does not allow for the observed variability in FMC. A more advanced (i.e. Scott and Burgan 2005) fuel model is being implemented. However, a dynamic, gridded FMC data set that can be assimilated in real-time in the operational system does not exist.
Presently, the National Fuel Moisture Database (available via the Wildland Fire Assessment System) provides continuously updated information about FMC based on surface observations from Remote Automated Weather Stations (RAWS). To provide FMC data at any location in the CONUS, measurements are interpolated using an inverse distance squared interpolation. However, considering the sparse spatial distribution of RAWS any interpolation method can result in large errors in spatial FMC distribution. We therefore propose to use satellite remote sensing observations to develop more accurate gridded, dynamic, real-time FMC database product for use with the dynamic Scott and Burgan (2005) fire behavior fuel models. By combining vegetation index products from the polar orbiting MODIS instruments, we will develop a high temporal and spatial resolution FMC product. The vegetation indices will be combined using machine learning algorithms and calibrated using surface RAWS observations to produce a best estimate of the dead and live fuel moisture content. The dead and live FMC will be combined to derive total fuel moisture content in the widely used Rothermel (1972) fire spread model. The use of the new gridded FMC database will be demonstrated in our WRF-Fire coupled atmosphere wildland fire prediction model.
More accurate accounting for live and dead FMC through assimilation of satellite observations will result in more realistic, dynamic representation of fuel heterogeneity and in improved accuracy of wildland fire spread prediction. The effectiveness of the coupled atmosphere wildland fire spread prediction model accounting for the FMC will be assessed in collaboration with the Center for Excellence for Advanced Technology Aerial Firefighting using observations of wildland fires over Colorado.
Jacqueline Le Moigne, Goddard Space Flight Center
Generalizing Distributed Missions Design Using the Trade-Space Analysis Tool for Constellations (TAT-C) and Machine Learning (ML)
A large amount of Earth Science data must be sustained or augmented for scientific and operational purposes under constrained budget requirements. Multipoint measurement missions can provide a significant advancement in science return at a manageable cost. Coupled with recent technological advances, this science interest drives a trend toward distributed architectures for future NASA missions instead of the traditional monolithic ones. As a general definition, Distributed Spacecraft Missions (DSMs) leverage multiple spacecraft to achieve one or more common goals. In particular, a constellation is the most general form of DSM with two or more spacecraft placed into specific orbit(s) for the purpose of serving a common objective.
DSMs are gaining momentum in all science domains and in Earth Science they enable new measurements and simultaneous observation sampling increases in spatial, spectral, temporal and angular dimensions. Additionally, DSMs are expected to increase mission flexibility, scalability, evolvability and robustness, to facilitate data continuity, and to minimize cost risks associated with launch and operations, thus responding to both data needs and budget constraints. However, distributed architectures also carry a risk of being “robust-yet-fragile,” a paradoxical behavior where poorly-understood interdependencies lead to unexpected failures. Considering both the upside potential and downside risk of DSMs requires careful evaluation of operations in a simulated environment. Furthermore, a DSM architectural trade-space includes both monolithic and distributed design variables subject to combinatorial factors. As a result, DSM optimization is a large and complex problem with multiple conflicting objectives.
Our proposed solution to these challenges develops an open-access tool which will be available to the scientific community for pre-Phase A constellation mission analysis. Over the last two years, our team has developed the prototype Trade-space Analysis Tool for Constellations (TAT-C). By enumerating and evaluating alternative mission architectures, TAT-C minimizes cost and maximizes performance for pre-defined science goals and helps to quantify and evaluate specific DSM challenges such as data calibration. TAT-C is suitable for missions ranging from smallsats to flagships and is based on existing modeling and analysis capabilities developed at NASA Goddard. TAT-C currently addresses basic capabilities required for pre-Phase A constellation design with a general framework and has already proven valuable in analyzing imaging systems. Its implementation provides improved and integrated capabilities compared to existing solutions; it also enables easy addition of new functionality.
This proposal will extend TAT-C to broader Earth Science interests, support additional trades on instruments, spacecraft sizes, launch choices and onboard processing hardware and computations, and extend cost and risk analysis to reconsider requirements of ground operations and mission replanning. The increased number of design variables coupled with combinatorial factors associated with DSMs demand a new Trade-Space Search Iterator driven by machine learning (ML) techniques and working closely with a fully functional and populated knowledge base to efficiently explore and optimize over a tractable design space. The final TAT-C ML software developed under this proposal will provide the Earth Science community a powerful tool to quickly design novel DSMs or augment existing missions to optimize their science return. Without TAT-C, missions would either have sub-optimal performance and science return, or each DSM design team would need to develop an equivalent to TAT-C to enable their mission optimization. Our proposed project responds to the Earth Science Technology Office's AIST Program Operations Technologies Core Topic by designing a tool that will perform mission design trade studies and will enable new types of observations.
James McDuffie, Jet Propulsion Laboratory
Multi-Instrument Radiative Transfer and Retrieval Framework
The objective of this proposal is to create an extensible multi-instrument atmospheric composition retrieval framework that enables software reuse while also improving science. This framework will be reusable and extensible allowing different instrument teams to use the same code base. It will help reduce the cost and risk of L2 development for new atmospheric Earth science missions. Secondly, the proposed framework will support the data fusion of radiance measurements from multiple instruments through joint retrievals. This would yield data fusion products that serve to advance Earth science research.
The framework will be implemented by extending the existing OCO-2 Full Physics software (Bösch et al. 2015) to handle multiple instruments both jointly and individually. This entails both Level 1 readers and the components modeling instrument characteristics. Structural changes to the existing software will be necessary to take advantage of the new instrument modeling components. These changes enable the framework to handle multiple instruments when performing joint retrievals. The integration of fast thermal infrared (TIR) to ultraviolet (UV) radiative transfer (RT) software will allow modeling the spectra of the wide range of instrument types available for ingestion. When setting up joint retrievals the selection of the specific measurements, temporally and spatially, will be handled by data matching algorithms. To give more flexibility in terms of how the optimal estimation retrieval methods frame the problem, the framework will incorporate retrieval methodologies developed by the TES science team (Bowman et al. 2006). Assessment of the quality of linear error estimates will come from a generalized approach building upon Monte Carlo uncertainty quantification methods from OCO-2 (Hobbs et al. 2017). Once integration of the afore mentioned components is complete, work will focus on configuration of an operational joint retrieval use case. This use case will be tested and validated against recent research (Fu et al. 2016).
Jeffrey Morisette, USGS Fort Collins Science Center
Advanced Phenological Information System
The ROSES A.41 solicitation highlights the need within the Earth science community to develop a “more complete and more accurate understanding of the temporal and spatial behavior of plant and animals species.” It specifically calls for the analysis of “time series measurements of key environmental parameters with the spatiotemporal distribution of organismal populations, communities, and species for improved understanding of the impact of climate change."
Phenological observations (the study of cyclic and seasonal natural phenomena, especially in relation to climate and plant and animal life) are highly relevant to these needs. There are extensive existing efforts to collect phenologically-relevant information, including a range of data from in situ, to near-surface cameras and flux towers, to airborne data, to polar orbiting and geostationary satellites. There are also significant efforts to provide gridded historical and projected climate data.
However, there is no consolidated effort to compile spatially-associated pheno-climatic information into a data framework that would improve information comparison and reuse, facilitate collaboration within the research community, and increase the speed of production and publication of results related to the needs identified in the solicitation.
Here, our objective is to develop an Advanced Pheno-climatic Information System (APIS). The system will incorporate tools, workflows, and software for bringing phenological and climate data together. APIS will be designed to leverage existing expertise, techniques, and networks as much as possible. The primary benefit of this effort is that it will focus on providing pheno-climatic data in way that would allow the scientific community to test families of hypotheses generated from the seed hypothesis that climate change is influencing the spatial distribution and temporal behavior of key populations of organisms, communities, and species.
Founded in this seed hypothesis, our proposed methodology is to build APIS with an iterative and agile programming approach and in parallel to a series of increasingly more complex use cases. These three use cases will both provide insights into constructing the information system as well as demonstrate its utility. We start with a fairly straightforward use case connecting observations at point locations to gridded phenology data. We then develop a more involved use case exploring the validation of satellite- based phenology products. Finally, we explore a more complex use case pertaining to the phenology of wildland corridors.
The proposed system will enter at Technology Readiness Level (TRL) 2. The technology concept (i.e., a data-centric, distributed SOA) and its’ practical application (i.e., analytic workflows to test phenology hypotheses) have been formulated. While multiple service provider APIs have been identified as important, maturity levels vary and the interoperability and integration of these services into scientific workflows have not yet been demonstrated. We will mature the proposed system to exit at TRL3. Use cases will drive system functionality, leading to end-to-end proof-of-concept demonstration. The proof-of-concept will be used to validate critical properties of the loosely-coupled system.
Christopher Neigh, Goddard Space Flight Center
Automated Protocols for Generating Very High-Resolution Commercial Validation Products with NASA HEC Resources
The volume of available remotely sensed data is growing at rates exceeding Petabytes per year. Over the past decade the cost for data storage systems and compute power have both dropped exponentially. This has opened the door for “Big Data” processing systems such as the Google Earth Engine, NASA Earth Exchange, and NASA Center for Climate Simulation (NCCS). At the same time, commercial very high-resolution (VHR) satellites have grown into a constellation with global repeat coverage that can support existing NASA Earth observing missions with stereo and super-spectral capabilities. Through agreements with the National Geospatial-Intelligence Agency NASA-GSFC is acquiring Petabytes of sub-meter to 4 meter resolution imagery from around the globe from WorldView-1,2,3,4 Quickbird-2, GeoEye-1 and IKONOS-2 satellites. Prior to 2008 these data were spatially disparate and were primarily used for evaluation and validation of coarser resolution data products. Current data collections often include repeat coverage in many large regions with contiguous coverage. These data are a valuable no- direct cost resource available for the enhancement of NASA Earth observation science that is currently underutilized.
We propose to develop automated protocols for generating VHR products to support NASA earth observing missions. These include two primary foci:
1) On Demand VHR 1/4° Ortho Mosaics - Systematic VHR HEC processing to orthorectify and co-register multi-temporal 2 m multi-spectral imagery compiled as user defined regional mosaics to provide an easily accessible evaluation dataset for LCLUC program mapping efforts. We will apply a consistent image normalization approach to minimize the effects of topography, view angle, date and time of day of collection. This work builds on PI Neigh (cad4nasas.gsfc.nasa.gov) prior experience and experience of COI’s Carroll and Montesano in processing of VHR data on GSFC’s NCCS ADAPT (https://www.nccs.nasa.gov/services/adapt) cluster. We will work with experts in the generation of surface reflectance data to develop a process for normalizing the VHR data which will yield scientifically valid mosaics that can be used to investigate biodiversity, tree canopy closure, surface water fraction, and cropped area for smallholder agriculture.
2) On Demand VHR DEM generation - Systematic VHR HEC processing of available within track and cross track stereo VHR imagery to produce VHR digital elevation models with the NASA Ames stereo pipeline (https://ti.arc.nasa.gov/tech/asr/intelligent-robotics/ngt/stereo/). We will apply a consistent vertical normalization with ICESat to merge and mosaic DEMs systematically to provide products that can support other NASA missions for a number of different programs. These could potentially include earth surface studies on the cryoshpere (glacier mass balance, flow rates and snow depth); hydrology (lake/water body levels, landslides, subsidence) and biosphere (forest structure, canopy height/cover) among others.
Successful development of a HEC protocol to process VHR data could foster surmounting prior spatial-temporal limitations found with using these data on an individual PI basis with broad benefits to many NASA programs.
Victor Pankratius, Massachusetts Institute of Technology
Computer Aided Discovery and Algorithmic Synthesis for Spatio-Temporal Phenomena in InSAR
Objectives and Benefits: The goal of this research is to provide Earth scientists with a computer-aided discovery environment that advances the synthesis of computational pipelines and which creates new artificial intelligence (AI) guided capabilities for the discovery of deformation phenomena in Interferometric Synthetic Aperture Radar (InSAR) data. The project will provide new cloud-scalable algorithms, tools, and data fusion capabilities for different instruments to enable complex spatio-temporal inferences on big data sets.
Work & Methodology: Interferometric Synthetic Aperture Radar (InSAR) has become a key technique in analyzing effects such as subsidence, co-seismic offsets after earthquakes, effects of volcano inflation and eruptions, and other natural phenomena related to global hazards and threats to human life. InSAR enables detections of deformations based on imaging with millimeter sensitivity over swath widths of up to 400 km. However, InSAR data processing is currently facing numerous challenges. Data sets are drastically increasing in size for current operations, and new missions like NASA ISRO SAR (NISAR) will increase the temporal density of InSAR images by orders of magnitude. While current NASA systems like ARIA are focusing on storage, retrieval, and generation of interferograms with Web-based techniques, new discovery and prediction capabilities are needed as a layer on top. Computational workflows generating interferograms and higher-level data products are highly complex (e.g., due to corrections for atmosphere, ionosphere, instrumental biases etc.). These workflows require a sophisticated search and adaptation of algorithmic choices and parameters to generate data products that visually amplify interesting phenomena that would lead to new discoveries. Yet another challenge includes data fusion with thousands of GPS sites worldwide, as well as instruments such as MODIS and GRACE, for purposes of eliminating false positives and maximizing phenomena information in temporal and spatial dimensions.
Addressing these challenges, this research will leverage the current NASA InSAR and NASA UAVSAR data efforts, and expand the successful NASA AIST14 Computer- Aided Discovery project with InSAR capabilities. Our cloud environment will allow scientists to programmatically express hypothesized scenarios, constraints, and model variants (e.g. parameters, choice of algorithms, workflow alternatives), to automatically explore with machine learning the combinatorial search space of possible model applications in parallel on multiple data sets. This project will also investigate new AI- based methods for generating and pruning geophysical models aimed at phenomena characterization, as well as processing pipeline synthesis for InSAR workflows.
All capabilities will be demonstrated in the context of concrete case studies. (1) Algorithmic coherence improvement in areas of de-correlation through enhanced phase matching between interferograms. (2) Model exploration and extraction of Episodic Tremor and Slip events in InSAR with improved coherence, applied to the Pacific Northwest and Guerrero, Southern Mexico regions. (3) Exploration of novel algorithmic approaches to directly generate InSAR time series without forming interferograms and using filtering and state estimation rather than differencing.
Significance: Our proposal will advance NASA’s capability for modeling, assessment, and computing of Earth Science data (3.1.2 Computational Technologies) and improve technical means to assess, mitigate, and forecast natural hazards. Computer-aided discovery will enhance the productivity and ability of scientists to process big data from a variety of sources and generate new insight.
Paul Rosen, Jet Propulsion Laboratory
Simplified, Parallelized InSAR Scientific Computing Environment
There is a vast amount of SAR data that is challenging for scientists to use. We propose a variety of technologies in SAR processing that will accelerate the processing and the use of the science products. Specifically, we will:
1) Develop methods of computational acceleration by exploiting back projection methods on cloud-enabled GPU platforms to directly compute focused imagery in UTM (landsat grid). This will deliver SAR data to users as user-ready products, in a form that is most familiar to them from optical sensors and which has never been done before. It has been
a major obstacle for scientists to adopt radar data. Once formed, the data can be accessed on standard GIS platforms. We could greatly reduce the processing complexity for users so they can concentrate on the science, and bring the products seamlessly into the 21st century tools that are rapidly evolving to handle the developing data explosion;
2) Develop python-based framework technologies at the user interface that support a more natural way for scientists to specify products and actions, thereby accelerating their ability to generate science results;
3) Extend the ESTO-funded InSAR Scientific Computing Environment framework to uniformly treat polarimetric and interferometric time-series such as those that will be created by the NISAR mission using serialized product-based workflow techniques.
There are several key challenges that need to be addressed in parallel: 1) speed and efficiency in handling very large multi-terabyte time-series imagery data files. This requires innovations in multi-scale (GPU, node, cluster, cloud) workflow control; 2) framework technologies that can support the varied algorithms that these data can support, from SAR focusing, interferometry, polarimetry, interferometric polarimetry, and time-series processing; framework technologies that can support heterogeneous, multi-sensor data types (point-clouds and raster) in time and space.
NASA’s upcoming radar mission, NISAR, will benefit from this technology after its planned launch in 2021, but first the vast archive of all international missions such as the Sentinel-1 A/B data at the Alaska Satellite Facility can be exploited more fully.
Jennifer Swenson, Duke University
Generative Models to Forecast Community Reorganization with Climate Change
We propose to fully integrate key remote sensing variables with continental scale ecological data to provide broadly accessible ecological forecasts to a user community of ecologists and managers. The product will integrate biological data and remote sensing in a form that translates directly to decision-ready products. To determine which species and communities are most vulnerable to climate change and to forecast responses of entire communities, we propose a forecasting framework and software that will be capable of synthesizing National Ecological Observatory Network (NEON), other biodiversity networks, and related physical and biological data with remotely sensed information. Species distribution models (SDMs) used to anticipate community responses to climate change can be unreliable and imprecise—current estimates range from 0 to 50% species loss. SDMs fail to accommodate the joint relationships between species and the different scales of measurement—they cannot coherently synthesize data and thus cannot predict entire communities. We have developed a generative model of community response to climate change that accurately predicts distribution and abundance of each species jointly as well as their organization in communities. Satellite imagery characterizes habitat with temporal, spatial, multispectral detail that goes well beyond that available from interpolated climate data and land cover/use maps. Frequent imagery collection also provides the opportunity to incorporate phenology dynamics across seasons as well as year-to-year changes over the last 17 years. We will test our model for different communities with a range of remotely sensed products at different temporal and spatial scales, including MODIS Leaf Area Index, Vegetation Indices, Continuous Vegetation Fields, Gross Primary Productivity and Land Surface Temperature. Joint analysis will allow substantial improvement in community prediction by synthesizing directly with NEON abundance data such ecosystem properties as canopy density and structure, productivity, surface heat exchange, fragmentation and disturbance. The technical themes of our proposal are data-centric technology (primary) and computational technology (secondary).
As part of this analysis, we will extend the predictive modeling to explore fine scale ecosystem attributes from data currently being acquired for NEON sites by the airborne observation platform (AOP) which include waveform and discrete lidar and hyperspectral information from JPL’s airborne imaging spectrometer. This will allow us to determine how fine scale ecosystem attributes are represented at the regional scale with Landsat and MODIS products.
Application to taxonomically diverse communities monitored in NEON and other networks will allow us to forecast community change and reorganization. Predictive distributions, with full uncertainty, will be updated together with data availability. Predictive sensitivity analysis under climate change will anticipate community reorganization and the habitats that will be most critical for new species assemblages. This two-year project will move the current version of our tool from TRL 3/4 to 6. Our new semi-automated interface will enable revised forecasts as updated climate, remote sensing and biodiversity data come available. The tool will enable instant selection and retrieval of key remotely-sensed products (e.g. MODIS products at the LP-DAAC) via web application programming interface (API), selection of biodiversity data via API (e.g. from NEON) or upload, parameterization of the model and display and download of model output. NASA interests and goals will be advanced by linking the broad NEON data network to the rich time series of regional NASA products to improve predictions and understanding of spatial-temporal distributions of ecological communities.
Philip Townsend, University of Wisconsin-Madison
Spectral Data Discovery, Access and Analysis through EcoSIS Toolkits
Over the last decade, the application of hyperspectral data to estimate foliar traits in vegetation has exploded in use, including applications in agriculture and phenotyping, ecosystem functioning and biodiversity. There is an urgent need for a common set of software tools and open-source repository for those tools and spectral models derived from spectral data. Our effort focuses on developing the software, database/accessibility tools and web front- and back-ends needed to ensure open-source usage of spectral data among a large user community, including: data repositories, data processing tools, and spectral model distribution. Building on our previous EcoSIS.org effort, we will greatly expand the functionality of open-source tools for spectral data analysis, thus reducing barriers to entry for new researchers wanting to make use of field and/or imaging spectroscopy. This proposal is being submitted under the Data-Centric Technologies core topic area. Our primary target audiences are users wishing to scale from ground measurements to imagery or simply to use existing published algorithms to predict foliar traits from new ground reflectance or imaging spectroscopy data.
EcoSIS.org is an easy-to-use online database that we developed with NASA support for storing, documenting, and distributing vegetation-themed spectroscopic datasets. The EcoSIS data portal makes contribution of rigorously-attributed datasets intuitive and uncomplicated. Ancillary data, such as chemical and physiological traits, as well as spectroscopy metadata, are easily added to make datasets discoverable across the internet, facilitate synergistic studies, and provide data to inform remote sensing research. Datasets published via EcoSIS are eligible to receive a DOI, providing persistent access by the user community as required by peer-reviewed journals and funding agencies. We propose to expand participation in EcoSIS by creating a suite of complementary open- source tools — the EcoSIS Toolkit — that make processing and preparation of spectral data straightforward, further removing the potential barriers to entry for those whose research would greatly benefit from the inclusion spectroscopy datasets and models.
Throughout the development process for all tools we will employ Agile practices to iteratively add and test new features. All software developed through this project will use only open-source technologies and will be licensed under the Apache Licence 2.0 (http://www.apache.org/licenses/LICENSE-2.0). We will develop: 1) the Ecological Spectral Model Library - EcoSML.org — an online repository that distributes model parameters, example code, and other supporting resources related to spectra-derived models used to predict sample traits from spectra; 2) EcoSIS SDK library packages for popular scientific open-source languages to interface with EcoSIS; 3) the Spectroscopy Data Abstraction Library (SpecDAL, https://specdal.github.io/) will be an open-source Python library. Inspired by GDAL (http://www.gdal.org/), it will provide functions and classes to work with data files from industry-standard portable field spectrometers as well as custom built instruments; 4) HyTools, a toolbox of open-source Python programs used to perform necessary processing of hyperspectral images, providing a source to new users for code that has either been part of closed-source software packages or developed by individual researchers on an ad hoc basis. No comparable resources for these proposed functions are widely available, although there are some disparate sources on the web providing these functions.
During this 2-year project, EcoSML will enter at TRL 2 and exit at TRL 4, while the SDK libraries, SpecDAL and HyTools will enter at TRL 3 and exit at TRL 5-7.
Petr Votava, Ames Research Center
Framework for Mining and Analysis of Petabyte-Size Time-Series on the NASA Earth Exchange (NEX)
Time-series analysis is key to understanding and uncovering changes in the Earth system. However many currently available geospatial tools only provide easy access to the spatial rather than the temporal component. Therefore the burden is on the researchers to correctly extract the time-series from multiple files for further analysis. While inconvenient, this is often achievable on a small scale, but to search for trends across millions of time-series, quickly becomes a huge undertaking for individual researchers, because apart from scaling the analysis algorithm itself it requires much effort in large- scale data processing, metadata and data management. Additionally, for most researchers in Earth sciences, there are almost no tools that would enable easy time-series access, search and analysis. Finally, there are limited places where algorithms supporting novel time-series approaches can be tested and evaluated at scale. Given the importance of time-series analysis to Earth sciences, we view it as an opportunity to engage and bring together Earth science, machine learning and data mining communities - an important goal of the NASA Earth Exchange (NEX) project. The overall goal of the proposed effort is to develop a platform for fast and efficient analysis of time-series data from NASA’s satellite and derived datasets at large scale (billions of time-series from 100’s of terabytes to petabytes of data) that would be deployed on NEX and accessible in both supercomputing and cloud environments. While the initial focus will be on deploying the technology to support NEX and NASA users, the overall system will be developed as a flexible framework that can easily accommodate any user’s time-series data and codes and will be deployable outside NEX using Docker containers. The project will significantly enhance the scale of state-of-the-art in time-series analysis, currently several orders of magnitude below the needs of the Earth science community. To accomplish this goal, we will develop time-series indexing and search components based on the Symbolic Aggregate approXimation (SAX/iSAX) that will be able to extract and index billions of time-series from satellite, model and climate data, giving both science and application users an important analysis tool and lower a major barrier in Earth science research. Finally, as time-series analysis is very active field of research, the platform will be developed as a plug-in framework and will be able to accommodate new improvements in time-series analysis, such as different space reduction methods that are first step in the indexing process. Apart from production use on the NEX system, we will deploy the system as a test-bed for users that will drive advancements in time-series analysis research, while providing unified access to billions of time-series. Because of the symbolic nature of the SAX representation, it is possible to deploy a number of algorithms from text mining, deep learning and bioinformatics that will provide giant leap in our ability to analyze time-series data and are already showing good results in other fields. In terms of the current NRA, this project is proposing to develop a data- centric technology that will significantly reduce development time of Earth science research and increase accessibility and utility of NASA data. In terms of specific technology areas outlined in the NRA, the proposed project will provide new big data analytics capability, as well as tools for scalable data mining and machine learning. Through the use of flexible container-based architecture and building upon existing capabilities of NEX and OpenNEX, the system will be demonstrated in both high- performance computing (HPC) as well as cloud environment on AWS. Period of performance for the proposed project is 2 years. The entry TRL is 3 and exit TRL is 6.
Anne Wilson, University of Colorado, Boulder
HY-LaTiS: Evolving the Functional Data Model through Creation of a Tool Set for Hyperspectral Image Analysis
Researchers in the Earth atmosphere remote sensing community need tools to be able to effectively handle the hyperspectral data that is now becoming available. The massive data volumes of these datasets, demanding up to an estimated 100 TB of storage for a set of analysis activities to be completed, is a barrier to their use. We propose to develop a tool, HY-LaTiS, to facilitate the use of hyperspectral data (HD) by providing convenient access to and operations over the massive data volumes they present. The data will be stored in the cloud where server-side computations can be performed. Users will be able to analyze data in the cloud using a notebook approach that enables processing to occur in the cloud without requiring dataset download to user workstations. They will also be able to reduce HD and stream it into local memory for the purpose of integrating hyperspectral information into existing sequential analyses. Overall, HD will be easier to access and use, with better interactivity in analyses than is available today.
The will be achieved by putting a functional data processing layer over the Spark big data processing engine, leveraging the existing LaTiS data access server and framework as the functional data processing layer. LaTiS has been in use at the Laboratory for Atmospheric and Space Physics (LASP) for many years, with about 20 instances currently running, serving a wide variety of datasets from a wide variety of domains, formats, and locations. We estimate LaTiS to be at an entry assessment of TRL 4 due to its demonstrated success within the lab.
This project expands the Spark interface to offer not only the relational data model interface that it does now, but also a functional interface in the functional programming sense. This is accomplished by implementing the LaTiS functional data model and associated Hyperspectral Imaging Analysis (HIA) operations with a functional style, while leveraging the power of Spark underneath. We argue that a functional approach towards data and processing provides a flexibility of representation coupled with the right set of semantics to be a very good approach for scientific analysis.
This two-year project will start with the technical team working with our science Co-Is, who are our subject matter experts (SMEs), to develop a core hyperspectral image analysis (HIA) domain tool set that supports HD analysis in the code and streaming into memory reduced volume HD. The tool set will provide an intuitive interface designed specifically at the level of HIA. The first operations to implement will be access operations, such as subsetting on geolocation, wavelength, or pixel. The second set of operations will be server side analyses, such as various forms of integration to be applied to reduce the data volume to a lower resolution.
After this work is underway, and depending on the timing of the needs of our SMEs, new analyses and underlying operations will be developed that leverage HY-LaTiS operations via notebook technologies, such as Jupyter notebook or JupyterLab. This is where the true interactive experience will lie. Instead of the batch oriented HIA practices of today, this approach will save time by reducing or eliminating the tedious download-extract- operate processing now required.
Once the capabilities of subsetting, interpolating, and performing a few other essential operations are operational in a notebook environment, the HY-LaTiS development team will hold a workshop to demonstrate the tool to the Earth atmosphere remote sensing community. There they will be able to test the tool and provide feedback. Our success will be determined by the adoption and usage of the tools by our SMEs and in performing science research operations that involve these hyperspectral datasets.
Charlie Zender, University of California at Irvine
JAWS: Justified AWS-Like Data through Workflow Enhancements that Ease Access and Add Scientific Value
Automated Weather Station and AWS-like networks are the primary source of surface- level meteorological data in remote polar regions. These networks have developed organically and independently, and deliver data to researchers in idiosyncratic ASCII formats that hinder automated processing and intercomparison among networks. Moreover, station tilt causes significant biases in polar AWS measurements of radiation and wind direction. Researchers, network operators, and data centers would benefit from AWS-like data in a common format, amenable to automated analysis, and adjusted for known biases. This project addresses these needs by developing a scientific software workflow called "Justified AWS" (JAWS) to ingest Level 2 (L2) data in the multiple formats now distributed, harmonize it into a common format, and deliver value-added Level 3 (L3) output suitable for distribution by the network operator, analysis by the researcher, and curation by the data center.
Polar climate researchers currently face daunting problems including how to easily: 1. Automate analysis (subsetting, statistics, unit conversion) of AWS-like L2 ASCII data. 2. Combine or intercompare data and data quality from among unharmonized L2 datasets. 3. Adjust L2 data for biases such as AWS tilt angle and direction. JAWS addresses these common issues by harmonizing AWS L2 data into a common format, and applying accepted methods to quantify quality and estimate biases. Specifically, JAWS enables users and network operators to 1. Convert L2 data (usually ASCII tables) into a netCDF- based L3 format compliant with metadata conventions (Climate-Forecast and ACDD) that promote automated discovery and analysis. 2. Include value-added L3 features like the Retrospective, Iterative, Geometry-Based (RIGB) tilt angle and direction corrections, solar angles, and standardized quality flags. 3. Provide a scriptable API to extend the initial L2-to-L3 conversion to newer AWS-like networks and instruments. Polar AWS network experts and NSIDC DAAC personnel, each with decades of experience, will help guide and deliberate the L3 conventions implemented in Stages 2-3.
The project will start on July 1, 2017 at entry Technology Readiness Level 3 and will exit on June 30, 2019 at TRL 6. JAWS is now a heterogeneous collection of scripts and methods developed and validated at UCI over the past 15 years. At exit, JAWS will comprise three modular stages written in or wrapped by Python, installable by Conda: Stage 1 ingests and translates L2 data into netCDF. Stage 2 annotates the netCDF with CF and ACDD metadata. Stage 3 derives value-added scientific and quality information. The labor-intensive tasks include turning our heterogeneous workflow into a robust, standards-compliant, extensible workflow with an API based on best practices of modern scientific information systems and services. Implementation of Stages 1-2 may be straightforward though tedious due to the menagerie of L2 formats, instruments, and assumptions. The RIGB component of Stage 3 requires ongoing assimilation of ancillary NASA data (CERES, AIRS) and use of automated data transfer protocols (DAP, THREDDS).
The immediate target recipient elements are polar AWS network managers, users, and data distributors. L2 borehole data suffers from similar interoperability issues, as does non-polar AWS data. Hence our L3 format will be extensible to global AWS and permafrost networks. JAWS will increase in situ data accessibility and utility, and enable new derived products (both are AIST goals). The PI is a long-standing researcher, open source software developer, and educator who understands obstacles to harmonizing disparate datasets with NASA interoperability recommendations. Our team participates in relevant geoscience communities, including ESDS working groups, ESIP, AGU, and EarthCube.