Project Selections for AIST-18
22 Projects Awarded Funding Under the Advanced Information Systems Technology (AIST) Program
2018 ROSES A.41 Solicitation NNH18ZDA001N-AIST Research Opportunities in Space and Earth Sciences
09/30/2019 – NASA’s Science Mission Directorate, NASA Headquarters, Washington, DC, has selected proposals for the Advanced Information Systems Technology Program (AIST-18) in support of the Earth Science Division (ESD). The AIST-18 will provide technologies to reduce the risk and cost of NASA information systems to support future Earth observation and to transform those observations into Earth information.
ESD’s Earth Science Technology Office (ESTO) received 100 proposals and has awarded a total of 22 proposals with a 2-year period of performance. The total amount of all the awards is $24M.
NASA’s Advanced Information Systems Technology (AIST) Program identifies, develops, and supports adoption of information technology expected to be needed by the Earth Science Division in the next 5 to 20 years. Currently, the AIST Program is organized around two primary thrusts, the Analytic Center Framework (ACF) and the New Observing Strategy (NOS). Proposals were solicited that fall into either of these two thrusts and explicitly showed how the resulting technology would be infused into one of ESD’s science domains. The AIST Program expects to mature the technologies in these proposals at least one Technology Readiness Level with an eventual goal to demonstrate their value to the relevant science communities.
The awards are as follows (links go to abstracts below):
John Beck, University of Alabama, Huntsville Cloud-Based Analytic Framework for Precipitation Research (CAPRi) |
James Carr, Carr Astronautics Corporation StereoBit: Advanced Onboard Science Data Processing to Enable Future Earth Science |
Janice Coen, University Corporation For Atmospheric Research Creation of a Wildland Fire Analysis: Products to Enable Earth Science |
Andrea Donnellan, Jet Propulsion Laboratory Quantifying Uncertainty and Kinematics of Earthquake Systems (QUAKES-A) Analytic Center Framework |
Riley Duren, Jet Propulsion Laboratory Multi-Scale Methane Analytic Framework |
Barton Forman, University of Maryland, College Park Towards the Next Generation of Land Surface Remote Sensing: A Comparative Analysis of Passive Optical, Passive Microwave, Active Microwave, and LiDAR Retrievals |
Ethan Gutmann, University Corporation For Atmospheric Research Preparing NASA for Future Snow Missions: Incorporation of the Spatially Explicit SnowModel in LIS |
Daven Henze, University of Colorado, Boulder Surrogate Modeling for Atmospheric Chemistry and Data Assimilation |
Jeanne Holm, City of Los Angeles Predicting What We Breathe: Using Machine Learning to Understand Air Quality |
Hook Hua, Jet Propulsion Laboratory Smart On-Demand Analysis of Multi-Temporal and Full Resolution SAR ARDs in Multi-Cloud & HPC |
Beth Huffer, Lingua Logica LLC AMP: An Automated Metadata Pipeline |
Anthony Ives, University of Wisconsin, Madison Valid Time-Series Analyses of Satellite Data to Obtain Statistical Inference about Spatiotemporal Trends at Global Scales |
Walter Jetz, Yale University An Analytic Center for Biodiversity and Remote Sensing Data Integration |
Randall Martin, Washington University in St. Louis Development of the High Performance Version of GEOS-Chem (GCHP) to Enable Broad Community Access to High-Resolution Atmospheric Chemistry Modeling in Support of NASA Earth Science |
Mahta Moghaddam, University of Southern California SPCTOR: Sensing-Policy Controller and OptimizeR |
John Moisan, NASA Goddard Space Flight Center NASA Evolutionary Programming Analytic Center (NEPAC) for Climate Data Records, Science Products and Models |
Sreeja Nag, NASA Ames Research Center D-SHIELD: Distributed Spacecraft with Heuristic Intelligence to Enable Logistical Decisions |
Derek Posselt, Jet Propulsion Laboratory A Science-Focused, Scalable, Flexible Instrument Simulation (OSSE) Toolkit for Mission Design |
Stephanie Schollaert Uz, NASA Goddard Space Flight Center Supporting Shellfish Aquaculture in the Chesapeake Bay Using Artificial Intelligence to Detect Poor Water Quality Through Sampling and Remote Sensing |
Jennifer Swenson, Duke University The Bridge from Canopy Condition to Continental Scale Biodiversity Forecasts, Including the Rare Species of Greatest Conservation Concern |
Philip Townsend, University of Wisconsin, Madison On-Demand Geospatial Spectroscopy Processing Environment on the Cloud (GeoSPEC) |
Jia Zhang, Carnegie Mellon University Mining Chained Modules in Analytic Center Framework |
John Beck, University of Alabama, Huntsville
Cloud-Based Analytic Framework for Precipitation Research (CAPRi)
Researchers at the University of Alabama in Huntsville (UAH) Information Technology and Systems Center (ITSC) and Earth Systems Science Center (ESSC), in collaboration with NASA Marshall Space Flight Center (NASA/MSFC), propose to leverage cloud-native technologies explored in the AIST-2016 VISAGE (Visualization for Integrated Satellite, Airborne and Ground-based data Exploration) project to develop a Cloud-based Analytic framework for Precipitation Research (CAPRi). CAPRi will host Global Precipitation Measurement Validation Network (GPM-VN) data integrated with a Deep Learning framework to provide an analysis-optimized cloud data store and access via on-demand cloud-based serverless tools. CAPRi services will automate the generation of large volumes of high-quality training data for successful deployment of Deep Learning models.
The research focus will be to develop a Deep Learning application to enhance the resolution of GPM data for improved identification of convective scale precipitation features, particularly outside the range of ground-based weather radar. This resolution enhancement technology is called super-resolution or downscaling. Deep Learning Convolutional Neural Networks (CNNs) will be used to learn features that can infer high-resolution information from low-resolution variables, building on a prototype developed by the PI of this proposal in collaboration with GPM mission scientists. The research focus addresses NASA AIST program’s Analytic Center Framework (ACF) thrust Water and Energy Cycle domain.
To conduct this research, we have assembled a multi-disciplinary team that includes leaders in Machine Learning/Deep Learning, atmospheric/earth science, precipitation, data analytics, remote sensing, and information technology: PI Dr. John Beck (UAH/ITSC), Deep Learning and remote sensing lead; Co-I Dr. Patrick Gatlin (NASA/MSFC), precipitation science and GPM GV subject matter expert; Co-I Mr. Todd Berendes (UAH/ITSC), GPM VN network and cloud-based tools and analytics lead; Co-I Dr. Geoffrey Stano (UAH/ITSC) atmospheric science and information technology; Co-I Ms. Anita LeRoy (UAH/ESSC) precipitation features subject matter expert (SME); and collaborator Dr. Walt Petersen (NASA/MSFC), GPM subject matter expert.
The GPM-VN, having already identified and extracted coincident low-resolution satellite radar and high-resolution ground radar observations of a variety of precipitation events, provides an ideal source of training and test data for this proposal. We propose to develop cloud-based tools within CAPRi to automate the development of this type of training data directly from the VN and results will be used within an extended CNN architecture to downscale the spatial resolution of satellite-based GPM DPR precipitation products from ~5km to ~1km radar resolution and thus advance precipitation measurements from space. CAPRi will simplify the access to and the processing of these data, leading to practical applicability of these techniques to improve rainfall retrieval estimates from GPM DPR data in areas where ground-based scanning radar data and reliable precipitation gauge observations are lacking.
As a science use test case, CAPRi will be used to develop a precipitation features demonstration database to support the precipitation science community. In prior research, Co-I LeRoy developed a method to identify convective scale precipitation features from a larger scale precipitation features database (developed at the University of Utah). This research will be expanded to identify 3D convective scale precipitation features in the GPM era, leveraging the VN database of precipitation observations from GPM including super-resolution GPM data to improve the identification and refine the scale of 3D precipitation features beyond the scope of the VN.
James Carr, Carr Astronautics Corporation
StereoBit: Advanced Onboard Science Data Processing to Enable Future Earth Science
We propose a two-year investigation to demonstrate higher-level onboard science data processing for more intelligent SmallSats and CubeSats to enable future Earth science missions and Earth observing constellations. Low-cost SmallSat architectures with limited numbers of ground stations generally suffer from downlink bottlenecks that present impediments to delivering large volumes of data to science algorithms implemented on the ground and generating timely science or operational products. When data compression and playback cannot provide a complete solution, data acquisitions per orbit must be sacrificed. This is highly undesirable for weather and climate objectives that require global-scale observations. Our approach to resolving this bottleneck is to disaggregate science data processing and embed science algorithms onboard. Our project leverages ESTO’s investment in the SpaceCube family of onboard processors with Size-Weight-and-Power (SWAP) suitable for SmallSat and CubeSat applications. It is now important to focus on application development for these processors to realize the full benefit of this investment. We target our pathfinder application to an objective relevant to the 2017-2027 Earth Sciences Decadal Survey – atmospheric dynamics with 3D stereo tracking of cloud and moisture features using a Structure from Motion (SfM) technique that we call StereoBit. StereoBit is well suited to the hybrid SpaceCube architecture that includes Field Programmable Gate Arrays (FPGAs) that can be programmed for computation. Our StereoBit approach derives from our work in 3D Winds using the MISR instrument onboard NASA’s Terra spacecraft together with imagery from NOAA’s geostationary satellites. We envision implementing StereoBit onboard using multi-angle data from a sensor such as the ESTO-funded Compact Mid-wave IR System (CMIS) and exploiting information from other vantage points, including geostationary satellites. StereoBit enables large-scale data volume reductions to realize the full science value of each orbit without degradation or compromise. Our proposal includes a testbed to enable development of intelligent onboard systems and test them under flight-like conditions while running on actual spacecraft processors. The testbed has a cloud-hosted, FPGA-enabled component that is scalable to emulate computing capabilities implemented on other platforms for collaborative computing concepts and for off-premise development and testing. We will develop and prove the onboard StereoBit approach to stereo 3D winds to achieve TRL 6 capability by study completion. Best practices derived from our investigations will be valuable to other application areas in the Earth sciences and will be shared with the community.
Janice Coen, University Corporation For Atmospheric Research
Creation of a Wildland Fire Analysis: Products to Enable Earth Science
Wildland fire science and related applications have benefitted from a wide range of space-based and airborne fire observations, each with different spatial resolutions and revisit frequencies, as additional sensors and constellations continue to be added by both the public and private sector. Greater use of such observations for analysis and modeling of wildland fire occurrence, behavior, and effects such as emissions is hampered by limitations including: (1) the observations’ disparate resolutions and extent, (2) temporal or spatial gaps due to cloud cover, satellite revisit timing, and other issues, and (3) partial fire mapping resulting from (1,2) with crudely mosaicked image segments.
Our purpose is to develop the methodology to create, test, and assess the first wildland fire analysis (sometimes called “”reanalysis””) products – standardized, gridded outputs of desired wildfire products produced at regular intervals. The idea is to develop fire products analogous to atmospheric analyses — powerful products used across atmospheric science to initialize atmospheric models and support many types of scientific studies. Reanalyses integrate dissimilar, disconnected, asynchronous observations using the physical consistency of a model and data assimilation system to fill gaps in time and space, estimate variables that are not directly observed, and create a physically balanced, more complete image of reality and how it changed over time. Similar needs exist for sectors of wildland fire science and operations but no analogous process or product exists; instead, investigators have attempted to numerically interpolate between widely separated observations or to estimate or retrieve information from spatially and/or temporally coarse data.
We aim to develop the fire reanalysis methodology using fire detection products (e.g. Suomi-NPP VIIRS, Landsat OLI, Sentinel-2, Terra/Aqua MODIS, AVHRR, and Sentinel-3 SLSTR), and other products (e.g. ASTER). Subsequently, we will investigate assimilation of supplementary infrared observations from small satellite constellations and private sector products (e.g. DigitalGlobe WorldView-3), airborne observations such as USDA’s National Infrared Operations (NIROPs), or experimental platforms (e.g. German Aerospace Center TET-1) that reflect the growing number of remote sensing observations. The CAWFE coupled numerical weather prediction – wildland fire model, which has been successfully used to model many wildfire events, will assimilate this remotely-sensed fire detection data to simulate fires’ evolution while minimizing variance from those observations.
In this exploratory study, we will produce reanalysis products for a sample of different types of wildfire events. Products would include 2-D gridded variables including cumulative burned area extent, active burning areas, and heat release rate (related to Fire Radiative Power) in common data formats at hourly intervals (or other frequency to be determined by users – team members who develop additional products, e.g. biomass burning emissions inventories, for the research community and decision makers). As part of the project, users will test the products and assess their utility in smoke and emissions modeling, where the infrequency of fire detection observations (e.g., 2 per day for the original VIIRS) currently creates errors in predicted fire emissions due to the diurnal variability in fire activity. Also, 3-D products will be considered, including atmospheric state variables and smoke concentration, which could indicate injection height, and smoke concentration profiles needed to initialize air quality models. Data, software, documentation, products, and metadata will be released in a publicly accessible repository, allowing the community to further develop the methods and product library, and outreach will be made by initiating workshops and sessions at scientific conferences.
Andrea Donnellan, Jet Propulsion Laboratory
Quantifying Uncertainty and Kinematics of Earthquake Systems (QUAKES-A) Analytic Center Framework
We propose to develop an analytic center framework for creating a uniform crustal deformation reference model for the active plate margin of California by fusing InSAR, topographic, and GNSS geodetic imaging data. We will quantify uncertainties for the reference model, which can serve to improve earthquake forecast models and be used to improve understanding of the physical processes leading to and following earthquakes. Users will be able to access and generate custom crustal deformation products for further analysis. Our approach will be to 1) infuse GNSS network solutions into UAVSAR baseline estimation and extract features from InSAR images, 2) develop cluster analysis to identify crustal blocks and rank active fault systems spatially and temporally, 3) interpolate the analyzed InSAR and GNSS data to provide an adaptively sampled deformation field, and 4) assimilate and correlate the crustal deformation products into geodetic/seismicity-based earthquake forecasts and test against past data. All tools and products will be open source (Apache Software License, version 2) and available through geospatial web map services.
The key technical challenge will be harmonizing data products with widely varying spatial and temporal resolutions that provide one or more components of the 3D time-dependent deformation field and have unique error sources and intrinsically different accuracies. Spatial resolutions range from cm to sub-meter for topography, ~10 m for airborne InSAR (UAVSAR), ~100m spaceborne InSAR, and ~10 km for GNSS. Temporal sampling can range from minutes (GNSS) to weekly, monthly, or yearly for the other geodetic imaging techniques. Processing assumptions can weaken solutions that would be strengthened using knowledge from other data types. For example, repeat pass interferometry requires an estimation of the position of the instrument at each pass and tectonic deformation can add error to baseline estimation if not properly incorporated. Error sources can bias each data set in unique ways and must be understood and accuracies quantified in order to best establish a gridded crustal deformation model that is dense in areas of rapid changes and sparse where little changes occur. Our project is divided into three main tasks: 1) Data fusion and uncertainty quantification, 2) Data management and geospatial information services (the analytic center framework) and 3) collaboration and infusion into target communities.
This work is directly relevant to NASA’s Earth Surface and Interior program and adds value to NASA supported GNSS and UAVSAR results. It is also relevant to the NISAR mission scheduled for launch in early 2022. Products from this project can serve to validate NISAR and methodologies developed here can be applied to NISAR data when the mission is operational. The earthquake geophysics community will benefit the most from this work, followed by disaster responders. Target users for earthquake geophysics are the multi-institutional Southern California Earthquake Center community and the US Geological Survey. Target applications communities are the Federal Emergency Management Agency (FEMA), State of California Office of Emergency Services, and local California jurisdictions.
Riley Duren, Jet Propulsion Laboratory
Multi-Scale Methane Analytic Framework
We propose to mature and integrate methane data analysis capabilities spanning multiple observing systems and spatial scales into an analytic center framework. The Multi-scale Methane Analytic Framework (M2AF) will address the needs of public and private sector stakeholders for improved decision support and also help the science community fully exploit emerging and planned airborne and space-based methane observations.
We will develop an analytic center framework that:
1. Provides workflow optimization and management tools that help methane science investigations and applications proceed in an efficient (low latency) and repeatable manner.
2. Provides analytic tools to characterize methane fluxes and the physical processes that control them.
3. Facilitates data search and discovery relevant to specific methane science investigations and applications from diverse data sets.
4. Leverages our prototype methane data portal to provide collaborative, web-based tools for enabling scientific discussion and application of data analysis and modeling results.
This framework will significantly mature and integrate component capabilities we developed under previous and ongoing methane projects including NASA’s Carbon Monitoring System (CMS) program (Jacob-2013, 2016, Duren-2012, 2015) and ACCESS Methane Source Finder project, the interagency California Methane Survey project (CARB/CEC/NASA funded), and the interagency Megacities Carbon Project (NIST, NASA, NOAA, CARB funded). Those capabilities currently involve primarily standalone experimental demonstrations with surface, airborne and satellite observations (TRL 3-4) and uncoordinated data processing and modeling systems. We assess the TRL levels of these capabilities to be mostly in the TRL 3-4 range, with some (data portal) in the TRL-5 range. We propose to further develop and mature these capabilities (with additional testing with airborne and satellite observations) with input from our technology recipients, achieving exit TRLs in the 5-6 range.
Barton Forman, University of Maryland, College Park
Towards the Next Generation of Land Surface Remote Sensing: A Comparative Analysis of Passive Optical, Passive Microwave, Active Microwave, and LiDAR Retrievals
The central objective of this proposal is to create a terrestrial hydrology mission planning tool to help inform experimental design with relevance to terrestrial snow, soil moisture, and vegetation using passive and active microwave remote sensing, LiDAR remote sensing, passive optical remote sensing, hydrologic modeling, and data assimilation. Accurate estimation of land surface states and fluxes – including soil moisture, snow, vegetation, surface runoff – has been identified as a priority in the most recent decadal survey and can only be viewed globally (in an operational sense) with the use of satellite-based sensors. The development of a simulation tool that can quantify the utility of mission data for research and applications is directly relevant to these planning efforts.
Leveraging the existing capabilities of the NASA Land Information System (LIS) and the Tradespace Analysis Tool for Constellations (TAT-C), a comprehensive environment for conducting observing system simulation experiments (OSSEs) will be developed to quantify the added value associated with different sensor and orbital configurations as related to snow, soil moisture, and vegetation and the subsequent hydrologic response of the coupled snow-soil moisture-vegetation system. It will allow for quantitative assessment of different orbital configurations (e.g., polar versus geostationary), number of sensors (e.g., single sensor versus constellation), and the associated costs with installing space-based instrumentation. In addition, the integrated system will provide insights into the advantages (and disadvantages) of different configurations including simultaneous measurements.
The goal of the OSSE is to maximize the utility in experimental design in terms of greatest benefit to global, freshwater resource characterization. The integrated system will enable a true end-to-end OSSE that can help to quantify the value of observations based on their utility to science research and applications and to guide mission designs and configurations. Synthetic passive microwave (radiometry), active microwave (RADAR), passive optical (VIS/NIR), and active infrared (LiDAR) observations will be generated to provide information related to the coupled snow-soil moisture-vegetation integration in the terrestrial environment. This suite of synthetic observations will then be assimilated into NASA LIS to systematically assess the added value (or lack thereof) associated with an observation in space and time. Science and mission planning questions addressed as part of this proposal include, but are not limited to:
1. How can sensor viewing be optimized to best capture and characterize the integrated snow-soil moisture-vegetation response? Further, how would the efficacy of these sensors behave during extreme (e.g., flood, drought, rain-on-snow, post-wildfire, atmospheric river) events and non-extreme (e.g., climatological) events?
2. How might observations be coordinated (in space and time) to maximize utility, inform experimental design, and mission planning?
3. What is the added utility associated with an additional observation? Further, what is the marginal gain associated with adaptive (a.k.a., dynamic) viewing relative to a more traditional, fixed (a.k.a., static) viewing strategy?
4. What is the tradeoff space between different mixes of sensor types (i.e., passive MW, RADAR, and LiDAR), swath widths, fields-of-view, and error characteristics that maximizes scientific return while minimizing mission cost and mission risk?
This project is relevant to several current and planned NASA research interests. This project will leverage a suite of existing NASA data products and models, and therefore, add value to previously incurred costs associated with their development.
Ethan Gutmann, University Corporation For Atmospheric Research
Preparing NASA for Future Snow Missions: Incorporation of the Spatially Explicit SnowModel in LIS
Objectives and benefits: Snow is a critical component of the natural earth system and it has significant socio-economic effects. Snow provides a natural reservoir for water resources that billions of people rely on; it is a key link between the global energy and water cycles; it plays a major role in both plant and animal habitats; it has the power to shut down economic activity; and yet it also provides the foundation for a massive recreation industry. While the importance of snow is often considered on regional to global scales, one of the most important horizontal length scales for snow processes can only be represented with model grid spacings of 100 meters or less.
The snow science community has made great advances in understanding and representing these critical scales in small scale snow models; however, the software infrastructure does not yet exist to perform such fine spatial resolution simulations on continental scales. This presents a significant constraint on NASA when planning and operating future snow science missions. The snow community has determined that the optimal future snow product must combine modeling with remote sensing, and to achieve this it must be possible to perform simulations on approximately a 100-meter grid while representing the processes that operate on that scale, e.g. preferential deposition, snow redistribution, enhanced radiation load.
The goal of this proposed project is to improve NASA’s capability to plan and operate a future snow mission by coupling an advanced snow modeling system (SnowModel) into NASA’s Land Information System Framework (NASA-LISF). This work will provide the tools necessary to plan and operate a more cost-effective snow mission, yielding better snow products for the research and applications communities.
Proposed work and approach: We propose to increase the readiness of NASA’s snow modeling capability by extending the LISF to include an explicit representation of spatial variability (LISF-SnowModel). The specific work elements in the proposal are:
(1) Incorporate and enhance fine scale wind field and precipitation modifications from SnowModel’s MicroMet tool into the NASA-LISF forcing engine.
(2) Couple SnowModel’s snowpack and snow redistribution components into LISF and the Noah-MP land surface model in LISF.
(3) Improve the computational infrastructure supporting LISF-SnowModel to provide parallel performance sufficient to run continental domain simulations at snowdrift resolving scales (100-m). This includes coupling snow transport processes represented in SnowModel with the MPI communications in LISF.
(4) Perform a mission trade-off analysis based on high-resolution continental scale SnowModel simulations within LISF-SnowModel.
Period of performance: December 1st 2019 – November 30th 2021.
Technical Readiness Level Advancement: Entry TRL: 2; Exit TRL: 5.
Daven Henze, University of Colorado, Boulder
Surrogate Modeling for Atmospheric Chemistry and Data Assimilation
Exposure to ambient concentrations of ozone (O3) is the second largest pollution-related risk of premature death in the US, leading to approximately 20,000 premature deaths annually. Tens of millions of US citizens live in areas where O3 concentrations exceed federal standards. While air quality forecasts could minimize these exposure risks and help develop attainment strategies, accurate and reliable O3 forecasts have been elusive owing to the computational complexity of O3 air quality modeling. From the prediction of urban-scale pollution distributions, to short-term O3 forecasts, to better understanding the relationship between O3 and climate change, a persistent challenge in the atmospheric community is the computational expense of chemistry within the models used to research these problems.
To address these challenges, this project aims at building a robust and computationally efficient chemical DA system, merging research in compressive sampling and machine learning for large- scale dynamical systems. In particular, we will use low-rank tensor representations, demonstrated recently for the first time by Co-PI Doostan for surrogate modeling of chemical kinetics. This approach allows for efficient construction of a surrogate which itself is very computationally cheap to evaluate. Our approach will also draw from expertise in polynomial chaos expansions (PCE – a spectral representation of the model solution that can be constructed non-intrusively, i.e., by treating the chemical model as a black box) coupled with multiscale stochastic preconditioners (to address stiffness of the chemical system) to develop fast surrogate models for atmospheric chemistry. In particular, we will accomplish the following objectives:
(1) Develop, test, and deliver a surrogate model for the chemical solver in a widely used AQ model (GEOS-Chem).
(2) Generalize the surrogate model generation procedure within a software toolbox applicable to any user-provided chemical mechanisms.
(3) Demonstrate the benefits of using a surrogate-based AQ modeling framework for assimilation of geostationary observations of atmospheric composition to improve O3 simulations.
This project will advance computational tools available for AQ prediction, mitigation, and research. Not only do we aim to deliver a surrogate model for the chemical mechanism of an AQ model used by a very large research community (GEOS-Chem), we will provide a software toolbox for generation of surrogate models for any user-provided mechanism. The potential impacts though are much greater, as improvements in computational expediency could affect research in areas from urban air pollution modeling to long-term studies of chemistry-climate interactions. Eliminating the computational bottleneck associated with chemistry will in turn help to promote the broader use of data assimilation within other AQ forecasting systems. As a case study, we will explore and demonstrate the benefits of surrogate modeling for 4D-Var assimilation using geostationary measurements of NO2, which in the next few years will be available for the first time over North America, Europe, and East Asia. This supports a broader goal of facilitating the use of NASA remote sensing data for air quality forecasting here in the US. The proposed work is highly relevant to the AIST program and broader goals of the earth science and applied sciences programs. Our work directly responds to the NASA AIST proposal solicitation for “data-driven modeling tools enabling the forecast of future behavior of the phenomena” as well as “analytic tools to characterize the natural phenomena or physical processes from data” within the program thrust for Analytic Center Framework Development.
Jeanne Holm, City of Los Angeles
Predicting What We Breathe: Using Machine Learning to Understand Air Quality
Every 7 seconds, someone dies from the effects of air pollution. Air pollution is responsible for 4.5 million deaths and 107.2 million disability-adjusted-life-years globally. With the percentage of the global population living in urban areas projected to increase from 54% in 2015 to 68% in 2050 and in the U.S. up to 89%, the prevention of a significant increase in air pollution-related loss of life requires comprehensive mitigation strategies, as well as forecast systems, to limit and reduce the exposure to harmful urban air. While some megacities like Los Angeles operate an extensive network of ground-based monitoring stations for ozone, particulate matter (PM) 2.5, and other pollutants, air quality (AQ) in many cities around the globe are poorly characterized at ground level. The main source of information on atmospheric environmental conditions in those cities is space-based monitoring of a limited set of AQ indicators.
We propose the development of advanced Machine Learning (ML)-based algorithms and models that links ground-based in situ and space-based remote sensing observations of major AQ components, with the aim to (a) classify patterns in urban air quality, (b) enable the deduction and forecast of air pollution events related to PM2.5 and ozone from space-based observations, and ultimately (c) identify similarities in AQ regimes between megacities around the globe for improved air pollution mitigation strategies. Furthermore, this proposal will help us understand the correlation between air pollution and health conditions all over the City of Los Angeles, and predict individuals’ health risks related to air pollution based on air quality measurements. Using the City of Los Angeles as a test case, this proposal work will focus on elements (a) and (b), with the extension to element (c) envisaged for follow-on studies.
The objective of this proposal is to increase the accessibility and use of space data by using machine learning to help cities predict air quality in ways that can be acted upon to improve human health outcomes and provide better data to individuals and cities. Secondarily, the goal is to provide these tools and algorithms to future Earth science missions (e.g., MAIA) to provide rapid ground truth, combine multiple data sources, and support more rapid use of mission data.
This proposal will focus on maturing the technologies involved in:
- Developing machine learning algorithms for predictive models for air quality based on PM2.5 and other air pollutants
- Build a big data analytics algorithm for integrating ground and space data
- Provide predictive models for health risk using deep learning and machine learning
- Build an open source PM2.5 stack for integrating ground and space data
- Create a model for cities with shared attributes to understand predictions and effective interventions
Hook Hua, Jet Propulsion Laboratory
Smart On-Demand Analysis of Multi-Temporal and Full Resolution SAR ARDs in Multi-Cloud & HPC
We will build upon our cloud-native Advanced Rapid Imaging & Analysis (ARIA) science data system to help address pain-points in the complexities of large-scale algorithm development and on-demand analysis of handling voluminous SAR measurements at full resolution from L1 SLCs to L3 time series. We will increase multi-temporal and full resolution SAR data use as well as facilitate algorithm development and analysis for higher fidelity surface deformation and urgent response use cases.
The approach will enable full resolution time series analysis, high-accuracy flood and damage assessments with remote sensing SAR Analysis Ready Data (ARD). We will be building upon our existing capabilities:
- Generation and analysis of the SAR ARDs will be done using science notebook-based (e.g. Jupyter) algorithm development environment where algorithms are published into the SAR ARD algorithm analysis “appstore”.
- We will integrate in curated natural events from NASA’s Earth Observatory Natural Event Tracker (EONET) to help users setup automated triggers of on-demand SAR ARD analysis using their own algorithms from algorithm analysis “appstore”.
- Handling on-demand analysis jobs are done across multi-cloud (AWS, Google Cloud Platform, and Microsoft Azure) and NASA HPC (Pleiades) environments.
- Enabling “smart on-demand” where analysis are cost-model-informed to help address the cost of jobs across multi-cloud and HPC environments. E.g. optimizing for fast processing vs lower costs requests.
We plan to use the system to develop and process data for the follow demonstration use cases:
(1) multi-temporal and full resolution ARDs of L1 coregistered SLC stacks
(2) full resolution time series for critical infrastructure monitoring use cases
(3) multi-temporal flood extent maps
(4) multi-temporal damage assessment maps
Beth Huffer, Lingua Logica LLC
AMP: An Automated Metadata Pipeline
A core function of the AIST Analytic Center Framework is to facilitate research and analysis that uses the full spectrum of data products available in archives hosting relevant, publicly available data. Key to this is making data “FAIR” (findable, accessible, interoperable, and reusable), not just for humans, but for automated systems. Effective data discovery services and fully automated, machine-driven transactions require metadata that can be understood by both humans and machines. But such metadata are uncommon. More commonly, metadata records are inadequately contextualized, incomplete, or simply do not exist. When they do exist, they often lack the semantic underpinnings to make them meaningful.
The goal of the Automated Metadata Pipeline (AMP) project is to
(1) Develop a fully-automated metadata pipeline that integrates machine learning and ontologies to generate syntactically and semantically consistent metadata records that advance FAIR objectives and support Earth science research for a diverse group of stakeholders ranging from scientists to policy makers, and
(2) Demonstrate the application of AMP-enhanced data to automate and substantially improve the use and reuse of NASA-hosted data in environmental/Earth systems models in the ARtificial Intelligence for Ecosystem Services (ARIES; Villa et al. 2014) platform, a distributed network of ecosystem services and Earth science models and data that relies on semantics to assemble network-available data and model components into ecosystem services models, built on demand and optimized for the context of application.
ARIES is a context-aware modeling system that, given adequate, semantically-grounded metadata, is capable of finding and assessing the suitability of candidate data sets for use with a particular model, and automatically establishing linkages between data sources and model components, performing a variety of mediation and pre-processing tasks to integrate heterogeneous data sets.
The AMP project aims to use machine learning techniques to auto-generate semantically consistent, variable-level metadata records for NASA data products and, in collaboration with the ARIES developer and user communities, demonstrate their value in supporting scientific research. In so doing, we hope to achieve several objectives:
- Address usability and scalability issues for data providers and metadata curators in connection with tools for generating robust variable-level metadata records;
- Improve the semantic interoperability of target NASA data products by linking concepts in AMP-generated metadata records to terms from well-established, external vocabularies such as the Environment Ontology (ENVO), thereby taking advantage of existing term mappings between ENVO and ontologies developed by other communities of practice, including NASA’s Semantic Web for Earth and Environment Technology (SWEET) Ontology;
- Demonstrate the benefits of semantically interoperable, FAIR data across communities of practice.
AMP will provide a platform for auto-generating robust, FAIR-promoting, semantically consistent metadata records using neural nets to assign variables to ontological classes in the AMP Ontology. Our approach recognizes that there is a wealth of information contained within the data itself, which can be exploited to generate accurate and consistent metadata. We will work with the Goddard Space Flight Center’s (GSFC) Earth Science (GES) Data and Information Services Center (DISC), which will provide access to data via its OPENDAP servers for training and testing the AMP pipeline, and for use by the ARIES platform.
AMP advances the Analytic Center Framework objectives of allowing seamless integration of new and user-supplied components and data; increasing research capabilities and speed; handling large volumes of data efficiently; and providing novel data discovery tools.
Anthony Ives, University of Wisconsin, Madison
Valid Time-Series Analyses of Satellite Data to Obtain Statistical Inference about Spatiotemporal Trends at Global Scales
(a) Objectives and Benefits
As remote sensing has matured, there is a growing number of datasets that have both broad spatial extent and repeated observations over decades. These datasets provide unprecedented ability to detect broad-scale changes in the world through time, and to forecast changes into the future. However, rigorously testing for patterns in these datasets, and confidently making forecasts, require a solid statistical foundation that is currently missing. The challenge presented by remotely sensed data is the same as its remarkable value: remotely sensed datasets consist of potentially millions of time series that are non-randomly distributed in space.
We propose to develop new statistical tools to analyze large, remotely sensed datasets that will give statistical rigor to conclusions about patterns of change and statistical confidence to forecasts of future change. Our focus is providing statistical tests for regional-scale hypotheses using pixel-scale data, thereby harnessing the statistical power contained within all of the information in remotely sensed time series.
(b) Proposed Work and Methodology
We will develop both a framework and software tools for incorporating spatial correlation (non-independence) into analyses of remotely sensed time-series data. Our framework will apply to different types of time-series models that are currently used to analyze pixel-level or small-scale data including continuous changes (e.g., directional trends) and abrupt changes (e.g., breakpoint analyses). Specifically, we will address:
i. Patterns in annual trends in time series. Our tools will identify where there are significant time trends.
ii. Causes of trends. Our tools will examine which variables explain observed changes best (e.g., climate, elevation, human population, etc.).
iii. Within-year patterns of seasonal trends and phenological events.
Forecasts for future change are only useful if they include an uncertainty in the forecasts. Thus, statistical models are necessary. Our statistical approach is based on models that, once fit to data, can be used for forecasting. These forecasts will use the spatio-temporal correlations estimated from the data and therefore can account implicitly for regional differences in the past time series that are likely to be perpetuated into the future, even if the underlying drivers for these changes are unknown.
We will test our algorithms with AVHRR/GIMMS3g, MODIS, AMSR-E, JAXA/JASMES, and Landsat data at global to regional scales to provide a proof-of-concept, demonstrate the feasibility, and highlight the value of our approach to the remote sensing community at large.
Our project will make substantial contributions to the AIST Goal of Increasing the Accessibility and Utility of Science Data by providing appropriate statistical tools for large remotely sensed time-series datasets. There are no available, easy-to-apply methods for testing hypotheses explaining regional patterns of past change and predicting future change that can be scaled to the size of remotely sensed datasets. We propose to remedy this, thereby making a major contribution to both remote sensing science and the application of satellite imagery for decision making. Our proposal fits under the AIST Core Topic of Data-Centric Technologies.
(c) Period of Performance
Our project will span two years.
(d) Entry and Planned Exit TRL
Our proposal will enter at Software TRL 2 and exit at TRL 5.
Walter Jetz, Yale University
An Analytic Center for Biodiversity and Remote Sensing Data Integration
Novel and rapidly growing biodiversity data streams from a range of sensors and sources now enable a much improved capture of geographic variation in status and trends of biodiversity. The combination of these data with remotely sensed information powerfully extends this assessment into environmental space and, with the right workflows and infrastructure, holds the potential for near real-time monitoring of the biological pulse of our planet. Successful integration needs to account for the differing spatiotemporal resolution and uncertainty in both data as well as biodiversity data biases, and support analysis, visualization and change detection across scales. These complexities require a range of tasks and decisions by scientists and communities who wish to benefit from new opportunities at the interface of biodiversity and remote sensing.
The proposed Analytic Center will provide analysis products, visualizations, interactive analytical tools, and guidance, all combined in an online dashboard, that will support a large community of users interested in linking environmental and biodiversity information. This project will leverage prototyped tools developed in a first-phase AIST grant (16-AIST16-0092) and specifically i) improve their technological readiness and usability, ii) add a range of analysis and visualization options, iii) strongly expand the breadth of remote sensing products and pre-annotated biodiversity products, and iv) scale up the on-demand biodiversity-environment annotation service. In-situ biodiversity has intrinsic spatiotemporal grain and associated uncertainty that varies with methodology.
The Center will provide this service to a broad community. Specifically, the Center online dashboard will i) allow users to perform a guided selection among a large suite of biodiversity-relevant remote sensing-supported biodiversity layers (including MODIS, Landsat, Sentinel, Airbus One Atlas, and others), ii) access available spatial biodiversity data (ca. 1 billion records) from repositories such as GBIF (publicly shared data), Movebank (movement data), Wildlife Insights (camera-trapping data) and Map of Life (additional data sources and types, including inventories and expert maps) and visually explore these data in multi-dimensional environment space, iii) support decisions around the most appropriate environmental characterization of these data, iv) support user-driven visualizations, reports, and data download, v) provide visual niche data coverage and niche change reporting, and vi) support these same services for user-uploaded data or user-driven annotation algorithms for up to 0.5M points.
The proposed platform will fill an important technological gap between biodiversity and remote sensing data and their associated communities. Specifically, it will lower the barrier of entry for the use of remote sensing data for a range of user communities in biodiversity science and conservation and enable them to access, use, and interpret biological signals enriched with remotely sensed environmental data. The proposed Center will provide the thousands of users of GBIF, Movebank, or Wildlife Insights data with the option to work with environmentally enriched biodiversity data or drive custom annotations. The workflows developed under this grant will provide a scalable solution to environmental annotation that is central to species environmental niche assessment, distribution modelling, and the development of species population EBV products. Together with additional visual reports this will directly benefit the biodiversity monitoring and assessment community, including GEO BON, and support EBV development and change assessment in Map of Life. The planned entry TRL for this project is 5 and the exit is aimed at TRL 7.
Randall Martin, Washington University in St. Louis
Development of the High Performance Version of GEOS-Chem (GCHP) to Enable Broad Community Access to High-Resolution Atmospheric Chemistry Modeling in Support of NASA Earth Science
Global modeling of atmospheric chemistry is a grand scientific and computational challenge because of the need to simulate large coupled systems of order ~100-1000 chemical and aerosol species interacting with transport on all scales. Atmospheric chemistry models must be integrated online into Earth system models (ESMs) to address a range of fundamental Earth Science questions but must also be able to operate offline, where the chemical continuity equations are solved using meteorological data as input. The offline approach has advantages of simplicity, accessibility, reproducibility, and application to inverse modeling. It is thus highly attractive to atmospheric chemistry researchers. The GEOS-Chem global 3-D model of atmospheric chemistry, developed and managed with NASA support, is used offline by hundreds of research groups worldwide with meteorological fields from the NASA Goddard Earth Observing System (GEOS), and the exact same model also operates online as a chemical module within the GEOS ESM at the NASA Global Modeling and Assimilation Office (GMAO). Through partnership with GMAO, we have recently developed a high-performance version of GEOS-Chem (GCHP) using the Earth System Modeling Framework (ESMF) in its Modeling Analysis and Prediction Layer (MAPL) implementation, that permits GEOS-Chem to operate offline in a distributed-memory framework for massive parallelization consistent with the NASA GEOS system. GCHP allows GEOS-Chem to exploit the native GEOS cubed-sphere grid for greater accuracy and computational efficiency. Global simulations of stratosphere–troposphere oxidant–aerosol chemistry at very high resolution such as cubed-sphere C720 (∼12 km) become feasible. Here we solicit support from AIST to make this high-performance version of GEOS-Chem highly accessible by the atmospheric chemistry community in sustained partnership with GMAO. This will allow the atmospheric chemistry community to better exploit the GEOS system through the use of GEOS-Chem, and to advance atmospheric chemistry knowledge for the benefit of the GEOS system and NASA’s mission. Specifically, we propose to:
1) Update to current version of MAPL and enable seamless updates to benefit from future software engineering advances at GMAO for ESM coupling and data transfer, and to contribute MAPL enhancements produced by the GEOS-Chem community for atmospheric chemistry applications. MAPL developments at GMAO will increase GCHP functionality and capabilities, such as the use of stretched-grid simulations.
2) Improve GCHP performance and portability to promote usage across the large GEOS-Chem community and beyond. This effort includes (a) fully parallelizing the model to remove current bottlenecks, thus enabling finer resolution simulations that better emulate the GEOS system, (b) improving the build system and using software containers to facilitate access and ease model configuration, and (c) supporting a mature multi-node capability on the Amazon cloud.
3) Generate an operational cubed-sphere archive of GEOS assimilated meteorological data for driving GCHP. This task will increase accuracy in modeling transport and will provide a better foundation for collaboration between the GEOS-Chem community and GMAO.
This project will greatly increase the accessibility and value of NASA GEOS products and GEOS-Chem to the atmospheric chemistry community worldwide, from academic researchers to air quality managers and policy analysts. It will not only exploit NASA’s investment in atmospheric modeling to benefit the atmospheric chemistry community, but also engage that community in long-term partnership with GMAO to advance atmospheric chemistry within the NASA Earth modeling and data assimilation system. Thus, this project is at the core of NASA’s ACF mission to developpowerful, re-usable tools to support scientific analysis of large volumes of data, often from disparate sources, as needed by the Earth Science Division in the 5-20 year timeframe.
Mahta Moghaddam, University of Southern California
SPCTOR: Sensing-Policy Controller and OptimizeR
This proposal outlines new technology concepts for multi-sensor infusion under the current AISTs New Observation Strategies (NOS) thrust with specific focus on (a) Integrated operation of different types of instruments or at different vantage points, (b) Evaluation/comparison of alternative observing [sensing] strategies, and (c) Estimation of science value to enable comparison of observing strategies.
Specifically, we propose to develop technologies that coordinate the operations of networks of ground-based and Unmanned Aerial Vehicle (UAV)-based sensors, jointly referred to as Agents. When sensing Agents are coordinated, they can deliver ground-truth information to multiple end-uses that may have different application requirements, especially for different NASA remote sensing science products. We call the overall system achieving this technology vision “Sensing Policy Controller and OptimizeR (SPCTOR).”
This technology concept is particularly relevant when considering the fact that existing ground networks have limited capabilities in terms of spatiotemporal sampling flexibility, multi-user coordination, and multi-sensor integration. These limitations are further compounded when multiple users, each with different application requirements, seek access to sensing networks that may have limited resources. Earth-looking remote sensing observatories, in particular, almost always require ground based in-situ networks for geophysical product validation and calibration. In the domain of soil moisture remote sensing, which is our target observable, current (e.g., SMAP and SMOS) and future (e.g., NISAR) missions have different spatial (100 m – 10s km) and temporal (daily to weekly) mapping requirements, resulting in complex and potentially conflicting demands on in-situ networks for validating their products.
This proposal will directly address the aforementioned limitations. Building upon elements of existing high-TRL (up to 7) technology heritage of the Soil moisture Sensing Controller and oPtimal Estimator (SoilSCAPE), we propose a new framework to coordinate soil moisture sensing strategies across multiple battery-operated Agents by means of a new machine-learning-based entity called the Sensing-Policy Controller (SPC). In essence, the PC considers Agents’ energy constraints, recent data and observations, and respective application (or science) based performance metrics, and then makes the determination on whether to update an Agent’s observation strategy. We will further expand in situ sensor network capabilities to enable integration, interaction, and interoperability between ground sensor networks and UAV-based software-defined radars (SDRadars) for field-scale operations. UAVs have reached a level of technological maturity such that their inclusion within existing sensor networks with potential science applications can be feasibly considered.
We therefore pursue two principal technology development and research objectives:
Objective 1: Develop a Sensing-Policy Controller (SPC) for multi-Agent observation strategy coordination and optimization. (NOS elements b and c).
Objective 2: Develop and demonstrate integrated operations between in-situ wireless sensors networks and networks of UAV-SDRadars (NOS element a)
Our technology development plan includes analysis, computations, laboratory experiments, and field demonstrations. The entry Technology Readiness Levels (TRLs) for the above two objectives are 2 and 4 and exit TRLs are 4 and 6, respectively.
John Moisan, NASA Goddard Space Flight Center
NASA Evolutionary Programming Analytic Center (NEPAC) for Climate Data Records, Science Products and Models
We propose to develop an Analytic Center Framework (ACF), called the NASA Evolutionary Programming Analytic Center (NEPAC), that will enable scientists and engineers to rapidly formulate algorithms for satellite data products using both satellite and in situ observations. ACFs are one of two primary thrusts of the NASA Advanced Information Systems Technology (AIST) Program. NEPAC’s primary scientific goal is to discover and apply new and novel algorithms for ocean chlorophyll a (Chl-a), the key biological state variable for ocean and inland water ecosystem assessment, historically the first-order proxy for phytoplankton biomass estimates (Cullen, 1982) and necessary inputs for contemporary carbon-based productivity estimates (Behrenfeld et al., 2005), making Chl-a observations a critical element for global phytoplankton climate assessment (Behrenfeld, 2014). The ACF will initially focus on generating Chl-a algorithms to target improvement of the uncertainties and for annealing these estimates across multiple Ocean Color (OC) satellite data sets, a needed step for establishing a coherent Chl-a climate data record. Ocean phytoplankton are primary components of the global biogeochemical cycle, generating nearly half of the global atmospheric oxygen supply, and are drivers of the global ocean carbon sequestration processes (Dutkiewicz et al., 2019). The core element for this ACF is a Genetic Programming (GP) application that generates either prognostic equations, such as satellite algorithms, or coupled systems of equations, such as those used in ocean ecosystem models. The proposed design of the ACF’s work environment centers on providing ocean scientists, the ACF target community, easy-access through a web-enabled, user-friendly, Graphical User Interface (GUI) to an evolutionary search algorithm toolbox which will connect data and applications to high end computer resources. This GUI-based environment will be the portal through which users will: i) select the independent and dependent variables from relevant satellite and in situ data sets; ii) select the performance metrics from a range of traditional and improved techniques; iii) set the GP application’s free parameters and options, and; iv) select the computational infrastructure to which the work will be submitted to. The GUI will operate outside of the high end computational and data storage infrastructure. A set of Application Programming Interfaces (APIs) will be developed, or a pre-existing and compatible one modified, to link the GUI application to the computational infrastructure that will carry out the GP computations using the chosen data sets. The overall technology goal is to enable the ocean science community to use computers to objectively, efficiently and rapidly generate algorithm-based satellite products that will have improved outcomes (lower errors, bias and uncertainty) necessary for answering key science questions regarding the sensitivity of ocean ecology to relevant issues, such as global climate variability. The initial target satellite product for this proposal is Chl-a. However, we, and others, have demonstrated that satellite observations are capable of retrieving pigments (Moisan et al., 2011), phytoplankton functional types (Hirata et al., 2011; Moisan et al., 2017), and primary production estimates (Behrenfeld, 2005), all important components for ecosystem and carbon cycle studies. Algorithms for these and other potential satellite products (new production, carbon flux) can also be generated using this ACF. In addition, this ACF will be extensible so that it can support other algorithm development needs within and outside of the NASA community. NEPAC will lead to considerable time and cost savings, and yield improved algorithms that will enable improved science results.
Sreeja Nag, NASA Ames Research Center
D-SHIELD: Distributed Spacecraft with Heuristic Intelligence to Enable Logistical Decisions
Distributed Space Missions (DSMs) can increase measurement samples over multiple, spatio-temporal vantage points. Smaller spacecraft can now carry imager/radar payloads, even if constrained by power and data downlink. Large numbers of spacecraft increase responsiveness, revisit time and coverage, even with static payloads. Fully re-orientable payloads, tasked by autonomous decision-making, can further improve the capability to reactively observe phenomena. D-SHIELD is a suite of scalable software tools – Scheduler, Science Simulator, Analyzer. The goal is to schedule the payload operations of a large constellation, with multiple payloads per and across spacecraft, such that the collection of observational data and their downlink, constrained by the DSM’s constraints (orbital mechanics), resources (e.g. power) and subsystems (e.g. attitude control), results in maximum science value for a selected use case. The constellation topology, spacecraft and ground network characteristics can be imported from design tools or existing constellations. D-SHIELD Scheduler is informed by D-SHIELD Science Simulator, that is based on a simplified Observing System Simulation Experiment developed for a relevancy scenario. It assimilates data from past observations of the DSM along with other sources, and predicts the relative, quantitative value of observations or operational decisions. We will also build a D-SHIELD Analyzer, to evaluate the performance of D-SHIELD for given user inputs, and compare options for its components (e.g., optimization algorithms). Analyzer will also assess the trades to run D-SHIELD onboard vs. ground vs. combination of both. D-SHIELD is thus a mission operations design tool that coordinates autonomous reactions to new events, observations of existing events with changed requirements, and off-nominal situations like failed observations or communications.
To validate D-SHIELD, we will apply it to schedule representative constellations measuring spatio-temporal distributions of soil moisture, which varies on spatial scales of a few meters to many kilometers and time scales of minutes to weeks. Observations are accomplished most effectively via power-hungry microwave active and passive sensors (radars and radiometers) at frequency bands P to K. As soil moisture fields are sensitive to view geometry and dynamic, an ideal observational strategy requires that remote sensing instruments have a high degree of agility in their orientation and spatio-temporal sampling. Using a combination of prior/current observations and hydroecologic modeling predictions, the Science Simulator will task the next set of space assets based on anticipated events, such as heavy precipitation and flooding (rapid time scales) and droughts (not rapidly emerging but requiring well-planned long time series of observations). We will also ensure D-SHIELD’s applicability to other relevancy scenarios like tropical cyclones, wildfires, urban floods.
We have several ongoing projects complementing the proposed work. We have published an algorithmic framework that schedules the time-varying, full-body orientation of single-payload small spacecraft in a constellation. This scheduler can run at the ground station autonomously and schedules uplinked to the spacecraft, or onboard small spacecraft, such that the constellation can make decisions autonomously, without ground control. The algorithm also applies Disruption Tolerant Networking reliable and low-latency communication between constantly changing inter-satellite link. We are working on a flight mission which will demonstrate reactive measurements of plasma density by an autonomous swarm of four cross-linked spacecraft, due for launch in 2020-21. We have prototyped an agent-based simulator with centralized and decentralized planning algorithms. It can probabilistically assess the value of forming temporary coalitions to observe targets on the ground simultaneously from various platforms and sensors.
Derek Posselt, Jet Propulsion Laboratory
A Science-Focused, Scalable, Flexible Instrument Simulation (OSSE) Toolkit for Mission Design
Observing System Simulation Experiments (OSSE’s) are used to design mission and instrument constellations and evaluate the science return. A critical component in this process is the simulation of both measurements and retrievals, and the assessment of candidate measurement configurations. The trade space that consists of all possible instrument and spacecraft configurations has expanded tremendously with the recent miniaturization of instruments, and the rise of small-sat (and cube-sat) technology. An exhaustive search through all possible instrument configurations is computationally infeasible with our current tools and infrastructure. Our approach allows quantification of elements in a science and applications traceability matrix (SATM), making it distinct from other mission design toolkits.
We propose to develop a fast-turnaround, scalable OSSE Toolkit that can support both rapid and thorough exploration of the trade space of possible instrument configurations, with full assessment of the science fidelity. The capability to rapidly explore various instrument configurations is enabled through the use of both lower fidelity and state-of-the-art simulators and radiative transfer codes, along with a scalable parallel computing framework utilizing the Apache PySpark (Map-Reduce analytics) and xarray/dask technologies. The toolkit will automate the entire mission simulation workflow and scale to large analyses by parallelizing many operations, including the search of the parameter space by an ensemble of runs (30 to 1000x speedups), using cluster computing either on-premise or in the Cloud.
The toolkit will consist of:
- An Apache Spark & xarray Map-Reduce compute framework with pluggable modules in the form of Docker containers for measurement and retrieval simulation, with the flexibility to plug in other instrument simulators and retrieval codes
- A scalable system, tested in the cloud, capable of examining large numbers of potential measurement configurations with the Map/Reduce framework
- A front end GUI for easy configuration, plotting, and computation of statistics, and a set of “live” python Notebooks that use JupyterLab and Jupyter Hub, a Python client library, and a web services API.
- A Knowledge Base (ElasticSearch JSON doc database) set up to track the simulations performed to facilitate rapid exploration of simulation results and ensure reproducibility
The target application is quantitative evaluation of science return from candidate measurements made in the Decadal Survey Aerosols, Clouds, Convection, and Precipitation (A-CCP) mission. In our OSSE workflow, many instrument configurations are simulated in parallel (the Map step), including measurements (e.g., from spaceborne radar) and retrievals (e.g., ice water path and precipitation content profile). Then metrics are aggregated that allow quantitative comparison of the possible configurations (the Reduce step). The majority of the computational work is highly parallelizable by segmenting over time (each instrument view) and the ensemble of parallel runs needed to search and characterize the mission parameter space to evaluate tradeoffs. We utilize instrument simulators that have already been packaged for use within virtual machines, and as such are already capable of running on the Cloud. Our toolkit will enable a breakthrough in the number and fidelity of mission simulations undertaken by: automating the entire pipeline, parallelizing many of the steps, and providing fast turnaround for iterative exploration.
Stephanie Schollaert Uz, NASA Goddard Space Flight Center
Supporting Shellfish Aquaculture in the Chesapeake Bay Using Artificial Intelligence to Detect Poor Water Quality Through Sampling and Remote Sensing
Aquaculture in the Chesapeake Bay and around the world is the fastest growing food-producing sector, now accounting for almost 50% of the world’s fish harvest according to the FAO. Yet waterways face increasing pressures as the world’s population grows near the coasts and extreme weather leads to greater run-off from land. Pollutants such as agricultural fertilizers, livestock waste, and overflowing sewage make their way into streams and rivers, with detrimental impacts on water quality in the Bay and elsewhere [e.g. 2019 Chesapeake Bay Foundation Report Card; MD DNR, 2019; Schollaert Uz et al., 2019]. Responsible management of aquaculture requires access to reliable information on a variety of environmental factors that are not currently available at optimal scales in space and time. To address this challenge, we propose to expand the use of Earth observations from satellites and other data sets by using an Analytic Center Framework (ACF) Artificial Intelligence (AI) tool to improve efficiencies in managing diverse environmental datasets. Computationally-intensive AI algorithms trained with many sources of observations have potential to detect patterns of poor water quality not previously possible through traditional techniques. Our team of experts in remote sensing, aquatic ecology and biogeochemistry, along with Maryland Department of the Environment (MDE) shellfish regulators, has been conducting a scoping study of this problem since last fall in search of optical signatures in the water around leaking septic systems. We have been collecting and analyzing biological, chemical, and physical variables in and above the water at target sites and in the lab. Although toxins that typically cause shellfish bed closures are not discernable by traditional multispectral techniques, we are looking for hyperspectral proxies covarying with such toxins. One promising result of our pilot study has been the detection of a shifted fluorescence emission from dissolved organic matter along with emission by phycoerythrin pigments associated with high fecal coliform counts. Now we propose to team up with computer scientists to apply an AI model to this problem, using datasets collected monthly at 800 sites around the Bay for decades, in combination with remotely sensed satellite and aerial data during targeted field work to support an applied research need to more effectively and rapidly sort through disparate data sets to identify areas of poor water quality resulting in shellfish bed closure. While this project is a proof-of-concept in the well-sampled Chesapeake Bay, the long-term goal is for a global application on a satellite platform. This is aligned with several goals of NASA for this decade, namely through the development of applications that contribute to managing water quality, one of the essential but overlooked elements of regional and local sustainable water resources management. This proposal is relevant to the AIST Program Element by integrating previously unlinked datasets and tools into a common platform to address this previously intractable science problem.
Jennifer Swenson, Duke University
The Bridge from Canopy Condition to Continental Scale Biodiversity Forecasts, Including the Rare Species of Greatest Conservation Concern
Emerging remote sensing tools have the potential to track biodiversity changes that are happening now are anticipated for coming decades, with new insights on the foundations of terrestrial food webs, particularly high-quality masting fruits, nuts, and seeds that make up 30-100% of herbivore diets. The need for scientific progress on the foundations of terrestrial food webs is most acute for the threatened species of greatest conservation concern. Lidar and hyperspectral imagery provide a high-dimensional representation of canopy structure and its changing spectral reflectance properties over time that are only just beginning to be explored. Emerging insights on canopy condition must be linked to the production of mast that supports herbivores. The Masting Inference and Forecasting (MASTIF) network, coupled with remote sensing and herbivore monitoring at National Ecological Observatory Network (NEON) sites, provides a first opportunity to fully calibrate changes in canopy condition, mast production, and herbivore responses at a continental scale. Building on NASA-AIST support to initiate joint analysis of NEON and remote sensing products, this proposal confronts three main prediction challenges for ecological forecasting, i) translating the most recent remote sensing products into the canopy condition and drought stress metrics having greatest promise for predicting the supply of mast resources to herbivores, ii) leveraging information from the full community of species to improve prediction for the rare species of greatest conservation concern, and iii) resolving the big-data challenge represented by spatio-temporal dependence in large, raster-based arrays of remotely sensed canopy variables and climate. We will create a web visualization portal that provides display and interaction with model results as well as visualization of biodiversity trends.
The three elements of our proposed study lead from canopy characterization to mast production by individual trees to continent scale prediction of mast and the consumers that depend on it, jointly as a community. First, we characterize canopy condition at NEON locations, where hyperspectral and lidar data are paired with mast-production and consumer data. Second, new canopy condition variables enter as covariates (predictors) for tree fecundity, including coarser-scale remotely-sensed drought stress indices (e.g., a new thermal stress index developed from MODIS LST (DroughtEye, 4 km), and NASA/JPL’s new ECOSTRESS sensor for validation and downscaling). This second step uses MASTIF and Generalized Joint Attribute Modeling (GJAM) for joint community prediction. Third, we use US Forest Inventory and Analysis plots with the fitted MASTIF to project mast production nationally. This third step engages dimension reduction techniques developed specifically for dependence in large spatio-temporal arrays. Finally, all canopy condition and drought indices, habitat, and mast predictors are used to predict consumer abundances nationally. These predictive distributions include conditional prediction of rare species to reevaluate ecological forecasting for conservation goals.
We will provide public access to our results and visualization of the NEON biodiversity trends, indices of ecosystems that characterize community and physical habitat structure, energy flow, herbivore food webs, and their vulnerability to climate change by drought stress (http://pbgjam.org). Products of this analysis will take the form of i) software for dimension reduction in ecological forecasting, available as an R package, ii) remotely sensed canopy condition and drought-stress maps iii) community predictions of biodiversity change and iv) conditional prediction of threatened species.
The goals of this project parallel those of The Decadal Survey in tracking flows of energy and changes in ecosystem structure, function and biodiversity, and relying on national field data collection and space-borne NASA mission data.
Philip Townsend, University of Wisconsin, Madison
On-Demand Geospatial Spectroscopy Processing Environment on the Cloud (GeoSPEC)
The 2017-2027 Decadal Survey prioritizes a spaceborne imaging spectrometer to advance the study of Surface Biology and Geology (SBG) globally. It would generate large volumes (~20 TB/day) of high dimensional (>224 bands) data, with a wide range of measurement objectives spanning vegetation, hydrology, geology, and aquatic studies. Many of the observables identified for SBG have been demonstrated using airborne imaging spectroscopy, but there is a potentially very large set of products that may be asked of a mission like SBG, and the algorithms, L1-to-L3+ processing flows, and intermediate products may vary considerably by application. Further, multiple well-vetted algorithms in the processing chain may exist for the same observables (and are ever-improving), and users may wish to tune the retrievals based on parameterizations particular to an application or using constraints from field measurements to improve algorithm performance in localized settings. Because of the demand for products from imaging spectroscopy from a wide range of users as well as the dynamic nature of the algorithms, our basic premise is that in the future science data systems for imaging spectrometer data will differ dramatically from current approaches. We propose an On-Demand Geospatial Spectroscopy Processing Environment on the Cloud (GeoSPEC) as a necessary information technology innovation required to meet the needs of both users and distributors of the products of imaging spectroscopy in the forthcoming era of widespread and possibly global hyperspectral data availability.
We will develop the technology using terrestrial vegetation use cases for mapping vegetation foliar traits and fractional cover. We will provide users with options for new atmospheric correction protocols, other corrections (such as BRDF and topography), and options for algorithm selection. GeoSPEC will also include off- and on-ramps in the processing workflow for users to implement their own code or commercial programs on their own systems, thus facilitating flexibility in the application of vetted algorithms for product distribution. The proposed project will leverage existing NASA-funded technologies:EcoSIS for spectral libraries, EcoSML for spectral models, the SWOT/NISAR cloud-based science data system (HySDS), as well as data analysis services (Apache SDAP), interactive visualization and analytics (Common Mapping Client, CMC) and open-source python packages HyTools for hyperspectral processing and ISOFIT for atmospheric correction.The platform will be developed using Level 1 calibrated radiance from two imaging spectrometers with large and diverse data records: NASA AVIRIS-Classic and AVIRIS-NG. Ultimately, development of flexible, on-demand cloud-based processing of hyperspectral imagery- which does not currently exist -will reduce barriers to usage of complex imaging spectroscopy data and its products, such as NASA’s vast airborne AVIRIS archives at present, and data from EMIT and HISUI in the near future, and eventually SBG.
Finally, there is a large potential user base for the L3+ outputs of SBG-like measurements, which for our use case includes ecologists and ecosystem scientists interested in using the high-level products to predict processes related to carbon uptake and nutrient processing, or to characterize biodiversity based on spatial variation in functional traits. These potential users represent a new constituency for imaging spectroscopy data products who previously would not have had access to such measurements across broad spatial and temporal extents. GeoSPEC is a necessary Analytic Center Framework development both for a scientific community wishing to use the high-level products of imaging spectroscopy (we have three letters of endorsement representing Map of Life, distribution modeling, and evolutionary biology) as well as two mission communities (letters from SBG mission study leadership and NEON). GeoSPEC enters at TRL 3 and exits at TRL 5.
Jia Zhang, Carnegie Mellon University
Mining Chained Modules in Analytic Center Framework
NASA is building Analytic Center Framework (ACF) as a collaboration platform for community users to harmonize existing tools, data and computing environments. In the next 5-10-year timeframe, it can be anticipated many data analytics tools and models will be published onto the NASA ACF as reusable software modules.
However, a large number of software modules will make it difficult for Earth scientists to choose from as components to build their new data analytics experiments (workflows). How to help Earth scientists find suitable software modules from a sea of available candidates and use them productively will soon become a big challenge for ACF thus demands a systematic study.
Therefore, we propose this AIST project preparing for the next 5-10-year timeframe, aiming to build a workflow tool as a building block for ACF, which is capable of recommending multiple software modules, already chained together, based on their past usage history. We propose to study how software modules collaborated, or could collaborate, to serve various workflows. Based on such investigations, we aim to develop a recommend-as-you-go technique to assist Earth scientists in designing data analytics workflows to address previously intractable scientific questions. The ultimate goal is to reduce time-to-experiment, share expertise and avoid reinvention of wheels.
It should be noted that in recent years, many Earth scientists have adopted the Jupyter Notebook to conduct interactive data analytics. Thus, we will leverage the Jupyter Notebook as the environment to focus on the following three objectives:
(1) to develop algorithms to mine software module usage history and dependencies from Jupyter notebooks, and construct a knowledge network to store and retrieve data effectively;
(2) to develop algorithms to explore and extract reusable chain of software modules from the knowledge network; and
(3) to develop an intelligent service that provides personalized recommend-as-you-go support to help Earth scientists design workflows.