Title: Anomaly Detection and Analysis Framework for Terrestrial Observation and Prediction System (TOPS)
Author: Petr Votava
Organization: NASA Ames Research Center/CSUMB California State University - Monterey Bay
Co-Authors: Ramakrishna Nemani, Andrew Michaelis, Hirofumi Hashimoto
Terrestrial Observation and Prediction System (TOPS) is a flexible modeling software system that integrates ecosystem models with frequent satellite and surface weather observations to produce ecosystem nowcasts (assessments of current conditions) and forecasts useful in natural resources management, public health and disaster management. We have been extending the Terrestrial Observation and Prediction System (TOPS) to include capability for automated anomaly detection and analysis of both on-line (streaming) and off-line data. While there are large numbers of anomaly detection algorithms for multivariate datasets, we are extending this capability beyond the anomaly detection itself and towards an automated analysis that would discover the possible causes of the anomalies. There are often indirect connections between datasets that manifest themselves during occurrence of external events and rather than searching exhaustively throughout all the datasets, our goal is to capture this knowledge and provide it to the system during automated analysis. This results in more efficient processing. Since we donít need to process all the datasets using the original anomaly detection algorithms, which is often compute intensive; we achieve data reduction as we donít need to store all the datasets in order to search for possible connections but we can download selected data on-demand based on our analysis. For example, an anomaly observed in vegetation Net Primary Production (NPP) can relate to an anomaly in vegetation Leaf Area Index (LAI), which is a fairly direct connection, as LAI is one of the inputs for NPP, however the change in LAI could be caused by a fire event, which is not directly connected with NPP. Because we are able to capture this knowledge we can analyze fire datasets and if there is a match with the NPP anomaly, we can infer that a fire is a likely cause. The knowledge is captured using OWL ontology language, where connections are defined in a schema that is later extended by including specific instances of datasets and models. The information is stored using Sesame server and is accessible through both Java API and web services using SeRQL and SPARQL query languages. Inference is provided using OWLIM component integrated with Sesame.