Title of Presentation: Segmentation Data Analysis

Primary (Corresponding) Author: Jeffrey D. Scargle

Organization of Primary Author: NASA Ames Research Center

 

Abstract:  Much astrophysics, Earth and space science data are in the form of measurements distributed over a data space of some fixed dimension.  Examples are:

            time series data -- 1D

            image data -- 2D

            redshift galaxy surveys -- 3D

            gamma-ray photon data -- 4D  (space, time, energy)

The measurements can refer to a physical quantity measured over a pre-defined interval (e.g. pixel) or to discrete points distributed over the data space (e.g., X-ray or gamma-ray photon maps).  In both cases, science analysis can be based on a density estimation procedure, followed by for example the detection of clusters or sources from the density map.

I have developed an algorithm that yields a segmentation analysis of 1D data, based on finding the optimal segmentation of the interval for a given fitness function (e.g. a likelihood for the piecewise constant model of the data).  In addition, the optimal segmentation problem in a space of any dimension can be rigorously solved with an extension of the 1D algorithm.

Examples of such analysis include:

  • Light-curves of gamma-ray bursts
  • Cluster and other structural analysis of the large scale distribution of galaxies with the Sloan Digital Sky Survey
  • Point source identification and characterization in gamma ray data (for GLAST, the Gamma Ray Large Area Space Telescope, to be launched in November, 2007)
  • Anomaly detection for a homeland security problem (detectiona of anomalous events in domestic water distribution systems)