Document Details

Title Using the GCE Data Toolbox as an EML-compatible workflow engine for PASTA
Archive All Files / Documents / Publications / Newsletter Articles
Abstract

The GCE Data Toolbox for MATLAB was initially developed in 2000 to process, quality control and document environmental data collected at the then-new Georgia Coastal Ecosystems LTER site (Sheldon, 2001). Development of this software framework has continued steadily since then, adding graphical user interface dialogs (Sheldon, 2002), data indexing and search (Sheldon, 2005), web-based data mining (Sheldon, 2006; Sheldon, 2011b), dynamic QA/QC (Sheldon, 2008), and a growing suite of tools for automating data harvesting and publishing (Sheldon et al. 2013; Gries et al., 2013). We began distributing a compiled version of the toolbox to the public in 2002, and in 2010 we released the complete source code under an open source GPL license (Sheldon, 2011a). Today, the GCE Data Toolbox is used at multiple LTER sites and other research programs across the world for a wide variety of environmental data management tasks, and we are actively working to make it a more generalized tool for the scientific community (Chamblee et al., 2013).The toolbox can be leveraged in many ways, but it has proven particularly useful for designing automated data processing, quality control and synthesis workflows (Sheldon et al., 2013; Cary and Chamblee, 2013; Gries et al., 2013). Key factors include broad data format support, a flexible metadata templating system, dynamic rule-based QA/QC, automated metadata generation and metadata-based semantic processing (fig.1). Consequently, the GCE Data Toolbox was one of the technologies chosen for a 2012 LTER NIS workshop convened to test the PASTA Framework for running analytical workflows (see http://im.lternet.edu/im_practices/data_management/nis_workflows). The lack of built-in support for EML metadata proved to be a significant barrier to fully utilizing this toolbox for PASTA workflows during the workshop; however, complete EML support has since been implemented. This article describes how the GCE Data Toolbox can now be used as a complete workflow engine for PASTA and other EML-compatible frameworks.

Contributor Wade M. Sheldon
Citation

Sheldon, W.M. Jr. 2014. Using the GCE Data Toolbox as an EML-compatible workflow engine for PASTA. In: LTER Databits - Information management Newsletter of the Long Term Ecological Research Network: Fall 2014. LTER Network, Albuquerque, NM.

Key Words EML, LTER-IMC, MATLAB, metadata, NIS, PASTA, toolbox, workflows
File Date 2014
Web Link PDF file
view/download PDF file
LTER
NSF

This material is based upon work supported by the National Science Foundation under grants OCE-9982133, OCE-0620959, OCE-1237140 and OCE-1832178. Any opinions, findings, conclusions, or recommendations expressed in the material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.