Ecological Metadata Language

Background

Ecological Metadata Language (EML) is a metadata specification developed by the ecology discipline and for the ecology discipline. It is based on prior work done by the Ecological Society of America and associated efforts (Michener et al., 1997, Ecological Applications). EML is implemented as a series of XML document types that can by used in a modular and extensible manner to document ecological data. Each EML module is designed to describe one logical part of the total metadata that should be included with any ecological dataset.

Complete information about the EML specification and its development and application is available on the Ecoinformatics.org web site.

EML in LTER

In July 2002, the LTER Information Managers Committee voted to pursue the adoption of EML 2 as a network-level metadata standard to promote integrated data cataloging and software-mediated data set discovery and integration across the LTER Network. The EML 2.0 specification was finalized in December 2002, and in Spring 2003 the LTER Coordinating Committee voted to accept the IM Committee recommendation and recognize EML 2.0 as the official metadata standard for the LTER Network and Network Information System.

EML Implementation at GCE

As of October 2003, EML metadata is dynamically-generated for every data set in the GCE Data Catalog from information stored in the GCE Metadata Database. EML documents and corresponding data files are offered as an alternative to formatted text metadata and standard text and MATLAB distribution files on the detail page for each data set. Support was also added for accessing data via EML documents hosted in external metadata servers at the LTER Network Office and elsewhere to support proposed grid-based computing initiatives.

EML metadata documents, conforming to EML schema version 2.1.0, are available for each data set at two levels of detail:

  • Basic EML Metadata contains general data set information, including geographic, temporal and taxonomic coverage, and is optimized for data set discovery in general-purpose metadata catalogs
  • Complete EML Metadata contains all discovery information plus complete details about the processing, structure and distribution of the data table, optimized for automated access and analysis of GCE data sets

Standard and user-customized species lists generated from the GCE Taxonomic Database are also optionally available in EML 2.1.0 format, with complete systematic information included as a taxonomicCoverage section within a species list data set. Coverage information is available in list form, with all taxonomic ranks listed for each species record, and in compact phylogenetic tree form with shared ranks within each taxonomic group removed.

EML Support in the GCE Data Toolbox

In 2013 the GCE Data Toolbox for MATLAB software was extended to support importing EML metadata and EML-described data tables into the MATLAB environment for analysis and synthesis. In September 2014 this support was extended to include EML export as well, so complete EML-described data packages can now be generated natively using this software.

With EML package export support, the GCE Data Toolbox can now be used as a complete data synthesis environment for the LTER NIS and PASTA framework. EML-described data packages can be programmatically retrieved from the LTER Data Portal and processed using the toolbox, then products can be packaged and exported as new EML-described data sets, all without manual XML editing. The EML support functions are also modular, so EML attributeList trees, dataTable trees or complete documents can be generated in workflows for more advanced use cases (e.g. multi-entity synthetic data sets). The GCE Data Toolbox can therefore function as a stand-alone synthesis engine for the LTER Data Portal, as well as part of a large software system for metadata-based processing and analysis.

LTER
NSF

This material is based upon work supported by the National Science Foundation under grants OCE-9982133, OCE-0620959 and OCE-1237140. Any opinions, findings, conclusions, or recommendations expressed in the material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.