|
Home
Agenda
Register
Participants
References
Site Bytes
LTER Home
ASM 2003
|
2003 LTER Site Byte
LTER Site: Georgia Coastal Ecosystems
Contributor: Wade Sheldon (Sep 11, 2003)
Site Byte:
Overview Information management efforts over the past year were largely directed towards providing rapid access to quality-controlled and documented data from our expanded monitoring program and from affiliated monitoring programs operated in partnership with USGS, the Sapelo Island National Estuarine Research Reserve, and UGA Marine Institute. Data downloaded semi-monthly from our network of moored hydrographic data loggers are now processed, documented, plotted and posted on the GCE private web site within 1 day of collection, providing immediate access to GCE participants.
In November, 2002, we used supplemental funding provided by NSF for LTER ClimDB/HydroDB participants to equip our primary climate station on Sapelo Island (operated in partnership with SINERR and UGAMI) with a modem to permit remote data access. Communications software also purchased with these funds was then used in conjunction with the GCE Data Toolbox for MATLAB (described below) to develop a fully automated data harvesting, processing, and web-posting application. This automated harvesting system was also extended to include real-time data from the meteorological and hydrographic station at Hudson Creek/Meridian Landing and river gaging station on the Altamaha River at Doctortown obtained from the USGS web site. As a result, GCE participants now have access to near-real-time meteorological and hydrographic data for several locations across the study domain; these data are automatically harvested, documented, quality-checked, plotted and posted to the web on a daily basis. In addition, past data from these stations for 2000-2002 and historic weather data from the National Weather Service station on Sapelo from 1957-2002 were also acquired, documented, plotted and posted this year to provide GCE researchers with a long-term database of climate observations.
All of these monitoring data are automatically standardized to common units and date/time standards and provided in both ASCII and MATLABŪ file formats to enhance the usability of the data and facilitate comparative and synthetic analyses. Data from the meteorological stations are provided to the broader scientific community through ClimDB on a continuing basis, and yearly data sets from hydrographic monitoring will be added to the GCE Data Catalog for public distribution on an annual basis. Work funded by the NERR program System-wide Monitoring Program is now underway to wire the existing hydrographic sonde at Marsh Landing to the weather station data logger, which will automatically add water quality parameters to the data being harvested at our primary weather station.
Software Development The MATLAB data analysis toolbox developed in years 1 and 2 of our project was significantly enhanced this year to support these automated data processing initiatives (http://gce-lter.marsci.uga.edu/lter/research/tools/data_toolbox.htm). For example, dynamic quality control flagging functions were extended to support multiple-parameter expressions to allow more complex criteria to be evaluated (e.g. flagging dissolved oxygen concentrations based on calculations of oxygen saturation made using corresponding data in temperature and salinity columns). This allowed us to develop more sophisticated filters for automatically identifying and flagging invalid and questionable data values and also propagating flags between mutually-dependent parameters during automated processing. Support for manual QA/QC flagging by visual inspection of values in worksheets or plots was also added, allowing us to flag questionable values in highly dynamic or structurally complex data sets when algorithmic analyses proved to be impossible or impractical. During the past year we also began providing access to the toolbox and documentation on the public GCE web site to support end-user analysis of our data and as a service to the LTER and broader scientific community.
Data Catalog Additions At the request of data users we began providing access to all GCE data sets as standard array-based MATLAB binary files this year, in addition to the various ASCII and structure-based MATLAB formats already offered. A MATLAB Web Server application was also developed for customization of data sets prior to download, allowing end-users to select various custom ASCII and MATLAB file formats and specify metadata detail level, handling of QA/QC-flagged values, and custom missing value characters; users can additionally request multiple statistical summaries of data sets grouped by various parameters. This application is currently only available to GCE members, but public access will be provided after more testing. Web-based data set sub-setting, resampling, and visualization applications are also under development. These server-side programs allow non-MATLAB users to benefit from the metadata-driven data processing tools developed as part of the GCE IM program using a familiar HTML forms interface.
Web Site Additions We developed a web-accessible bibliographic database for GCE publications that supports custom searches, display of abstracts and hyperlinks, and bibliography generation in several formats, including EndNote import (http://gce-lter.marsci.uga.edu/lter/asp/db/biblio_query.asp). The entire catalog of UGAMI publications (Collected Sapelo Reprints) was also uploaded to this database, allowing the scientific community to search online for publications from all research performed on Sapelo over the past 50 years. A URL-API interface (based on querystring parameters) was also developed to allow customized data retrieval in EndNote format for synchronization with the LTER all-site bibliography; xml files in eml-literature format will also be available shortly. We also enhanced the integrated taxonomic database developed in year 2 to support tabular species list generation (HTML table and CSV file formats) to permit use of the GCE species list in LTER cross-site comparisons and other database projects, and cross-referenced our species records to the USDA Integrated Taxonomic Information System database. Corresponding ITIS taxonomic serial numbers and hyperlinks are displayed on the web to support comprehensive taxonomic research (http://gce-lter.marsci.uga.edu/lter/asp/db/all_species_lists.asp). Details about the development of this web-accessible taxonomic database were presented at the annual LTER Information Managers meeting in Orlando, FL, and in an OBFS informatics training workshop in Sevilleta, NM.
USGS Data Harvesting Service In cooperation with Don Henshaw and Suzanne Remillard of Andrews LTER, automated data harvesting and processing technology developed at GCE was linked with the HydroDB database harvesting programs to produce a fully automated system for uploading USGS streamflow data for any station across the country to HydroDB. This cooperative effort allows LTER and USFS sites that don't collect and manage streamflow data but are near to a USGS gauging station to participate in HydroDB, expanding the scope of the database and its utility for hydrologists and climatologists seeking to perform cross-site comparisons. Finalized and provisional data are harvested weekly, and provisional data are automatically over-written as new finalized data are released to ensure that HydroDB contains the best available data and USGS-assigned Q/C flags are represented. Seven LTER sites are now participating in this harvesting service (GCE, AND, KBS, LUQ, NTL, PIE, SBC), for a total of 31 streamflow stations harvested each week. This cooperative effort serves as a good example of how metadata-based software tools and data format standards aid synthesis by easing the extension of site-specific tools and technologies to broader-scale problems; a LTER Network News article on this collaboration is already planned.
Site Review We also underwent our first mid-term NSF review this past year. Preparation for this review brought very welcome focus on our web site and IM program by all our project participants. I provided several overview presentations and training sessions to PIs, staff and graduate students, which were very well received and provided excellent opportunities for feed-back on all sides. On the downside, I was inundated with new data submissions (many requiring extensive editing and documentation) by some reluctant data providers. This severely taxed our limited IM resources and interrupted work-flow for over a month, but raised awareness of the need for timely data submission among our lead PIs and Co-PIs.
The review itself went very well, and our IM program was particularly well received by all reviewers and highlighted throughout the report. Tight integration with the science program, development of automated data harvesting and distribution technology, and high levels of network participation and service were listed as key strengths.
|