|
|
LTER Site Bytes
- AND (Andrews LTER)
- ARC (Arctic LTER)
- BES (Baltimore Ecosystem Study)
- BNZ (Bonanza Creek LTER)
- CAP (Central Arizona - Phoenix)
- CCE (California Current Ecosystem)
- CDR (Cedar Creek LTER)
- CWT (Coweeta LTER)
- FCE (Florida Coastal Everglades)
- GCE (Georgia Coastal Ecosystems)
- HBR (Hubbard Brook LTER)
- HFR (Harvard Forest)
- JRN (Jornada Basin)
- KBS (Kellogg Biological Station)
- KNZ (Konza Prairie LTER)
- LUQ (Luquillo LTER)
- MCM (McMurdo Dry Valleys)
- MCR (Moorea Coral Reef)
- NTL (North Temperate Lakes)
- NWT (Niwot Ridge LTER)
- PAL (Palmer Station)
- PIE (Plum Island Ecosystem)
- SBC (Santa Barbara Coastal)
- SGS (Shortgrass Steppe)
- VCR (Virginia Coast Reserve)
- Other (Guest Participant)
LTER Site: Andrews LTER (AND)
Contributor: Suzanne Remillard, Don Henshaw, Theresa Valentine (Jul 26, 2007)
Site Byte:
1. Annual Site Byte
After spending 3 days backpacking in Rocky Mountain National Park, following last year’s meetings, the IM team was rejuvenated and ready to move ahead with the next year of IM activities. The main focus has been a planning process in conjunction with our LTER6 proposal submission (due Feb 2008). Planning activities have included significant interaction between the IM Team and site PIs with the objective of developing a strategy for keeping pace with the large IM workload. The strategy includes streamlining certain IM processes, better defining the responsibilities of PIs and IMs with respect to providing and posting data online, better defining the bases for assigning priorities to individual data sets, and actually setting major priorities for this proposal writing year. One of our streamlining strategies is the revamping of our QA/QC process for our climate data with the goal of being more efficient and reducing the hands-on need for data processing and graphical checking. A strategy for improving the data submission process includes improving guidelines for data submission and improving our web administrative interface to allow PIs to enter and edit their upper level metadata (abstract, coverage, methods, etc.). Additionally, the Andrews Team received a $25K planning grant and will conduct a fall workshop to consider planning for increased streaming sensor data and associated Q/C, larger bandwidth and WiFi or WiMax capability, and considering long-term personnel needs.
______________________
2. Planned IM Projects
a. Improve the efficiency of data processing and Q/C for our climate, streamflow, air shed and other planned sensor networks b. Develop a proposal for improving telemetry to accommodate new, streaming data sets, establishing WiFi or WiMax throughout the Andrews, and increasing our bandwidth for communications c. Review all data and metadata and set priorities for updating, establishing online, improving the metadata content, and improving EML d. Improve the data submission process through 1) better defining the responsibilities of the PI and IM, and 2) improving the web interface for managing and editing metadata to better accommodate site researchers e. Create GIS web services using ESRI ArcGIS Server technology to provide remote web access to Andrews GIS data by researchers/staff at the Andrews site
LTER Site: Arctic LTER (ARC)
Contributor: Jim Laundre (Aug 01, 2007)
Site Byte:
1. Annual Site Byte
This past year Arctic LTER updated its web pages and continued work on implementing our new Excel metadata entry form (http://ecosystems.mbl.edu/arc/metadata_forms/MetadataBlank.xls). Since most ARC-LTER investigators use Excel this form allows the metadata and data to reside in the same workbook. VBA macros are then used to output EML, rich text, ASCII and html files for the web.
Our site review occurred this past June. I was not able to be at the field site during the review but reports back indicate that the site review team had a good time and that many useful discussions were held. The report will be out soon.
Several new large projects will provide opportunities for enhancing the data information available for the Toolik Region:
- International Polar Year (IPY): Collaborative Research on Carbon, Water, and Energy Balance of the Arctic Landscape at Flagship Observatories and in a PanArctic Network (AON). This project will provide landscape-level data and analyses of terrestrial carbon, water, and surface energy. An integrated terrestrial carbon/water/energy database for use in modeling and synthesis activities will be developed. Proposed is the development or extension of three databases: (1) the Toolik/Imnavait database (already a part of the Arctic LTER database, (2) a Cherskii database (to be developed and archived with the Arctic LTER) and (3) the PanArctic database (to be developed in partnership with our collaborators, archived with the Arctic LTER).
- Toolik Field Station is the tundra candidate for The National Ecological Observatory Network (NEON).
These projects will expanded the environmental baseline datasets and provide exciting opportunities for science at Toolik. The task of integrating the databases will both be challenging and rewarding.
Unfortunately because of family commitments I will not be able to attend this year’s meeting. The IM team member who was planning on attending also had to cancel.
______________________
2. Planned IM Projects
1) Add new and update existing datasets.
2) EML implementation: Continue work on the Excel metadata worksheet entry form. More documentation is need and taxonomic coverage and protocols sections need to be added in the VBA macro that output EML files. Our legacy metadata has been converted to Ecological Metadata Language (EML) but only at EML Best Practices level 2/3 (no attribute EML). In bringing the files up to level 4, the files will be reviewed and where appropriate consolidated into multi-year files. Differences in methods and personnel will require that some years remain separate.
3) Train of new ARC IM team members in metadata and data methods.
4) Integration of non-LTER data such as AON (see above in section 1).
5) Evaluate relational databases for use with some of our long-term datasets.
LTER Site: Baltimore Ecosystem Study (BES)
Contributor: Jonathan Walsh (Aug 01, 2007)
Site Byte:
1. Annual Site Byte
We underwent our mid-term review. Our report was for the most part very good. With regard to information management the report indicated we should be investing more but other than that the system was met with praise.
We are “adopting” our metadata system. The Open Research System (ORS) which we have utilized for years as our metadata repository and clearing house has run into funding shortfalls. Up until now it has been hosted by Dr Charlie Schweick’s public policy GIS lab at the University of Massachussetts Amherst. In order to continue using it, we have installed the requisite hardware – a server – and software – ColdFusion, Server 2005, SQL server 2005 – at the Institute of Ecosystem Studies and we are currently porting the existing code and data onto it.
We have continued to work with the synthesis of demographic and social data and physical data.
We now have close to seven years of stream chemistry data online and available to the public. These data represent several sites along the urban rural gradient of the city of Baltimore.
______________________
2. Planned IM Projects We plan to facilitate a means to combine geodatabases. This will be accomplished by means of the ORS metadata system. The metadata generated by these geodatabases – in Arccatalog – will be synthesized so the geodatabases can be polled in larger searches.
We plan to increase bandwidth of our sensing networks and move toward more wireless solutions.
We plan to collect data as part of educational outreach and involve high school students in the design and planning of the sampling and collection efforts.
We plan to improve our online databases used in collection (rather than dissemination) of data. We plan to increase the number of fields collected, adopt wireless connectivity to the data, and facilitate more streaming data collection.
Jonathan M Walsh, August, 2007
LTER Site: Bonanza Creek LTER (BNZ)
Contributor: Brian Riordan (Aug 01, 2007)
Site Byte:
1. Annual Site Byte
The BNZ IM group has been very busy the last year as we prepped for our site review that occurred in June. Some of our major accomplishments are: · Level 4-5 EML for all of our data files (Thank You Inigo!!!) · Versioning for our EML that adjusts as metadata changes · 10 hourly streaming weather stations
We are in the process of implementing field data loggers to help stream line our core datasets into our database. This will help to improve our QA/QC as well as allow us to make ecological observations as they occur.
Our site review went well and we are feeling rejuvenated by many of the responses from the review committee. Thank you Kristin for taking the time to come up to AK.
We are in the process of a Data Manager change as Brian has left for the Minnesota Dept. of Natural Resources to pursue GIS/Remote Sensing. We hope to have a new IM hired by the end of August.
______________________
2. Planned IM Projects
a) Rebuild our Intranet to allow our Pis the ability to adjust their datasets as well as add new ones.
b) Rebuild the way we maintain site information (super sites, sub-sites, ect) This could possible entail a large GIS project
c) Continue to develop web applications for data that is stored in the database for on the fly analysis
d) Explore historical data files and metadata to ensure that the information is correct and robust.
LTER Site: Central Arizona - Phoenix (CAP)
Contributor: Corinna Gries (Jul 31, 2007)
Site Byte:
Site Bytes 2007
Many important changes have taken place for CAP data management. We moved all our data out of SQL server and into MySQL, which is now our main database server. In addition we have moved all data that were stored on CAP servers to storage space at ASU’s High Performance Computing Cluster. The web applications hosted on our servers were moved to a virtual server in the same cluster. Currently, we are maintaining only our development server on site. The Global Institute of Sustainability has provided personal storage space as well as collaboratively accessible space for researchers and administrators. Therefore, this move affected the entire personnel at GIOS which implies major efforts in user support.
The CAP website saw many additions this year. Several password protected areas were added for data entry, and the regular management activities for data provided to the LTER Network databases. The data entry applications are written in php using and improving the earlier developed template. Entry applications for all long term data collection efforts at CAP are in place. These applications replace the formerly used ACCESS front ends and have already proved to be much easier to access and use by the data entry personnel. In the process the data models were streamlined, metadata tables added and attribute level metadata stored in comment fields directly in MySQL table descriptions. The earlier developed reverse engineering tool now produces valid level 5 EML automatically with minimal editing of the resulting document. The applications to manage contributions to the LTER Network databases was added in an effort to document these activities in one central location.
Public data access was changed on the CAP website. Download applications for the large long term monitoring databases were developed and table level data download was enabled via EML files. EML files now allow direct data access which may be used by other applications as well as the CAP website. Currently, a user registration and logging of downloads are developed.
Ed Gilbert, who originally developed our biodiversity portal within the Southwest Environmental Information Network (SEINet http://seinet.asu.edu) as a grad student has been working for GIOS this past year as web developer. In this capacity he developed new online keying application within SEINet. The applications provide major improvements over traditional matrix based online keys like DELTA intkey. The underlying data model combines hierarchical taxonomic information obtained from ITIS with the traditional character and character state information. This allows for strict normalization of information and character inheritance within taxonomic groups which greatly reduces data entry and management efforts. In addition, this approach allows for easy integration of new species if the key is to be expanded geographically. It also provides the means to dynamically build keys based on a subset of species and makes the key very user friendly as characters may be added and dropped dynamically depending on where in the keying process a user is.
Last year the Arizona Water Institute received funding from the Governor’s office for water related research and the development of an Arizona Hydrologic Information System. This is a tri University effort (University of Arizona, Northern Arizona University and Arizona State University) in which we are developing infrastructure for data storage, curation and access. A federated data storage system is accessed via web services which provide the data in standard format modeled according to CUAHSI’s standards. One full time programmer and two computer science grad students are now working at GIOS to develop a metadata editor and management system, web services accessing groundwater data from the Arizona Department of Environmental Quality and data in the SRP flood warning system, and a search engine for data sets.
The School of Sustainability is growing in leaps and bounds. Five new faculty positions have recently been added through an endowment to the School. Charles Redman, has given up the directorship of GIOS to fully concentrate on running the school and ASU’s former Vice President of Research, Jonathan Fink, has taken over as director of GIOS.
______________________
2. Planned IM Projects
online EML editor data set search engine onine specimen keying application
LTER Site: California Current Ecosystem (CCE)
Contributor: Karen Baker (Sep 13, 2007)
Site Byte:
1. Annual Site Byte
There were two major foci this year: growth of the information infrastructure and launch of a queriable site information system. In terms of physical infrastructure, a second server (iSurf) was deployed providing increased storage together with isolation of web and collaboration services and also providing an LDAP server for single sign-on authentication. In terms of organizational infrastructure, department arrangements for a computational recharge facility were supported as a growth strategy central to cross-project infrastructure and resource sharing. Our physical work area was moved from a long-time laboratory to a more compact Design Studio, centered on a refurbished design table and posting spaces that serve as memory traces. In terms of conceptual infrastructure, we continue to develop the tie between theory and practice through the framework of Ocean Informatics. In terms of social infrastructure, an informatics team approach coalesced to include KBaker, MKortz, JConners, LYarmey, JWanetick together with part-time associates and stronger ties with other local data initiative efforts (CalCOFI and SWFSC).
Throughout the year, Datazoo, an online information system, was designed and developed, combining a MySQL backend with an object-oriented, PHP-based web frontend. A data link and a site data policy were added to the CCE website data page and connected to the information system. Data practices that included the information system were discussed with site researchers and students during demonstrations of the system held throughout the summer. The system architecture addresses critical data integration issues so that multi-dataset browsing, comparison, and joins are possible. Though the core functions had all been addressed by June, subsequent recoding improved the interface and created new functionality. We expect this process to continue according to the iterative design principle that “Data-using informs Information-Managing informs Data-using”.
Population of the database began over the summer, during which the data ingestion procedure and template were streamlined. Datazoo is now a cross-site information system using an augmented EML specification where metadata describes datasets to the column level through use of attribute, unit, and qualifier dictionaries. Work on attribute naming conventions and re-naming conventions was carried out. Capacity to publish dataset metadata to the community catalogue was demonstrated by submitting our first dataset to the Metacat server this summer.
Tools were added as a feature of the information system. A grid converter, which calculates line and station given latitude and longitude (and vice versa) and uses Google maps for visual display, was given a second interface for batch mode input. Code was refactored to separate input from display functionality. A time formatting tool permits construction of time in a format required for dataset ingestion. Finally, a joining tool permits matching a dataset file against an existing eventlog dataset so that designated columns can be added to the original dataset file.
Support was provided for the second CCE process cruise in May 2007. The eventlogger was installed onboard, providing centralized activity coordination. In addition, procedures were developed for Picture-of-the-Day in support of Education/Outreach. The development of cruise web sites continued with the cruise glossary and eventlog among the first files posted. A cruise map was generated this year using the previously developed dynamic mapping tool, and the participant list was ingested using the existing standard template.
Three posters were presented at the annual IMC: Data Integration in the Decade of Synthesis, (MKortz, LYarmey, JConners, and KBaker); Environmental Data Management: Infrastructure Studies Insights (FMillerand and KBaker); and Long Term Informatics (KBaker, CChandler, AGold, FMillerand, and JWanetick). More than a dozen Databits contributions over the year included news articles, reports, and good reads as well as an editorial and an FAQ. Work with the NSF Comparative Interoperability Project continued bringing forward the concepts of ‘enactment’, ‘articulation work’ and of ‘knowledge provinces’. A joint keynote with FMillerand at the IMC07 presented to the community the notions of roles, informatics, and sociotechnical systems. A history associate worked at SIO early summer focusing on historical data models.
______________________
2. Planned IM Projects
Planned projects include continuing to migrate data into the new Datazoo system and populating our augmented EML metadata schema by working through the metadata web forms with participants. We recognize this as an opportunity to elicit tacit data collection information and to facilitate category building and vocabulary development with local participants. Having added a user-friendly view of the metadata this year in addition to the xml view, we will be considering methods for download of the metadata with the data. With Datazoo, we will reconsider both project-study and cross-project relations as well as dataset creation and ingestion. Further, there will be a more extensive online help system developed in coordination with user input. Finally, as an outgrowth of working groups on Matlab and on plotting reviews held in the last year, we are planning to develop visualization in general and a plot gallery that builds from the redesign of existing media galleries in particular.
Ongoing partnerships within LTER and with Ocean Informatics Woods Hole and MIT will continue as will collaboration with Science Studies, NOAA, and international partners. Within LTER, we will continue to contribute to efforts to articulate and/or demonstrate strategies that enable site contributions to network efforts. Specifically we are working locally on units, dictionaries, quality control, semantic bridges, and learning environments so are engaged with the LTER Information Management Unit Registry Task Force and Dictionary Process, Quality Control, and Training Working Groups.
LTER Site: Cedar Creek LTER (CDR)
Contributor: Dan Bahauddin (Jul 27, 2007)
Site Byte:
1. Annual Site Byte
IM activities at Cedar Creek over the last year have focused on standardizing datasets and migrating them to an online database. Metadata has been systematically examined in an effort to bring our content to EML level 4.
Concurrent with these activities, we are completely redesigning out website to better Cedar Creek’s outreach efforts while making information more accessible to researchers. We have also upgraded many proprietary software packages Cedar Creek had come to rely on to more accessible, efficient systems.
______________________
2. Planned IM Projects
1. Complete web redesign (September 20007) 2. Create procedures for EML file generation from database. 3. Collect and convert legacy GIS data and make GIS database web accessible. 4. Improve and expand metadata. 5. Make database schema EML compliant.
LTER Site: Coweeta LTER (CWT)
Contributor: Barrie Collins (Jul 19, 2007)
Site Byte:
1. Annual Site Byte
I think that the comments below serve as a pretty nice site byte...Thanks, Barrie.
______________________
2. Planned IM Projects
2007 has largely been about preparation for Coweeta LTER's 2008 review. Thus, our focus has been oriented towards filling gaps in our data, publications, and metadata.
a. Improvement of Data metadata. (Thanks to John Porter for helping get PI's motivated). b. Working with the redoubtable Inigo San Gil to improve metadata in EML format to a higher plane. c. Classification of 2006 imagery from Ga to Virgnia to fit with our current coverage on five year increments back to 1986 or so. d. Support of EcoTrends Demographics. http://coweeta.ecology.uga.edu/trends/catalog_trends_base2.php
Point a especially is grueling work, as it's one thing to write: Improve metadata. It's another thing to be the combination of Sherlock Holmes and wolverine required to get the job done...but the payoff has been a more professional site and I say this with full sincerety: I'm glad we're biting the bullet and doing it right.
LTER Site: Florida Coastal Everglades (FCE)
Contributor: Linda Powell (Jul 27, 2007)
Site Byte:
1. Annual Site Byte
As the Florida Coastal Everglades LTER program heads into its second round of funding (FCE II), the IMS team (Mike Rugge and Linda Powell) is in the process of completely redesigning the FCE website to reflect its new research phase and expects to launch the new website sometime this fall. The new website was designed to be more graphical and user-friendly and will hopefully capture the interest of the ‘general public’ in addition to facilitating LTER research. FCE program information is now organized with a series of tabs that give a ‘file folder’ look to the web page and once a tab is selected, the user will find several related subcategories. Many new features have been added to enhance a user’s visit to our website such as ‘What we do’, ‘About the Everglades’, and ‘Featured Movies’. If you’re interested in taking a look at the new website, please see my FCE website demo at the upcoming IM meeting in San Jose.
Although the IMS team has made many changes to the FCE website, two important sections have been preserved and enhanced: 1) FCE Projects and 2) FCE Sampling. Following the new organizational template, the ‘Projects’ section is organized specific projects and their related information using a series of tabs that give a ‘file folder’ look on the web page (http://fcelter.fiu.edu/research/projects/projects.htm?pid=14 ). For any given project, a user can easily find the abstract, research sites, personnel, sampling attributes, datasets and publications related to that specific project all on one web page. Additionally, projects can be searched by keyword, researcher or funding organization.
The ‘Sampling’ application (http://fcelter.fiu.edu/research/projects/sampling.htm) is extremely useful in information and project management. Users are able to search for sampling attributes by entering keywords or manually selecting an attribute of interest from a list. Once the selection has been made, a map with a series of tabs give the web user access to research site information, dataset listings and project information related to their sampling attribute selection. This feature greatly facilitates project management and site science as users are able to easily access related information from one portal. For example, researchers interested in porewater salinity can enter the sampling attribute in the keyword field and the results will return a map of the FCE research sites where porewater salinity is collected as well as tabular links to all datasets containing porewater salinity and projects responsible for the collection of that particular attribute.
This coming year, the IMS team will focus on building a web-based query interface for FCE physical and chemical data values. In conjunction with the query capabilities, a graphing application will be added so that FCE researchers can graph their query results in real-time. Allowing users to query, download, and graph data for specific sampling attributes across all FCE data files will be a welcomed addition to the FCE website.
______________________
2. Planned IM Projects
A. Finish FCE website redesign and launch by late Fall, 2007
B. Create synthetic physical and chemical data tables (and appropriate related tables) in the FCE Oracle DB.
C. Build a web-based query interface for FCE physical and chemical data values and add a web-based graphing application.
D. Create FCE ArcGIS data.
LTER Site: Georgia Coastal Ecosystems (GCE)
Contributor: Wade Sheldon (Jul 25, 2007)
Site Byte:
1. Annual Site Byte
Our major focus this past year has been transitioning from GCE-I to GCE-II, which began in January 2007. Our GCE-II study plan emphasizes marsh processes and landscape-scale ecology, so we added a major GIS component to our project to support this research. We began by hiring a full time assistant IM / spatial data manager (Kris Meehan) to develop GIS resources for the project and assist investigators and students with GIS analyses, as well as provide backup and support to our lead IM (Wade Sheldon). We also greatly expanded our IT infrastructure to accommodate large volumes of geospatial data. For example, we added a high speed GIS workstation, 2 TB spatial data server (running ArcGIS SDE 9.2 with MS SQL 2000) and LTO-3 tape backup system, and we have just ordered an updated web server with 1 TB RAID to support hosting of GIS data and imagery on the web. We also maintain a GIS workstation and file server at our field site, as well as 50-seat ArcGIS license servers both at Sapelo and UGA Marine Sciences.
We also acquired a high precision field GPS unit (Trimble GeoXH, with subfoot accuracy) and post-processing software this spring, and are finally starting to acquire good geo-location information for our land-based sampling sites and plots. Hydrographic sites are already geo-referenced using high-precision GPS on research vessels, so now all our research locations can be properly registered in the GIS. Kris has made major progress in acquiring maps, imagery, and GPS data relevant to our study area and organizing them into documented geodatabases; unfortunately, she will be leaving the project this fall for a job in Iceland so we are about to begin the hiring and training process all over again.
These major additions to our IM program (as well as moving into a new office at UGA) have dominated work this year, but we did make progress in a few other areas as well. Don Henshaw, Ken Ramsey and I received post-ASM funding to conduct a workshop on quality control for derived data products, which was held at JRN in conjunction with a Trends editorial meeting in February 2007. Several other LTER IMs attended, as well as representatives from LNO, CUAHSI, SEEK and the Canopy Databank Project. Follow-up work is planned during this IM meeting in San Jose, and I will also be demonstrating new Q/C capabilities in the GCE Data Toolbox software inspired by discussion in the workshop. I also collaborated with Barrie Collins to provide access to near-real-time USGS streamflow data on the CWT web site (http://coweeta.ecology.uga.edu/ecology/hydrologic_data/hydrologic_data.html), as well as automatic data harvesting for HydroDB. Similar capability could also be added for other sites very quickly, with results "skinned" using html templates to provide appropriate look and feel, so let me know if anyone is interested in a similar collaboration.
The other major project we undertook this spring was to redesign the GCE web site. We needed a more flexible menu system as well as better support for dynamic content and user contributions. We also wanted to simplify maintenance, update the code for better W3C standards compliance, and incorporate more of the LTER web design group recommendations. We finished developing a prototype framework early this summer, and we are currently gathering feedback and moving our existing content to the new site. We will launch the new site in conjunction with deployment of our new web server this fall. Planned additions to this site are listed below.
______________________
2. Planned IM Projects
a. finish migration of legacy content to new web site (designed in summer 2007) b. develop web forms to allow GCE participants to directly upload bibliographic citations, reprints, project announcements and documents to the database c. implement a "current conditions" page on the GCE web site for integrated display of tide predictions and near-real-time weather and hydrographic plots (with access to documented data sets) d. expand GCE data catalog coverage to include ancillary data holdings (to provide EML support and integrated searching) e. add support for multiple data set selection and zip archive downloads to data catalog (i.e. shopping cart metaphor) f. add search and browse interfaces to data catalog g. extend GCE data catalog to include new geospatial data sets (with EML metadata) h. implement map query interface (Google maps) to data catalog
LTER Site: Hubbard Brook LTER (HBR)
Contributor: John Campbell (Aug 01, 2007)
Site Byte:
1. Annual Site Byte
In March, 2007 we released a new redesigned Hubbard Brook webpage (http://www.hubbardbrook.org/). The new webpage is a vast improvement over the old one and includes features such as better search capabilities, image archive with an upload feature, curricula vitae that can be edited by investigators, password protected intranet site, near real-time data graphing, and more.
Last summer we received an NSF supplement to install a wireless sensor network at Hubbard Brook. A repeater station was erected which made the entire Hubbard Brook Valley accessible with 900 Mhz radios. We are currently collecting real-time data from 5 locations and are planning to expand the network over the coming year. An example of the online graphical interface is posted at http://www.hubbardbrook.org/data/Realtime_Data/index.php.
We recently received another NSF supplement this year to improve the physical sample archive database. In the coming months we will post the database online and make a number of other enhancements.
In June, we had a NSF midterm site review. This was a positive experience and the reviewers offered a number of constructive ideas for improving information management at the site that we have begun to implement.
______________________
2. Planned IM Projects
Develop QA/QC for near realtime data
Develop XML database for EML
Put database for physical sample archive online
Improve method for tracking research projects
LTER Site: Harvard Forest (HFR)
Contributor: Emery R. Boose (Aug 01, 2007)
Site Byte:
1. Annual Site Byte
Activities over the past year included:
1. Design and initial field testing of a wireless network to reach experimental sites in the forest. 2. Installation of optical fiber and wiring for a new building. 3. Upgrades and reconfiguration of Harvard Forest servers. 4. Creation of attribute-level EML for most datasets using Morpho. 5. Continued collaboration with computer scientists at UMass Amherst on analytic web project.
______________________
2. Planned IM Projects
1. Installation of 100 Mbps optical fiber connection to university through Verizon (to replace T1 line). 2. Completion of wireless network design and submission of proposals for funding. 3. Completion of level 5 EML for all online datasets. 4. Implementation of new policy requiring researchers to update data & metadata annually before granting permission to conduct or continue field work. 5. Continued collaboration with computer scientists at UMass Amherst on analytic web project.
LTER Site: Jornada Basin (JRN)
Contributor: Ken Ramsey (Aug 01, 2007)
Site Byte:
1. Annual Site Byte
Information Management Ken Ramsey and Barbara Nolen have been planning on how to effectively integrate GIS data layers and research site locations with research data and associated metadata to enhance the quality and availability of all JRN data and to generate more detailed and precise EML documentation.
Barbara continues to provide GPS of research site locations and production of new GIS layers, to provide GIS and GPS support to researchers and students including training and map production, and to administer NMSU site licenses for GIS/RS software.
Ken is currently populating the JIMS database with research project and associated dataset metadata to support EML generation for all Jornada Basin LTER (JRN) datasets as well as the new dynamic data catalog and data cart that will enforce the JRN data policies for download notification using a user registration system that is in beta testing. This process includes validating data structure consistency between data and metadata.
Database Server The USDA ARS Jornada Experimental Range (JER) purchased a new database server and the JIMS’ databases and Trends administrative databases have been migrated to the new server, which required upgrading from Microsoft SQL Server 2000 to 2005. During this process we discovered that Xanthoria (CAP LTER product) would not work on SQL Server 2005. This required that we develop a customized EML creation solution from metadata stored with JIMS. New style sheets have been developed that generate EML level 3 from the JIMS database.
Integrating GIS Data, Metadata, and Services The GIS metadata in xml will be used to generate EML documentation for all JRN GIS layers stored within JIMS, which includes geographic bounding coordinates within EML documents describing research datasets. JRN purchased a new server to support ESRI ArcGIS Server and ArcGIS Image Server software to allow JRN researchers to query and access the JRN data, GIS layers, and imagery. Ken and Barbara attended the Annual ESRI International User Conference in San Diego, CA to gather technical information necessary to develop and serve the GIS services on the new GIS Server. The new GIS Server will support the following GIS Services: • query and access to aerial photograph archive • access research site locations (Intranet) • online shapefile production to support research site location selection and approval • Jornada Interactive Map (2D) on website • Jornada Interactive Map (3D) using ArcGIS Explorer
Deployment of ESRI ArcGIS Explorer is being evaluated to support interaction with the globe services and research site locations and their associated data and metadata as well as GIS thematic layers such as vegetation, soils, and base maps.
Jornada Website The new web server (purchased by JRN) has been installed and configured to support the new Jornada website currently in beta testing. The new website will include a new XML-based data catalog and data cart that will enforce the JRN data access policy by requiring user registration and authentication prior to download of JRN data. Users that download data will be required to supply an intended use statement upon download. Users will provide contact information, affiliation, and acknowledgement of the JRN data policies when they register with JIMS. When the beta testing of the new website is completed, the new server will be renamed and replace the web server that currently serves the Jornada website.
______________________
2. Planned IM Projects
1. Data: a. Production of Chihuahuan Desert Landforms map b. Update Desert Project map data c. Review and update metadata for legacy research datasets d. Upload legacy dataset files into databases (JIMS and ArcSDE) and correct any problems discovered while moving from ASCII text files to RDBMS 2. EML: a. Generate EML level 3 for all Jornada datasets including GIS layers (soils, vegetation, etc.) b. Enhance style sheets used to create EML to generate EML level 5 (attribute information). c. Implement and enhance the esri2eml style sheet to create EML for GIS layers and incorporating research site location bounding coordinates within research dataset EML documentation 3. JIMS: a. GIS Services: i. Incorporate vector drawing functionality to website to give prospective researchers at the Jornada the ability to draw proposed research site locations on the Jornada base map to aid in selection and evaluation of impact of proposed research on nearby ongoing/historic research efforts for approval/disapproval. ii. Spatial and tabular query interface to aerial photograph archive iii. Information management notification system b. Develop research notification administrative and user interfaces c. Develop automated dataset tracking system d. Implement QA rules for dataset attributes within metadata and enforce when uploading research data to JIMS database e. Consistently document and implement QA rules and flags within JRN research dataset documentation and JIMS databases and interfaces 4. Systems and networking: a. Upgrade to ArcSDE 9.2 on Database Server b. Upgrade to ArcIMS 9.2 on the Map Server. c. Deploy development server as new web server to add new data catalog and cart functionality d. Extend wireless coverage in the field e. Increase storage capacity for imagery using SAN/NAS solution
LTER Site: Kellogg Biological Station (KBS)
Contributor: Sven Bohm (Aug 02, 2007)
Site Byte:
1. Annual Site Byte
This year was a mid term review for us and we redesigned our website which was an opportunity to redesign the dynamic part of the site. I implemented the data catalog, directory and publication parts of the site using the Ruby on Rails web framework, which provided me with a testing framework so that it will be easier to evolve both the database(s) and the website. ______________________
2. Planned IM Projects
Write data upload applets to help researchers do a first pass quality control of their data.
LTER Site: Konza Prairie LTER (KNZ)
Contributor: Jincheng Gao (Aug 01, 2007)
Site Byte:
1. Annual Site Byte
The main focus of Konza IM activities this past year have been the QA/QC checking of our datasets and metadata, updating of GIS coverages and spatial metadata, and web site design. We generated window-based interfaces to convert text file data into a format for uploading into our SQL Server database and to perform initial QA/QC checking based on data ranges within specified minimum and maximum values. We also updated the historical Konza Prairie climate data and reharvested into CLIMDB to effectively recover missing data. We also performed a quality check for our metadata database in the SQL Server, which is dynamically connected with our web site. The geospatial datasets were updated and checked for accuracy with GPS unit. Spatial metadata were updated and were converted into EML. The Konza geospatial data were stored in shapefiles and arcinfo files, and as well as in a spatial feature format which is managed with ArcSDE and SQL Server. A spatial database of Konza LTER research activities (locations of plots, sampling transects, experimental infrastructure) was generated and added to interactive web sites managed with ArcIMS v9.2. The new Konza LTER web site was redesigned, and more functions which are dynamically connected to the metadata database were added.
In the past year, we also updated our IM hardware systems to support better data service. We expanded the capabilities of our database server and added one more SQL Server instance to accommodate the large volume of GIS data and images associated with the development of several new databases. We also replaced our old web server, which served us more than 9 years, with a new server with faster speed and more storage space. The new server has an Intel CPU of 3.0 GHZ and a 3.0 GHz and 2.0 GB RAM. Our new server will support our expanding datasets and the increased demands of our new GIS server.
______________________
2. Planned IM Projects
a. Continue with QA/QC checking of our LTER datasets and metadata, and add spatial attributes and methodology in EML metadata based on EML Best Practices. b. Improve our dynamic web sites with integration of our metadata database in order to make our web site more flexible and more convenient for users. c. Create GIS services to manage Konza geospatial data and the geospatial data of several related projects, and harvest spatial EML in Metacat. d. Work with Konza LTER PIs as we prepare for our next LTER renewal proposal.
LTER Site: Luquillo LTER (LUQ)
Contributor: Eda C. Melendez-Colom (Aug 01, 2007)
Site Byte:
1. Annual Site Byte
In this last 12 months the LUQ Information Management (IM) was dedicated to three types of projects and tasks: the EML further development of LUQ metadata, improving the services given to the LUQ scientists and outreach type projects.
• Once the EML level 3 – 4 for all the LUQ databases were achieved, we started the path to achieve the attribute level (5). We received a second visit from Iñigo whose assistance has been crucial to this task. We realized that further QA/QC was needed on the attribute-related metadata items such that the EML metadata could be suitable for web services processes. We decided that although the process to achieve level 5 will be longer, we will assure the quality of these metadata before producing new level 5 EML packages. Nineteen percent of the metadata files has gone the necessary QA/QC to be EML level-5 ready and the transitory files to produce their EML packages have been delivered and EML-proved by Iñigo’s perl scripts for LUQ’s metadata.
This year, four new databases were published on our web site (http://luq.lternet.edu/data) and four additional databases’ metadata whose data dissemination are still restricted, were published on the web site. All of the new complete published databases have gone the QA/QC processes.
• Special attention was given to assist the LUQ investigators in their synthesis-related tasks. A private Plone site was installed and sectios were made accessible to different science projects communities within the site; the Canopy Trimming Experiment, the Synthesis Book, and the Schoolyard projects’ groups were given special groups’ access and rights to share documents and information and to make collaboration easier. A section was established to give access to all the LUQ PIs to long-term databases and publications. The interesting thing about these last two sites was that it was an idea created and promoted by two of the LUQ PI’s.
• We further developed a perl script that handle online input forms. The script was modified to send the data to a readily-database-importable ASCII-file, to produce a screen that displays the just entered information to the user’s screen on the fly, and to add the URL of the updated ASSCII file to the email that the original script sends to the person that is defined as a recipient. With few modifications done to the script it can process any kind of input in the same way.
The first version of the script handles 4 different forms designed to get group, personal, and project information from groups visiting the El Verde Field Station (http://ites.upr.edu/EVFS/reservations.htm ). The second modification of the script was designed to get LUQ LTER graduate students’ personal and thesis project information (http://luq.lternet.edu/people/StudentdPers/PersonalDataEntryForm.html ).
______________________
2. Planned IM Projects
• Further develop LUQ metadata database by: o Finish exporting LUQ metadata to an EML level 5 with attribute data that will make the LUQ's metadata ready to be processed by present and future web services. o Re-structuring the metadata database to accommodate the new uses given to the metadata such that the process of producing EML packages or any other kind of metadata products can be automatized (that is, re-structure LUQ metadata database to a more modular structure). • Change the LUQ LTER site to a more dynamic, databased web site that will facilitate the installation of effective searching mechanisms for the users. • Create scripts that will automate the input of metadata information online. Two kinds are to be produced: o Script(s) that process one set of forms that get the direct metadata input of the user and stores it in an online database (preferably MYSQL) o Other scripts that accept the LUQ LTER MS Word metadata standards (http://luq.lternet.edu/datamng/imdocs/dsetallfrm.doc ) process it and get the metadata information into the online metadata database (preferably MYSQL).
LTER Site: McMurdo Dry Valleys (MCM)
Contributor: Chris Gardner (Aug 10, 2007)
Site Byte:
First of all, I apologize for the lateness - I thought I had submitted this several weeks ago, but I guess I wrote it and forgot!
1. Annual Site Byte
Information Management at MCM over the past year has focused on improving our GIS capability, increasing data redundancy, better tracking analytical samples in the field, and preparing for our site review in January, 2008 in Antarctica.
Many GIS layers in our database were cleaned up and properly attributed, and protocols were developed for handling spatial layers. Several new spatial layers were created, including a DEM and bathymetric lake layers. Finally, an interactive online map using ESRI's ArcGIS Server technology was implemented on www.mcmlter.org. This map is JSP based, and pulls layers directly from our Oracle database. The map also serves as a data portal - clicking on certain features returns actual data and metadata.
An older Sun workstation was also acquired for free. This machine is now running Oracle 10g and automatically replicates any changes in the main database every 7 days using materialized views. Rsync open source replication software was also installed on this machine, and it replicates key directories from the main server.
Chain of Custody (COC) forms for analytical chemical samples were redesigned for the 2006-2007 Antarctic field season. These forms are now inserted into the database and viewable both by users of Analytical Services in the field, and by team members running the samples. A page in the restricted section of our website gives users new flexibility for tracking samples and assuring data continuity from the field to the database. The COC database can be used to compare actual samples and sample names received to data submitted after the field season. This will solve many problems with missing data and incorrect sample names
______________________
2. Planned IM Projects
Our main task is now to prepare for our site review in January, 2008. Will will be updating data and protocols and ensuring that we meet the review criteria for LTER Information Management Systems. Another goal is to continue to make the database more useful by offering complex web-based queries and joins on data. Finally, the online map will be expanded and GIS capabilities will be increased. We will also be creating EML for our spatial layers.
LTER Site: Moorea Coral Reef (MCR)
Contributor: Sabine Grabner (Jul 25, 2007)
Site Byte:
1. Annual Site Byte
The past year's IM activities focused on setting up an environment to post the site's long term datasets online. Datasets were described in EML varying from level 3 to 5. Posting EML metadata online through XSLT transformation was adapted from SBC. Most datasets can be downloaded automatically and downloads are registered in the database.
Our back end is a combination of disk storage and database. For insertion of manually collected data we developed database interfaces using web forms and Java applications. Maintaining the database schema is a continuous task and will become a major task in the coming year, when we will make the schema fully compliant with the used EML tags.
______________________
2. Planned IM Projects
1. develop generic instrument data harvester for harvesting TCP connected instruments/loggers/fieldpcs (Java, XML, RBNB) 2. make database compliant with used EML tags 3. develop web interface to edit metadata 4. generate EML from DB
LTER Site: North Temperate Lakes (NTL)
Contributor: Dave Balsiger, Barbara Benson, Jonathan Chipman (Aug 01, 2007)
Site Byte:
1. Annual Site Byte
We have continued our development in the area of sensor networks and our leadership in the Global Lake Ecological Observatory Network (GLEON). We are collaborating with computer scientists from the University of California San Diego Supercomputer Center and SUNY-Binghamton on automating the configuration and quality assurance components of the NTL information system for near-real time data streaming from instrumented buoys. We are developing a database of lake characteristics information on lakes associated with GLEON including web forms for users to enter and edit information from their sites.
Two major upgrades to NTL-LTER data collection software were accomplished by our student programmer during the past year. Software used in counting and measuring zooplankton was made more user friendly and efficient. Software used by the fish crew for field data collection was migrated to the latest IPAQs with substantial program enhancements based on feedback from the fish crew and information management staff.
During the past year we developed complete EML metadata for the NTL spatial data sets. Inigo San Gil was very helpful with the development of a set of methods for converting ESRI-format metadata into EML. At NTL, most spatial metadata are originally developed using ESRI's ArcCatalog utility. We then save the metadata as XML, use a stylesheet to reformat them into EML, and then use the oXygen XML editor to correct various errors and make the metadata fully EML-compliant. EML from the spatial datasets are included in the Metacat harvest.
We currently provide access to the NTL spatial data through several interfaces -- a "static" data catalog for downloading individual files, a set of interactive ArcIMS map applications, and an Oracle/ArcSDE geodatabase for direct interactive access to all spatial data. We are exploring a series of improvements to these interfaces, including investigating methods for efficient storage of and access to very high-volume (multi-gigabyte) raster images.
______________________
2. Planned IM Projects
1. Design and implement an alternate search interface for the NTL data catalog to allow users to query by investigator, subject, keywords, lake name, geographic location, and taxonomic criteria.
2. Continue our progress in automating data processing of near-real time data from instrumented buoys and the NTL meteorological station including testing a new data model.
3. Develop a generic version of our custom zooplankton software that would enable microscope counting and measuring of other organisms.
4. A new seminar in information management and technology for graduate students will be offered this fall at the Center for Limnology. Organized by Barbara Benson and Paul Hanson, this course will cover topics such as database design, software options, and metadata requirements.
LTER Site: Niwot Ridge LTER (NWT)
Contributor: T. M. Ackerman (Aug 01, 2007)
Site Byte:
1. Annual Site Byte
This year was our mid-term site review. The review process proved once again a trying but very important process. The comments from the review team were extremely helpful and will result in an improvement in our IM system.
This year also saw Niwot Ridge being selected as a Core Site for the NEON.
Our Relational Database server was upgraded from MSSQL 2000 to MSSQL 2005. The production database server (Dell PowerEdge 500SC) has been upgraded to a more powerful server (Dell PowerEdge 2600) capable of handling more requests. We migrated the existing databases designed in Microsoft SQL Server 2000 to redesigned SQL Server 2005. Currently this server is handling the generation of the EML metadata, and once version upgrade issues are resolved, will be able to allow query of the climate data. All of the meteorological data that collected via the wireless radios are imported into the current schema.
Three sites have recently been enhanced with High Throughput radios which support Ethernet as well as serial communications. The Ethernet communications are in use at the Soddie lab in order to remotely control and manage the computer which interfaces with the instrumentation. These radios allow for troubleshooting problems quickly, previously such problems may have gone weeks without notice. The installation of these radios has saved time and energy for researchers and MRS personnel adding the ability to perform minor tasks such as changing data logger programs or rebooting upon system failure.
A wireless connection from the MRS to Boulder replacing the current T1 connectivity is in development. The current system consists of a fiber optic line from the Tundralab to the MRS and a T1 from the MRS to the world. The new system will consist of the existing fiber optic line from the MRS to the Tundralab and a wireless connection from the Tundralab to the world with a bandwidth of 3mbs. We have installed two relay stations, ground-truthed the sites, and are currently awaiting the installation of the receiver on the CU campus.
The precipitation and temperature datasets for C-1 (1952-2006), Saddle (1981-2006), and D-1 (1952-2006) have been analyzed in order to fill in missing values and correct erroneous values. Adjacent sites were used to statistically insert values. These new datasets provide more a robust ability to look at some of the longest continuous high elevation climate data sets.
______________________
2. Planned IM Projects
1) Improve Site Research Application to include formal project definitions, for better data acquisition tracking. 2) Web metadata form. 3) Attempt to get data into eml file. 4) Climate data query system.
LTER Site: Palmer Station (PAL)
Contributor: Karen Baker (Sep 13, 2007)
Site Byte:
1. Annual Site Byte
There were two major foci this year: growth of the information infrastructure and launch of a queriable site information system. In terms of physical infrastructure, a second server (iSurf) was deployed providing increased storage together with isolation of web and collaboration services and also providing an LDAP server for single sign-on authentication. In terms of organizational infrastructure, department arrangements for a computational recharge facility were supported as a growth strategy central to cross-project infrastructure and resource sharing. Our physical work area was moved from a long-time laboratory to a more compact Design Studio, centered on a refurbished design table and posting spaces that serve as memory traces. In terms of conceptual infrastructure, we continue to develop the tie between theory and practice through the framework of Ocean Informatics. In terms of social infrastructure, an informatics team approach coalesced to include KBaker, MKortz, JConners, LYarmey, JWanetick together with part-time associates and stronger ties with other local data initiative efforts (CalCOFI and SWFSC).
Throughout the year, Datazoo, an online information system, was designed and developed, combining a MySQL backend with an object-oriented, PHP-based web frontend. Data practices that included the information system were discussed with site researchers and students during demonstrations of the system conducted in the summer. The system architecture addresses critical data integration issues so that multi-dataset browsing, comparison, and joins are possible. Though the core functions had all been addressed by June, subsequent recoding improved the interface and created new functionality. We expect this process to continue according to the iterative design principle that “Data-using informs Information-Managing informs Data-using”.
Population of the database began over the summer, during which the data ingestion procedure and template were streamlined. Datazoo is now a cross-site information system using an augmented EML specification where metadata describes datasets to the column level through use of attribute, unit, and qualifier dictionaries. Work on attribute naming conventions and re-naming conventions was carried out. Capacity to publish dataset metadata to the community catalogue was demonstrated by submitting our first dataset to the Metacat server this summer.
Tools were added as a feature of the information system. A grid converter, which calculates line and station given latitude and longitude (and vice versa) and uses Google maps for visual display, was given a second interface for batch mode input. Code was refactored to separate input from display functionality. A time formatting tool permits construction of time in a format required for dataset ingestion. Finally, a joining tool permits matching a dataset file against an existing eventlog dataset so that designated columns can be added to the original dataset file.
Three posters were presented at the annual IMC: Data Integration in the Decade of Synthesis, (MKortz, LYarmey, JConners, and KBaker); Environmental Data Management: Infrastructure Studies Insights (FMillerand and KBaker); and Long Term Informatics (KBaker, CChandler, AGold, FMillerand, and JWanetick). More than a dozen Databits contributions over the year included news articles, reports, and good reads as well as an editorial and an FAQ. Work with the NSF Comparative Interoperability Project continued bringing forward the concepts of ‘enactment’, ‘articulation work’ and of ‘knowledge provinces’. A joint keynote with FMillerand at the IMC07 presented to the community the notions of roles, informatics, and sociotechnical systems. A history associate worked at SIO early summer focusing on historical data models.
______________________
2. Planned IM Projects
Planned projects include continuing to migrate data from the original PAL data system to the new Datazoo system and populating our augmented metadata schema by working through the metadata web forms with participants. We recognize this as an opportunity to elicit tacit data collection information and to facilitate category building and vocabulary development with local participants. Having added a user-friendly view of the metadata in addition to the xml view, we will be considering methods for download of the metadata with the data. With Datazoo, we will reconsider both project-study and cross-project relations as well as dataset creation and ingestion. Further, there will be a more extensive online help system developed in coordination with user input. Finally, working groups on Matlab and on a plotting review held in the last year. We are planning to develop visualization in general and a plot gallery that builds from the redesign of existing media galleries in particular.
Ongoing partnerships within LTER and with Ocean Informatics Woods Hole and MIT will continue as will collaboration with Science Studies, NOAA, and international partners. Within LTER, we will continue to contribute to efforts to articulate and/or demonstrate strategies that enable site contributions to network efforts. Specifically we are working locally on units, dictionaries, quality control, semantic bridges, and learning environments so are engaged with the LTER Information Management Unit Registry Task Force, Dictionary Process, Quality Control, and Training Working Groups.
LTER Site: Plum Island Ecosystem (PIE)
Contributor: Hap Garritt (Jul 23, 2007)
Site Byte:
1. Annual Site Byte
The Plum Island Ecosystems (PIE) LTER has been working diligently the last year at upgrading our EML metadata from Level 3 to Level 4+. We feel complete attribute level information is a critical component of metadata as most researchers want the actual research data sets. We are currently redoing existing Level 3 EML by reentering our MS Word metadata into a modified version of the ARC LTER Excel based metadata form developed by Jim Laundre. The Excel form is relatively straightforward with metadata in Sheet 1 of Excel and data in Sheet 2 thus allowing the metadata and research data to always “travel” together. With the help of a site visit by Inigo San Gil in February 2007, we are also making progress on generating EML compliant metadata for our ArcGIS data. Generation of EML compliant metadata dominates most IM activity; hopefully this effort will lessen as our legacy metadata reaches Level 4+ status.
______________________ 2. Planned IM Projects
1) Generate EML level 4+ metadata for all data sets. a) all non spatial data b) ArcGIS spatial data c) IDRISI (Clark University) GIS spatial data d) RIVERGIS (University of New Hampshire) GIS spatial data 2) Redesign PIE web site a) Redesign skin template b) Develop usable GIS web page c) Develop usable Photo/image web page 3) Continue generation of discrete TRENDS datasets
LTER Site: Santa Barbara Coastal (SBC)
Contributor: Margaret O'Brien (Jul 20, 2007)
Site Byte:
1. Annual Site Byte Most of the past year's IM activities at SBC have been laying the groundwork for the tasks listed in Part 2. Metadata: work on a design for a metadata database which can be used by both SBC and MCR; work with PIs to more completely and correctly describe sampling locations and parameters. Start work on a prototype data portal and gather feedback from user groups. Integrate data processing with IM: assist scientists in a redesign of hydrologic data processing using matlab to lessen the dependence on excel spreadsheets. We plan to create EML packages as a data processing step. EML data table query tool: Create an generic application for querying tables which are described in EML (see Spring 07 issue of Data Bits). In addition, the SBC website was redesigned, adding dynamic menus and incorporating several features recommended by the network. The new design uses templates and libraries in a flexible framework that allows existing pages to be edited and new pages to be added easily. ______________________
2. Planned IM Projects A. Create an interface to browse SBC's data catalog by sampling site and parameter B. Finish and install the EML data table query application C. Populate metadata database D. Write matlab m-files to output EML as a processing step, with metadata drawn from db
LTER Site: Shortgrass Steppe (SGS)
Contributor: Nicole Kaplan and Bob Flynn (Aug 01, 2007)
Site Byte:
1. Annual Site Byte
The SGS-LTER Information Management System archives extensive amounts of data, metadata, and other information. We have made significant revisions to our relational database management system to more efficiently organize, relate and deliver information, data and metadata. Our new schema also contains metadata content to generate level 5 EML, which defines each attribute within each dataset and the associated measurement units or code definitions. Implementation of EML level 5 will begin with long term studies consisting of 30 datasets, (15% of our entire database) which supports both legacy and current studies. Perl scripts have been developed to generate EML from our database and store it on our web server for harvest by the LTER Network metacat catalog. This level of metadata and our participation in the communitywide metacat will facilitate data discovery, data integration and synthetic research.
We are contracting with a web development team at our home institution, Colorado State University, to implement our newly designed website that queries, joins, and publishes information from the revised database. The benefits of using the database to drive the website content include delivery of information that is more dynamic, connections between different types of information within the relational database management system, and increased ease of use for our end users with the latest web features. An example of related information that can be delivered within just a few clicks of the mouse includes a publication citation with the supporting data and the authors’ e-mails. The SGS-LTER website is large in scope, presenting an impressive amount of information that requires unique tools compared to other projects on the CSU campus. The tools we will build can be maintained and expanded by SGS-LTER Staff and may be passed along to other LTER sites. Implementation is planned for Fall of 2007.
We are also providing data, metadata and expertise to integrate datasets with other LTER sites. For example, researchers working at the SGS, SEV, and JRN, as well as the TRENDS project are interested in determining drivers of Net Primary Productivity (NPP). The Grasslands Data Integration Project (GDI) brings together ecologists, computer scientists, and information managers to address the challenges of integrating diverse datasets and produce analyses from the integration product. Unfortunately, integrating NPP data over time and over regions is not a straightforward problem. Issues of ecological synthesis and data integration arise from questions of methodology (e.g., sampling, plot sizes, etc.), data semantics (e.g., changing taxonomic information), and straightforward but messy data transformations. These datasets represent a core area of research within the LTER community, and an integrated databank would create a powerful resource for ecologists from several sites to perform cross-site analyses.
Our staff provides support for SGS-LTER researchers and students in various aspects of GIS including gathering data with GPS equipment and imagery, assisting with GIS model development for their particular research, and providing GIS data and maps for field work and modeling. We have also extended existing GIS programs for analysis of SGS-LTER data.
We also continue to participate in planning future activities for the LTER IM committee. Nicole serves as co-chair of the IM Committee, which establishes priorities and strategies to facilitate network-level and synthetic research that requires greater integration of information and interoperability of systems.
______________________
2. Planned IM Projects
Populate database with EML level 4-5 content Harvest more EML Implement new website Upgrade server Contribute to site renewal proposal
LTER Site: Virginia Coast Reserve (VCR)
Contributor: John Porter (Aug 01, 2007)
Site Byte:
1. Annual Site Byte
Another busy year at the Virginia Coast Reserve LTER project! We're greatly enjoying our new home at the Anheuser-Busch Coastal Research Center. Its modern, climate-controlled laboratory and dormitory facilities are a giant improvement over the renovated farm house it replaced. However, based on how busy it's been this summer, we may need to expand it soon!
On the IM-front this has been a year for consolidating some activities and improving user interfaces for data access. In particular, we've improved our local data catalog by changing a long list into a matrix format that can be sorted by a variety of criteria, including core area and popularity. We also did some major work on our database of locations (for datasets, species observations etc.), dealing with a large number of locations that were named and described, but had no coordinates. To avoid such problems in the future a new interface for identifying locations was developed. It uses the Google Maps API v.2 (following testing to make sure that the georeferencing was sufficiently accurate in our area) so that a user can use the standard Google Maps controls to zoom and pan the map, then click on a location to record its coordinates into our location data table. Coupled with that new facility is an interactive map (again using the Google API) that plots all locations for VCR/LTER datasets. Clicking on a location pulls up a list of all the datasets associated with that location. Currently the system only supports points and bounding boxes, but we look forward to adding polygon capabilities in the future.
We have also worked on enhancing data systems for important datasets. Traditionally these have been handled using user-designed spreadsheets. However, in some cases these were inconsistent in format and highly variable in the degree of quality control and assurance. For our water quality dataset we have developed an operational system that incorporates spreadsheet forms (including both data and “flags”), a PERL ingestion program that resolves relational issues that the spreadsheet handles poorly, a MySQL database, R-scripts that conduct standard QA analyses, and data editing capabilities using phpMyAdmin and an ODBC-linked Access database. A web page provides a central location for uploading spreadsheets, running QA analyses and for data editing. We plan to use this system as a prototype for additional types of data in the near future.
We also have experimented with the Drupal content-management system. Although, PostNuke systems are used for all of our major web pages, Drupal is being used for new graduate student and project-specific web pages that depend heavily on relatively unstructured user input, as the Drupal tools are somewhat simpler and easy to use for the naive user, albeit perhaps less powerful than the PostNuke tools for the expert user.
We are also looking at options for the future. Currently our main systems are hosted by a powerful, but aging Sun Blade 2000 server. However, as more software has become available in the Linux environment, we see that as an option for our future. We have therefore set up a “developmental” Linux server on a functional, but obsolete, PC, and are using it to test software etc.
Our wireless network on the shore continues to expand. We had to do a major rebuild on one of the nodes when it took a direct lighting strike so intense that most connections were welded together. However, that gave us a chance to experiment with improved power systems that now permit 24/7 operation. We are adding an extensive network of ground-water wells to the network on Hog Island and developing plans for extending the network to other islands. We have also collaborated with the University of Virginia's Engineering School to deploy a mote-based network of light sensors (LUSTER – Light Under Shrub Thickets Ecological Research).
We have continued our strong interactions with the Taiwan Ecological Research Network (TERN). TERN researchers Chi-Wen Hsaio and Chau-Chin Lin visited during the winter of 2006 and spring of 2007, respectively. We worked with them on developing web-based systems that ingest EML documents and the associated data and produce quality assurance analyses. We also participated, along with Don Henshaw from AND and Kristin Vanderbilt from SEV in a workshop in Shanping Taiwan that helped train East Asia Pacific (EAP) ILTER researchers in the use of Kepler and other tools that exploit EML.
Along with Chau-Chin Lin, John Porter made an invited presentation to the Coastal Environmental Sensing Networks workshop at the University of Massachusetts, Boston in April 2007. We also took the opportunity to visit Emory Boose at HFR during the trip. John Porter also remains a member of the Oak-Ridge National Laboratory Distributed Active Archive Center User Working Group.
______________________
2. Planned IM Projects
1) Expand use of relational databases in day-to-day data input and management and improve automated QA/QC checking 2) Explore how to best capture polygon-based location data 3) Continue development of EML-based tools, ideally as web services 4) Link VCR Personnel Database with LNO Personnel database, as API becomes available
LTER Site: Other Site (Guest Participant)
Contributor: Sheng-Shan Lu, Cau-CHin Lin, Meei-Ru Jeng (Jul 28, 2007)
Site Byte:
1. Annual Site Byte
Taiwan Ecological Research Network (TERN) IM team at Taiwan Forestry Research Institute has been benefited many ways by work with US LTER information managers since 2004 (see article ‘ LTER intensifies IM interactions with Taiwan’, in The LTER Network News Vol. 20 No. 1 p.18-19, Spring 2007). A three-tier framework for an ecological information system based on EML was adopted by this team in 2006. The system aids with editing, storing, and using documents in the multiple languages of Asian cultures that comprise the East-Asia Pacific Regional Network (EAPRN-ILTER). The IM team has held over 20 workshops in the past two years to teach researchers how to use Morpho and create their own EML documents, mainly in domestic but also promote to the EAPRN-ILTER. Just two weeks ago, EAPRN-ILTER IM Committee Member Meeting and Ecological Information Management Workshop held at Kaohsiung, Taiwan, from 9-14 July, 2007. We highly appreciated John Porter, Don Henshaw and Kristin Vanderbilt came to Taiwan assistance on this workshop. In addition, the Third ILTER Workshop on the Ecological Information Management in the East Asia-Pacific Region will be held in Seoul, Korea during 16-21 October 2007 with the help from TERN, US LTER again. By the way, we also expending the relationship to Australia LTER this March, mainly focus on the ecological information management system issue. Further cooperation between Australia and Taiwan on IM issue will carry on. However, a strategy for encouraging the data submission or improving data quality still needs to develop in this regional network. ______________________
2. Planned IM Projects
a. Review all data and metadata, improving the metadata content, and improving EML by Q/A&Q/C b. Promote scientific collaboration through data sharing policy, all the ecological or environmental monitoring data can be shared to all academic institutions through Cyberinfrastructure deployment c. Create Open GIS web services using Google Earth/Google Map to provide location of biological collection at the TFRI herbarium and insect collection d. Applications of Kepler workflow system to provide the opportunity to researchers willing to create their own EML documents e. Third ILTER Workshop on the Ecological Information Management in the East Asia-Pacific Region in Korea October 2007
|