|
|
LTER Site Bytes
- AND (Andrews LTER)
- ARC (Arctic LTER)
- BNZ (Bonanza Creek LTER)
- CCE (California Current Ecosystem)
- CAP (Central Arizona - Phoenix)
- CWT (Coweeta LTER)
- FCE (Florida Coastal Everglades)
- GCE (Georgia Coastal Ecosystems)
- HBR (Hubbard Brook LTER)
- JRN (Jornada Basin)
- KBS (Kellogg Biological Station)
- KNZ (Konza Prairie LTER)
- LUQ (Luquillo LTER)
- MCM (McMurdo Dry Valleys)
- NWT (Niwot Ridge LTER)
- NTL (North Temperate Lakes)
- PAL (Palmer Station)
- SEV (Sevilleta LTER)
- SGS (Shortgrass Steppe)
- VCR (Virginia Coast Reserve)
LTER Site: Andrews LTER
Contributor: Suzanne Remillard and Don Henshaw (Sep 14, 2006)
Site Byte:
The Andrews Information Management (IM) Team includes Don Henshaw (USFS), Suzanne Remillard (Oregon State University), Theresa Valentine (USFS), and Fred Bierlmaier (OSU). Zhiqiang Yang, a remote sensing post-doc at OSU and a former Information Manager for KNZ, is helping us review and edit our EML with respect to best practices and provides expertise for dynamic EML generation and other technical issues.
This past year has been a year of transition for the Andrews Forest. Gody Spycher, the long-time LTER-funded Information Manager for the Andrews, retired in April and Suzanne has officially taken the reins. Gody has been terrifically productive working behind-the-scenes since 1985 in building information systems for the Andrews and will be greatly missed. Fortunately, Gody will still be available to assist our team on a temporary appointment. Additionally, Mark Harmon has stepped down as the lead site PI and Barbara Bond is now the lead.
In light of this transition and the loss of an IM position (Suzanne’s previous soft-funded position is eliminated), the Andrews has initiated a planning effort to discuss the scope of Information Management (IM) activities, priorities with respect to database and metadata development, and the roles of investigators with respect to IM. The Andrews also received an NSF-funded field station planning grant. This funding will be used in part to improve efficiency in data processing and quality assurance checking of data from our sensor networks and to streamline the flow of data from the field to the lab.
Study database and metadata development for online access continues to be the highest priority with continual updates to long-term data sets, the migration of legacy data sets, and the development of newly submitted data sets from Andrews researchers. In the last year, we have added or updated 40 data sets. Currently, 140 data sets are now online including 24 spatial (ArcGIS) data sets. A web demo and poster at this year’s ASM will feature our metadata-driven information system and interactive data retrieval programs that deliver value-added climate and streamflow data. Gody completed development of an interactive management system before retiring, which will facilitate entry, editing, and quality assurance checks of data.
Suzanne continues as the database administrator for the combined ClimDB/HydroDB effort. The ClimDB/HydroDB warehouse now includes over 7 million daily measurement values from 39 sites (280 stations) and hosts 30 visitor sessions per day. Don and Suzanne are collaborating with Wade Sheldon (GCE) on a ClimDB/HydroDB paper which Don presented at the International Conference on Hydrological Science and Engineering in Philadelphia on September 12, 2006.
Don chairs the LTER Network Information System Advisory Committee (NISAC) and serves on the LTER Executive Board representing the IM Committee. As NISAC chair Don also participates with the Cyberinfrastructure (CI) Team and as a technical representative on the Trends Editorial Board. Don attended an East Asian Pacific (EAP) ILTER Information Management Workshop hosted by the Taiwan Ecological Research Network (TERN) in February. John Porter (VCR) and Peter McCartney (NSF) also participated as instructors and in attendance were representatives from 7 other nations. This visit has enhanced collaboration with TERN and Taiwan sites are now participating in ClimDB/HydroDB. ILTER workshop participants from Taiwan, Japan, and Malaysia will be attending this year’s ASM. Don also participated as a panelist in the Central America Workshop for Information Technology and Biodiversity in Panama to better link information technology efforts with the community of biologists and to consider cyberinfrastructure improvements in Central America.
Theresa remains involved with the LTER GIS working group and will lead a GIS workshop at the ASM 2006. Theresa has led a pilot project to develop WatershedDB, an interactive web application and database intended to provide spatial data layers in conjunction with ClimDB/HydroDB sites. (Look for the demo and poster at this year’s ASM.) The project is complete and we are looking for additional funding or collaboration to continue this effort. A website is available: http://wwwgis.forestry.oregonstate.edu/website/wsheddb1/wsheddb.htm. Theresa also co-leads the annual GIS Day in coordination with OSU departments and hosted over 400 Corvallis middle school students this past year. Students interacted in the use of GPS and the Andrews Interactive Mapping Site, and volunteer lectures highlighted the use of GIS in natural resources.
Fred Bierlmaier (OSU) is the on-site Andrews system administrator and maintains the site Local Area Network (LAN), wireless LAN, and digital radio and spread spectrum telemetry networks. The newly installed wireless access has been incredibly useful for researchers and conference attendees at the Andrews. The site has also received OSU matching funds from student fees that will buy two new computers and a Polycom video-conferencing system for the Andrews site.
LTER Site: Arctic LTER
Contributor: Jim Laundre (Oct 27, 2006)
Site Byte:
This past year EML implementation and new web design has been a major focus of the Arctic LTER IM. We have successfully moved our legacy data to EML level 2 using a Perl/java script developed by the LNO. For new metadata we have developed a simple one sheet Excel form (modeled after FCE’s form) for new metadata entry (see http://ecosystems.mbl.edu/arc/metadata_forms/MetadataBlank.xls). The one worksheet Excel form will help researchers easily include metadata with their Excel datasheets. A rich text Metadata form is available for researchers who do not use Excel. EML is then generated using a VBA macro. Currently the EML produced is level 3-4. There is still some coding to be completed in order to meet the “Best Practices” guidelines. As soon as this coding is done and documentation completed a version will be placed on the LNO’s CVS site.
In order to update legacy metadata a script was also developed to populate an Excel workbook with a metadata sheet and a data sheet. The code made best guests at the attribute table based on our previous text metadata. We have begun check and update these files. At the same time where possible we will combine yearly files into multi-year files.
We started to redesign our web site more then a year ago but have been waiting to deploy it until after the Web Design Recommendations were completed. In designing our new web site we are moving away from a frame base web site to using templates. We also plan to use style sheets to dynamically display metadata and have the Excel metadata and data workbook file available as a download.
From the field: This winter will be the start of Toolik Field Station having year round power. At least two staff will be on station all winter. For the Toolik Field Station webcam see http://www.uaf.edu/toolik/webcam. With power the fiber optic internet connection will be available year round and will provide continual downloads of weather data (http://ecosystems.mbl.edu/ARC).
LTER Site: Bonanza Creek LTER
Contributor: Brian Riordan (Sep 18, 2006)
Site Byte:
Bonanza Creek LTER data management has made huge strides this past year. Not only in EML but overall data structure and our website flow. We have managed to be taken off of probation and are looking forward to our site review this coming summer. We have recoded a number of the Coldfusion webpage's to reflect the new database design. We have taken into account as many of the Website guidelines into account during our changes.
Data/website focus redirection: We have changed our philosophies on data structure and focus. Previously we were focused on a "story line" structure. This is were the concepts behind that data files were more important than the data itself. Now we are focused on individual files and letting researchers decide how they want to use it.
Hardware - The purchase of a dedicated database Penguin server has allowed us to generate large climate search engines, robust data file searches, and 3 million + row tables. We now have the Coldfusion html server on a separate piece of hardware.
Bibliography - Half of the references are linked to full pdfs. The bibliography is very dynamic and can be searched on a number of different levels. We hope to merge the data files with bibliographies over the next year.
Personnel - We have modeled our new format after the Andrews site. It makes it look more professional and informative with keywords, publications, data files, and key personnel information.
Data - We now have over 191 diverse data sets. The PIs responded very well to this years data file call and together they produced 40+ new data files this year. We rebuilt the data file search engine. It is formatted after the GCE site. All data files can now be searched for by personnel, keywords, sites, titles, and dates.
EML - We are currently producing 131 EML documents around level 2 - 3. We have made a number of different changes to the database to allow us to support attributes now. We are developing our site section to further round out or EML information. We hope to generate a dynamic EML system for the BNZ DB by Spring 07.
Climate Search engine - We have built a number of different search engines that are modeled after a "tax wizard" format. The engine dynamically builds queries that are executed upon finishing. The results are then displayed in a downloadable formats as well as graphs.
Future: We are now beginning to look at the site section of the database and generate a new look and feel to it. This will be a large task since we have over 1000 sites in our database and over 500 site alternative names. With the acquisition of the radios we are now working on fully automating out climate data. This will provide "real time" hourly data that will be available online immediately.
LTER Site: California Current Ecosystem
Contributor: Karen Baker (Oct 22, 2006)
Site Byte:
The CCE information management effort is in its second year extending information management strategies to support the CCE community of scientists and interfaces with the LTER Network. A new disk share capability augmented last year’s technical infrastructure build-out. Two disk mount technologies have been established for immediate use and comparative purposes (see MKortz, File Sharing Options: Elements of a Collaborative Infrastructure, http://intranet.lternet.edu/archives/documents/Newsletters/DataBits/06spring/#7fa). In addition, plans include a new server that will isolate web and storage services. Our informatics design studio capacity was updated and used for joint design sessions with California Cooperative Oceanographic Fisheries Investigations (CalCOFI) program participants. Organizational infrastructure was also built-out by initiating collaboration with CCE related projects.
Features added to the CCE web site (http://ccelter.sio.ucsd.edu) include a site dynamic mapper consisting of a web interface to the GMT plotting tool. In addition, a grid station converter was developed in order to make available the standard sampling grid line and station given any latitude and longitude. The eventlogger system designed last year was further developed and deployed on a series of CalCOFI cruises as well as on the first CCE Process Cruise in May 2006. During this cruise, support included insuring daily web access to satellite imagery. After the cruise, a cruise web page is being developed and populated in order to collect cruise information and data. This serves as a data focal point while best practices, data procedures, and the local database are designed. In addition, identifying and gathering of long-term data for the Trends project took place.
Local cross-project work focused on cruise glossaries, eventlogs, and dataset dictionaries. As Ocean Informatics participants, a cross project Matlab User Group was initiated drawing together a variety of participants including programmers, technicians, and graduate students. An Infrastructure Studies presentation was made to the Integrative Oceanography Division in collaboration with a social science project and posters (Ocean Informatics; SCCOOS Data System: A Real-Time Data Acquisition, Storage and Access System) presented at the Marine Monitoring Conference 24-25 April hosted by Aquarium of the Pacific. KBaker became a UCSD Science Studies Affiliate (http://sciencestudies.ucsd.edu/) and collaboration with science studies projects (http://interoperability.ucsd.edu) and participants continued with a focus on development of the notion of Infrastructure Studies. A student from sociology worked at SIO over the summer with a focus on the development of the eventlogger system. As LTER Network participants, we continued communications via feature article and good read submissions to the Databits Newsletter. A team of five Ocean Informatics participants for PAL & CCE submitted posters (CCE LTER: Information Management (2004-2006); Research in Infrastructure Studies: Social and Organizational Perspectives on Ecological Data Management) and attended the 2006 All Scientists Meeting.
LTER Site: Central Arizona - Phoenix
Contributor: Corinna Gries (Sep 12, 2006)
Site Byte:
During the last year we improved publication search and download functionality on the CAP website. Results from a simple or advanced search can be marked and exported as plain text, EML or EndNote export format. The EndNote export format is also used for bulk upload to the LNO publications database. About 40 protocols used by CAP have been converted to EML format and added to the EML database in eXist an open source xml database system.
As a major update to data management at CAP we have experimented with and implemented online data entry applications written in php and accessing data in MySQL. This represents a move away from SQLServer and ACCESS front end applications. A basic template was developed in php to rapidly add new entry applications. Currently three data entry applications are being used successfully and a fourth is ready.
Thanks to the addition of a student programmer we have made major progress with our intranet. Bugs have been worked out and this year’s annual report submission went smoothly. Online equipment and vehicle reservations have been added and the very successful concept of an informal working group was expanded. Working groups allow people to post and download documents, have a threaded discussion and collaboratively edit text documents. After several less successful experiments with commercial and open source software packages we have integrated this functionality into the intranet and at this point the document posting and downloading has been very well accepted by the researchers.
A former graduate student, Ed Gilbert, has returned to work with us full time as web programmer. A plant taxonomist by training, he is mainly interested in biodiversity informatics and during his former tenure has been instrumental in establishing the biodiversity section of the Southwest Environmental Information Network (http://seinet.asu.edu ). Arizona’s large biodiversity collections can all be searched simultaneously, species distributions mapped and dynamic checklist created for any region based on collection records. In his spare time he is working on an application that allows interested people to contribute remotely to a central character database and online key for Arizona. His first duty will be the improvement of our data downloading functionality.
The Global Institute of Sustainability (GIOS) is participating in the state funded Arizona Water Institute, a consortium of Arizona’s three universities focused on water sustainability through research, technical assistance, education and technology. The GIOS datalab staff is involved in building the Arizona Hydrologic Information system (http://www.azwaterinstitute.org/ahis/ ) in collaboration with the other universities, local stakeholders and CUAHSI (http://www.cuahsi.org/ ).
GIOS has recently added the School of Sustainability (http://sos.asu.edu ), offering undergraduate and graduate degree programs in the area of sustainability research.
LTER Site: Coweeta LTER
Contributor: Barrie Collins (Sep 14, 2006)
Site Byte:
In the past year, Coweeta LTER IM has focused on improving metadata and data consistency. This is a follow-on . In addition, we've been in the process of debuting several large projects.
Land cover southern Appalachians (32,000 Square Miles): Coweeta LTER has classified Landsat TM imagery on five year intervals (1986, 1991, 1996, & 2001). This data is the beginning of an ongoing record of land cover that will provide scientists with historic data of change over time.
Socioeconomic data for 22 LTER sites: As part of the LTER Trends Project, Coweeta LTER IM has organized demographic data since 1790 over 31 variables covering 236 fields. This data is currently being folded into an online prototype for serving data to scientists and the public.
Internet Mapping: A tough challenge under any circumstance, internet mapping is still a developing technology. Coweeta LTER IM is in the final stages of a product that will allow users to view and download gigabytes of GIS data covering over 30,000 square miles. Using wavelet compression technologies, we are successfully serving raster data sets for the entire southern Appalachians (original files size in some cases exceeding 8 gigabytes) across the internet. This product is still under development but should be released in October 2006.
It is worthwhile to note that Internet Mapping is a buzz-worthy concept that rarely has a well-defined elegant application in real-world situations. Coweeta LTER IM is trying to overcome the current design limitations and poor or non-existent design standards to release a truly useasble product.
Telling Stories: In the 4th quarter of 2006, there will be a paradigm shift in how Coweeta LTER presents itself across the internet. We have a fine legacy of data and a capable dynamic data and publications database. However, our goal is to begin communicating the context of the research. Why is this research conducted? Who are the main players? What is the importance? Why does this matter? Our hope is to being to create a narrative of the research at Coweeta.
The first two prototypes for this plan are to create a larger narrative for the afore-mentioned land cover classification, as well as a real estate program being implemented by Dr. Carolyn Dehring focusing on Buncombe County, North Carolina (Asheville area).
It is our definite goal to create a larger understanding for our users of what is going on at the research level for individual projects. This will require IM to follow the recommendations of past reviewers to work closer with our scientiest to construct this narrative.
Best regards,
Barrie
LTER Site: Florida Coastal Everglades
Contributor: Linda Powell (Sep 05, 2006)
Site Byte:
In preparation for our 2006 funding renewal this past February, we made several enhancements to the Florida Coastal Everglades LTER program’s information management system (IMS). With the addition of three new servers to the IMS, Linda Powell (information manager) and Mike Rugge (project manager) have spent numerous hours incorporating the new equipment into the FCE network. Their focus over the past year has been on several important components of the IMS: 1) upgrading and expanding the FCE Oracle10g database, 2) upgrading and migrating information to the new FCE web server, 3) adding new web applications to the FCE website to facilitate information and project management and 4) converting all 270 FCE metadata files into tier 4 and 5 EML.
We have redesigned the ‘Research’ section of the FCE website by adding several new categories: 1) Information Management, 2) Projects and 3) Sampling. The Information Management System (IMS) (http://fcelter.fiu.edu/research/information_management/) is described in detail as it includes an IMS overview, the FCE data management and data submission policies, IMS statistics and a link to the EXCEL2EML tool. This information is also cross-referenced under the ‘Data Resources’ section.
Our ‘Projects’ section has greatly improved as we have organized specific projects and their related information using a series of tabs that give a ‘file folder’ look on the web page (http://fcelter.fiu.edu/research/projects/projects.htm?pid=1). For any given project, a user can easily find the abstract, research sites, personnel, sampling attributes, datasets and publications related to that specific project all on one web page. Additionally, projects can be searched by keyword, researcher or funding organization.
The ‘Sampling’ application (http://fcelter.fiu.edu/research/projects/sampling.htm) is extremely useful in information and project management. Users are able to search for sampling attributes by entering keywords or manually selecting an attribute of interest from a list. Once the selection has been made, a map with a series of tabs give users access to research site information, dataset listings and project information related to their sampling attribute selection. This feature greatly facilitates project management and site science. Users are able to easily access related information from one portal. For example, researchers interested in porewater salinity can enter the sampling attribute in the keyword field and the results will return a map of the FCE research sites where porewater salinity is collected as well as tabular links to all datasets containing porewater salinity and projects responsible for the collection of that attribute. The FCE LTER has a written policy whereby researchers beginning new projects must submit their project information, including proposed sampling attributes, to the information no later than six months after the beginning of such project. With a list of sampling characteristics, both real and proposed, this application allows the information manager to track which sampling attribute values have not been submitted to the FCE IMS in a timely fashion as there will be no dataset affiliated with the missing attributes.
Additionally, the FCE student organization now has a section on our home page that delivers information on ongoing events and activities and links to important student features such as graduate student tools, links to other student organizations, poster logo information and student presentations, dissertations, and theses. There is also a ‘Featured FCE Student’ section that highlights the student’s background and interests.
This coming year our focus will be on building a web-based query interface for FCE data values and adding a graphing application so that FCE researchers can graph their query results in real-time. To date, our data are available in ASCII files only. Many sampling attributes, like Total Phosphorus, can be found in several different flat files. Users must download each file to compile one Total Phosphorus data table for the FCE research sites. Allowing users to query download or graph the database for specific sampling attributes across all data files will be a welcomed addition to our website.
The FCE IMS has fully adopted the LTER network metadata standard Ecological Metadata Language (EML) and one hundred percent (100%) of the FCE tabular data are accompanied by a Level 4/5 (Data Identification, Discovery, Evaluation, Access and Integration) EML (XML) metadata documents. FCE EML documents are harvested daily to the LTER network metacat XML database.
LTER Site: Georgia Coastal Ecosystems
Contributor: Wade Sheldon (Nov 14, 2006)
Site Byte:
We submitted our GCE-II renewal proposal to NSF in February 2006, so most IM effort this past year was dedicated to helping researchers acquire, integrate and analyze data for publication and incorporation in the proposal. We also acquired some new ancillary long-term data sets from NWS and NOAA stations along the way, which we documented and added to our online data portal (http://gce-lter.marsci.uga.edu/portal/). We decided not to significantly overhaul our web site prior to the renewal, so we focused instead on tightening integration between web-accessible databases and web applications. For example, we developed a dynamic personnel list and bio pages that automatically display recent data submissions and complete lists of publications and presentations, with follow-up query links to our data catalog and bibliographic database query page. We also replaced all contact information links displayed by other web applications (e.g. data catalog, web calendar, species lists) and static web pages with query links to these dynamic bio pages to improve cross-linking of information and centralize content control for contact information to one database.
The GCE-II renewal proposal was very well received by reviewers, and both science and IM reviewers were very supportive of our IM program, web site, software development, and network participation. The only suggested changes were to provide a web-accessible version of our MATLAB data search client (already under development) and to "take some risks in thinking about value-added products or interfaces with other digital library efforts beyond the LTER". The latter suggestion will be difficult to address, both logistically and philosophically, so we're not yet sure how we will respond.
Work this summer has largely focused on wrapping up projects left over from GCE-I and preparing for new challenges in GCE-II, particularly a new emphasis on spatial data and GIS to support studies on marsh hammocks. I took over hosting of ESRI licenses in the UGA Marine Sciences Dept. and began supporting several GIS workstations, due in part to the relocation of our Marine Extension Service GIS lab across campus. I also set up a small GIS lab on Sapelo Island as part of a collaboration with the Sapelo Island National Estuarine Research Reserve. We are currently in the process of hiring a full time assistant IM / GIS specialist to assist researchers with collecting GPS data in the field and develop geospatial databases for our project, as well as assist me in acquiring and documenting new tabular data sources and maintaining our IS. This represents a huge and exciting operational change at GCE, more than doubling the FTE dedicated to informatics.
At the network level, I was elected to serve on NISAC to fill out Peter McCarney's term. This committee provides a great opportunity to interact with domain scientists on issues relevant to IM, and I expect to learn a lot this year. I have also participated in the Trends editorial meetings as a NISAC representative, and look forward to more interaction with this group. I have also continued to collaborate with Don Henshaw and Suzanne Remillard at AND on ClimDB/HydroDB development. We recently contributed a paper on ClimDB/HydroDB to the International Conference on HydroScience and Engineering, which is now submitted for review and publication. I also developed a ClimDB/HydroDB data access client for the GCE Data Toolbox for MATLAB (http://gce-lter.marsci.uga.edu/lter/research/tools/data_toolbox.htm), allowing users to retrieve data from any registered station directly into MATLAB for analysis and transformation. Suzanne and I presented a poster and demo on these tools at the 2006 LTER ASM. I am currently collaborating with Barrie Collins at CWT to provide near-real-time USGS data and plots on the web for the CWT study site, and we are exploring the possiblity of sharing resources on GIS development in the future.
LTER Site: Hubbard Brook LTER
Contributor: John Campbell (Sep 21, 2006)
Site Byte:
The information management team at Hubbard Brook is in the process of installing a wireless sensor network for automated collection of continuous, 15-minute meteorological and stream water data from one remote watershed. The wireless network will be based around 900 MHz spread spectrum Freewave radios. The network will be complemented by a web-based data delivery system through which researchers can view graphic displays of real-time environmental data, or download archived data. In September, a 15 m repeater station was installed on a ridge between the experimental watersheds and the Forest Service headquarters to establish line-of-site radio transmission. Once the data are transmitted to the headquarters building they will be uploaded via satellite internet connection to a MySQL database.
This past summer we hired Pavel Dorovskoy, an undergraduate Computer Science major at the University of New Hampshire. Pavel was the first REU student at Hubbard Brook specializing in information management. His research project was centered on the wireless sensor network and he gave a presentation at the annual Hubbard Brook Cooperators’ meeting and will be submitting an article to the Databits newsletter.
Considerable progress has been made in redesigning the Hubbard Brook web page. In January, the Hubbard Brook’s Information Oversight Committee approved the new design of the web page. Since that time, the Hubbard Brook information management team has been converting the old web page to the new format. The new webpage will adhere to the new LTER webpage design recommendations. Some of the new features of the web page include: an image archive with an image upload feature; curriculum vitas with forms for making on-line updates; and improved methods for searching and downloading data. A draft of the new website will be completed by October 1st for review by the Information Oversight Committee. The final website will be made public before Hubbard Brook’s mid-cycle NSF review in 2007.
The Hubbard Brook sample archive was recently dedicated to Cindy Veen. Cindy Veen served as Hubbard Brook database manager from 1988 until her untimely passing in 1996. She was well-respected at the USDA Forest Service and among her peers in the LTER network. Among her many accomplishments, she was instrumental in the establishment of the Hubbard Brook sample archive and worked diligently to make it operational. In recognition of her achievements, the Hubbard Brook sample archive building was dedicated to her in July at the Annual Hubbard Brook Cooperators’ meeting. The exterior of the building now has a sign bearing Cindy’s name and there is a plaque on the inside with her picture and a description of her role in the Hubbard Brook Ecosystem Study.
LTER Site: Jornada Basin
Contributor: Ken Ramsey (Nov 02, 2006)
Site Byte:
In the last year, the information management team at the Jornada Basin LTER has redesigned the Jornada website and prepared for our site renewal (which was approved) in addition to normal information management tasks and responding to requests from researchers for map products. The Jornada website was redesigned using a draft of the LTER web design guidelines and by visiting all LTER websites. The website was initially developed using the PHP programming language and a SQL Server backend database. We have migrated most of our dynamic web pages from using a relational database to using XML files. This has made our website available when our database server is down for maintenance and updates.
We are populating the metadata for all research projects and datasets in the relational database to allow dynamic generation of EML. We are also editing the style sheets used to create EML documents from the database to correct some initial problems caused by generating blank EML elements and to enable the creation of level 5 EML from our database once the metadata population has been completed. Inigo San Gil is scheduled to come to the Jornada to assist with completing the style sheets in early November.
LTER Site: Kellogg Biological Station
Contributor: Sven Bohm (Sep 17, 2006)
Site Byte:
This past year we on consolidated and normalized our databases. Reorganizing tables and trying to eliminate redundancies. We made one more switch in database systems from MSSQL to DB2, to gain spatial SQL capability. We also added a server with 1 terra-byte of disk space to store our annual air photo campaigns. We're currently replacing our old web/db server with a new virtual server. Using a virtual server, we can easily create a copy to test changes, and troubleshoot without disrupting the service.
I wrote a couple of small quality control web applications, to provide an incentive for the lab and other researchers to submit data earlier in the process. Two of the QC apps have been successfully used so far, and they have generated request for apps to process other datasets.
LTER Site: Konza Prairie LTER
Contributor: Jincheng Gao (Sep 16, 2006)
Site Byte:
With 18 months work experience as an Information Manager at the Konza site, I have learned a lot and have benefited from the efforts of our former Information Managers, including John Briggs, Brent Brock and Zhiqiang Yang. In particular, I appreciate the efforts of Zhiqiang Yang, who did an excellent job in building the KNZ metadata database and establishing the dynamic web site. In the past year, I have continued to work on the transfer of data from ascii text files to a SQL Server database and QA/QC checking, as well as GIS data generation and regular updating of the database and website.
So far, we have finished the transfer of historical data from text files into the SQL Server database. The data can now be searched and downloaded based on specific watershed, treatment, or date of data collection criteria. Data input interfaces for each dataset were created with VC# to convert text data files into the SQL Server database. Data quality checking is performed based on data range and logical consistency before data transfer. The data ranges are defined according to expected minimum and maximum values for each dataset, and logical consistency is checked based on general knowledge for each data type. For example, daily mean temperature must be less than the daily maximum and greater than the daily minimum values, or the data are flagged for further checking and correction. We are continuing to update the EML for each dataset to comply with the Best Practice for the Metacat harvest. Currently, over 50 datasets in the KNZ database (more than 80% of our core datasets) are harvested weekly by the Metacat database. After QA/QC checking, four stream flow datasets from the KNZ are being regularly harvested by HydroDB.
Based on the suggestion of the KNZ site review last year, I have also collected and edited the GPS data for historical and long-term sites of data collection, such as the climate and stream gauging stations, and long-term transects for collection of plant species data. The accuracy of other spatial data in the KNZ database has been checked, and updated where necessary. The GIS and remote sensing data in KNZ have been also reorganized and their metadata have been updated with FGDC and EML format. The dataset of annual burning history of various watersheds at KNZ is now stored in two formats. One format is as layers for the various watersheds or fire treatments, and stored as a polygon feature class in our database. The other format is as non-spatial data. The non-spatial data are linked with KNZ watershed spatial data through ArcSDE. The burning history data can be interactively queried by watershed name, burning type, and burn time (year or date) in the KNZ interactive web site.
A new bison dataset was created last year. A GPS collar unit was installed on an adult bison as an initial trial to assess the utility of this approach for tracking bison movement patterns. The geographic coordinates of the bison are automatically collected every two hours. The GPS data of the bison are in the interactive wet site (www.konza.ksu.edu/website/knzbison315/). As more animals are fitted with tracking devices, the GPS data for bison movement will be integrated with the vegetation data in KNZ to study bison behavior and their effects on heterogeneity in the tallgrass prairie landscape.
In the coming year, I will focus on our web design and the QC/QA checking of our EML metadata. In addition, the methodology associated with each dataset will be converted from the current text format to EML metadata.
LTER Site: Luquillo LTER
Contributor: Eda C. Melendez-Colom (Sep 14, 2006)
Site Byte:
LUQ LTER main achievement this year was the unconditional approval of the LUQ LTER 4 proposal. For the last year LUQ IM dedicated a great time in the everlasting task of improving the databases’ metadata and updating data on the web site.
LUQ IM is always looking for ways to improve the communication among the different communities of the LUQ LTER. Web sites could be a mean to improve this communication. A new Plone web site was created for the LUQ LTER Canopy Trimming Experiment (CTE) group which is now being used by its members to share and edit their documents, e.g., the LUQ LTER CTE Poster for the 2006 ASM.
To further enhance the communication and present the community with information management concepts, we created periodical email messages, copies of which that can be accessed on the web site called “Gotitas del Saber” or “Drops of Knowledge” (http://luq.lternet.edu/datamng/GotitasdelSaber/index.html). The explanation of simple concepts like what is a URL was very well welcomed by the community.
Realizing the fact that the LUQ LTER Schoolyard Program is going through a re-birth process, we realized that there was an obvious lack of an IM component in this program. We had collaborated in year 2000 by giving a workshop to the teachers on data management and creating a static directory on the web. Since fall 2005, LUQ IM has been closely collaborating with the investigator in charge by visiting the schools to learn more about the research sites and their data sets, creating a mailing group alias (lterschool@ites.upr.edu), designing a Web Site (http://luq.lternet.edu/outreach/schoolyard/index.html), entering their data, and planning workshops for managing data at the different schools. In these workshops, the student will be exposed to the principles that define an LTER site: sharing and preserving information. A subset of LUQ metadata standards will be used to document their data uniformly across the schools. IM has suggested modifications to the structures of similar data sets in an attempt to standardize the way data is being managed in the schools. The web sites where each school has a page with Photos, a data set catalog, main schools results, publications and contact information. Finally, IM will give a hands-on workshop in an Internship that will be held next November for selected students of all three schools.
LUQ IM continues with the effort of translating its metadata into EML and it will focus in this activity for the next year.
LUQ IM System needs to be enhanced. We plan to devote ourselves in studying ways to create a framework that helps the user (internally and externally) to access information and perform synthesis-related tasks. The best way to do this is still not clear. The LUQ IM intends to learn more about the best infrastructure to do this. We have hopes that EML will be the answer to this.
LTER Site: McMurdo Dry Valleys
Contributor: Chris Gardner (Oct 31, 2006)
Site Byte:
This year has been one of major IM-related advances at MCM. I went live with our new website at www.mcmlter.org in January 2006. Working with Inigo at the LNO, we were able to take MCM from 0% EML compliance to 100% EML level 5 in the matter of a few short months.
The new web presence represents a total redesign with new database-driven features. Some new features include an updated searchable bibliography and personnel list, a database of freely downloadable high quality Antarctic photos, and several tools to query our Oracle database for different dynamically derived datasets. A main goal of mine is to take full advantage of the relational database structure through the use of derived datasets that are useful to MCM researchers and the greater science community. An example of these derived dataset query tools include the ability to query stream chemistry and have Oracle dynamically convert mass units to moles, calculate TDS and charge balance, and then round the time the chemistry sample was taken to the nearest quarter hour and join those chemical data to stream discharge measurements. Other dynamic pages calculate the number of days of flow of these ephemeral streams at MCM and the total volume during the summer flow season. Other accomplishments include a site management section in the restricted access portion of the website where ancillary research projects can be documented, news stories can be added to the database, chain of custody forms can be downloaded, etc.
Current projects include documenting our spatial data layers in ESRI’s ArcSDE and linking those data dynamically to our tabular database in Oracle. Additionally, this field season I will be implementing improved sample tracking protocols where the chain of custody forms will be entered into the database, which will eventually be used as table joins and to verify the existence of data. Using a new standardized chain of custody form that requires specific sample name formats will also greatly reduce the time spent manually verifying and combining chemical data. These and other new protocols will greatly increase the efficiency and accuracy of MCM’s system of dealing with data starting from the point of collection all the way to making the QA/QC’d data publicly available on the web through dynamic query tools.
LTER Site: Niwot Ridge LTER
Contributor: Todd Ackerman (Sep 18, 2006)
Site Byte:
The personnel of Niwot Ridge LTER Information Management over the past year has consisted of Information Manager Todd Ackerman and Student Worker Anobha Gurung. It has been another busy year for the IM staff. We have been working on several new projects on top of the centralized data entry/process/dissemination.
One major dataset undertaking that has occurred this year was to re-visit our 50+ year climate record for the D1 and C1 Station and the 20+ year record at the Saddle Station. We realized a shortcoming in the data when it comes to modeling. We needed to have a solid/standard record of daily values for precipitation and temperature for these main stations, and there were many missing days throughout such a long record causing false trends when the long-term record was analyzed. Using adjacent climate stations and statistical methods we filled and flagged these missing days so that modeling could be better performed when needed. The datasets are nearly complete, as we are revisiting some of the methodology used.
We have also been developing a query-able system for the chemistry data produced by our in-house Kiowa Environmental Chemistry Laboratory run by Chris Seibold. In the past these files existed as MS Excel files and only some of the data posted as ancillary data to the core datasets. This new query system at the moment is available internally only until use permissions and proper metadata can be developed. It has vastly improved the compilation of data for researchers when requested.
We are also now installing the latest FreeWave HT-plus radio system for two new sites (new high frequency data collecting eddy-flux towers run by Peter Blanken) and adding one to the existing Soddie Site to allow for Deltev Helmig’s group to better access and control their sampling tower remotely in the winter. Our field personnel (Mark Losleben and Kurt Chowansky) have been key in getting this project going.
This year we have also been participating in the DayCent-Chem biogeochemical cross-site inter-comparison led by Jill Baron. Along with several other LTER sites (NWT, CWT, HBR, and HJA), as well as experimental forest sites and National Parks, we attended workshops and submitted data to Baron's group to look at C and N dynamics.
We have recently upgraded the core LTER file/web server CULTER from a Sun Ultra 60 to a Sunfire V250. This has vastly increased our file storage capabilities allowing us to make larger datasets such as imagery more easily accessible. It also allowed us to consolidate one other LTER server SNOBEAR with CULTER to reduce the number of machines to maintain. We are also upgrading our Windows server from a development server (Dell PowerEdge 500SC) to a much more powerful server (Dell PowerEdge 2600) which will allow us to serve data publicly from the MSSQL RDBMS which the previous machine could not handle from lack of system resources.
LTER Site: North Temperate Lakes
Contributor: Barbara Benson, Dave Balsiger, Jonathan Chipman (Sep 15, 2006)
Site Byte:
We have continued our development in the area of sensor networks as NTL increases the number of lakes on which instrumented buoys are deployed. Currently data are flowing automatically, in near real-time, into our Oracle database from 8 buoys. We are collaborating with computer scientists from the University of California San Diego Supercomputer Center, SUNY-Binghamton, and Indiana University on automating the configuration and quality assurance components of the NTL information system for near-real time data streaming from instrumented buoys. We have created a metadata model for instrument management. Calibration previously had been done for the dissolved oxygen data in a manual process; we designed and implemented a web-interface for computer-assisted calculation of correction factors for the calibration and updating values in the database based on the correction factors. We designed and implemented a Cyberdashboard based on the GridSphere portal framework to help manage the network of instrumented lake buoys. This web-based application enables the buoy technician or data manager to quickly determine the status of the multiple remote sensors. Information regarding missing data and the automated QA/QC processing of the data is also available via the Cyberdashboard.
NTL is providing leadership in the Global Lake Ecological Observatory Network. The Global Lake Ecological Observatory Network (http://gleon.org) is a grassroots network of limnologists, information technology experts, and engineers who have a common goal of building a scalable, persistent network of lake ecology observatories. Data from these observatories will allow us to better understand key processes such as the effects of climate and landuse change on lake function, the role of episodic events such as typhoons in resetting lake dynamics, and carbon cycling within lakes. In March 2006, we met in Townsville, Australia for a GLEON /CREON workshop. Another workshop will be held in Hsinchu,Taiwan in October 2006. We maintain the GLEON website and are developing a database of information on lakes associated with this global network. More social science data sets including a manure management survey and a data set on shoreline property sales were added to the NTL Oracle database and are available through the website. We have implemented a second “tier” for the information management system in order to capture data sets as text files along with their metadata when it has been decided that these data are not in high enough demand to warrant incorporation into the Oracle database but need to be permanently archived.
Ongoing spatial data management activities include development of web-based mapping systems and the addition of new data sets to the catalog. The NTL website now has three interactive map servers that provide users with the functionality to create maps using spatial data layers for our two study areas (the Madison region and the Trout Lake region in Vilas County) and for Wisconsin statewide.
LTER Site: Palmer Station
Contributor: Karen Baker (Oct 22, 2006)
Site Byte:
The PAL information management effort is developing within a broader Ocean Informatics environment that includes the CCE LTER marine site co located with the PAL information management component at UCSD/Scripps Institution of Oceanography. Technical, organizational, social, and individual infrastructure needs are addressed together in order to support the PAL community and interfaces with the LTER Network.
A new remote mount technology was established for manuscript, presentation, and file sharing. Our informatics design studio capacity was updated with a system able to support multi-user chat sessions and with processor speeds more effective for CPU intensive applications such as google earth. Local efforts are ongoing in preparation for Open Directory LDAP services, an approach intended to enable secure and versatile information exchange. In the interim, local authentication/authorization conventions have been developed as collaborative needs arise for accessibility and security. Community tools developed include JPGraph for web plotting, Matlab for data analysis, and WordPress for blog and web page editing.
The local database design was expanded into a dual component model influenced by identifying, discussing, and gathering of long-term data for the Trends project. Attribute naming conventions are in development providing a framework for dictionaries that ultimately will support interoperability. The PAL web site (http://ccelter.sio.ucsd.edu) was updated for meetings of the site review committee in November 2005 where ice precluded the ship carrying site reviewers from reaching Palmer station and in May 2006. The site bibliography content was updated and electronic copies of manuscripts made available either publicly or to the site. In addition, the bibliography module was refactored to include script delivery for an LNO harvester and to conform with the recently updated LTER Network requirements. An Ocean Informatics web site (http://oceaninformatics.ucsd.edu) is being developed into a cross-project web portal. Design issues have been explored (Design Interfaces; Design Patterns; http://intranet.lternet.edu/archives/documents/Newsletters/DataBits/06spring/#2gr) and techniques used to develop a hybrid model incorporating elements of global navigation as well asll hub-spoke design. Cross-project work (CCE, CalCOFI, Ocean Observing System, and NOAA Fisheries) is prompting development of meta level context and stimulates development of best practices.
Collaborative local activities included joint design sessions focusing on eventlogs, and dictionaries. This work was part of a proposal written to frame and support cross-project infrastructure work. Work with the LTER Network included participation on the LTER Governance Committee. Where new text was drafted that develops more fully the role of information management. Participation on the network Web Recommendations Committee provided an opportunity to foreground the SiteDB network module.
KBaker became a UCSD Science Studies affiliate (http://sciencestudies.ucsd.edu/) with synergies developing along side previous LTER ethnographic studies focusing on articulation work and data stewardship. A team of five Ocean Informatics participants for PAL & CCE submitted posters (Palmer LTER: Designing a Queriable Community Data System; Research in Infrastructure Studies: Social and Organizational Perspectives on Ecological Data Management) and attended the 2006 All Scientists Meeting.
LTER Site: Sevilleta LTER
Contributor: Kristin Vanderbilt (Oct 30, 2006)
Site Byte:
The Sevilleta IM Team continues to support site and network level science by seeking ways to improve data discovery and availability.
Sevilleta scientists have two studies in which wireless sensor networks play a significant role. Data is being collected in large volumes, and the challenge is to import the data from the Campbell datalogger into our MySQL database, write QA/QC scripts, and graph the data in near real-time. We are presently exploring off-the-shelf software and custom software solutions using SAS for these purposes. Sevilleta LTER graduate student Etsuko Nonaka is working with two scientists at Los Alamos National Labs to develop QA/QC methods for the Sevilleta sensor data.
Sevilleta continues to translate legacy semi-structured metadata into EML using a variety of methods including Morpho and a custom script developed by Inigo San Gil of LNO. We have Level 5 EML for about 50% of Sevilleta legacy data.
Wikis have become valuable parts of the Sevilleta IMS. Sevilleta submitted a successful renewal proposal in 2006, and the preparation of the proposal made use of a wiki for sharing document revisions. A wiki is also being maintained by the permanent four person field crew and the IM, describing all steps needed to collect, manage, and archive the long-term data sets collected by the field crew. A password-protected System Administration wiki has been implemented to document mission-critical sys admin tasks.
IM Kristin Vanderbilt has been active on the US ILTER Committee. She gave two presentations at the EAP-ILTER meeting in Kyoto, Japan in February about information management training. At the ILTER Coordinating Committee meeting in Namibia in August she became Chair of the ILTER Information Management Committee.
LTER Site: Shortgrass Steppe
Contributor: Nicole Kaplan (Sep 18, 2006)
Site Byte:
Over several weeks, the IM Team and SGS-LTER Staff have focused on setting up new administrative and staff office at Colorado State University. Staff and the IM Team have been responsible for inventorying and organizing legacy data and reports dating back to the 1930s, located across six offices and over a dozen filing cabinets. We have used this opportunity to create a library of historic International Biome Project and LTER data and reports, and prepare older, richer metadata for entry into our database. These efforts will help preserve the history of project goals, and document the evolution of ecological studies and data that spans over forty years at the SGS field research site. We hope this will serve as a useful resource for LTER Researchers in the future.
The IM Team now consists of Nicole Kaplan, full-time IM Team Leader, Bob Fynn, half-time IT/GIS Manager, a half-time student web developer and two quarter time data entry students. We continue to work on improving data discovery, delivery and interoperability tools within the SGS Information Management system. Our efforts have been guided by both internal evaluation of our strengths and weaknesses, and an external mid-term review, which occurred last summer by National Science Foundation. We have continued to work with Inigo San Gil at the LTER Network office to implement a newly designed relational database management system (RDBMS) and PERL and XSLT (Extensible Stylesheet Language Transformation) scripts that now contain and generate level 5 EML (Ecological Metadata Content) content. The IM Team is working with our core Staff and Principal Investigators to enhance metadata for our current long-term studies to Level 5, which describes tables and attributes in addition to project objectives, methods, locations, and principal investigators.
We are also constructing new information delivery tools within a redesigned SGS-LTER website that serves metadata and other information from the new RDBMS. Nicole has been involved with developing and adopting Recommendations for Website Design within the LTER Network. We are taking into account these recommendations, applying a Java script for menu navigation, installing website search tools and using Macromedia (Adobe) Fireworks for displaying images from the field site.
Bob provides support for SGS-LTER researchers and students in various aspects of GIS including gathering data with GPS equipment and imagery, assisting with GIS model development for their particular research, and providing GIS data and maps for field work and modeling. He has also extended existing programs for analysis of SGS-LTER data.
LTER Site: Virginia Coast Reserve
Contributor: John Porter (Sep 17, 2006)
Site Byte:
This has been a busy year for information management at the VCR/LTER in a number of areas. The first has been work on developing tools that use EML to produce documents, such as statistical programs, for use by researchers. Converters were developed that convert suitable EML documents to SAS, SPSS and R statistical languages. In addition, a web interface to the tools was developed and means to link the programs to static web pages were developed.
The VCR/LTER also hosted the second-to-last Coordinating Commitee meeting in the fall of 2005 (the CC was replaced by the "Science Council" in the spring of 2006). We used a system we have developed for easy-to-use online forms to manage the complex logistics of getting everyone to and from the airport.
During 2005 and January 2006, we also wrote our new proposal. During that process we made extensive use of videoconferencing to bring together (virtually) investigators taking the lead on specific proposal sections. The system we used consisted of PC's running the Polycom PVX software ($120) using an inexpensive ($50) USB-camera/microphone combination. These then linked to a higher-end Polycom FX unit that allowed us to have 4 participants in the conversations. We also continued revamping our web page using the PostNuke Content Management System.
This has also been an important year for international collaboration with the Taiwan Ecological Research Network (TERN). It started in January when John Porter (VCR), Don Henshaw (AND) and Peter McCartney (NSF) participated in a series of workshops with Taiwanese researchers and with other LTER Information Managers from the East Asia Pacific (EAP) region. The trip also included visits to several of their research sites. As a result of the trip, several sites now participate in CLIMDB. Closer to home, we have had a series of visits by members of the TERN "IM Team." Chien Wen "Kevin" Chen visited the VCR/LTER during February-May 2006. The focus of his visit was collaboration on the development of wireless sensor networks for field research. He was followed by a return visit (June-Sept) from his colleague Meei-ru Jeng. She focused on collaborations involving EML, specifically the creation of EML documents, XML editors, stylesheets and "R" statistical language programming. In Sept. 2006 she was joined by her colleagues Yunyin Yeh and Fu-Ching "Tanya" Yang. During their visit they focused on the examination of several US LTER web sites and how they provide services to researchers. This was followed up by visits to the Baltimore Ecosystem Study, Harvard Forest and Hubbard Brook LTER sites, where they met with the information managers from each site. This fall, their colleague Chi-Wen Hsao will be coming to work on GIS-related issues.
We also saw the openning in August of the long-awaited "Anheuser-Busch Coastal Research Center of the University of Virginia" (ABCRC). The new ABCRC provides much improved laboratory and housing facilities, and replaces our old, rented farmhouse as the field headquarters for VCR/LTER activities. IM activities at the new Center involve moving our T-1 connection, developing wireless and wired networks in the new buildings and connecting those networks to wireless field equipment.
We have also been active in several ongoing network-wide activities. We have been involved in the Cyberinfrastructure working group of the LTER Planning Grant activity, and hosted a modeling workshop at the University of Virginia in January 2006. We have also been active with the controlled-vocabulary working group and ILTER training in the East Asia Pacific region.
Plans for the coming year include continued improvement of the EML we produce, specifically in the areas of units and geographical information, development and improvement of tools for using EML , continued collaborations with TERN and EAP researchers, and development of systems to support management of the expanded LTER facilities at the ABCRC.
|