Data Management

As PLE has undergone extensive growth in its missions during the past 5 yrs, it has become clear that there is a strong need for a comprehensive data management plan. Some types of data have been collected for decades whereas other types of data have not. As a result, in 2010 PLE began a comprehensive data management plan that was fully implemented in 2011. This plan was prepared in consultation with the NSF Award and Administration Guide, OBFS guidelines (Gorentz 1992), multiple recent publications (Michener et al. 1997, Moore et al. 2010, Porter 2010, Rausher et al. 2010), and the PLE Advisory Board.

PLE's data consists of the following categories: 1) Long-term climate data, 2) Station user-days statistics, 3) Historic research, and 4) Current and future research. Each of these areas is discussed below, followed by a description of the system used to archive and protect these data.

Field-station data

PLE has long-term climate data collected by the adjacent PA State Fish Hatchery (located within 300 m of the field station). These daily data go back to 1941 and are available from the NOAA Climate Data Online website.  The weather station Network:ID is GHCND:USC00365050.

 In 2016, PLE partnered with WeatherSTEM to host a weather station as part of their network of sensors.  Up to the second weather data are now available to users as well as historical data through the WeatherSTEM website: https://crawford.weatherstem.com/ple

PLE has been recording the field site use of all researchers since the mid 2000s. This provides a valuable history of field sites for current and future investigators. These data are available in an Excel spreadsheet here.  If more detail is desired about a previous researcher's plot locations, you can contact the PI directly or email us at pymlab@pitt.edu and we will send scans of any maps the researcher may have provided us.

When considering data management, it is critical that one considers issues of access and security. In consultation with IT personnel at Pitt, PLE has developed a strategy of providing data through the PLE web page portal using a server that is based at the University of Pittsburgh main campus. All data will be automatically backed up on University servers to ensure redundancy and no loss of data. Unrestricted access will be granted to the databases that contain historic publications, climate data, historic field use, and user days.

Principal Investigator data and metadata

Current researchers at PLE are a tremendous resource for research data and we have recently experienced a rapid increase in the number of annual publications. Prior to 2010, PLE did not possess a data management plan nor did most PIs archive their data in publicly available repositories. The PLE research community understands the value of archiving data and there is broad support for this effort, particularly since data archiving is currently required by several NSF programs and is already being done by the leading ecology and evolutionary journals (e.g., Rausher et al. 2010).

To assist PIs with data management, PLE has developed a DMP that is available online to all researchers.

See also the University of Pittsburgh Guidelines on Research Data Management.

This document explains the need for archiving data and the benefits of archiving data to both the PI and the scientific community. It also provides guidelines to assist PLE investigators in the following steps: 1) Recording and maintaining data, 2) Policies regarding rights and obligations of involved parties, 3) Notification of publication (to make PLE aware of all publications), and 4) Archiving the data in an international repository.

The DMP document includes the complete list of possible data: 1) Data set descriptors, research origin descriptors, 3) Data set status and accessibility, 4) Data structure, 5) Supplemental descriptors, and 6) Location(s) of physical specimens. The online DMP also includes an example DMP for researchers to view and a link to the DataOne web site, which provides a useful search engine of data management tools including interactive software that drafts a DMP based on user input.

The PLE data management plan requires all researchers to archive a copy of their data within one year of publication (including metadata, using the standards recommended by Michener et al. 1997). Data can be archived in a public repository (e.g., Dryad, GenBank, TreeBASE, and NCEAS Data Repository) or as online supplements to manuscripts. As is standard practice with data repositories, privileged or confidential information will be protected to respect the privacy of individuals. Data releases will also consider any legitimate concerns of investigators. Because PLE cannot afford a Data Manager staff position, each researcher is required to upload their data following publication. Each researcher will also be responsible for quality assurance and quality control.

In addition to having the archived data available in international repositories, PLE also provides web links to raw data that accompany published papers by PLE researchers. These links are placed at the end of each citation, next to the pdf file, on PLE’s Publications webpage. As researchers move to archive more of their studies into repositories, the number of archived listed on the PLE Publications web page will continue to grow.

Helpful data archiving references

Bruna, E. M. 2010. Scientific Journals can Advance Tropical Biology and Conservation by Requiring Data Archiving. Biotropica 42:399-401.

Gorentz, J. B., ed. 1992. Data management at biological field stations and coastal marine laboratories. OBFS Workshop April 22-26, 1990.

Michener, W. K., J. W. Brunt, J. J. Helly, T. B. Kirchner, and S. G. Stafford. 1997. Nongeospatial metadata for the biological sciences. Ecological Applications 7:330-342.

Moore, A.J., M A. McPeek, M. D. Rausher, L. Rieseberg, and M. C. Whitlock. 2010. The need for archiving data in evolutionary biology. Journal of Evolutionary Biology 23:659-660.

Porter, J. H. 2010. A Brief History of Data Sharing in the U.S. Long Term Ecological Research Network. Bulletin of the Ecological Society of America 91:14-20.

Rausher, M. D., M. A. McPeek, A. J. Moore, L. Rieseberg, and M. C. Whitlock. 2010. Data archiving. Evolution 64:603-604.