Review
Special Issue: Ecological and evolutionary informatics
Ecoinformatics: supporting ecology as a data-intensive science

https://doi.org/10.1016/j.tree.2011.11.016Get rights and content

Ecology is evolving rapidly and increasingly changing into a more open, accountable, interdisciplinary, collaborative and data-intensive science. Discovering, integrating and analyzing massive amounts of heterogeneous data are central to ecology as researchers address complex questions at scales from the gene to the biosphere. Ecoinformatics offers tools and approaches for managing ecological data and transforming the data into information and knowledge. Here, we review the state-of-the-art and recent advances in ecoinformatics that can benefit ecologists and environmental scientists as they tackle increasingly challenging questions that require voluminous amounts of data across disciplines and scales of space and time. We also highlight the challenges and opportunities that remain.

Section snippets

Ecology as an evolving discipline

Ecology is increasingly becoming a data-intensive science (see Glossary) 1, 2, relying on massive amounts of data collected by both remote-sensing platforms [3] and sensor networks that are embedded in the environment 4, 5, 6, 7. New observatory networks, such as the US National Ecological Observatory Network (NEON) [8] and Global Lake Ecological Observatory Network (GLEON) [9], provide research platforms that enable scientists to examine phenomena across diverse ecosystem types through access

What is ecoinformatics?

Ecoinformatics is a framework that enables scientists to generate new knowledge through innovative tools and approaches for discovering, managing, integrating, analyzing, visualizing and preserving relevant biological, environmental, and socioeconomic data and information. Many ecoinformatics solutions have been developed over the past decade, increasing scientists’ efficiency and supporting faster and easier data discovery, integration and analysis; however, many challenges remain, especially

The data life cycle

Knowledge is derived through the acquisition of data and the transformation of those data into information that can be incorporated into the corpus of scientific facts, principles and theories. Figure 1 illustrates the different stages that data might progress through during the processes that lead to new information and knowledge. Two stages are reflected in this depiction of the data life cycle. First, projects that include collection of new data typically proceed through steps 1–5 (i.e.

Supporting the full data life cycle

New ground, aerial and satellite-based environmental observing systems coupled with the rapid growth in the use of in situ environmental sensor networks for field research and monitoring, as well as an ever-growing number of citizen-science programs, will soon push ecology and the environmental sciences into a new era where petabytes of data are being collected annually. Powerful informatics platforms will be required to support scientists as they move into this age of data-intensive science.

Remaining challenges

Despite the emergence of ecoinformatics solutions that enable science, several technical and sociocultural challenges and research opportunities remain. First, from the technical side, it is difficult to transport terabyte- and petabyte-sized data sets. Possible solutions include adding computing capabilities to data repositories so that data sets can be processed prior to transport and colocating high-performance computing with large data resources. Second, new visualization approaches and

Concluding remarks

In a manner analogous to the transformation undertaken in the physics domain, new environmental observational systems are moving ecology into the realm of big science, whereby scientists and institutions share observation platforms, accumulate and analyze massive amounts of data, and collaborate across institutions to address environmental grand challenge questions. NEON, GLEON, OOI and other observational platforms play a key role in this scientific transformation, much like telescopes,

Acknowledgments

This work was supported by National Science Foundation awards #0619060, #0743429, #0722079, #0753138, #0814449, #0830944, #0918635, and the National Center for Ecological Analysis and Synthesis [funded by NSF (Grant #EF-0553768), the University of California, Santa Barbara, and the State of California].

Glossary

Cloud computing
provision of computing cycles, storage resources and software as a service that is accessible from the Internet via a standardized approach that treats these shared resources as a commodity utility.
Data-intensive science
a transformative, new way of doing science that entails the capture, curation and analysis of massive amounts of data from an array of sources, including satellite and aerial remote sensing, instruments, sensors and human observation.
Data life cycle
the data life

References (76)

  • J.H. Porter

    New eyes on the world: advanced sensors for ecology

    Bioscience

    (2009)
  • P.W. Rundel

    Environmental sensor networks in ecological research

    New Phytol.

    (2009)
  • B.J. Benson

    Perspectives on next-generation technology for environmental sensor networks

    Front. Ecol. Environ.

    (2010)
  • M. Keller

    A continental strategy for the National Ecological Observatory Network

    Front. Ecol. Environ.

    (2008)
  • T.K. Kratz

    Toward a global lake ecological observatory network

    Publ. Karelian Inst.

    (2006)
  • E. Fleishman

    Top 40 priorities for science to inform US conservation and management policy

    Bioscience

    (2011)
  • E.J. Hackett

    Ecology transformed: the National Center for Ecological Analysis and Synthesis and the changing patterns of ecological research

  • D.P.C. Peters

    Living in an increasingly connected world: a framework for continental-scale environmental science

    Front. Ecol. Environ.

    (2008)
  • W.K. Michener et al.

    The evolution of collaboration in ecology: lessons from the United States Long Term Ecological Research Program

  • J.R. Gosz

    Twenty-eight years of the US-LTER program: experience, results, and research questions

  • O.J. Reichmann

    Challenges and opportunities of open data in ecology

    Science

    (2011)
  • M.C. Whitlock

    Data archiving in ecology and evolution: best practices

    Trends Ecol. Evol.

    (2010)
  • M.C. Whitlock

    Data archiving

    Am. Nat.

    (2010)
  • P.B. Heidorn

    Shedding light on the dark data in the long tail of science

    Libr. Trends

    (2008)
  • E. Borer

    Some simple guidelines for effective data management

    Bull. Ecol. Soc. Am.

    (2009)
  • R.B. Cook

    Best practices for preparing ecological data sets to share and archive

    Bull. Ecol. Soc. Am.

    (2000)
  • M. Donnelly

    DMP online: the Digital Curation Centre's web-based tool for creating, maintaining and exporting data management plans

    Int. J. Digit. Curation

    (2010)
  • C. Strasser

    DataONE promoting data stewardship through best practices

  • T. Cowles

    The Ocean Observatories Initiative: sustained ocean observing across a range of spatial scales

    Mar. Technol. Soc. J.

    (2010)
  • K. Vanderbilt et al.

    Information management standards and strategies for net primary production data

  • D. Barseghian

    Workflows and extensions to the Kepler scientific workflow system to support environmental sensor data access and analysis

    Ecol. Inform.

    (2010)
  • D. Barseghian

    Sensor lifecycle management using scientific workflows

  • C. Gries et al.

    Moving from custom scripts with extensive instructions to a workflow system: use of the Kepler workflow engine in environmental information management

  • B. Lerner

    Provenance and quality control in sensor networks

  • E.H. Fegraus

    Maximizing the value of ecological data with structured metadata: an introduction to ecological metadata language (EML) and principles for metadata creation

    Bull. Ecol. Soc. Am.

    (2005)
  • M.B. Jones

    Managing scientific metadata

    IEEE Internet Comput.

    (2001)
  • Rugge, D.J. (2005) Creating FGDC and NBII Metadata using Metavist 2005, Gen. Tech. Rep. NC-255, US Department of...
  • D. Higgins

    Managing heterogeneous ecological data using Morpho

  • Cited by (318)

    • Beyond data labor: sowing synthesis science in the Global South

      2023, Perspectives in Ecology and Conservation
    View all citing articles on Scopus
    View full text