Background & Summary

Most ecosystems are rich in species that display a wide diversity of characteristics1 (i.e., traits). One way to make meaningful generalizations from this diversity has been to identify physiological, ecological or functional traits of organisms to infer (e.g., using traits as explanatory variables) patterns of demography, distribution and abundance, and more broadly, ecosystem function and evolution2. Moreover, species traits can be used as explanatory variables for the responses of ecosystems to environmental change, as functionally significant traits mediate species’ responses to disturbances3. Recently, research has demonstrated the utility of trait-based approaches for understanding the effects of anthropogenic disturbances4, the provisioning of ecosystem services5, species distributions68, species composition9,10, and energetic and ecological trade-offs11,12. In seminal papers, compilations of species trait data with broad taxonomic coverage have revealed, for example, a general axis of variation in plants that describes costs and benefits of key chemical, structural and physiological traits11; and factors influencing the metabolic rates of organisms13. However, such broad-scale insights have been restricted to relatively few taxonomic groups, often due to lack of data, particularly information about the ecological context in which data were collected, when such data do exist.

Trait data for stony corals (Cnidaria: Scleractinia) have been collected for more than 100 years and published in many languages. Sufficient data might well exist already for addressing broad-scale hypotheses regarding the ecology and evolution of corals. Although trait compilations are accumulating4,1416, and new statistical approaches for analysing such data are emerging7,12, these datasets are typically gathered for specific traits in isolation to address specific questions which can result in duplication of effort by separate research groups (e.g., Darling et al.12 and Pratchett et al.17 both independently compiled growth rate data). Trait data also tend to be gathered rapidly, for instance with means extracted from tables that present a mixture of original data and data collected previously by others (i.e., meta-analyses). Such a rapid assembly of data can result in omission of important contextual information (e.g., local environmental conditions and levels of variation and replication), confusion about the origin of the data, preventing appropriate provenance and credit18, and the accidental duplication of data points in large datasets.

In this data descriptor, we introduce the Coral Trait Database: a curated database of trait information for coral species from the global oceans. The goals of the Coral Trait Database are: (i) to assemble disparate information on coral traits, (ii) to provide unrestricted, open-source access to coral trait data, (iii) to facilitate and encourage the appropriate crediting of original data sources, and (iv) to engage the reef coral research community in the collection and quality control of trait data. We release 56 error-checked, validated and referenced traits, and also provide their context of measurement, together with an online system for transparently and accurately archiving and presenting coral trait data in future research. Our vision is an inclusive and accessible data resource to more rapidly advance the science and management of a sensitive ecosystem at a time of unprecedented environmental change.

Methods

The data are held in the Coral Traits Database (https://coraltraits.org). The database was designed to contain individual-level traits and species-level characteristics and is currently focused on shallow water zooxanthellate (‘reef building’) scleractinian corals. Individual-level traits include any potentially heritable quality of an organism19,20. In the database, individual-level traits are accompanied by contextual characteristics, which give information about the environment or situation in which an individual-level trait was measured (e.g., characteristics of the habitat, seawater or an experiment). These contextual variables are important for understanding variation in individual-level traits (e.g., as predictor variables in analyses). For example, if measurement of colony growth rate was measured at a given depth, the latter datum is included to provide important information for the focal measurement. Some individual-level traits have no or little variation (e.g., mode of larval development), and therefore contextual information is not required. Species-level characteristics do not have contextual information because they are characteristics of species as entities (such as geographical range size and maximum depth observed).

For simplicity, we use the single term ‘trait’ to refer to individual-level (variant and invariant), species-level (emergent) and contextual (environmental or situational) measurements. Moreover, these traits are grouped into ten use-classes based on various sub-disciplines of reef coral research: biomechanical, conservation, ecological, geographical, morphological, phylogenetic, physiological, reproductive, stoichiometric, and contextual.

Observation and measurements

The database contains two core data tables—Observations and Measurements—each of which has a series of associated tables (Fig. 1). We follow the high-level structure of the Observation and Measurement Ontology21 in that observations bind related measurements and potentially provide context for other observations.

Figure 1: Overview of the design of the Coral Trait Database.
figure 1

(a) The general schema consists of an Observation of a coral colony that is a collection of one or more Measurements associated with the colony. Solid borders represent table associations and dotted borders represent values. Observations have four table associations (contributor, coral species, resource and location) and one value for access (i.e., public or private). Measurements have four table associations (observation, trait, methodology and standard) and five values. (b) An example of an observation where coral growth rate was measured along with two contextual measurements (represented in the database by an eye). All observation-level attributes are required. Required measurement-level attributes are trait, standard, value and value type. Precision details are entered when a value type is not a raw value. Photograph: Emily Darling.

The observation table contains information about the observation of a coral or coral species. Observation-level data must include the Enterer, Species, Location and Resource. Access is an optional variable, and can be controlled by database users entering data for a project that has not yet been published (see https://coraltraits.org/procedures for more information). Observation-level data are the same for all measurements corresponding to the observation. Measurement-level data include the Trait, Value, Standard (measurement unit), Methodology, and estimates of precision (if applicable). The hypothetical example given in Fig. 1b is for growth rate that was measured within the context of a water depth and habitat that were given in the published resource.

The Species table provides taxonomy that is regularly updated by the Taxonomy Advisory Board (https://coraltraits.org/procedures) to keep pace with the rapid rate of revision2224. The table contains the valid name for each coral species based largely on the World Register of Marine Species (http://www.marinespecies.org), the major clade (Basal, Robust or Complex25), family based on molecular work26, family based on morphology (following Cairns27 or Veron28), and other names and synonyms.

Data acquisition

All public data in the Coral Trait Database and included in this data descriptor release are linked with published resources, which include peer-reviewed papers, taxonomic monographs and books. The original source of entered data must be included (called the primary resource), even when extracted from secondary compilations (e.g., for the purpose of meta-analyses). Secondary sources can be included optionally, and so the database captures both the original data collector and subsequent data compilers, which allows both to be credited when re-using data. Measurement value types, which can be flexibly added to, currently include: raw, mean, median, maximum, minimum, expert opinion (the view of a single expert), group opinion (the consensus of a group of experts), and model derived. Continuous data are typically means extracted from tables or figures unless raw data are available. When available, aggregate values such as means and medians should be accompanied by the number of replicates and a measure of dispersion (e.g., standard deviation). Means and estimates of dispersion from figures in resources were captured using ImageJ29. The data released in this data descriptor have broad taxonomic (Fig. 2), global (Fig. 3) and phylogenetic (Fig. 4) coverage. However, some large data gaps exist, because few species have been comprehensively measured in many locations.

Figure 2: Trait by species matrix, illustrating coverage of trait data are currently available in the Coral Trait Database across the worlds 1547 coral species.
figure 2

Blue cells correspond with the traits released in this data descriptor. Grey cells correspond with other available data for which thorough error checking is still being conducted.

Figure 3: Locations where data already released in the Coral Trait Database were collected.
figure 3

Figure 4: The phylogenetic coverage of traits in the Coral Trait Database, for the subset of species in the current molecular phylogeny.
figure 4

As for Fig. 2, blue cells indicate traits for species released in this data descriptor and grey cells indicate other available information in the database, still being federated.

Data Records

A static release of the 56 traits contained in this descriptor is available from the Coral Trait Database (Data Citation 1) and Figshare (Data Citation 2). Details and references for the trait data are summarised in Table 1 (available online only). Up-to-date data can be downloaded directly from the database. However, as validation (see Technical Validation, below) and data entry is ongoing, users are recommended to pull data from the static releases, to ensure results remain consistent as the database is updated. Both static releases and datasets downloaded from the database are accompanied by the primary (and, if applicable, secondary) resource lists for the data, which should be credited wherever feasible.

Table 1 Overview of traits in release 1.1.1, including descriptions, measurement standards, the number of measurements and the references

Technical Validation

The database is curated on a voluntary basis, which includes a Managerial Board, Editorial Board, Taxonomy Advisory Board and Database Administrator (https://coraltraits.org/procedures). Database Contributors who add data for a new trait are typically asked to be that trait’s editor. Quality control of data and editorial procedures include:

  1. 1

    Contributor approval: Database users must request permission to become a database contributor, and any observations entered by the contributor are associated with their user account.

  2. 2

    Editorial approval: Once a contributor enters an observation of a coral trait, an email is sent automatically to the editor of that trait. The editor must approve the observation to remove the ‘pending’ flag from the observation record.

  3. 3

    User feedback: Data issues can be reported for any observation using a simple form. Editors are automatically emailed if an issue with one of their traits is reported.

  4. 4

    Duplicate detection: Measurements with the same value, resource, location and species are flagged for confirmation.

  5. 5

    Outlier detection: Frequency histograms are generated in real time when loading trait pages. Outliers can be detected visually (e.g., a very large value for continuous data or a category that has one or few associated measurements for categorical data).

Usage Notes

The data release is a compressed folder containing two files:

  1. 1

    A csv-formatted data file containing all publicly available observation and measurement data, which includes contextual data.

  2. 2

    A csv-formatted resource file containing all the resources (primary and secondary) that correspond with the data. Users are expected to cite the data correctly using these resources.

An example for extracting and reshaping release data for analysis can found online (https://coraltraits.org/procedures).

Additional Information

Table 1 is only available in the online version of this paper.

How to cite this article: Madin, J. S. et al. The Coral Trait Database, a curated database of trait information for coral species from the global oceans. Sci. Data 3:160017 doi: 10.1038/sdata.2016.17 (2016).