Department of Biology
Institute for Plant Science and Microbiology
    Division BEE  >  Biodiversity & Ecology  >  Vol.4 >  Article 7

Biodiversity & Ecology

Research Article    Open Access 

A methodological framework to quantify the spatial quality of biological databases

Jaime R. García Márquez*, Carsten F. Dormann, Jan Henning Sommer, Marco Schmidt, Adjima Thiombiano, Sié Sylvestre Da, Cyrille Chatelain, Stefan Dressler & Wilhelm Barthlott

Article first published online: 24 September 2012

DOI: 10.7809/b-e.00057

*Corresponding author contact: jrgarcia.marquez@gmail.com

Biodiversity & Ecology  (Biodivers. Ecol.)

Special Volume: Vegetation databases for the 21st century,
edited by Jürgen Dengler, Jens Oldeland, Florian Jansen, Milan Chytrý, Jörg Ewald, Manfred Finckh, Falko Glöckler, Gabriela Lopez-Gonzalez, Robert K. Peet & Joop H.J. Schaminée
Volume 4, pages 25–39, Sep 12
  PDF  (1.3 MB)

Keywords: Bootstrap; completeness; environmental bias; Jackknife; multi-scale analysis; point pattern analysis; sampling bias; species richness.


Abstract: The basic information necessary for biogeographical analysis is the geographical location appended to the data contained in biological databases. Reliability of analyses thus crucially depends on the quality of the spatial information available. In the present study we build on a database of vascular plants of West Africa (Ivory Coast, Burkina Faso, Benin), containing 53,205 georeferenced observations distributed over 2,931 collection localities. We propose a methodology to quantify the quality of the database through a series of spatial analyses of spatial configuration of the collection localities, their spatial and environmental bias and inventory completeness. The spatial configuration of the database followed a highly clustered pattern and was strongly biased with respect to the distance to cities, the coast, rivers, roads and protected areas. The same biased pattern was found in relation to several environmental factors. Inventory completeness was calculated by estimating the total number of species based on two non-parametric estimates (first-order Jackknife and Bootstrap) and at different grid cell sizes. At the highest resolution (100 km²) only 5.5% of the cells contained a near-complete (> 80% of Jackknife estimates) species inventory. The percentage of near-complete cells increased as the resolution of analysis decreased. Results of all analyses were integrated into a new index (Gap Selection Index) that serves to guiding future field work campaigns and as cautionary criterion for the uncertainties related to biogeographical application based on the current database.

Suggested citation:
García Márquez, J.R., Dormann, C.F., Sommer, J.H., Schmidt, M., Thiombiano, A., Sylvestre Da, S., Chatelain, C., Dressler, S., Barthlott, W. (2012): A methodological framework to quantify the spatial quality of biological databases. – In: Dengler, J., Oldeland, J., Jansen, F., Chytrý, M., Ewald, J., Finckh, M., Glöckler, F., Lopez-Gonzalez, G., Peet, R.K., Schaminée, J.H.J. [Eds.]: Vegetation databases for the 21st century. – Biodiversity & Ecology 4: 25–39. DOI: 10.7809/b-e.00057.