Elsevier

Pedosphere

Volume 28, Issue 5, October 2018, Pages 739-750
Pedosphere

An Insight into Machine Learning Algorithms to Map the Occurrence of the Soil Mattic Horizon in the Northeastern Qinghai-Tibetan Plateau

https://doi.org/10.1016/S1002-0160(17)60481-8Get rights and content

Abstract

Soil diagnostic horizons, which each have a set of quantified properties, play a key role in soil classification. However, they are difficult to predict, and few attempts have been made to map their spatial occurrence. We evaluated and compared four machine learning algorithms, namely, the classification and regression tree (CART), random forest (RF), boosted regression trees (BRT), and support vector machine (SVM), to map the occurrence of the soil mattic horizon in the northeastern Qinghai-Tibetan Plateau using readily available ancillary data. The mechanisms of resampling and ensemble techniques significantly improved prediction accuracies (measured based on area under the receiver operator characteristic curve score (AUC)) and produced more stable results for the BRT (AUC of 0.921 ± 0.012, mean ± standard deviation) and RF (0.908 ± 0.013) algorithms compared to the CART algorithm (0.784 ± 0.012), which is the most commonly used machine learning method. Although the SVM algorithm yielded a comparable AUC value (0.906 ± 0.006) to the RF and BRT algorithms, it is sensitive to parameter settings, which are extremely time-consuming. Therefore, we consider it inadequate for occurrence-distribution modeling. Considering the obvious advantages of high prediction accuracy, robustness to parameter settings, the ability to estimate uncertainty in prediction, and easy interpretation of predictor variables, BRT seems to be the most desirable method. These results provide an insight into the use of machine learning algorithms to map the mattic horizon and potentially other soil diagnostic horizons.

References (49)

  • R Lawrence et al.

    Classification of remotely sensed imagery using stochastic gradient boosting as a refinement of classification tree analysis

    Remote Sens Environ

    (2004)
  • M Ließ et al.

    Uncertainty in the spatial prediction of soil texture: Comparison of regression tree and Random Forest models

    Geoderma

    (2012)
  • R Lorenzetti et al.

    Comparing data mining and deterministic pedology to assess the frequency of WRB reference soil groups in the legend of small scale maps

    Geoderma

    (2015)
  • A B McBratney et al.

    Spacial prediction and mapping of continuous soil classes

    Geoderma

    (1992)
  • A B McBratney et al.

    On digital soil mapping

    Geoderma

    (2003)
  • N J McKenzie et al.

    Spatial prediction of soil properties using environmental correlation

    Geoderma

    (1999)
  • S Oliveira et al.

    Modeling spatial patterns of fire occurrence in Mediterranean Europe using multiple regression and random forest

    Forest Ecol Manag

    (2012)
  • M Rodrigues et al.

    An insight into machinelearning algorithms to model human-caused wildfire occurrence

    Environ Model Softw

    (2014)
  • V Rodriguez-Galiano et al.

    Predictive modeling of groundwater nitrate pollution using Random Forest and multisource variables related to intrinsic and specific vulnerability: A case study in an agricultural setting (Southern Spain)

    Sci Total Environ

    (2014)
  • V Rodriguez-Galiano et al.

    Machine learning predictive models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines

    Ore Geol Rev

    (2015)
  • V F Rodriguez-Galiano et al.

    Random Forest classification of Mediterranean land cover using multi-seasonal imagery and multi-seasonal texture

    Remote Sens Environ

    (2012)
  • T Selige et al.

    High resolution topsoil mapping using hyperspectral image and field data in multivariate regression modeling procedures

    Geoderma

    (2006)
  • T H Snelder et al.

    Predictive mapping of the natural flow regimes of France

    J Hydrol

    (2009)
  • K Sumfleth et al.

    Prediction of soil property distribution in paddy soil landscapes using terrain data and satellite information as indicators

    Ecol Indie

    (2008)
  • Cited by (0)

    View full text