Pedosphere 28(5): 739--750, 2018
ISSN 1002-0160/CN 32-1315/P
©2018 Soil Science Society of China
Published by Elsevier B.V. and Science Press
An Insight into Machine Learning Algorithms to Map the Occurrence of the Soil Mattic Horizon in the Northeastern Qinghai-Tibetan Plateau
ZHI Junjun1, ZHANG Ganlin1,2, YANG Renmin1,2, YANG Fei1,2, JIN Chengwei1, LIU Feng1, SONG Xiaodong1, ZHAO Yuguo1, and LI Decheng1
1State Key Laboratory of Soil and Sustainable Agriculture, Institute of Soil Science, Chinese Academy of Sciences, Nanjing 210008(China)
2University of Chinese Academy of Sciences, Beijing 100049(China)
ABSTRACT
      Soil diagnostic horizons, which each have a set of quantified properties, play a key role in soil classification. However, they are difficult to predict, and few attempts have been made to map their spatial occurrence. We evaluated and compared four machine learning algorithms, namely, the classification and regression tree (CART), random forest (RF), boosted regression trees (BRT), and support vector machine (SVM), to map the occurrence of the soil mattic horizon in the northeastern Qinghai-Tibetan Plateau using readily available ancillary data. The mechanisms of resampling and ensemble techniques significantly improved prediction accuracies (measured based on area under the receiver operator characteristic curve score (AUC)) and produced more stable results for the BRT (AUC of 0.921 ±0.012, mean ±standard deviation) and RF (0.908 ±0.013) algorithms compared to the CART algorithm (0.784 ±0.012), which is the most commonly used machine learning method. Although the SVM algorithm yielded a comparable AUC value (0.906 ±0.006) to the RF and BRT algorithms, it is sensitive to parameter settings, which are extremely time-consuming. Therefore, we consider it inadequate for occurrence-distribution modeling. Considering the obvious advantages of high prediction accuracy, robustness to parameter settings, the ability to estimate uncertainty in prediction, and easy interpretation of predictor variables, BRT seems to be the most desirable method. These results provide an insight into the use of machine learning algorithms to map the mattic horizon and potentially other soil diagnostic horizons.
Key Words:  boosted regression trees,classification and regression tree,digital soil mapping,random forest,soil diagnostic horizons,support vector machine
Citation: Zhi J J, Zhang G L, Yang R M, Yang F, Jin C W, Liu F, Song X D, Zhao Y G, Li D C. 2018. An insight into machine learning algorithms to map the occurrence of the soil mattic horizon in the northeastern Qinghai-Tibetan Plateau. Pedosphere. 28(5):739-750.
View Full Text



Copyright © 2018 Editorial Committee of PEDOSPHERE. All rights reserved.
Address: P. O. Box 821, 71 East Beijing Road, Nanjing 210008, China    E-mail: pedosphere@issas.ac.cn
Technical support: Beijing E-Tiller Co.,Ltd.