Pedosphere 35(2): 387--404, 2025
ISSN 1002-0160/CN 32-1315/P
©2025 Soil Science Society of China
Published by Elsevier B.V. and Science Press
Comparing disaggregation approaches DSMART and PPD in disaggregating soil series maps |
Tahmid Huq EASHER1, Daniel SAURETTE1, Brandon HEUNG2, Adam GILLESPIE1, Richard J. HECK1, Asim BISWAS1 |
1 School of Environmental Sciences, University of Guelph, 50 Stone Road East, Guelph ON N1G 2W1 (Canada); 2 Department of Plant, Food, and Environmental Sciences, Faculty of Agriculture, Dalhousie University, P. O. Box 550, Truro NS B2N 5E3 (Canada) |
ABSTRACT |
Conventional soil maps (CSMs) often have multiple soil types within a single polygon, which hinders the ability of machine learning to accurately predict soils. Soil disaggregation approaches are commonly used to improve the spatial and attribute precision of CSMs. The approach disaggregation and harmonization of soil map units through resampled classification trees (DSMART) is popular but computationally intensive, as it generates and assigns synthetic samples to soil series based on the areal coverage information of CSMs. Alternatively, the disaggregation approach pure polygon disaggregation (PPD) assigns soil series based solely on the proportions of soil series in pure polygons in CSMs. This study compared these two disaggregation approaches by applying them to a CSM of Middlesex County, Ontario, Canada. Four different sampling methods were used: two sampling designs, simple random sampling (SRS) and conditional Latin hypercube sampling (cLHS), with two sample sizes (83 100 and 19 420 samples per sampling plan), both based on an area-weighted approach. Two machine learning algorithms (MLAs), C5.0 decision tree (C5.0) and random forest (RF), were applied to the disaggregation approaches to compare the disaggregation accuracy. The accuracy assessment utilized a set of 500 validation points obtained from the Middlesex County soil survey report. The MLA C5.0 (Kappa index = 0.58-0.63) showed better performance than RF (Kappa index = 0.53-0.54) based on the larger sample size, and PPD with C5.0 based on the larger sample size was the best-performing (Kappa index = 0.63) approach. Based on the smaller sample size, both cLHS (Kappa index = 0.41-0.48) and SRS (Kappa index = 0.40-0.47) produced similar accuracy results. The disaggregation approach PPD exhibited lower processing capacity and time demands (1.62-5.93 h) while yielding maps with lower uncertainty as compared to DSMART (2.75-194.2 h). For CSMs predominantly composed of pure polygons, utilizing PPD for soil series disaggregation is a more efficient and rational choice. However, DSMART is the preferable approach for disaggregating soil series that lack pure polygon representations in the CSMs. |
Key Words: conditioned Latin hypercube sampling,conventional soil map,machine learning algorithm,processing capacity and time,sample size,simple random sampling,soil map unit,soil series disaggregation |
Citation: Easher T H, Saurette D, Heung B, Gillespie A, Heck R J, Biswas A. 2025. Comparing disaggregation approaches DSMART and PPD in disaggregating soil series maps. Pedosphere. 35(2): 387-404. |
View Full Text
|
|
|
|