期刊名称: |
Sensors |
全部作者: |
Tiezhu Shi*,Huizeng Liu,Yiyun Chen,Teng Fei,Junjie Wang,Guofeng Wu |
出版年份: |
2017 |
卷 号: |
17 |
期 号: |
|
页 码: |
|
查看全本: |
|
This study investigated the abilities of pre-processing, feature selection and machinelearning methods for the spectroscopic diagnosis of soil arsenic contamination. The spectral data
were pre-processed by using Savitzky-Golay smoothing, first and second derivatives, multiplicative
scatter correction, standard normal variate, and mean centering. Principle component analysis
(PCA) and the RELIEF algorithm were used to extract spectral features. Machine-learning methods,
including random forests (RF), artificial neural network (ANN), radial basis function- and linear
function- based support vector machine (RBF- and LF-SVM) were employed for establishing diagnosis
models. The model accuracies were evaluated and compared by using overall accuracies (OAs).
The statistical significance of the difference between models was evaluated by using McNemar’s test
(Z value). The results showed that the OAs varied with the different combinations of pre-processing,
feature selection, and classification methods. Feature selection methods could improve the modeling
efficiencies and diagnosis accuracies, and RELIEF often outperformed PCA. The optimal models
established by RF (OA = 86%), ANN (OA = 89%), RBF- (OA = 89%) and LF-SVM (OA = 87%) had no statistical difference in diagnosis accuracies (Z < 1.96, p < 0.05). these results indicated that it was feasible to diagnose soil arsenic contamination using reflectance spectroscopy. the appropriate combination of multivariate methods was important to improve diagnosis accuracies.