z-logo
open-access-imgOpen Access
Comparison of Geographical Traceability of Wild and Cultivated Macrohyporia cocos with Different Data Fusion Approaches
Author(s) -
Li Wang,
Qinqin Wang,
Yuanzhong Wang,
Yunmei Wang
Publication year - 2021
Publication title -
journal of analytical methods in chemistry
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.407
H-Index - 25
eISSN - 2090-8865
pISSN - 2090-8873
DOI - 10.1155/2021/5818999
Subject(s) - principal component analysis , traceability , pattern recognition (psychology) , artificial intelligence , linear discriminant analysis , mathematics , partial least squares regression , data mining , computer science , statistics
Poria originated from the dried sclerotium of Macrohyporia cocos is an edible traditional Chinese medicine with high economic value. Due to the significant difference in quality between wild and cultivated M. cocos , this study aimed to trace the origin of the fungus from the perspectives of wild and cultivation. In addition, there were quite limited studies about data fusion, a potential strategy, employed and discussed in the geographical traceability of M. cocos . Therefore, we traced the origin of M. cocos from the perspectives of wild and cultivation using multiple data fusion approaches. Supervised pattern recognition techniques, like partial least squares discriminant analysis (PLS-DA) and random forest, were employed in this study using. Five types of data fusion involving low-, mid-, and high-level data fusion strategies were performed. Two feature extraction approaches including the selecting variables by a random forest-based method—Boruta algorithm and producing principal components by the dimension reduction technique of principal component analysis—were considered in data fusion. The results indicate the following: (1) The difference between wild and cultivated samples did exist in terms of the content analysis of vital chemical components and fingerprint analysis. (2) Wild samples need data fusion to realize the origin traceability, and the accuracy of the validation set was 95.24%. (3) Boruta outperformed principal component analysis (PCA) in feature extraction. (4) The mid-level Boruta PLS-DA model took full advantage of information synergy and showed the best performance. This study proved that both geographical traceability and optimal identification methods of cultivated and wild samples were different, and data fusion was a potential technique in the geographical identification.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom