Pedosphere 32(4): 507--520, 2022
ISSN 1002-0160/CN 32-1315/P
©2022 Soil Science Society of China
Published by Elsevier B.V. and Science Press
Protein sequence databases generated from metagenomics and public databases produced similar soil metaproteomic results of microbial taxonomic and functional changes
Yi XIONG1,2, Lu ZHENG1, Xiangxiang MENG1,2, Ren Fang SHEN1,2, Ping LAN1,2
1State Key Laboratory of Soil and Sustainable Agriculture, Institute of Soil Science, Chinese Academy of Sciences, Nanjing 210008 (China)
2University of Chinese Academy of Sciences, Beijing 100049 (China)
      Soil metaproteomics has excellent potential as a tool to elucidate the structural and functional changes in soil microbial communities in response to environmental alterations. However, soil metaproteomics is hindered by several challenges and gaps. Soil microbial communities possess extremely complex microbial composition, including many uncultured microorganisms without whole genome sequencing. Thus, how to select a suitable protein sequence database remains challenging in soil metaproteomics. In this study, the Public database and Meta-database were constructed using protein sequences from public databases and metagenomics, respectively. We comprehensively analyzed and compared the soil metaproteomic results using these two kinds of protein sequence databases for protein identification based on published soil metaproteomic raw data. The results demonstrated that many more proteins, higher sequence coverage, and even more microbial species and functional annotations could be identified using the Meta-database compared with those identified using the Public database. These findings indicated that the Meta-database was more specific as a protein sequence database. However, the follow-up in-depth metaproteomic analyses exhibited similar main results regardless of the database used. The microbial community composition at the genus level was similar between the two databases, especially the species annotations with high peptide-spectrum match and high abundance. The functional analyses in response to stress, such as the gene ontology enrichment of biological progress and molecular function and the key functional microorganisms, were also similar regardless of the database. Our analysis revealed that the Public database could also meet the demand to explore the functional responses of microbial proteins to some extent. This study provides valuable insights into the choice of protein sequence databases and their impacts on subsequent bioinformatic analysis in soil metaproteomic research and will facilitate the optimization of experimental design for different purposes.
Key Words:  bioinformatics,differentially accumulated protein,functional annotation,functional microorganism,Meta-database,microbial community,microbial species,Public database
Citation: Xiong Y, Zheng L, Meng X X, Shen R F, Lan P. 2022. Protein sequence databases generated from metagenomics and public databases produced similar soil metaproteomic results of microbial taxonomic and functional changes. Pedosphere. 32(4): 507-520.
View Full Text

Copyright © 2022 Editorial Committee of PEDOSPHERE. All rights reserved.
Address: P. O. Box 821, 71 East Beijing Road, Nanjing 210008, China    E-mail:
Technical support: Beijing E-Tiller Co.,Ltd.