Citation: LI Yong, ZHOU Wei, DAI Zhi-Jun, CHEN Yuan, WANG Zhi-Ming, YUAN Zhe-Ming. Predicting the Protein Folding Rate Based on Sequence Feature Screening and Support Vector Regression[J]. Acta Physico-Chimica Sinica, ;2014, 30(6): 1091-1098. doi: 10.3866/PKU.WHXB201404091 shu

Predicting the Protein Folding Rate Based on Sequence Feature Screening and Support Vector Regression

  • Received Date: 12 December 2013
    Available Online: 9 April 2014

    Fund Project:

  • Folding rate prediction plays an important role in clarifying the protein folding mechanism. In this work, we collected 115 protein samples with known folding rates including two-, multi-, and mixed-state proteins. To characterize the primary structure information of the protein molecules more comprehensively, we considered sequence length, residue components with different scales, k-space features for pair residues, and geostatistics association features among different locations of the residues substituted with corresponding physical-chemical properties. Each protein sequence was represented by a numeric vector containing 9357 numbers. We selected 23 features with a clear meaning from the above-mentioned high-dimensional features for each sample, after conducting an improved binary matrix shuffling filter and a worst descriptor elimination multi-round method. We constructed a nonlinear support vector regression (SVR) model based on the folding rate and the 23 retained features. The correlation coefficient of the Jackknife cross validation was 0.95. Our prediction accuracy was superior to other results from the literature and other reference feature selection methods. Finally, we established an interpretability system for SVR, and our data showed that the nonlinear regression relationship between the folding rates and the reserved features was highly significant. By further analyzing the effects of each retained descriptor on protein folding rates, the results showed that the protein folding rate might be closely related to the sequence length, the features associated with the medium-and short-range, the triplet residues component features, etc.

  • 加载中
    1. [1]

      (1) Guo, J. X.; Rao, N. N.; Liu, G. X.; Li, J.;Wang, Y. H. Prog. Biochem. Biophys. 2010, 37 (12), 1331. [郭建秀, 饶妮妮, 刘广雄, 李杰, 王云鹤. 生物化学与生物物理进展, 2010, 37 (12), 1331.]

    2. [2]

      (2) Xi, L. L.; Li, S. Y.; Liu, H. X.; Li, J. Z.; Lei, B. L.; Yao, X. J. J. Theor. Biol. 2010, 264 (4), 1159. doi: 10.1016/j.jtbi.2010.03.042

    3. [3]

      (3) Plaxco, K.W.; Simons, K. T.; Baker, D. J. Mol. Biol. 1998, 277 (4), 985. doi: 10.1006/jmbi.1998.1645

    4. [4]

      (4) Ivankov, D. N.; Garbuzynskiy, S. O.; Alm, E.; Plaxco, K.W.; Baker, D.; Finkelstein, A. V. Protein Sci. 2003, 12 (9), 2057. doi: 10.1110/ps.0302503

    5. [5]

      (5) Weikl, T. R.; Dill, K. A. J. Mol. Biol. 2003, 332 (4), 953. doi: 10.1016/S0022-2836(03)00884-2

    6. [6]

      (6) Zhang, L. X.; Li, J.; Jiang, Z. T.; Xia, A. G. Polymer 2003, 44 (5), 1751. doi: 10.1016/S0032-3861(03)00021-1

    7. [7]

      (7) Capriotti, E.; Casadio, R. Bioinformatics 2007, 23 (3), 385. doi: 10.1093/bioinformatics/btl610

    8. [8]

      (8) Ivankov, D. N.; Bogatyreva, N. S.; Lobanov, M. Y.; Galzitskaya, O. V. PLoS One 2009, 4 (8), e6476.

    9. [9]

      (9) ng, H. P.; Isom, D. G.; Srinivasan, R.; Rose, G. D. J. Mol. Biol. 2003, 327 (5), 1149. doi: 10.1016/S0022-2836(03)00211-0

    10. [10]

      (10) Ivankov, D. N.; Finkelstein, A. V. Proc. Natl. Acad. Sci. U. S. A. 2004, 101 (24), 8942. doi: 10.1073/pnas.0402659101

    11. [11]

      (11) Ma, B. G.; Guo, J. X.; Zhang, H. Y. Proteins: Struct., Funct., Bioinf. 2006, 65 (2), 362. doi: 10.1002/prot.21140

    12. [12]

      (12) Jiang, Y. F.; Iglinski, P.; Kurgan, L. J. Comput. Chem. 2009, 30 (5), 772. doi: 10.1002/jcc.21096

    13. [13]

      (13) Gao, J. Z.; Zhang, T.; Zhang, H.; Shen, S. Y.; Ruan, J. S.; Kurgan, L. Proteins: Struct., Funct., Bioinf. 2010, 78 (9), 2114.

    14. [14]

      (14) Shen, H. B.; Song, J. N.; Chou, K. C. J. Biomed. Sci. Eng. 2009, 2 (3), 136. doi: 10.4236/jbise.2009.23024

    15. [15]

      (15) Cheng, X.; Xiao, X.;Wu, Z. C.;Wang, P.; Lin,W. Z. Proteins: Struct., Funct., Bioinf. 2013, 81 (1), 140. doi: 10.1002/prot.24171

    16. [16]

      (16) Zhang, H. Y.;Wang, H. Y.; Dai, Z. J.; Chen, M. S.; Yuan, Z. M. BMC Bioinformatics 2012, 13 (1), 298. doi: 10.1186/1471-2105-13-298

    17. [17]

      (17) Han, N.; Yuan, Z. M.; Chen, Y.; Dai, Z. J.;Wang, Z. M. Acta Phys. -Chim. Sin. 2013, 29 (9), 1945. [韩娜, 袁哲明, 陈渊, 代志军, 王志明. 物理化学学报, 2013, 29 (9), 1945.] doi: 10.3866/PKU.WHXB201306182

    18. [18]

      (18) Guo, J. X.; Rao, N. N.; Liu, G. X.; Yang, Y.;Wang, G. J. Comput. Chem. 2011, 32 (8), 1612. doi: 10.1002/jcc.21740

    19. [19]

      (19) Galzitskaya, O. V.; Garbuzynskiy, S. O.; Ivankov, D. N.; Finkelstein, A. V. Proteins: Struct., Funct., Genet. 2003, 51 (2), 162. doi: 10.1002/prot.10343

    20. [20]

      (20) Kawashima, S.; Pokarowski, P.; Pokarowska, M.; Kolinski, A.; Katayama, T.; Kanehisa, M. Nucl. Acids Res. 2008, 36 (suppl. 1), D202.

    21. [21]

      (21) Gromiha, M, M.; Selvaraj, S. Prep. Biochem. Biotechnol. 1999, 29 (4), 339. doi: 10.1080/10826069908544933

    22. [22]

      (22) Zhou, P.; Tian, F. F.; Li, B.;Wu, S. R.; Li, Z. L. Acta Chim. Sin. 2006, 64 (7), 691. [周鹏, 田菲菲, 李波, 吴世容, 李志良. 化学学报, 2006, 64 (7), 691.]

    23. [23]

      (23) Tan, X. S.;Wang, Z. M.; Tan, S. Q.; Yuan, Z. M.; Xiong, X. Y. J. Syst. Simul. 2009, 21 (24), 7795. [谭显胜, 王志明, 谭泗桥, 袁哲明, 熊兴耀. 系统仿真学报, 2009, 21 (24), 7795.]

    24. [24]

      (24) Dai, Z. J.; Zhou,W.; Yuan, Z. M. Acta Phys. -Chim. Sin. 2011, 27 (7), 1654. [代志军, 周玮, 袁哲明. 物理化学学报, 2011, 27 (7), 1654.] doi: 10.3866/PKU.WHXB20110735

    25. [25]

      (25) Chang, C. C.; Lin, C. J. ACM TIST. 2011, 2 (3), 27.

    26. [26]

      (26) Chen, Y.; Yuan, Z. M.; Zhou,W.; Xiong, X. Y. Acta Phys. -Chim. Sin. 2009, 25 (8), 1587. [陈渊, 袁哲明, 周玮, 熊兴耀. 物理化学学报, 2009, 25 (8), 1587.] doi: 10.3866/PKU.WHXB20090752

    27. [27]

      (27) Leardi, R. J. Chemometr. 2000, 14 (5-6), 643. doi: 10.1002/1099-128X(200009/12)14:5/6<643::AID-CEM621>3.0.CO;2-E

    28. [28]

      (28) Wang, Z. M.; Han, N.; Yuan, Z. M.;Wu, Z. H. Acta Phys. -Chim. Sin. 2013, 29 (3), 498. [王志明, 韩娜, 袁哲明, 伍朝华. 物理化学学报, 2013, 29 (3), 498.] doi: 10.3866/PKU.WHXB201301042

    29. [29]

      (29) Ouyang, Z.; Liang, J. Protein Sci. 2008, 17 (7), 1256. doi: 10.1110/ps.034660.108


  • 加载中
    1. [1]

      Zhi Zheng Feiyang Liu Junlong Zhao . D-Amino Acids and Mirror-Image Proteins. University Chemistry, 2026, 41(2): 353-359. doi: 10.12461/PKU.DXHX202505017

    2. [2]

      Xinran Zhang Siqi Liu Yichi Chen Qingli Zou Qinghong Xu Yaqin Huang . From Protein to Energy Storage Materials: Edible Gelatin Jelly Electrolyte. University Chemistry, 2025, 40(7): 255-266. doi: 10.12461/PKU.DXHX202408104

    3. [3]

      Xinyi Hong Tailing Xue Zhou Xu Enrong Xie Mingkai Wu Qingqing Wang Lina Wu . Non-Site-Specific Fluorescent Labeling of Proteins as a Chemical Biology Experiment. University Chemistry, 2024, 39(4): 351-360. doi: 10.3866/PKU.DXHX202310010

    4. [4]

      Zhibei Qu Changxin Wang Lei Li Jiaze Li Jun Zhang . Organoid-on-a-Chip for Drug Screening and the Inherent Biochemistry Principles. University Chemistry, 2024, 39(7): 278-286. doi: 10.3866/PKU.DXHX202311039

    5. [5]

      Zhi DouHuiyu DuanYixi LinYinghui XiaMingbo ZhengZhenming Xu . High-Throughput Screening Lithium Alloy Phases and Investigation of Ion Transport for Solid Electrolyte Interphase Layer. Acta Physico-Chimica Sinica, 2024, 40(3): 2305039-0. doi: 10.3866/PKU.WHXB202305039

    6. [6]

      Shuying Zhu Shuting Wu Ou Zheng . Improvement and Expansion of the Experiment for Determining the Rate Constant of the Saponification Reaction of Ethyl Acetate. University Chemistry, 2024, 39(4): 107-113. doi: 10.3866/PKU.DXHX202310117

    7. [7]

      Heng Zhang . Determination of All Rate Constants in the Enzyme Catalyzed Reactions Based on Michaelis-Menten Mechanism. University Chemistry, 2024, 39(4): 395-400. doi: 10.3866/PKU.DXHX202310047

    8. [8]

      Jian Huang Mingjue Zhang Shangchu Ma Jia Dong Guanzi Wu Aiming Wen Zhuoliang Liu . Data-Driven Approach for the Determination of Chemical Reaction Rate Constant. University Chemistry, 2026, 41(1): 213-226. doi: 10.12461/PKU.DXHX202505110

    9. [9]

      Yujia Luo Yunpeng Qi Huiping Xing Yuhu Li . The Use of Viscosity Method for Predicting the Life Expectancy of Xuan Paper-based Heritage Objects. University Chemistry, 2024, 39(8): 290-294. doi: 10.3866/PKU.DXHX202401037

    10. [10]

      Haiyu ZhuZhuoqun WenWen XiongXingzhan WeiZhi Wang . 二维半金属/硅异质结中肖特基势垒高度的准确高效预测. Acta Physico-Chimica Sinica, 2025, 41(7): 100078-0. doi: 10.1016/j.actphy.2025.100078

    11. [11]

      Xinghai LiZhisen WuLijing ZhangShengyang Tao . Machine Learning Enables the Prediction of Amide Bond Synthesis Based on Small Datasets. Acta Physico-Chimica Sinica, 2025, 41(2): 100010-0. doi: 10.3866/PKU.WHXB202309041

    12. [12]

      Jian CaoChang LiuDanling WangHaichao LiLina XuHongping XiaoShaoqi ZhanXiao HeGuoyong Fang . Machine learning potentials for property predictions of two-dimensional group-Ⅲ nitrides. Acta Physico-Chimica Sinica, 2026, 42(4): 100224-0. doi: 10.1016/j.actphy.2025.100224

    13. [13]

      Kai PENGXinyi ZHAOZixi CHENXuhai ZHANGYuqiao ZENGJianqing JIANG . Progress in the application of high-entropy alloys and high-entropy ceramics in water electrolysis. Chinese Journal of Inorganic Chemistry, 2025, 41(7): 1257-1275. doi: 10.11862/CJIC.20240454

    14. [14]

      Yichang Liu Li An Dan Qu Zaicheng Sun . “双碳”背景下的综合设计实验——以PbCrO4催化甲基蓝的光降解速率常数测定为例. University Chemistry, 2025, 40(6): 222-229. doi: 10.12461/PKU.DXHX202407105

    15. [15]

      Ying Zhang Fang Ge Zhimin Luo . AI-Driven Biochemical Teaching Research: Predicting the Functional Effects of Gene Mutations. University Chemistry, 2025, 40(3): 277-284. doi: 10.12461/PKU.DXHX202412104

    16. [16]

      Xiaodong Chen Yumin Zhang . An Improved Simulated Annealing Algorithm for Predicting the Molecular Formulas of Organic Compounds. University Chemistry, 2025, 40(9): 19-24. doi: 10.12461/PKU.DXHX202408095

    17. [17]

      Heng Zhang Ying Ma Shiling Yuan . Machine Learning-based Prediction of Antifouling Performance in Polymer Materials: An Integrated Molecular Simulation Experiment. University Chemistry, 2026, 41(1): 346-353. doi: 10.12461/PKU.DXHX202506015

    18. [18]

      Yongjie ZHANGBintong HUANGYueming ZHAI . Research progress of formation mechanism and characterization techniques of protein corona on the surface of nanoparticles. Chinese Journal of Inorganic Chemistry, 2024, 40(12): 2318-2334. doi: 10.11862/CJIC.20240247

    19. [19]

      Zeyu XUAnlei DANGBihua DENGXiaoxin ZUOYu LUPing YANGWenzhu YIN . Evaluation of the efficacy of graphene oxide quantum dots as an ovalbumin delivery platform and adjuvant for immune enhancement. Chinese Journal of Inorganic Chemistry, 2024, 40(6): 1065-1078. doi: 10.11862/CJIC.20240099

    20. [20]

      Shanghua LiMalin LiXiwen ChiXin YinZhaodi LuoJihong Yu . High-Stable Aqueous Zinc Metal Anodes Enabled by an Oriented ZnQ Zeolite Protective Layer with Facile Ion Migration Kinetics. Acta Physico-Chimica Sinica, 2025, 41(1): 100003-0. doi: 10.3866/PKU.WHXB202309003

Metrics
  • PDF Downloads(665)
  • Abstract views(1096)
  • HTML views(102)

通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索
Address:Zhongguancun North First Street 2,100190 Beijing, PR China Tel: +86-010-82449177-888
Powered By info@rhhz.net

/

DownLoad:  Full-Size Img  PowerPoint
Return