
Citation: CONG Yong, XUE Ying. Quantitative Structure-Activity Relationship Study of the Non-Nucleoside Inhibitors of HCV NS5B Polymerase by Machine Learning Methods[J]. Acta Physico-Chimica Sinica, 2013, 29(08): 1639-1647. doi: 10.3866/PKU.WHXB201305171

基于机器学习方法的丙型肝炎病毒聚合酶NS5B非核苷抑制剂的定量构效关系研究
对89 个苯并异噻唑和苯并噻嗪类丙型肝炎病毒(HCV) NS5B聚合酶非核苷抑制剂进行了定量构效关系(QSAR)研究. 采用遗传算法组合偏最小二乘(GA-PLS)和线性逐步回归分析(LSRA)两种特征选择方法选择最优描述符子集, 然后建立多元线性回归和偏最小二乘线性回归模型. 并首次尝试使用遗传算法耦合支持向量机方法(GA-SVM)对两种特征选择方法所选的描述符子集分别建立非线性支持向量机回归模型. 三种机器学习方法所建模型均得到比较满意的预测效果. 采用LSRA所选的6 个描述符建立的三个QSAR模型对于测试集的相关系数为0.958-0.962, GA-SVM法给出最好的预测精度(0.962). 采用GA-PLS所选的7个描述符建立的三个QSAR模型对于测试集的相关系数为0.918-0.960, 偏最小二乘回归模型的结果最好(0.960). 本工作提供了一种有效的方法来预测丙型肝炎病毒抑制剂的生物活性, 该方法也可以扩展到其他类似的定量构效关系研究领域.
English
Quantitative Structure-Activity Relationship Study of the Non-Nucleoside Inhibitors of HCV NS5B Polymerase by Machine Learning Methods
The quantitative structure-activity relationship (QSAR) approach was used to predict the activity of two different scaffolds (benzoisothiazole and benzothiazine) of 89 non-nucleoside inhibitors of hepatitis c virus (HCV) NS5B polymerase. Two selection methods, linear stepwise regression analysis (LSRA) and genetic al rithm-partial least squares (GA-PLS), were used to select appropriate descriptor subsets for QSAR modeling with linear models. The genetic al rithm-support vector machine (GA-SVM) approach was first used to build nonlinear models with six LSRA- and seven GA-PLS-selected descriptors. Three QSAR models built with the six LSRA-selected descriptors gave correlation coefficients of 0.958-0.962 for the training set. GA-SVM provided the highest prediction accuracy of the models of 0.962. Three QSAR models built with the seven GA-PLS-selected descriptors gave correlation coefficients of 0.918-0.960 for the training set, of which the partial least squares (PLS) model was the best (0.960). The investigated models gave satisfactory prediction results and can be extended to other QSAR studies.
-
-
[1]
(1) Choo, Q. L.;Weiner, A. J.; Overby, L. R.; Bradley, D.W.;Houghton, M. Science 1989, 244, 359. doi: 10.1126/science.2523562
(1) Choo, Q. L.;Weiner, A. J.; Overby, L. R.; Bradley, D.W.;Houghton, M. Science 1989, 244, 359. doi: 10.1126/science.2523562
-
[2]
(2) (a) Lauer, G. M.;Walker, B. D. N. Engl. J. Med. 2001, 345, 41.doi: 10.1056/NEJM200107053450107(2) (a) Lauer, G. M.;Walker, B. D. N. Engl. J. Med. 2001, 345, 41.doi: 10.1056/NEJM200107053450107
-
[3]
(b) Di Bisceglie, A. M. Lancet 1998, 351, 351.(b) Di Bisceglie, A. M. Lancet 1998, 351, 351.
-
[4]
(c) Alter, M. J.; Kruszon-Moran, D.; Nainan, O. V.; McQuillan,G. M.; Gao, F.; Moyer, L. A.; Kaslow, R. A.; Mar lis, H. S.N. Engl. J. Med. 1999, 341, 556.(c) Alter, M. J.; Kruszon-Moran, D.; Nainan, O. V.; McQuillan,G. M.; Gao, F.; Moyer, L. A.; Kaslow, R. A.; Mar lis, H. S.N. Engl. J. Med. 1999, 341, 556.
-
[5]
(3) Manns, M. P.; McHutchison, J. G.; rdon, S. C.; Rustgi, V. K.;Shiffman, M.; Reindollar, R.; odman, Z. D.; Koury, K.; Ling,M. H.; Albrecht, J. K. Lancet 2002, 347, 975.(3) Manns, M. P.; McHutchison, J. G.; rdon, S. C.; Rustgi, V. K.;Shiffman, M.; Reindollar, R.; odman, Z. D.; Koury, K.; Ling,M. H.; Albrecht, J. K. Lancet 2002, 347, 975.
-
[6]
(4) (a) Koch, U.; Narjes, F. Curr. Top. Med. Chem. 2007, 7, 1302.doi: 10.2174/156802607781212211(4) (a) Koch, U.; Narjes, F. Curr. Top. Med. Chem. 2007, 7, 1302.doi: 10.2174/156802607781212211
-
[7]
(b) Rönn, R.; Sandström, A. Curr. Top. Med. Chem. 2008, 8, 533.(b) Rönn, R.; Sandström, A. Curr. Top. Med. Chem. 2008, 8, 533.
-
[8]
(c) Zapf, C.W.; Bloom, J. D.; Levin, J. I. Ann. Rep. Med. Chem.2007, 42, 281.(c) Zapf, C.W.; Bloom, J. D.; Levin, J. I. Ann. Rep. Med. Chem.2007, 42, 281.
-
[9]
(5) Appel, N.; Schaller, T.; Penin, F.; Bartenschlager, R. J. Biol. Chem. 2006, 281, 9833. doi: 10.1074/jbc.R500026200(5) Appel, N.; Schaller, T.; Penin, F.; Bartenschlager, R. J. Biol. Chem. 2006, 281, 9833. doi: 10.1074/jbc.R500026200
-
[10]
(6) Ni, Z. J.;Wagman, A. S. Curr. Opin. Drug Discov. Dev. 2004, 7,446.(6) Ni, Z. J.;Wagman, A. S. Curr. Opin. Drug Discov. Dev. 2004, 7,446.
-
[11]
(7) Beaulieu, P. L.; Bos, M.; Bousquet, Y.; Fazal, G.; Gauthier, J.;Gillard, J.; ulet, S.; LaPlante, S.; Poupart, M. A.; Lefebvre,S.; McKercher, G.; Pellerin, C.; Austel, V.; Kukolj, G. Bioorg. Med. Chem. Lett. 2004, 14, 119. doi: 10.1016/j.bmcl.2003.10.023(7) Beaulieu, P. L.; Bos, M.; Bousquet, Y.; Fazal, G.; Gauthier, J.;Gillard, J.; ulet, S.; LaPlante, S.; Poupart, M. A.; Lefebvre,S.; McKercher, G.; Pellerin, C.; Austel, V.; Kukolj, G. Bioorg. Med. Chem. Lett. 2004, 14, 119. doi: 10.1016/j.bmcl.2003.10.023
-
[12]
(8) Stansfield, I.; Ercolani, C.; Mackay, A.; Conte, I.; Pompei, M.;Koch, U.; Gennari, N.; Giuliano, C.; Rowley, M.; Narjes, F.Bioorg. Med. Chem. Lett. 2009, 19, 627. doi: 10.1016/j.bmcl.2008.12.068(8) Stansfield, I.; Ercolani, C.; Mackay, A.; Conte, I.; Pompei, M.;Koch, U.; Gennari, N.; Giuliano, C.; Rowley, M.; Narjes, F.Bioorg. Med. Chem. Lett. 2009, 19, 627. doi: 10.1016/j.bmcl.2008.12.068
-
[13]
(9) Louise-May, S.; Yang,W.; Nie, X.; Liu, D.; Deshpande, M. S.;Phadke, A. S.; Huang, M.; Agarwal, A. Bioorg. Med. Chem. Lett. 2007, 17, 3905. doi: 10.1016/j.bmcl.2007.04.103(9) Louise-May, S.; Yang,W.; Nie, X.; Liu, D.; Deshpande, M. S.;Phadke, A. S.; Huang, M.; Agarwal, A. Bioorg. Med. Chem. Lett. 2007, 17, 3905. doi: 10.1016/j.bmcl.2007.04.103
-
[14]
(10) Stankiewicz-Dro n, A.; Palchykovska, L. G.; Kostina, V. G.;Alexeeva, I. V.; Shved, A. D.; Boguszewska-Chachulska, A. M.Bioorg. Med. Chem. 2008, 16, 8846. doi: 10.1016/j.bmc.2008.08.074(10) Stankiewicz-Dro n, A.; Palchykovska, L. G.; Kostina, V. G.;Alexeeva, I. V.; Shved, A. D.; Boguszewska-Chachulska, A. M.Bioorg. Med. Chem. 2008, 16, 8846. doi: 10.1016/j.bmc.2008.08.074
-
[15]
(11) Bosse, T. D.; Larson, D. P.;Wagner, R.; Hutchinson, D. K.;Rockway, T.W.; Kati,W. M.; Liu, Y.; Masse, S.; Middleton, T.;Mo, H.; Mont mery, D.; Jiang,W.; Koev, G.; Kempf, D. J.;Molla, A. Bioorg. Med. Chem. Lett. 2008, 18, 568. doi: 10.1016/j.bmcl.2007.11.088(11) Bosse, T. D.; Larson, D. P.;Wagner, R.; Hutchinson, D. K.;Rockway, T.W.; Kati,W. M.; Liu, Y.; Masse, S.; Middleton, T.;Mo, H.; Mont mery, D.; Jiang,W.; Koev, G.; Kempf, D. J.;Molla, A. Bioorg. Med. Chem. Lett. 2008, 18, 568. doi: 10.1016/j.bmcl.2007.11.088
-
[16]
(12) Lü,W. J.; Chen, Y. L.; Ma,W. P.; Zhang, X. Y.; Luan, F.; Liu,M. C.; Chen, X. G.; Hu, Z. D. Euro. J. Med. Chem. 2008, 43,569. doi: 10.1016/j.ejmech.2007.04.011(12) Lü,W. J.; Chen, Y. L.; Ma,W. P.; Zhang, X. Y.; Luan, F.; Liu,M. C.; Chen, X. G.; Hu, Z. D. Euro. J. Med. Chem. 2008, 43,569. doi: 10.1016/j.ejmech.2007.04.011
-
[17]
(13) Luan, F.; Liu, H. T.; Ma,W. P.; Fan, B. T. Euro. J. Med. Chem.2008, 43, 43. doi: 10.1016/j.ejmech.2007.03.002(13) Luan, F.; Liu, H. T.; Ma,W. P.; Fan, B. T. Euro. J. Med. Chem.2008, 43, 43. doi: 10.1016/j.ejmech.2007.03.002
-
[18]
(14) Melagraki, G.; Afantitis, A.; Sarimveis, H.; Koutentis, P. A.;Markopoulos, J.; Igglessi-Markopoulou, O. Bioorg. Med. Chem.2007, 15, 7237. doi: 10.1016/j.bmc.2007.08.036(14) Melagraki, G.; Afantitis, A.; Sarimveis, H.; Koutentis, P. A.;Markopoulos, J.; Igglessi-Markopoulou, O. Bioorg. Med. Chem.2007, 15, 7237. doi: 10.1016/j.bmc.2007.08.036
-
[19]
(15) Su, L.; Li, L.; Li, Y.; Zhang, X.; Huang, X.; Zhai, H. Med. Chem. Res. 2012, 21, 2079. doi: 10.1007/s00044-011-9734-x(15) Su, L.; Li, L.; Li, Y.; Zhang, X.; Huang, X.; Zhai, H. Med. Chem. Res. 2012, 21, 2079. doi: 10.1007/s00044-011-9734-x
-
[20]
(16) de Vicente, J.; Hendricks, R. T.; Smith, D. B.; Fell, J. B.; Fischer,J.; Spencer, S. R.; Stengel, P. J.; Mohr, P.; Robinson, J. E.;Blake, J. F.; Hilgenkamp, R. K.; Yee, C.; Adjabeng, G.;Elworthy, T. R.; Li, J.;Wang, B.; Bamberg, J. T.; Harris, S. F.;Wong, A.; Leveque, V. J. P.; Najera, I.; Pogam, S. L.;Rajyaguru, S.; Ao-Ieong, G.; Alexandrova, L.; Larrabee, S.;Brandl, M.; Briggs, A.; Sukhtankar, S.; Farrell, R. Bioorg. Med. Chem. Lett. 2009, 19, 5652. doi: 10.1016/j.bmcl.2009.08.022(16) de Vicente, J.; Hendricks, R. T.; Smith, D. B.; Fell, J. B.; Fischer,J.; Spencer, S. R.; Stengel, P. J.; Mohr, P.; Robinson, J. E.;Blake, J. F.; Hilgenkamp, R. K.; Yee, C.; Adjabeng, G.;Elworthy, T. R.; Li, J.;Wang, B.; Bamberg, J. T.; Harris, S. F.;Wong, A.; Leveque, V. J. P.; Najera, I.; Pogam, S. L.;Rajyaguru, S.; Ao-Ieong, G.; Alexandrova, L.; Larrabee, S.;Brandl, M.; Briggs, A.; Sukhtankar, S.; Farrell, R. Bioorg. Med. Chem. Lett. 2009, 19, 5652. doi: 10.1016/j.bmcl.2009.08.022
-
[21]
(17) Hendricks, R. T.; Spencer, S. R.; Blake, J. F.; Fell, J. B.; Fischer,J.; Stengel, P. J.; Leveque, V. J. P.; Pogam, S. L.; Rajyaguru, S.;Najera, I.; Swallow, S. Bioorg. Med. Chem. Lett. 2009, 19, 410.doi: 10.1016/j.bmcl.2008.11.060(17) Hendricks, R. T.; Spencer, S. R.; Blake, J. F.; Fell, J. B.; Fischer,J.; Stengel, P. J.; Leveque, V. J. P.; Pogam, S. L.; Rajyaguru, S.;Najera, I.; Swallow, S. Bioorg. Med. Chem. Lett. 2009, 19, 410.doi: 10.1016/j.bmcl.2008.11.060
-
[22]
(18) de Vicente, J.; Hendricks, R. T.; Smith, D. B.; Fell, J. B.; Fischer,J.; Spencer, S. R.; Stengel, P. J.; Mohr, P.; Robinson, J. E.;Blake, J. F.; Hilgenkamp, R. K.; Yee, C.; Adjabeng, G.;Elworthy, T. R.; Tracy, J.; Chin, E.; Li, J.;Wang, B.; Bamberg,J. T.; Stephenson, R.; Oshiro, C.; Harris, S. F.; Ghate, M.;Leveque, V.; Najera, I.; Pogam, S. L.; Rajyaguru, S.; Ao-Ieong,G.; Alexandrova, L.; Larrabee, S.; Brandl, M.; Briggs, A.;Sukhtankar, S.; Farrell, R.; Xu, B. Bioorg. Med. Chem. Lett.2009, 19, 3642. doi: 10.1016/j.bmcl.2009.05.004(18) de Vicente, J.; Hendricks, R. T.; Smith, D. B.; Fell, J. B.; Fischer,J.; Spencer, S. R.; Stengel, P. J.; Mohr, P.; Robinson, J. E.;Blake, J. F.; Hilgenkamp, R. K.; Yee, C.; Adjabeng, G.;Elworthy, T. R.; Tracy, J.; Chin, E.; Li, J.;Wang, B.; Bamberg,J. T.; Stephenson, R.; Oshiro, C.; Harris, S. F.; Ghate, M.;Leveque, V.; Najera, I.; Pogam, S. L.; Rajyaguru, S.; Ao-Ieong,G.; Alexandrova, L.; Larrabee, S.; Brandl, M.; Briggs, A.;Sukhtankar, S.; Farrell, R.; Xu, B. Bioorg. Med. Chem. Lett.2009, 19, 3642. doi: 10.1016/j.bmcl.2009.05.004
-
[23]
(19) Hendricks, R. T.; Fell, J. B.; Blake, J. F.; Fischer, J. P.;Robinson, J. E.; Spencer, S. R.; Stengel, P. J.; Bernacki, A. L.;Leveque, V. J. P.; Pogam, S. L.; Rajyaguru, S.; Najera, I.; Josey,J. A.; Harris, J. R.; Swallow, S. Bioorg. Med. Chem. Lett. 2009,19, 3637. doi: 10.1016/j.bmcl.2009.04.119(19) Hendricks, R. T.; Fell, J. B.; Blake, J. F.; Fischer, J. P.;Robinson, J. E.; Spencer, S. R.; Stengel, P. J.; Bernacki, A. L.;Leveque, V. J. P.; Pogam, S. L.; Rajyaguru, S.; Najera, I.; Josey,J. A.; Harris, J. R.; Swallow, S. Bioorg. Med. Chem. Lett. 2009,19, 3637. doi: 10.1016/j.bmcl.2009.04.119
-
[24]
(20) de Vicente, J.; Hendricks, R. T.; Smith, D. B.; Fell, J. B.; Fischer,J.; Spencer, S. R.; Stengel, P. J.; Mohr, P.; Robinson, J. E.;Blake, J. F.; Hilgenkamp, R. K.; Yee, C.; Zhao, J.; Elworthy, T.R.; Tracy, J.; Chin, E.; Li, J.; Lui, A.;Wang, B.; Oshiro, C.;Harris, S. F.; Ghate, M.; Leveque, V. J. P.; Najera, I.; Pogam, S.L.; Rajyaguru, S.; Ao-Ieong, G.; Alexandrova, L.; Fitch, B.;Brandl, M.; Masjedizadeh, M.;Wua, S. Y.; de Keczer, S.;Voronin, T. Bioorg. Med. Chem. Lett. 2009, 19, 5648.doi: 10.1016/j.bmcl.2009.08.023(20) de Vicente, J.; Hendricks, R. T.; Smith, D. B.; Fell, J. B.; Fischer,J.; Spencer, S. R.; Stengel, P. J.; Mohr, P.; Robinson, J. E.;Blake, J. F.; Hilgenkamp, R. K.; Yee, C.; Zhao, J.; Elworthy, T.R.; Tracy, J.; Chin, E.; Li, J.; Lui, A.;Wang, B.; Oshiro, C.;Harris, S. F.; Ghate, M.; Leveque, V. J. P.; Najera, I.; Pogam, S.L.; Rajyaguru, S.; Ao-Ieong, G.; Alexandrova, L.; Fitch, B.;Brandl, M.; Masjedizadeh, M.;Wua, S. Y.; de Keczer, S.;Voronin, T. Bioorg. Med. Chem. Lett. 2009, 19, 5648.doi: 10.1016/j.bmcl.2009.08.023
-
[25]
(21) Todeschini, R.; Consonni, V. Handbook of Molecular Descriptors;Wiley-VCH: New York, 2000.(21) Todeschini, R.; Consonni, V. Handbook of Molecular Descriptors;Wiley-VCH: New York, 2000.
-
[26]
(22) Xue, Y.; Li, Z. R.; Yap, C.W.; Sun, L. Z.; Chen, X.; Chen, Y. Z.J. Chem. Inform. Comp. Sci. 2004, 44, 1630. doi: 10.1021/ci049869h(22) Xue, Y.; Li, Z. R.; Yap, C.W.; Sun, L. Z.; Chen, X.; Chen, Y. Z.J. Chem. Inform. Comp. Sci. 2004, 44, 1630. doi: 10.1021/ci049869h
-
[27]
(23) Tan, N. X.; Rao, H. B.; Li, Z. R.; Li, X. Y. SAR QSAR Environ. Res. 2009, 20, 27. doi: 10.1080/10629360902724085(23) Tan, N. X.; Rao, H. B.; Li, Z. R.; Li, X. Y. SAR QSAR Environ. Res. 2009, 20, 27. doi: 10.1080/10629360902724085
-
[28]
(24) http://www.models.kvl.dk/source/GAPLS/index.asp, accessedJune 2008.(24) http://www.models.kvl.dk/source/GAPLS/index.asp, accessedJune 2008.
-
[29]
(25) Leardi, R.; Boggia, R.; Terrile, M. J. Chemom. 1992, 6, 267.(25) Leardi, R.; Boggia, R.; Terrile, M. J. Chemom. 1992, 6, 267.
-
[30]
(26) Leardi, R. J. Chemom. 1994, 8, 65.(26) Leardi, R. J. Chemom. 1994, 8, 65.
-
[31]
(27) Burbidge, R.; Trotter, M.; Buxton, B.; Holden, S. Comput. Chem. 2001, 26, 5. doi: 10.1016/S0097-8485(01)00094-8(27) Burbidge, R.; Trotter, M.; Buxton, B.; Holden, S. Comput. Chem. 2001, 26, 5. doi: 10.1016/S0097-8485(01)00094-8
-
[32]
(28) Cherkassky, V.; Ma, Y. Selection of Meta-parameters forSupport Vector Regression. Proceedings of the InternationalConference on Artificial Neural Networks, Madrid, Spain, Aug28-30, 2002.(28) Cherkassky, V.; Ma, Y. Selection of Meta-parameters forSupport Vector Regression. Proceedings of the InternationalConference on Artificial Neural Networks, Madrid, Spain, Aug28-30, 2002.
-
[33]
(29) Hao, M.; Li, Y.;Wang, Y.; Zhang, S. Anal. Chim. Acta 2011,690, 53. doi: 10.1016/j.aca.2011.02.004(29) Hao, M.; Li, Y.;Wang, Y.; Zhang, S. Anal. Chim. Acta 2011,690, 53. doi: 10.1016/j.aca.2011.02.004
-
[34]
(30) Rainville, F. M. D.; Fortin, F. A.; Gardner, M. A.; Parizeau, M.;Gagné, C. DEAP: A Python Framework for EvolutionaryAl rithms. In EvoSoft Workshop, Companion Proc. of theGenetic and Evolutionary Computation Conference, July 07-11,2012.(30) Rainville, F. M. D.; Fortin, F. A.; Gardner, M. A.; Parizeau, M.;Gagné, C. DEAP: A Python Framework for EvolutionaryAl rithms. In EvoSoft Workshop, Companion Proc. of theGenetic and Evolutionary Computation Conference, July 07-11,2012.
-
[35]
(31) Chang, C. C.; Lin, C. J. LIBSVM: A Library for Support VectorMachines, 2001. Software available at http://www.csie.ntu.edu.tw/-cjlin/libsvm, accessed Jun 2008.
(31) Chang, C. C.; Lin, C. J. LIBSVM: A Library for Support VectorMachines, 2001. Software available at http://www.csie.ntu.edu.tw/-cjlin/libsvm, accessed Jun 2008.
-
[1]
-

计量
- PDF下载量: 947
- 文章访问数: 902
- HTML全文浏览量: 10