
Citation: HE Bing, LUO Yong, LI Bing-Ke, XUE Ying, YU Luo-Ting, QIU Xiao-Long, YANG Teng-Kuei. Predicting and Virtually Screening Breast Cancer Targeting Protein HEC1 Inhibitors by Molecular Descriptors and Machine Learning Methods[J]. Acta Physico-Chimica Sinica, 2015, 31(9): 1795-1802. doi: 10.3866/PKU.WHXB201507301

基于分子描述符和机器学习方法预测和虚拟筛选乳腺癌靶向蛋白HEC1抑制剂
HEC1(癌症高表达蛋白)是纺锤体检查点控制、着丝粒功能、细胞存活的关键的有丝分裂调节器, 与原发性乳腺癌的不良预后有关. 筛选具有高亲和力的HEC1新型抑制剂对探索乳腺癌的靶向治疗具有重要意义.本文从结构多样性的化合物库中筛选HEC1抑制剂. 通过对分子描述符的特征筛选, 采用支持向量机(SVM)和随机森林(RF)方法分别对HEC1抑制剂和非抑制剂建立了分类模型. 经对比, RF模型显示了更好的预测精度.我们采用RF模型对HEC1抑制剂进行了虚拟筛选, 从“in-house”实体库筛选得到2个潜在的HEC1抑制剂分子.随后对筛出的化合物进行了体外活性实验, 发现对乳腺癌细胞株MDA-MB-468和MDA-MB-231均有一定程度的抗肿瘤活性. 研究结果表明, 机器学习方法对于设计和虚拟筛选HEC1抑制剂有良好的效果.
English
Predicting and Virtually Screening Breast Cancer Targeting Protein HEC1 Inhibitors by Molecular Descriptors and Machine Learning Methods
Highly expressed in cancer 1 (HEC1) is a conserved mitotic regulator that is critical for spindle checkpoint control, kinetochore functionality, and cell survival. Overexpression of HEC1 has been detected in a variety of human cancers, and it is linked to poor prognosis of primary breast cancers. Thus, it is important to screen novel inhibitors with high affinity for HEC1. Machine learning (ML) methods were exhibiting od pharmacodynamics, and toxicity. In this work, two ML methods, support vector machines (SVMs) and random forests (RFs), were used to develop a classification method for searching inhibitors and non-inhibitors of HEC1 from the chemical library of structural diversity by screening characteristics of molecular descriptors. Both ML methods achieved promising prediction accuracies, and the RF model showed better performance. We performed virtual screening of HEC1 inhibitors by the RF model from an in-house database to screen potential HEC1 inhibitors. Two novel potential candidates were found. In vitro experiments of the two compounds showed that both had a certain degree of antitumor activity for the MDA-MB-468 and MDA-MB-231 breast cancer cell lines. Our study shows that ML methods are promising to design and virtually screen inhibitors of HEC1.
-
Key words:
-
HEC1
- / Selective inhibitor
- / Machine learning method
- / Support vector machine
- / Random forest
- / Virtual screening
-
-
[1]
(1) Gan, S. J.; Wang, Q.; Zhu, L. M.; Xie, H.; Ding, X. F. Basic & Clin. Med. 2015, 35 (1), 134. [甘绍举, 王青, 朱丽敏, 谢浩, 丁先锋. 基础医学与临床, 2015, 35 (1), 134.]
(1) Gan, S. J.; Wang, Q.; Zhu, L. M.; Xie, H.; Ding, X. F. Basic & Clin. Med. 2015, 35 (1), 134. [甘绍举, 王青, 朱丽敏, 谢浩, 丁先锋. 基础医学与临床, 2015, 35 (1), 134.]
-
[2]
(2) Chen, Y.; Riley, D. J.; Chen, P. L.; Lee, W. H. Mol. Cell Biol. 1997, 17 (10), 6049.(2) Chen, Y.; Riley, D. J.; Chen, P. L.; Lee, W. H. Mol. Cell Biol. 1997, 17 (10), 6049.
-
[3]
(3) Du, X. L.; Wang, M. R. Acta Acad. Med. Sin. 2007, 29 (1), 137. [杜小莉, 王明荣. 中国医学科学院学报, 2007, 29 (1), 137.](3) Du, X. L.; Wang, M. R. Acta Acad. Med. Sin. 2007, 29 (1), 137. [杜小莉, 王明荣. 中国医学科学院学报, 2007, 29 (1), 137.]
-
[4]
(4) Hu, C. M.; Zhu, J.; Guo, X. E.; Chen, W.; Qiu, X. L.; N , B.; Chien, R.; Wang, Y. V.; Tsai, C. Y.; Wu, G.; Kim, Y.; Lopez, R.; Chamberlin, A. R.; Lee, E. H.; Lee, W. H. Oncogene 2015, 34, 1220. doi: 10.1038/onc.2014.67(4) Hu, C. M.; Zhu, J.; Guo, X. E.; Chen, W.; Qiu, X. L.; N , B.; Chien, R.; Wang, Y. V.; Tsai, C. Y.; Wu, G.; Kim, Y.; Lopez, R.; Chamberlin, A. R.; Lee, E. H.; Lee, W. H. Oncogene 2015, 34, 1220. doi: 10.1038/onc.2014.67
-
[5]
(5) Huang, L. Y.; Chang, C. C.; Lee, Y. S.; Chang, J. M.; Huang, J. J.; Chuang, S. H.; Kao, K. J.; Lau, G. M.; Tsai, P. Y.; Liu, C. W.; Lin, H. S.; Lau, J. Y. Mol. Cancer Ther. 2014, 13 (6), 1419.(5) Huang, L. Y.; Chang, C. C.; Lee, Y. S.; Chang, J. M.; Huang, J. J.; Chuang, S. H.; Kao, K. J.; Lau, G. M.; Tsai, P. Y.; Liu, C. W.; Lin, H. S.; Lau, J. Y. Mol. Cancer Ther. 2014, 13 (6), 1419.
-
[6]
(6) Lee, Y. S.; Chuang, S. H.; Huang, L. Y.; Lai, C. L.; Lin, Y. H.; Yang, J. Y.; Liu, C. W.; Yang, S. C.; Lin, H. S.; Chang, C. C.; Lai, J. Y.; Jian, P. S.; Lam, K.; Chang, J. M.; Lau, J. Y.; Huang, J. J. J. Med. Chem. 2014, 57 (10), 4098. doi: 10.1021/jm401990s(6) Lee, Y. S.; Chuang, S. H.; Huang, L. Y.; Lai, C. L.; Lin, Y. H.; Yang, J. Y.; Liu, C. W.; Yang, S. C.; Lin, H. S.; Chang, C. C.; Lai, J. Y.; Jian, P. S.; Lam, K.; Chang, J. M.; Lau, J. Y.; Huang, J. J. J. Med. Chem. 2014, 57 (10), 4098. doi: 10.1021/jm401990s
-
[7]
(7) Wu, G.; Qiu, X. L.; Zhou, L.; Zhu, J.; Chamberlin, R.; Lau, J.; Chen, P. L.; Lee, W. H. Cancer Res. 2008, 68 (20), 8393. doi: 10.1158/0008-5472.CAN-08-1915(7) Wu, G.; Qiu, X. L.; Zhou, L.; Zhu, J.; Chamberlin, R.; Lau, J.; Chen, P. L.; Lee, W. H. Cancer Res. 2008, 68 (20), 8393. doi: 10.1158/0008-5472.CAN-08-1915
-
[8]
(8) Qiu, X. L.; Li, G.; Wu, G.; Zhu, J.; Zhou, L.; Chen, P. L.; Chamberlin, A. R.; Lee, W. H. J. Med. Chem. 2009, 52 (6), 1757. doi: 10.1021/jm8015969(8) Qiu, X. L.; Li, G.; Wu, G.; Zhu, J.; Zhou, L.; Chen, P. L.; Chamberlin, A. R.; Lee, W. H. J. Med. Chem. 2009, 52 (6), 1757. doi: 10.1021/jm8015969
-
[9]
(9) Chen, Y.; Riley, D. J.; Zheng, L.; Chen, P. L.; Lee, W. H. J. Biol. Chem. 2002, 277 (51), 49408. doi: 10.1074/jbc.M207069200(9) Chen, Y.; Riley, D. J.; Zheng, L.; Chen, P. L.; Lee, W. H. J. Biol. Chem. 2002, 277 (51), 49408. doi: 10.1074/jbc.M207069200
-
[10]
(10) Diaz-Rodríguez, E.; Sotillo, R.; Schvartzman, J. M.; Benezra, R. Proc. Natl. Acad. Sci. U. S. A. 2008, 105 (43), 16719. doi: 10.1073/pnas.0803504105(10) Diaz-Rodríguez, E.; Sotillo, R.; Schvartzman, J. M.; Benezra, R. Proc. Natl. Acad. Sci. U. S. A. 2008, 105 (43), 16719. doi: 10.1073/pnas.0803504105
-
[11]
(11) Ferretti, C.; Totta, P.; Fiore, M.; Mattiuzzo, M.; Schillaci, T.; Ricordye, R.; Di Leonardo, A.; Degrassi, F. Cell Cycle 2010, 9 (20), 4174. doi: 10.4161/cc.9.20.13457(11) Ferretti, C.; Totta, P.; Fiore, M.; Mattiuzzo, M.; Schillaci, T.; Ricordye, R.; Di Leonardo, A.; Degrassi, F. Cell Cycle 2010, 9 (20), 4174. doi: 10.4161/cc.9.20.13457
-
[12]
(12) Wei, R.; N , B.; Wu, G.; Lee, W. H. Mol. Biol. Cell 2011, 22 (19), 3584. doi: 10.1091/mbc.E11-01-0012(12) Wei, R.; N , B.; Wu, G.; Lee, W. H. Mol. Biol. Cell 2011, 22 (19), 3584. doi: 10.1091/mbc.E11-01-0012
-
[13]
(13) Xue, Y.; Li, H.; Ung, C.; Yap, C.; Chen, Y. Chem. Res. Toxicol. 2006, 19, 1030. doi: 10.1021/tx0600550(13) Xue, Y.; Li, H.; Ung, C.; Yap, C.; Chen, Y. Chem. Res. Toxicol. 2006, 19, 1030. doi: 10.1021/tx0600550
-
[14]
(14) Xue, Y.; Yap, C. W.; Sun, L. Z.; Cao, Z. W.; Wang, J.; Chen, Y. Z. J. Chem. Inf. Comput. Sci. 2004, 44, 1497. doi: 10.1021/ci049971e(14) Xue, Y.; Yap, C. W.; Sun, L. Z.; Cao, Z. W.; Wang, J.; Chen, Y. Z. J. Chem. Inf. Comput. Sci. 2004, 44, 1497. doi: 10.1021/ci049971e
-
[15]
(15) Xue, Y.; Li, Z.; Yap, C. W.; Sun, L.; Chen, X.; Chen, Y. Z. J. Chem. Inf. Comput. Sci. 2004, 44, 1630. doi: 10.1021/ci049869h(15) Xue, Y.; Li, Z.; Yap, C. W.; Sun, L.; Chen, X.; Chen, Y. Z. J. Chem. Inf. Comput. Sci. 2004, 44, 1630. doi: 10.1021/ci049869h
-
[16]
(16) Yang, X. G.; Chen, D.; Wang, M.; Xue, Y.; Chen, Y. Z. J. Comput. Chem. 2009, 30, 1202. doi: 10.1002/jcc.v30:8(16) Yang, X. G.; Chen, D.; Wang, M.; Xue, Y.; Chen, Y. Z. J. Comput. Chem. 2009, 30, 1202. doi: 10.1002/jcc.v30:8
-
[17]
(17) Yang, X. G.; Lv, W.; Chen, Y. Z.; Xue, Y. J. Comput. Chem. 2010, 31, 1249.(17) Yang, X. G.; Lv, W.; Chen, Y. Z.; Xue, Y. J. Comput. Chem. 2010, 31, 1249.
-
[18]
(18) Lv, W.; Xue, Y. Eur. J. Med. Chem. 2010, 45, 1167. doi: 10.1016/j.ejmech.2009.12.038(18) Lv, W.; Xue, Y. Eur. J. Med. Chem. 2010, 45, 1167. doi: 10.1016/j.ejmech.2009.12.038
-
[19]
(19) Cong, Y.; Yang, X.; Lv, W.; Xue, Y. J. Mol. Graph. Model. 2009, 28, 236. doi: 10.1016/j.jmgm.2009.08.001(19) Cong, Y.; Yang, X.; Lv, W.; Xue, Y. J. Mol. Graph. Model. 2009, 28, 236. doi: 10.1016/j.jmgm.2009.08.001
-
[20]
(20) Luan, F.; Liu, H.; Ma, W.; Fan, B. Eur. Med. Chem. 2008, 43, 43. doi: 10.1016/j.ejmech.2007.03.002(20) Luan, F.; Liu, H.; Ma, W.; Fan, B. Eur. Med. Chem. 2008, 43, 43. doi: 10.1016/j.ejmech.2007.03.002
-
[21]
(21) Ung, C. Y.; Li, H.; Yap, C. W.; Chen, Y. Z. Mol. Pharmacol. 2007, 71, 158.(21) Ung, C. Y.; Li, H.; Yap, C. W.; Chen, Y. Z. Mol. Pharmacol. 2007, 71, 158.
-
[22]
(22) Li, H.; Ung, C.; Yap, C.; Xue, Y.; Li, Z.; Cao, Z.; Chen, Y. Chem. Res. Toxicol. 2005, 18, 1071. doi: 10.1021/tx049652h(22) Li, H.; Ung, C.; Yap, C.; Xue, Y.; Li, Z.; Cao, Z.; Chen, Y. Chem. Res. Toxicol. 2005, 18, 1071. doi: 10.1021/tx049652h
-
[23]
(23) Li, B. K.; Cong, Y.; Tian, Z. Y.; Xue, Y. Acta Phys. -Chim. Sin. 2014, 30 (1), 171. [李秉轲, 丛湧, 田之悦, 薛英. 物理化学学报, 2014, 30 (1), 171.] doi: 10.3866/PKU.WHXB201311041(23) Li, B. K.; Cong, Y.; Tian, Z. Y.; Xue, Y. Acta Phys. -Chim. Sin. 2014, 30 (1), 171. [李秉轲, 丛湧, 田之悦, 薛英. 物理化学学报, 2014, 30 (1), 171.] doi: 10.3866/PKU.WHXB201311041
-
[24]
(24) Huang, J. J.; Lau, J. Improved Modulators of HEC1 Activity and Methods. CN Patent 103038231.A, 2013-04-10. [Huang, J. J., Lau, J. HEC1活性调节剂及其方法: 中国, CN103038231.A[P]. 2013-04-10.](24) Huang, J. J.; Lau, J. Improved Modulators of HEC1 Activity and Methods. CN Patent 103038231.A, 2013-04-10. [Huang, J. J., Lau, J. HEC1活性调节剂及其方法: 中国, CN103038231.A[P]. 2013-04-10.]
-
[25]
(25) Duda, R. O.; Hart, P. E. Pattern Classification and Scene Analysis; John Wiley & Sons: Hoboken, New Jersey, USA, 1973.(25) Duda, R. O.; Hart, P. E. Pattern Classification and Scene Analysis; John Wiley & Sons: Hoboken, New Jersey, USA, 1973.
-
[26]
(26) ChemDraw 7.0.1 ed.; CambridgeSoft Corporation, Cambridge: Massachusetts, USA, 2007.(26) ChemDraw 7.0.1 ed.; CambridgeSoft Corporation, Cambridge: Massachusetts, USA, 2007.
-
[27]
(27) Corina 3.4 edn.; Molecular Networks GmbH Computerchemie: Erlangen, Germany, 2006.(27) Corina 3.4 edn.; Molecular Networks GmbH Computerchemie: Erlangen, Germany, 2006.
-
[28]
(28) Burges, C. J. Data Min. Knowl. Disc. 1998, 2, 121.(28) Burges, C. J. Data Min. Knowl. Disc. 1998, 2, 121.
-
[29]
(29) Vapnik, V. N. The Nature of Statistical Learning Theory; Springer: Berlin & Heidelberg, Germany, 1995.(29) Vapnik, V. N. The Nature of Statistical Learning Theory; Springer: Berlin & Heidelberg, Germany, 1995.
-
[30]
(30) Doucet, J. P.; Barbault, F.; Xia, H.; Panaye, A.; Fan, B. Curr. Comput-Aid. Drug. 2007, 3, 263. doi: 10.2174/157340907782799372(30) Doucet, J. P.; Barbault, F.; Xia, H.; Panaye, A.; Fan, B. Curr. Comput-Aid. Drug. 2007, 3, 263. doi: 10.2174/157340907782799372
-
[31]
(31) Svetnik, V.; Liaw, A.; Tong, C.; Culberson, J. C.; Sheridan, R. P.; Feuston, B. P. J. Chem. Inf. Comput. Sci. 2003, 43, 1947. doi: 10.1021/ci034160g(31) Svetnik, V.; Liaw, A.; Tong, C.; Culberson, J. C.; Sheridan, R. P.; Feuston, B. P. J. Chem. Inf. Comput. Sci. 2003, 43, 1947. doi: 10.1021/ci034160g
-
[32]
(32) Breiman, L. Mach. Learn. 2001, 45, 5. doi: 10.1023/A: 1010933404324(32) Breiman, L. Mach. Learn. 2001, 45, 5. doi: 10.1023/A: 1010933404324
-
[33]
(33) Khandelwal, A.; Krasowski, M. D.; Reschly, E. J.; Sinz, M. W.; Swaan, P. W.; Ekins, S. Chem. Res. Toxicol. 2008, 21, 1457. doi: 10.1021/tx800102e(33) Khandelwal, A.; Krasowski, M. D.; Reschly, E. J.; Sinz, M. W.; Swaan, P. W.; Ekins, S. Chem. Res. Toxicol. 2008, 21, 1457. doi: 10.1021/tx800102e
-
[34]
(34) Breiman, L. Out-of-bag Estimation, 1996, http://citeseerx.ist.psu.edu.sci-hub.org/viewdoc/download?doi=10.1.1.45.3712&rep=rep1&type=pdf (accessed Mar 15, 2015).(34) Breiman, L. Out-of-bag Estimation, 1996, http://citeseerx.ist.psu.edu.sci-hub.org/viewdoc/download?doi=10.1.1.45.3712&rep=rep1&type=pdf (accessed Mar 15, 2015).
-
[35]
(35) Breiman, L. Wald Lecture II, Looking inside the Black Box, 2005. http://www.stat.berkeley.edu/users/breiman (accessed Mar 15, 2015).(35) Breiman, L. Wald Lecture II, Looking inside the Black Box, 2005. http://www.stat.berkeley.edu/users/breiman (accessed Mar 15, 2015).
-
[36]
(36) Breiman, L.; Cutler, A. Random Forests, Version 5.1, 2004. http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm (accessed Mar 15, 2015).
(36) Breiman, L.; Cutler, A. Random Forests, Version 5.1, 2004. http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm (accessed Mar 15, 2015).
-
[1]
-

计量
- PDF下载量: 203
- 文章访问数: 946
- HTML全文浏览量: 99