基于动态多粒度扫描预测药物−靶点相互作用

张琪; 殷志祥; 陆林

doi:10.12299/jsues.24-0143

基于动态多粒度扫描预测药物−靶点相互作用

doi: 10.12299/jsues.24-0143

张琪^1,,
殷志祥^1, ,,
陆林²

1.
上海工程技术大学数理与统计学院, 上海 201620
2.
上海信昊信息科技有限公司, 上海200235

基金项目: 国家自然科学基金（62573282）

详细信息

作者简介:
张琪：张　琪（1998 − ），女，硕士生，研究方向为生物统计。E-mail：942587377@qq.com

通讯作者:
殷志祥（1966 − ），男，教授，博士，研究方向为图论与组合、DNA计算及蛋白质结构预测。E-mail：zxyin66@163.com

中图分类号: TP391
计量
- 文章访问数: 1434
- HTML全文浏览量: 203
- PDF下载量: 24
- 被引次数: 0
出版历程
- 收稿日期: 2024-05-28
- 网络出版日期: 2025-12-22
- 刊出日期: 2025-09-30

Predicting drug-target interactions based on dynamic multi-grained scanning

ZHANG Qi^1
,,
YIN Zhixiang^{1
, ,},
LU Lin²

1.
School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
2.
Shanghai Xinhao Information Technology Co., Ltd., Shanghai 200335, China

摘要

摘要: 针对传统机器学习模型在药物−靶点预测任务中由浅层模型结构和复杂数据特征导致分类表现不佳的问题，提出一种新预测模型DMS-DF。该模型基于深度森林算法，引入动态自适应多粒度扫描机制，并选择CatBoost和XGBoost作为级联森林基分类器。结果表明， DMS-DF模型在药物–靶点预测中表现优于同一数据集下的其他4个模型，为药物发现提供了新途径。
- 药物−靶点相互作用 /
- 机器学习 /
- 多粒度级联森林模型 /
- 多粒度扫描
Abstract: To address the poor classification performance of traditional machine learning models in the drug-target prediction, a problem caused by their shallow structure and complex data features, a novel prediction model DMS-DF was proposed. The model was based on the deep forest algorithm, the model incorporated a dynamic adaptive multi-granularity scanning mechanism. Furthermore, CatBoost and XGBoost were selected as cascade forest-based classifiers. It demonstrates that that the DMS-DF model outperforms the other four models in terms of drug-target prediction on the same dataset, providing a novel approach for drug discovery.
- drug-target interactions /
- machine learning /
- multi-grained cascade forest (gcForest) model /
- multi-grained scanning

HTML全文

图 1 动态自适应多粒度扫描

Figure 1. Dynamic adaptive multi-granularity scanning

下载: 全尺寸图片幻灯片

图 2 改进级联结构图

Figure 2. Improved cascade structural diagram

下载: 全尺寸图片幻灯片

表 1 DMS-DF和其他方法的表现

Table 1. Performance of DMS-DF and baseline methods

模型	S_n	S_p	MCC	AUC	AUPR
DMS-DF	0.9417	0.9317	0.8935	0.9847	0.9857
LGBMDF	0.9451	0.9471	0.8924	0.9844	0.9855
NEDTP	0.9194	0.9267	0.8462	0.9714	0.9690
SVM	0.8869	0.9286	0.8162	0.9668	0.9664
RF	0.9138	0.9348	0.8488	0.9784	0.9798

下载: 导出CSV

表 2 每种方法的性能比较

Table 2. Performance comparison under each method

模型	AUC	AUPR
3XGBoost-3RF	0.9813	0.9834
3CatBoost-3RF	0.9796	0.9818
DMS-DF	0.9847	0.9857

下载: 导出CSV

参考文献(26)

[1]	SCHOMBURG I, CHANG A, PLACZEK S, et al. BRENDA in 2013: integrated reactions, kinetic data, enzyme function data, improved disease classification: new options and contents in BRENDA[J] . Nucleic Acids Research, 2013, 41: D764 − D772.
[2]	LOTFI SHAHREZA M, GHADIRI N, MOUSAVI S R, et al. A review of network-based approaches to drug repositioning[J] . Briefings in Bioinformatics, 2018, 19(5): 878 − 892. doi: 10.1093/bib/bbx017
[3]	KANEHISA M, FURUMICHI M, TANABE M, et al. KEGG: new perspectives on genomes, pathways, diseases and drugs[J] . Nucleic Acids Research, 2017, 45: D353 − D361. doi: 10.1093/nar/gkw1092
[4]	LEE I, KEUM J, NAM H. DeepConv-DTI: prediction of drug-target interactions via deep learning with convolution on protein sequences[J] . PLoS Computational Biology, 2019, 15(6): e1007129. doi: 10.1371/journal.pcbi.1007129
[5]	RU X Q, YE X C, SAKURAI T, et al. Current status and future prospects of drug–target interaction prediction[J] . Briefings in Functional Genomics, 2021, 20(5): 312 − 322. doi: 10.1093/bfgp/elab031
[6]	LIU Y, WU M, MIAO C Y, et al. Neighborhood regularized logistic matrix factorization for drug-target interaction prediction[J] . PLoS Computational Biology, 2016, 12(2): e1004760. doi: 10.1371/journal.pcbi.1004760
[7]	O'CONNELL M J, LOCK E F. Linked matrix factorization[J] . Biometrics, 2019, 75(2): 582 − 592. doi: 10.1111/biom.13010
[8]	GIRYES R, SAPIRO G, BRONSTEIN A M. Deep neural networks with random Gaussian weights: a universal classification strategy?[J] . IEEE Transactions on Signal Processing, 2015, 64: 3444 − 3457. doi: 10.1109/TSP.2019.2961228
[9]	BLEAKLEY K, YAMANISHI Y. Supervised prediction of drug-target interactions using bipartite local models[J] . Bioinformatics, 2009, 25(18): 2397 − 2403. doi: 10.1093/bioinformatics/btp433
[10]	YAMANISHI Y, KOTERA M, KANEHISA M, et al. Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework[J] . Bioinformatics, 2010, 26(12): i246 − i254. doi: 10.1093/bioinformatics/btq176
[11]	白茹, 滕奇志, 杨晓敏, 等. 基于SVM和GA的药物与人血清白蛋白结合的预测[J] . 计算机工程与应用, 2009, 45(12): 226 − 228, 248. doi: 10.3778/j.issn.1002-8331.2009.12.072
[12]	GÖNEN M. Predicting drug–target interactions from chemical and genomic kernels using Bayesian matrix factorization[J] . Bioinformatics, 2012, 28(18): 2304 − 2310. doi: 10.1093/bioinformatics/bts360
[13]	YAN X, YOU Z H, WANG L, et al. DTIFS: a novel computational approach for predicting drug-target inter-actions from drug structure and protein sequence[C] //Proceedings of the 16th International Conference on Intelligent Computing Theories and Application. Bari: Springer, 2020: 371 − 383.
[14]	LIAN M J, DU W L, WANG X J, et al. Drug-target interaction prediction based on multi-similarity fusion and sparse dual-graph regularized matrix factorization[J] . IEEE Access, 2021, 9: 99718 − 99730. doi: 10.1109/ACCESS.2021.3096830
[15]	章新友, 王芝, 张春强, 等. 相似性算法在药物−靶点预测研究中的应用[J] . 中国新药杂志, 2024, 33(9): 885 − 894. doi: 10.3969/j.issn.1003-3734.2024.09.007
[16]	WANG Y C, YANG Z X, WANG Y, et al. Computationally probing drug-protein interactions via support vector machine[J] . Letters in Drug Design & Discovery, 2010, 7(5): 370 − 378.
[17]	刘文昌, 魏赟, 袁浩轩, 等. 基于SMOTE和gcForest的医疗小样本数据分类研究[J] . 物联网学报, 2023, 7(2): 76 − 87. doi: 10.11959/j.issn.2096-3750.2023.00337
[18]	CHAWLA N V, BOWYER K W, HALL L O, et al. SMOTE: synthetic minority over-sampling technique[J] . Journal of Artificial Intelligence Research, 2002, 16: 321 − 357. doi: 10.1613/jair.953
[19]	GOODFELLOW I, BENGIO Y, COURVILLE A. Deep learning[M] . Cambridge: The MIT Press, 2016: 216.
[20]	ZHOU Z H, FENG J. Deep forest: towards an alternative to deep neural networks[C] // Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne: ACM, 2017: 3553 − 3559.
[21]	PROKHORENKOVA L, GUSEV G, VOROBEV A, et al. CatBoost: unbiased boosting with categorical features[C] //Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal: ACM, 2018: 6639 − 6649.
[22]	CHEN T Q, GUESTRIN C. XGBoost: a scalable tree boosting system[C] //Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco: ACM, 2016: 785 − 794.
[23]	CORTES C, VAPNIK V. Support-vector networks[J] . Machine Learning, 1995, 20(3): 273 − 297.
[24]	HO T K. Random decision forests[C] //Proceedings of 3rd International Conference on Document Analysis and Recognition. Montreal: IEEE, 1995: 278 − 282.
[25]	PENG Y, ZHAO S W, ZENG Z L, et al. LGBMDF: a cascade forest framework with LightGBM for predicting drug-target interactions[J] . Frontiers in Microbiology, 2022, 13: 1092467. doi: 10.3389/fmicb.2022.1092467
[26]	AN Q, YU L. A heterogeneous network embedding framework for predicting similarity-based drug-target interactions[J] . Briefings in Bioinformatics, 2021, 22(6): 1 − 10. doi: 10.1093/bib/bbab275