留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于Stacking的乳腺癌诊断集成模型与实证分析

程慕爽 王国强

程慕爽, 王国强. 基于Stacking的乳腺癌诊断集成模型与实证分析[J]. 上海工程技术大学学报, 2025, 39(3): 360-365. doi: 10.12299/jsues.24-0147
引用本文: 程慕爽, 王国强. 基于Stacking的乳腺癌诊断集成模型与实证分析[J]. 上海工程技术大学学报, 2025, 39(3): 360-365. doi: 10.12299/jsues.24-0147
CHENG Mushuang, WANG Guoqiang. Ensemble model and empirical analysis of breast cancer diagnosis based on Stacking[J]. Journal of Shanghai University of Engineering Science, 2025, 39(3): 360-365. doi: 10.12299/jsues.24-0147
Citation: CHENG Mushuang, WANG Guoqiang. Ensemble model and empirical analysis of breast cancer diagnosis based on Stacking[J]. Journal of Shanghai University of Engineering Science, 2025, 39(3): 360-365. doi: 10.12299/jsues.24-0147

基于Stacking的乳腺癌诊断集成模型与实证分析

doi: 10.12299/jsues.24-0147
基金项目: 国家自然科学基金面上项目(11971302);浦东新区科技发展基金产学研专项资金(人工智能)项目(PKX2020-R02);全国统计科学研究项目一般项目(2020LY067)
详细信息
    作者简介:

    程慕爽(1997 − ),女,硕士生,研究方向为机器学习与数据挖掘。E-mail:Mushuangcheng@hotmail.com

    通讯作者:

    王国强(1977 − ),男,教授,博士,研究方向为最优化理论与算法、高维数据统计推断、统计优化、数据挖掘、机器学习和人工智能。E-mail:guoq_wang@hotmail.com

  • 中图分类号: TP181;R737.9

Ensemble model and empirical analysis of breast cancer diagnosis based on Stacking

  • 摘要: 乳腺癌的早期诊断可显著提高其治愈的可能性。近年来,大数据与人工智能技术的蓬勃兴起为乳腺癌在内的多种疾病早期诊断提供技术支持。为提升乳腺癌诊断的准确度,构建基于曲线下面积(area under curve, AUC)改进的Stacking集成模型。首先,构建基于$v$-SVM的AdaBoost集成模型,并将其作为Stacking的元学习器。其次,利用各基学习器的总体AUC值对各基学习器的训练结果进行加权,将加权后的结果作为元学习器的训练集对元学习器进行训练。最后,在WDBC和WBC数据集上进行实证分析。结果表明,基于AUC改进的Stacking集成模型在两个数据集上分别取得较高准确率,可为医生提供更为精细、个性化的诊断依据,进而实现更早介入、更高效治疗的目标。
  • 图  1  Ada-$v$SVM流程图

    Figure  1.  Flowchart of Ada-$v$SVM

    图  2  传统的Stacking流程图

    Figure  2.  Flowchart of traditional Stacking

    图  3  改进的Stacking流程图

    Figure  3.  Flowchart of improved Stacking

    表  1  改进Stacking集成模型与未改进模型对比结果

    Table  1.   Comparison results between improved Stacking ensemble model and unimproved model

    模型 A /% AUC /%
    WDBC WBC WDBC WBC
    Stacking集成模型 95.61 97.08 94.68 97.27
    改进Stacking集成模型 96.49 97.81 96.38 98.28
    下载: 导出CSV

    表  2  与基学习器的对比结果

    Table  2.   Comparison results with base learner

    模型 A /% AUC /%
    WDBC WBC WDBC WBC
    改进Stacking集成模型 96.49 97.81 96.38 98.28
    KNN 95.61 94.89 94.68 95.98
    GBDT 96.49 97.08 96.38 97.28
    RF 96.49 95.62 96.70 95.28
    NB 90.35 97.08 90.20 97.28
    下载: 导出CSV
  • [1] SUNG H, FERLAY J, SIEGEL R L, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries[J] . CA: A Cancer Journal for Clinicians, 2021, 71(3): 209 − 249. doi: 10.3322/caac.21660
    [2] REZAEI Z. A review on image-based approaches for breast cancer detection, segmentation, and classification[J] . Expert Systems with Applications, 2021, 182: 115204. doi: 10.1016/j.eswa.2021.115204
    [3] FATIMA N, LIU L, HONG S, et al. Prediction of breast cancer, comparative review of machine learning techniques, and their analysis[J] . IEEE Access, 2020, 8: 150360 − 150376. doi: 10.1109/ACCESS.2020.3016715
    [4] ASRI H, MOUSANNIF H, AL MOATASSIME H, et al. Using machine learning algorithms for breast cancer risk prediction and diagnosis[J] . Procedia Computer Science, 2016, 83: 1064 − 1069. doi: 10.1016/j.procs.2016.04.224
    [5] WOLBERG W, MANGASARIAN O, STREET N, et al. Breast cancer Wisconsin (diagnostic)[EB/OL] . (1995-11-01)[2024-06-05] . https://archive.ics.uci.edu/dataset/17/breast+cancer+wisconsin+diagnostic.
    [6] 李勇, 陈思萱, 贾海, 等. 基于C-AdaBoost模型的乳腺癌预测研究[J] . 计算机工程与科学, 2020, 42(8): 1414 − 1422. doi: 10.3969/j.issn.1007-130X.2020.08.011
    [7] WANG H F, ZHENG B C, YOON S W, et al. A support vector machine-based ensemble algorithm for breast cancer diagnosis[J] . European Journal of Operational Research, 2018, 267(2): 687 − 699. doi: 10.1016/j.ejor.2017.12.001
    [8] PRIYA R S P, VADIVU P S. Bio-inspired ensemble feature selection (biefs) and ensemble multiple deep learning (emdl) classifier for breast cancer diagnosis[J] . Journal of Pharmaceutical Negative Results, 2022, 13(6): 483 − 499.
    [9] NANGLIA S, AHMAD M, KHAN F A, et al. An enhanced predictive heterogeneous ensemble model for breast cancer prediction[J] . Biomedical Signal Processing and Control, 2022, 72: 103279. doi: 10.1016/j.bspc.2021.103279
    [10] ABDAR M, ZOMORODI-MOGHADAM M, ZHOU X J, et al. A new nested ensemble technique for automated diagnosis of breast cancer[J] . Pattern Recognition Letters, 2020, 132: 123 − 131. doi: 10.1016/j.patrec.2018.11.004
    [11] SCHöLKOPF B, SMOLA A J, WILLIAMSON R C, et al. New support vector algorithms[J] . Neural Computation, 2000, 12(5): 1207 − 1245. doi: 10.1162/089976600300015565
    [12] WOLPERT D H. Stacked generalization[J] . Neural Networks, 1992, 5(2): 241 − 259. doi: 10.1016/S0893-6080(05)80023-1
    [13] 周星, 丁立新, 万润泽, 等. 分类器集成算法研究[J] . 武汉大学学报(理学版), 2015, 61(6): 503 − 508.
    [14] ZHANG X L, REN F. Improving svm learning accuracy with AdaBoost[C] //Proceedings of the Fourth International Conference on Natural Computation. Jinan: IEEE, 2008: 221 − 225.
  • 加载中
图(3) / 表(2)
计量
  • 文章访问数:  12
  • HTML全文浏览量:  8
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-05-23
  • 网络出版日期:  2025-12-22
  • 刊出日期:  2025-09-30

目录

    /

    返回文章
    返回