BiLSTM_Attention text classification method based on pre-trained model ERNIE3.0

WANG Jiajun; ZHAO Shouwei

doi:10.12299/jsues.23-0206

Volume 39 Issue 1

May 2025

Turn off MathJax

Article Contents

Article Navigation > Journal of Shanghai University of Engineering Science > 2025 > 39(1): 113-118

WANG Jiajun, ZHAO Shouwei. BiLSTM_Attention text classification method based on pre-trained model ERNIE3.0[J]. Journal of Shanghai University of Engineering Science, 2025, 39(1): 113-118. doi: 10.12299/jsues.23-0206

Citation:

WANG Jiajun, ZHAO Shouwei. BiLSTM_Attention text classification method based on pre-trained model ERNIE3.0[J]. Journal of Shanghai University of Engineering Science, 2025, 39(1): 113-118. doi: 10.12299/jsues.23-0206

Citation:

PDF( 672 KB)

BiLSTM_Attention text classification method based on pre-trained model ERNIE3.0

doi: 10.12299/jsues.23-0206

WANG Jiajun,
ZHAO Shouwei^,

School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China

Received Date: 2023-09-22
Publish Date: 2025-05-19

Abstract

Abstract

To enhance the accuracy of text classification models and address the deficiencies of traditional word vector models in syntax, semantics, and deep-level information representation, a text classification model based on the ERNIE 3.0 pre-trained model with BiLSTM_Attention was proposed. First, the ERNIE 3.0 model was used to encode text dataset, generating word vectors with rich semantic information. Subsequently, text features were extracted through the BiLSTM layer and the Attention layer. Finally, the output word vectors were classified via the Softmax layer. Classification experiments conducted on the THUCNews dataset compared the accuracy and F1-score metrics across different models. The results show that the ERNIE 3.0_BiLSTM_Attention model achieves superior classification performance.
- pre-trained models,
- bidirectional long short-term memory networks,
- attention mechanism

FullText(HTML)

References(17)

References

[1]	TONG S, KOLLER D. Support vector machine active learning with applications to text classification[J] . Journal of Machine Learning Research,2001(2):45 − 66.
[2]	TAN S. An effective refinement strategy for KNN text classifier[J] . Expert Systems with Applications,2006,30(2):290 − 298. doi: 10.1016/j.eswa.2005.07.019
[3]	李航. 统计学习方法[M] . 北京: 清华大学出版社, 2012: 47 − 53.
[4]	JONES K S. A statistical interpretation of term specificity and its application in retrieval[J] . Journal of Documentation,2004,60(5):493 − 502. doi: 10.1108/00220410410560573
[5]	PENNINGTON J , SOCHER R , MANNING C . Glove: global vectors for word representation[C] //Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha: Association for Computational Linguistics, 2014: 1532–1543.
[6]	JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification[EB/OL] . (2016-08-09)[2023-05-16] . https://doi.org/10.48550/arXiv.1607.01759.
[7]	KIM Y . Convolutional neural networks for sentence classification[EB/OL] . (2014−09−03)[2023−05−16] . https://doi.org/10.48550/arXiv.1408.5882.
[8]	GRAVES A, SCHMIDHUBER J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J] . Neural Networks,2005,18(5):602 − 610.
[9]	和志强, 杨建, 罗长玲. 基于BiLSTM神经网络的特征融合短文本分类算法[J] . 智能计算机与应用,2019,9(2):21 − 27. doi: 10.3969/j.issn.2095-2163.2019.02.005
[10]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C] //Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates Inc. , 2017: 6000 – 6010.
[11]	DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C] //Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis: Association for Computational Linguistics, 2019: 4171 – 4186.
[12]	张洋, 胡燕. 基于多通道深度学习网络的混合语言短文本情感分类方法[J] . 计算机应用研究,2021,38(1):69 − 74.
[13]	张军, 邱龙龙. 一种基于BERT和池化操作的文本分类模型[J] . 计算机与现代化,2022(6):1 − 7. doi: 10.3969/j.issn.1006-2475.2022.06.001
[14]	SUN Y, WANG S, FENG S, et al. Ernie 3.0: large-scale knowledge enhanced pre-training for language understanding and generation[EB/OL] . (2021−07−15)[2023−05−16] . https://doi.org/10.48550/arXiv.2107.02137.
[15]	张欣, 翟正利, 姚路遥. 基于CNN和LSTM混合模型的中文新闻文本分类[J] . 计算机与数字工程,2023,51(7):1540 − 1543, 1573. doi: 10.3969/j.issn.1672-9722.2023.07.018
[16]	杨兴锐, 赵寿为, 张如学, 等. 结合自注意力和残差的BiLSTM_CNN文本分类模型[J] . 计算机工程与应用,2022,58(3):172 − 180. doi: 10.3778/j.issn.1002-8331.2104-0258
[17]	张小为, 邵剑飞. 基于改进的BERT-CNN模型的新闻文本分类研究[J] . 电视技术,2021,45(7):146 − 150.