题名 | Automatic Identification of Non-compliant Information in Chinese Text Based on BERT-TextCNN Algorithm |
作者 | |
发表日期 | 2024-12-01 |
会议名称 | International Conference on Research in Education and Sciencee (ICRES) |
会议录名称 | Proceedings of International Conference on Research in Education and Science: ICRES 2024
![]() |
会议录编者 | Mack Shelley & Omer Tayfur Ozturk |
ISBN | 9781952092633 |
页码 | 2312-2325 |
会议日期 | April 27-30, 2024 |
会议地点 | Antalya, Turkey |
出版者 | ISTES |
摘要 | With the rapid global expansion and popularity of short video platforms, the identification and filtering of non-compliant content have become crucial tasks to ensure the health of the online environment. Traditional manual review methods face challenges of low efficiency and poor consistency, urgently requiring an efficient, automated solution. This study proposes a deep learning model based on BERT and TextCNN, aimed at automatically identifying non-compliant information within Chinese text content. By integrating the deep semantic understanding capabilities of the BERT model with the local feature extraction advantages of TextCNN, we designed and implemented an effective model for non-compliant information identification. This research first conducted a detailed preprocessing and analysis of the dataset, including exploring word length distribution, analyzing the proportion of compliant and non-compliant information, and visualizing key vocabulary through word cloud graphics. Subsequently, we trained and tested the model, which achieved an accuracy of 93.61% on the identification task, demonstrating good balance across precision, recall, and F1 score metrics, with an AUC value of 0.96. This indicates the model's high accuracy and reliability in distinguishing between compliant and non-compliant information. The outcomes of this study not only provide an effective automatic identification tool for short video platforms but also offer a new research perspective and practical evidence for the application of deep learning in text analysis. |
关键词 | BERT-TextCNN deep learning natural language processing Non-compliant information identification text analysis |
URL | 查看来源 |
语种 | 英语English |
Scopus入藏号 | 2-s2.0-85217632923 |
引用统计 | |
文献类型 | 会议论文 |
条目标识符 | https://repository.uic.edu.cn/handle/39GCC9TT/12559 |
专题 | 北师香港浸会大学 |
作者单位 | 1.Beijing Normal University-Hong Kong Baptist University,United International College,China 2.Beijing Institute of Technology,China |
第一作者单位 | 北师香港浸会大学 |
推荐引用方式 GB/T 7714 | Gao, Yuan,Wang, Chunning,Xie, Yingchong. Automatic Identification of Non-compliant Information in Chinese Text Based on BERT-TextCNN Algorithm[C]//Mack Shelley & Omer Tayfur Ozturk: ISTES, 2024: 2312-2325. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论