BNU-HKBU UIC NLP team 2 at SemEval-2019 task 6: Detecting offensive language using BERT model

科研成果详情

题名	BNU-HKBU UIC NLP team 2 at SemEval-2019 task 6: Detecting offensive language using BERT model
作者	Wu, Zhenghao; Zheng, Hao; Wang, Jianming; Su, Weifeng; Fong, Jefferson
发表日期	2019
会议名称	NAACL HLT 2019 - International Workshop on Semantic Evaluation
会议录名称	NAACL HLT 2019 - International Workshop on Semantic Evaluation - Proceedings of the 13th Workshop
ISBN	9781950737062
页码	551-555
会议日期	June 6–June 7, 2019
会议地点	Minneapolis, Minnesota, USA
摘要	In this study we deal with the problem of identifying and categorizing offensive language in social media. Our group, BNU-HKBU UIC NLP Team2, use supervised classification along with multiple version of data generated by different ways of pre-processing the data. We then use the state-of-the-art model Bidirectional Encoder Representations from Transformers, or BERT (Devlin et al. (2018)), to capture linguistic, syntactic and semantic features. Long range dependencies between each part of a sentence can be captured by BERT's bidirectional encoder representations. Our results show 85.12% accuracy and 80.57% F1 scores in Subtask A (offensive language identification), 87.92% accuracy and 50% F1 scores in Subtask B (categorization of offense types), and 69.95% accuracy and 50.47% F1 score in Subtask C (offense target identification). Analysis of the results shows that distinguishing between targeted and untargeted offensive language is not a simple task. More work needs to be done on the unbalance data problem in Subtasks B and C. Some future work is also discussed.
URL	查看来源
语种	英语English
Scopus入藏号	2-s2.0-85091389212
引用统计
文献类型	会议论文
条目标识符	https://repository.uic.edu.cn/handle/39GCC9TT/6839
专题	理工科技学院
作者单位	Computer Science and Technology,Division of Science and Technology,BNU-HKBU United International College,Zhuhai,Guangdong,China
第一作者单位	北师香港浸会大学
推荐引用方式 GB/T 7714	Wu, Zhenghao,Zheng, Hao,Wang, Jianminget al. BNU-HKBU UIC NLP team 2 at SemEval-2019 task 6: Detecting offensive language using BERT model[C], 2019: 551-555.