科研成果详情

发表状态已发表Published
题名MLFormer: a high performance MPC linear inference framework for transformers
作者
发表日期2025-04-01
发表期刊Journal of Cryptographic Engineering
ISSN/eISSN2190-8508
卷号15期号:1
摘要

Transformer-based models are widely used in natural language processing tasks, and their application has been further extended to computer vision as well. In their usage, data security has become a crucial concern when deploying deep learning services on cloud platforms. To address these security concerns, Multi-party computation (MPC) is employed to prevent data and model leakage during the inference process. However, Transformer model introduces several challenges for MPC computation, including the time overhead of the Softmax (normalized exponential) function, the accuracy issue caused by the “dynamic range” of approximated division and exponential, and the high memory overhead when processing long sequences. To overcome these challenges, we propose MLformer, an MPC-based inference framework for transformer models based on Crypten Knott et al. (Adv Neural Inf Process Syst 34: 4961–4973, 2021), a secure machine learning framework suggested by Facebook AI Research group, in the semi-honest adversary model. In this framework, we replace the softmax attention with linear attention, which has linear time and memory complexity with input length. The modification eliminates the softmax function entirely, resulting in lower time and memory overhead. To ensure the accuracy of linear attention, we propose the scaled linear attention to address the dynamic range issue caused by the MPC division used and a new approximate division function is proposed to reduce the computational time of the attention block. Furthermore, to improve the efficiency and accuracy of MPC exponential and reciprocal which are commonly used in transformer model, we propose a novel MPC exponential protocol and first integrate the efficient reciprocal protocol Bar-Ilan and Beaver (in Proceedings of the 8th annual ACM symposium on principles of distributed computing, pp. 201–209, 1989) to our framework. Additionally, we optimize the computation of causal linear attention, which is utilized in private inference of auto-regression tasks, using our novel CUDA kernel functions. All the proceeding optimizations contribute to the construction of a more accurate and efficient framework. The experimental results demonstrate that our framework achieves comparable accuracy with reduced inference time and GPU memory overhead compared to the original transformer model. The speedup reaches 78.79% compared to traditional private transformer with input length of 1024 patches.

关键词GPU Linear transformer Multi-party computation Parallel processing Private inference
DOI10.1007/s13389-024-00365-1
URL查看来源
收录类别SCIE
语种英语English
WOS研究方向Computer Science
WOS类目Computer Science, Theory & Methods
WOS记录号WOS:001358922700001
Scopus入藏号2-s2.0-85209581974
引用统计
文献类型期刊论文
条目标识符https://repository.uic.edu.cn/handle/39GCC9TT/12802
专题理工科技学院
通讯作者Chen, Donglong
作者单位
1.Guangdong Provincial Key Laboratory of IRADS,BNU-HKBU United International College,Zhuhai,519000,China
2.Hangzhou Innovation Institute of Beihang University,Hangzhou,311121,China
3.Zhejiang Lab,Hangzhou,310000,China
4.Sun Yat-sen University,Shenzhen,518107,China
5.Nanjing University of Aeronautics and Astronautics,Nanjing,210000,China
6.City University of Hong Kong,310000,Hong Kong,China
7.Iǧdır University,Turkey,and University of California Santa Barbara,Santa Barbara,United States
第一作者单位北师香港浸会大学
通讯作者单位北师香港浸会大学
推荐引用方式
GB/T 7714
Liu, Siqi,Liu, Zhusen,Chen, Donglonget al. MLFormer: a high performance MPC linear inference framework for transformers[J]. Journal of Cryptographic Engineering, 2025, 15(1).
APA Liu, Siqi., Liu, Zhusen., Chen, Donglong., Dai, Wangchen., Zhou, Lu., .. & Koç, Çetin Kaya. (2025). MLFormer: a high performance MPC linear inference framework for transformers. Journal of Cryptographic Engineering, 15(1).
MLA Liu, Siqi,et al."MLFormer: a high performance MPC linear inference framework for transformers". Journal of Cryptographic Engineering 15.1(2025).
条目包含的文件
条目无相关文件。
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Liu, Siqi]的文章
[Liu, Zhusen]的文章
[Chen, Donglong]的文章
百度学术
百度学术中相似的文章
[Liu, Siqi]的文章
[Liu, Zhusen]的文章
[Chen, Donglong]的文章
必应学术
必应学术中相似的文章
[Liu, Siqi]的文章
[Liu, Zhusen]的文章
[Chen, Donglong]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。