科研成果详情

题名Enhancing LLM QoS Through Cloud-Edge Collaboration: A Diffusion-Based Multi-Agent Reinforcement Learning Approach
作者
发表日期2025
发表期刊IEEE Transactions on Services Computing
卷号18期号:3页码:1412-1427
摘要Large Language Models (LLMs) are widely used across various domains, but deploying them in cloud data centers often leads to significant response delays and high costs, undermining Quality of Service (QoS) at the network edge. Although caching LLM request results at the edge using vector databases can greatly reduce response times and costs for similar requests, this approach has been overlooked in prior research. To address this, we propose a novel Vector database-assisted cloud-Edge collaborative LLM QoS Optimization (VELO) framework that caches LLM request results at the edge using vector databases, thereby reducing response times for subsequent similar requests. Unlike methods that modify LLMs directly, VELO leaves the LLM's internal structure intact and is applicable to various LLMs. Building on VELO, we formulate the QoS optimization problem as a Markov Decision Process (MDP) and design an algorithm based on Multi-Agent Reinforcement Learning (MARL). Our algorithm employs a diffusion-based policy network to extract the LLM request features, determining whether to request the LLM in the cloud or retrieve results from the edge's vector database. Implemented in a real edge system, our experimental results demonstrate that VELO significantly enhances user satisfaction by simultaneously reducing delays and resource consumption for edge users of LLMs. Our DLRS algorithm improves performance by 15.0% on average for similar requests and by 14.6% for new requests compared to the baselines.
关键词diffusion model Edge computing multi-agent reinforcement learning request scheduling vector database
DOI10.1109/TSC.2025.3562362
URL查看来源
语种英语English
Scopus入藏号2-s2.0-105007982160
引用统计
文献类型期刊论文
条目标识符https://repository.uic.edu.cn/handle/39GCC9TT/13733
专题北师香港浸会大学
通讯作者Tang,Zhiqing; Jia,Weijia
作者单位
1.Beijing Normal University,School of Artificial Intelligence,Beijing,100875,China
2.Beijing Normal University,Institute of Artificial Intelligence and Future Networks,Zhuhai,519087,China
3.Beijing Normal-Hong Kong Baptist University,Guangdong Key Lab of AI and Multi-Modal Data Processing,Zhuhai,519087,China
通讯作者单位北师香港浸会大学
推荐引用方式
GB/T 7714
Yao,Zhi,Tang,Zhiqing,Yang,Wenmianet al. Enhancing LLM QoS Through Cloud-Edge Collaboration: A Diffusion-Based Multi-Agent Reinforcement Learning Approach[J]. IEEE Transactions on Services Computing, 2025, 18(3): 1412-1427.
APA Yao,Zhi, Tang,Zhiqing, Yang,Wenmian, & Jia,Weijia. (2025). Enhancing LLM QoS Through Cloud-Edge Collaboration: A Diffusion-Based Multi-Agent Reinforcement Learning Approach. IEEE Transactions on Services Computing, 18(3), 1412-1427.
MLA Yao,Zhi,et al."Enhancing LLM QoS Through Cloud-Edge Collaboration: A Diffusion-Based Multi-Agent Reinforcement Learning Approach". IEEE Transactions on Services Computing 18.3(2025): 1412-1427.
条目包含的文件
条目无相关文件。
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Yao,Zhi]的文章
[Tang,Zhiqing]的文章
[Yang,Wenmian]的文章
百度学术
百度学术中相似的文章
[Yao,Zhi]的文章
[Tang,Zhiqing]的文章
[Yang,Wenmian]的文章
必应学术
必应学术中相似的文章
[Yao,Zhi]的文章
[Tang,Zhiqing]的文章
[Yang,Wenmian]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。