Currently, I am a postdoctoral fellow at CUHK, working with Prof. Guoliang Xing.
I received my PhD degree from USTC in 2024, advised by Prof. Xiang-Yang Li and Prof. Lan Zhang.
I received my Bachelor's degree from USTC in 2019, as a member of the Hua-Xia Talent Class (华夏英才班).
My research interests lie in designing theory-backed algorithms and building innovative systems for AI workloads.
STIP: Three-Party Privacy-Preserving and Lossless Inference for Large Transformers in Production
Mu Yuan, Lan Zhang, Yihang Cheng, Miao-Hui Song, Guoliang Xing, Xiang-Yang Li
NDSS 2026
Code
Official MindSpore Support
Qwen2.5 Usage Example
SCX: Stateless KV-Cache Encoding for Cloud-Scale Confidential Transformer Serving
Mu Yuan, Lan Zhang, Liekang Zeng, Siyang Jiang, Bufang Yang, Di Duan, Guoliang Xing
SIGCOMM 2025
Code
Myo-Trainer: A Vision-based Muscle-Aware Motion Feedback System for In-Home Resistance Training
Yuting He, Xinyan Wang, Mu Yuan, Bufang Yang, Siyang Jiang, Yihua Huang, Doris S. F. Yu, Guoliang Xing, Hongkai Chen
MobiCom 2025 (🏅 ACM SenSys '24 Best Demo Runner-up Award)
RemoteRAG: A Privacy-Preserving LLM Cloud RAG Service
Yihang Cheng, Lan Zhang, Junyang Wang, Mu Yuan, Yunhao Yao
ACL 2025
A-VL: Adaptive Attention for Large Vision-Language Models
Junyang Zhang, Mu Yuan, Ruiguang Zhong, Puhan Luo, Huiyou Zhan, Ningkang Zhang, Chengchen Hu, Xiangyang Li
AAAI 2025
Venus: An Efficient Edge Memory-and-Retrieval System for VLM-based Online Video Understanding
Shengyuan Ye, Bei Ouyang, Tianyi Qian, Liekang Zeng, Mu Yuan, Xiaowen Chu, Weijie Hong, Xu Chen
IEEE INFOCOM 2026
IoT-Brain: Grounding LLMs for Semantic-Spatial Sensor Scheduling
Zhaomeng Zhou, Lan Zhang, Junyang Wang, Mu Yuan, Junda Lin, Jinke Song
ACM MobiCom 2026
Dataset
Argus: Multi-view egocentric human mesh reconstruction based on stripped-down wearable mmwave add-on
Di Duan, Shengzhe Lyu, Mu Yuan, Hongfei Xue, Tianxing Li, Weitao Xu, Kaishun Wu, Guoliang Xing
SenSys 2025 (🏅 Best Paper Honorable Mention Award)
PacketGame: Multi-Stream Packet Gating for Concurrent Video Inference at Scale
Mu Yuan, Lan Zhang, Xuanke You, Xiang-Yang Li
ACM SIGCOMM 2023 (SIG Grant Award)
PDF
Code
InFi: End-to-end Learnable Input Filter for Resource-efficient Mobile-centric Inference
Mu Yuan, Lan Zhang, Fengxiang He, Xueting Tong, Xiang-Yang Li
ACM MobiCom 2022
PDF
Code
Mitigating Tail Latency for on-Device Inference with Load-Balanced Heterogeneous Models
Mu Yuan, Lan Zhang, Di Duan, Liekang Zeng, Miao-Hui Song, Zichong Li, Guoliang Xing, Xiang-Yang Li
IEEE TMC 2025
MLink: Linking Black-Box Models from Multiple Domains for Collaborative Inference
Mu Yuan, Lan Zhang, Zimu Zheng, Yi-Nan Zhang, Xiang-Yang Li
IEEE TPAMI 2023
PDF
MLink: Linking Black-box Models for Collaborative Multi-model Inference
Mu Yuan, Lan Zhang, Xiang-Yang Li
AAAI 2022 (Oral 4.5%)
PDF
Code
Comprehensive and Efficient Data Labeling via Adaptive Model Scheduling
Mu Yuan, Lan Zhang, Xiang-Yang Li, Hui Xiong
IEEE ICDE 2020
PDF
异构协同模型推理 (Heterogeneous Collaborative Model Inference)
袁牧 (Mu Yuan)
CCF 全国优博 (CCF Doctoral Dissertation Award)
CCF 物联网专委优博 (CCF TCIoT Doctoral Dissertation Award)
中国科学技术大学校优博 (USTC Doctoral Dissertation Award)
Keynote: Co-Designing the Edge and Cloud for Scalable and Secure AI Inference [Slides], ECCAI @ CoNEXT 2025, 2025.12.1
端云协同:驱动大模型高效与安全服务的系统研究, ACM 中国图灵大会 ACM TURC 2025, 2025.10.11
Enhancing AI System Performance and Security via Device-Cloud Collaboration [Best Oral Presentation Award], The 3rd International Conference on the Frontiers of Robotics and Software Engineering (FRSE2025), 2025.8.9
端云协同智能面向高效、安全、个性化的模型服务, 第十五届中国计算机学会优博论坛, 2025.8.4
端云协同范式赋能大模型高效机密推理, 中国科学技术大学专题报告, 2025.5.6
模型推理原生的智能物联网系统 Model Inference-Native AIoT Systems, 香港中文大学大模型可靠性技术沙龙, 2025.2.21
优秀博士论文报告, 第十八届中国物联网学术会议 CWSN 2024, 2024.9.21
SCX: Stateless KV-Cache Encoding for Cloud-Scale Confidential Transformer Serving [YouTube] [Bilibili], ACM SIGCOMM 2025, 2025.9.9
PacketGame: Multi-Stream Packet Gating for Concurrent Video Inference at Scale [Bilibili] [Slides], ACM SIGCOMM 2023, 2023.9.13
InFi: End-to-end Learnable Input Filter for Resource-efficient Mobile-centric Inference [Bilibili] [Slides], ACM MobiCom 2022, 2022.10.18
MLink: Linking Black-box Models for Collaborative Multi-model Inference [Bilibili] [Slides (20min)] [Slides (1min)], AAAI 2022
Comprehensive and Efficient Data Labeling via Adaptive Model Scheduling [Bilibili] [Slides], IEEE ICDE 2020