Publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2025

  1. Towards Energy Efficient 5G vRAN Servers
    In NSDI, 2025

2024

  1. MoE-Infinity: Offloading-Efficient MoE Model Serving
    Leyang Xue, Yao Fu, Zhan Lu, Luo Mai, and Mahesh K. Marina
    2024
  2. MoE-CAP: Cost-Accuracy-Performance Benchmarking for Mixture-of-Experts Systems
    Yao Fu, Yinsicheng Jiang, Yeqi Huang, Ping Nie, Zhan Lu, Leyang Xue, Congjie He, Man-Kit Sit, and 5 more authors
    2024
  3. ServerlessLLM: Low-Latency Serverless Inference for Large Language Models
    Yao Fu, Leyang Xue, Yeqi Huang, Andrei-Octavian Brabete, Dmitrii UstiugovYuvraj Patel, and Luo Mai
    In OSDI, 2024

2022

  1. PAINT: Path Aware Iterative Network Tomography for Link Metric Inference
    Leyang Xue, Mahesh K. Marina, Geng Li, and Kai Zheng
    In ICNP, 2022