Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor
Cores |
Chenpeng Wu (Shanghai Jiao Tong University), Qiqi Gu (Shanghai Jiao Tong
University), Heng Shi (Shanghai Enflame Technology Co.Ltd; Shanghai Jiao Tong
University), Jianguo Yao (Shanghai Jiao Tong University), Haibing Guan (Shanghai
Jiao Tong University) |
BINGO: Radix-based Bias Factorization for Random Walk on Dynamic Graphs |
Pinhuan Wang (Rutgers, The State University of New Jersey), Chengying Huan (Nanjing
University), Zhibin Wang (Nanjing University), Chen Tian (Nanjing University), Yuede
Ji (The University of Texas at Arlington), Hang Liu (Rutgers, The State University
of New Jersey) |
Achilles: Efficient TEE-Assisted BFT Consensus via Rollback Resilient Recovery |
Jianyu Niu (Southern University of Science and Technology), Guanlong Wu (Southern
University of Science and Technology), Shengqi Liu (Southern University of Science
and Technology.), Xiaoqing Wen (University of British Columbia), Jiangshan Yu (The
University of Sydney), Yinqian Zhang (Southern University of Science and Technology
(SUSTech)) |
LOFT: A Lock-free and Adaptive Learned Index with High Scalability for Dynamic
Workloads |
Yuxuan Mo (Huazhong University of Science and Technology), Yu Hua (Huazhong
University of Science and Technology) |
SpotHedge: Serving AI Models on Spot Instances |
Ziming Mao (UC Berkeley), Tian Xia (UC Berkeley), Zhanghao Wu (UC Berkeley), Wei-Lin
Chiang (UC Berkeley), Tyler Griggs (UC Berkeley), Romil Bhardwaj (UC Berkeley),
Zongheng Yang (UC Berkeley), Scott Shenker (ICSI AND UC Berkeley), Ion Stoica (UC
Berkeley) |
Groot: Graph-Centric Row Reordering with Tree for Sparse Matrix Multiplications on
Tensor Cores |
YuAng Chen (The Chinese University of Hong Kong), Jiadong Xie (The Chinese
University of Hong Kong), Siyi Teng (The Chinese University of Hong Kong), Wenqi
Zeng (Hong Kong University of Science and Technology), Jeffrey Xu Yu (The Chinese
University of Hong Kong) |
Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism
Co-Optimization |
Zhanda Zhu (University of Toronto, CentML, Vector Institute), Christina Giannoula
(University of Toronto), Muralidhar Andoorveedu (CentML), Qidong Su (University of
Toronto, CentML, Vector Institute), Karttikeya Mangalam (UC Berkeley), Bojian Zheng
(Independent Researcher), Gennady Pekhimenko (CentML, University of Toronto, Vector
Institute) |
Chrono: Meticulous Hotness Measurement and Flexible Page Migration for Memory
Tiering |
Zhenlin Qi (Shanghai Jiao Tong University), Shengan Zheng (Shanghai Jiao Tong
University), Ying Huang (Intel), Yifeng Hui (Shanghai Jiao Tong University), Bowen
Zhang (Shanghai Jiao Tong University), Linpeng Huang (Shanghai Jiao Tong
University), Hong Mei (Shanghai Jiao Tong University) |
Enabling Virtual Priority in Data Center Congestion Control |
Zhaochen Zhang (Nanjing University), Feiyang Xue (Nanjing University), Keqiang He
(Shanghai Jiao Tong University), Zhimeng Yin (City University of Hong Kong), Gianni
Antichi (Politecnico Milano & Queen Mary University of London), Jiaqi Gao
(Unaffiliated), Yizhi Wang (Nanjing University), Rui Ning (Nanjing University),
Haixin Nan (Nanjing University), Xu Zhang (Nanjing University), Peirui Cao (Nanjing
University), Xiaoliang Wang (Nanjing University), Wanchun Dou (Nanjing University),
Guihai Chen (Nanjing University), Chen Tian (Nanjing University) |
Erebor: A Drop-In Sandbox Solution for Private Data Processing in Untrusted
Confidential Virtual Machines |
Chuqi Zhang (National University of Singapore), Rahul Priolkar (Arizona State
University), Yuancheng Jiang (National University of Singapore), Yuan Xiao (Intel
Labs), Mona Vij (Intel Labs), Zhenkai Liang (National University of Singapore), Adil
Ahmad (Arizona State University) |
SeBS-Flow: Benchmarking Serverless Cloud Function Workflows |
Larissa Schmid (Karlsruhe Institute of Technology), Marcin Copik (ETH Zurich),
Alexandru Calotoiu (ETH Zurich), Laurin Brandner (ETH Zurich), Anne Koziolek
(Karlsruhe Institute of Technology), Torsten Hoefler (ETH Zurich) |
Comprehensive Deadlock Prevention for GPU Collective Communication |
LiChen Pan (School of Computer Science, Peking University), Juncheng Liu (OneFlow
Inc.), Yongquan Fu (Science and Technology Laboratory of Parallel and Distributed
Processing; College of Computer, National University of Defense Technology,
Changsha, Hunan province, China), Jinhui Yuan (OneFlow Inc.), Rongkai Zhang (None),
PengZe Li (School of Computer Science, Peking University), Zhen Xiao (School of
Computer Science, Peking University) |
Hourglass: Enabling Efficient Split Federated Learning with Data Parallelism |
Qiang He (Huazhong University of Science and Technology), Kaibin Wang (Swinburne
University of Technology), Zeqian Dong (Swinburne University of Technology), Liang
Yuan (University of Adelaide), Feifei Chen (Deakin University), Hai Jin (Huazhong
University of Science and Technology), Yun Yang (Swinburne University of Technology)
|
DeltaZip: Efficient Serving of Multiple Full-Model-Tuned LLMs |
Xiaozhe Yao (ETH Zurich), Qinghao Hu (MIT), Ana Klimovic (ETH Zurich) |
MEPipe: Democratizing LLM Training with Memory-Efficient Slice-Level Pipeline
Scheduling on Cost-Effective Accelerators |
Zhenbo Sun (Tsinghua University), Shengqi Chen (Tsinghua University), Yuanwei Wang
(Tsinghua University), Jian Sha (Tsinghua University), Guanyu Feng (Zhipu AI), Wenguang Chen
(Tsinghua University) |
Empowering WebAssembly with Thin Kernel Interfaces |
Arjun Ramesh (Carnegie Mellon University), Tianshu Huang (Carnegie Mellon
University), Ben Titzer (CMU), Anthony Rowe (Carnegie Mellon University) |
PET: Proactive Demotion for Efficient Tiered Memory Management |
Wanju Doh (Seoul National University), Yaebin Moon (Samsung Electronics), Seoyoung
Ko (Seoul National University), Seunghwan Chung (Seoul National University), Kwanhee
Kyung (Seoul National University), Eojin Lee (Inha University), Jung Ho Ahn (Seoul
National University)
|
Empower Vision Applications with LoRA LMM |
Liang Mi (Nanjing University), Weijun Wang (Institute for AI Industry Research
(AIR), Tsinghua University), Wenming Tu (Institute for AI Industry Research (AIR),
Tsinghua University), Qingfeng He (Institute for AI Industry Research (AIR),
Tsinghua University), Kui Kong (Institute for AI Industry Research (AIR), Tsinghua
University), Xinyu Fang (Institute for AI Industry Research (AIR), Tsinghua
University), Yazhu Dong (Institute for AI Industry Research (AIR), Tsinghua
University), Yikang Zhang (Nanjing University), Yuanchun Li (Institute for AI
Industry Research (AIR), Tsinghua University), Meng Li (Nanjing University), Haipeng
Dai (Nanjing University), Guihai Chen (Nanjing University), Yunxin Liu (Institute
for AI Industry Research (AIR), Tsinghua University), Weijun Wang (Tsinghua
University) |
A Hardware-Software Co-Design for Efficient Secure Containers |
Jiacheng Shi (Shanghai Jiao Tong University), Yang Yu (Shanghai Jiao Tong
University), Jinyu Gu (Shanghai Jiao Tong University), Yubin Xia (Shanghai Jiao Tong
University) |
OHMiner: An Overlap-centric System for Efficient Hypergraph Pattern Mining |
Hao Qi (Huazhong University of Science and Technology), Kang Luo (Huazhong
University of Science and Technology), Ligang He (University of Warwick), Yu Zhang
(Huazhong University of Science and Technology), Minzhi Cai (Huazhong University of
Science and Technology), Jingxin Dai (Huazhong University of Science and
Technology), Bingsheng He (National University of Singapore), Hai Jin (Huazhong
University of Science and Technology), Zhan Zhang (Zhejiang Lab, China), Jin Zhao
(Huazhong University of Science and Technology), Hengshan Yue (Jilin University),
Hui Yu (Huazhong University of Science and Technology), Xiaofei Liao (Huazhong
University of Science and Technology) |
Adios to Busy-Waiting for Microsecond-scale Memory Disaggregation |
Wonsup Yoon (KAIST), Jisu Ok (KAIST), Sue Moon (KAIST), Youngjin Kwon (KAIST) |
Towards VM Rescheduling Optimization Through Deep Reinforcement Learning |
Xianzhong Ding (University of California, Merced), Yunkai Zhang (University of
California, Berkeley), Binbin Chen (ByteDance), Donghao Ying (UC Berkeley), Tieying
Zhang (ByteDance), Jianjun Chen (Bytedance), Lei Zhang (ByteDance), Alberto Cerpa
(University of California, Merced), Wan Du (University of California Merced) |
HawkSet: Automatic, Application-Agnostic, and Efficient Concurrent PM Bug Detection
|
João Oliveira (INESC-ID, IST), João Gonçalves (INESC-ID & IST U. Lisboa), Miguel
Matos (IST Lisbon) |
Solid State Drive Targeted Memory-Efficient Indexing for Universal I/O Patterns and
Fragmentation Degrees |
Junsu Im (POSTECH), Jeonggyun Kim (DGIST), Seonggyun Oh (DGIST), Jinhyung Koo
(POSTECH), Juhyung Park (DGIST), Hoon Sung Chwa (DGIST), Sam H. Noh (Virginia Tech),
Sungjin Lee (POSTECH), Junsu Im (DGIST), Jinhyung Koo (DGIST), Sungjin Lee (DGIST)
|
Byte vSwitch: A High-Performance Virtual Switch for Cloud Networking |
Xin Wang (ByteDance Inc.), Deguo Li (ByteDance Inc.), Zhihong Wang (ByteDance Inc.),
Lidong Jiang (ByteDance Inc.), Shubo Wen (ByteDance Inc.), Daxiang Kang (ByteDance
Inc.), Engin Arslan (ByteDance Inc.), Peng He (ByteDance Inc.), Xinyu Qian
(ByteDance Inc.), Bin Niu (ByteDance Inc.), Jianwen Pi (ByteDance Inc.), Xiaoning
Ding (ByteDance Inc.), Ke Lin (ByteDance Inc.), Hao Luo (ByteDance Inc.) |
TUNA: Tuning Unstable and Noisy Cloud Applications |
Johannes Freischuetz (University of Wisconsin - Madison), Konstantinos Kanellis
(University of Wisconsin-Madison), Brian Kroth (Microsoft), Shivaram Venkataraman
(University of Wisconsin-Madison) |
SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference
on GPUs |
Ruibo FAN (Data Science and Analytics Thrust, HKUST(GZ)), Xiangrui YU (Data Science
and Analytics Thrust, HKUST(GZ)), Peijie Dong (Data Science and Analytics Thrust,
HKUST(GZ)), Zeyu Li (Data Science and Analytics Thrust, HKUST(GZ)), Gu Gong (Data
Science and Analytics Thrust, HKUST(GZ)), QIANG WANG (Harbin Institute of Technology
(Shenzhen)), Wei Wang (Hong Kong University of Science and Technology), Xiaowen Chu
(Data Science and Analytics Thrust, HKUST(GZ)) |
Daredevil: Rescue Your Flash Storage from Inflexible Kernel Storage Stack |
Junzhe Li (The University of Hong Kong), Ran Shu (Microsoft Research), Jiayi Lin
(The University of Hong Kong), Qingyu Zhang (The University of Hong Kong), Ziyue
Yang (Microsoft Research), Jie Zhang (Peking University), Yongqiang Xiong (Microsoft
Research), Chenxiong Qian (The University of Hong Kong) |
Eva: Cost-Efficient Cloud-Based Cluster Scheduling |
Tzu-Tao Chang (University of Wisconsin-Madison), Shivaram Venkataraman (University
of Wisconsin-Madison) |
HyperAlloc: Efficient VM Memory De/Inflation via Hypervisor-Shared Page-Frame
Allocators |
Lars Wrenger (Leibniz Universität Hannover), Kenny Albes (Leibniz Universität
Hannover), Marco Wurps (Leibniz Universität Hannover), Christian Dietrich
(Technische Universität Braunschweig), Daniel Lohmann (Leibniz Universität Hannover)
|
Impeller: Stream Processing on Shared Logs |
Zhiting Zhu (Lepton AI), Zhipeng Jia (Google), Newton Ni (University of Texas at
Austin), Dixin Tang (UT Austin), Emmett Witchel (UT Austin) |
Marlin: Enabling High-Throughput Congestion Control Testing in Large-Scale Networks
|
Yanqing Chen (Nanjing University), Li Wang (Nanjing University), Jingzhi Wang
(Nanjing University), Songyue Liu (Nanjing University), Keqiang He (Shanghai Jiao
Tong University), Jian Wang (Nanjing University), Xiaoliang Wang (Nanjing
University), Wanchun Dou (Nanjing University), Guihai Chen (Nanjing University),
Chen Tian (Nanjing University) |
Seal: Towards Diverse Specification Inference for Linux Interfaces from Security
Patches |
Wei Chen (The Hong Kong University of Science and Technology), Bowen Zhang (The Hong
Kong University of Science and Technology), Chengpeng Wang (HKUST), Wensheng Tang
(The Hong Kong University of Science and Technology), Charles Zhang (HKUST) |
Overcoming the Last Mile between Log-Structured File Systems and Persistent Memory
via Scatter Logging |
Yifeng Zhang (Harbin Institute of Technology, Shenzhen), Yanqi Pan (Harbin institute
of Technology, Shenzhen), Hao Huang (Harbin Institute of Technology, Shenzhen),
Yuchen Shan (Harbin Institute of Technology, Shenzhen), Wen Xia (Harbin Institute of
Technology, Shenzhen) |
NeuStream: Bridging Deep Learning Serving and Stream Processing |
Haochen Yuan (Peking University), Yuanqing Wang (Peking University and Microsoft
Research), Wenhao Xie (Peking University), Yu Cheng (Peking University and Microsoft
Research), Ziming Miao (Microsoft Research), Lingxiao Ma (Microsoft Research),
Jilong Xue (Microsoft Research), Zhi Yang (Peking University) |
AlloyStack: A Library Operating System for Serverless Workflow Applications |
Jianing You (Tianjin University), Kang Chen (Tsinghua University), Laiping Zhao
(Tianjin University), Yiming Li (Tianjin University), Yichi Chen (Tianjin
University), Yuxuan Du (Tianjin University), Yanjie Wang (Tianjin University),
Luhang Wen (Tianjin University), Keyang Hu (Tsinghua University), Keqiu Li (Tianjin
University) |
Jupiter: Pushing the Speed and Scalability Limitations for Subgraph Matching on
Multi-GPUs |
Zhiheng Lin (Institute of Computing Technology, Chinese Academy of Sciences), Ke
Meng (Institute of Computing Technology), Changjie Xu (Institute of Computing
Technology, Chinese Academy of Sciences), Weichen Cao (Institute of Computing
Technology, Chinese Academy of Sciences), Guangming Tan (Institute of Computing
Technology) |
MetaHG: Enhancing HGNN Systems Leveraging Advanced Metapath Graph Abstraction |
Haiheng He (Huazhong University of Science and Technology), Haifeng Liu (Huazhong
University of Science and Technology), Long Zheng (Huazhong University of Science
and Technology), Yu Huang (Huazhong University of Science and Technology), Xinyang
Shen (Huazhong University of Science and Technology), Wenkan Huang (Huazhong
University of Science and Technology), Shuaihu Cao (Huazhong University of Science
and Technology), XIAOFEI LIAO (Huazhong University of Science and Technology), Hai
Jin (Huazhong University of Science and Technology), Jingling Xue (University of New
South Wales) |
Garbage Collection Does Not Only Collect Garbage: Piggybacking-Style Defragmentation
for Deduplicated Backup Storage |
Dingbang Liu (Harbin Institute of Technology, Shenzhen), Xiangyu Zou (Harbin
Institute of Technology, Shenzhen), Tao Lu (DapuStor), Philip Shilane (Dell
Technologies), Wen Xia (Harbin Institute of Technology, Shenzhen), Wenxuan Huang
(Harbin Institute of Technology, Shenzhen), Yanqi Pan (Harbin Institute of
Technology, Shenzhen), Hao Huang (Harbin Institute of Technology, Shenzhen) |
Atlas: Towards Real-Time Verification in Large-Scale Networks via a Native
Distributed Architecture |
Mingxiao Ma (State Key Laboratory of Networking and Switching Technology, Beijing
University of Post and Telecommunication), Yuehan Zhang (State Key Laboratory of
Networking and Switching Technology, Beijing University of Post and
Telecommunication), Jingyu Wang (State Key Laboratory of Networking and Switching
Technology, Beijing University of Post and Telecommunication), Bo He (State Key
Laboratory of Networking and Switching Technology, Beijing University of Post and
Telecommunication), Chenyang Zhao (State Key Laboratory of Networking and Switching
Technology, Beijing University of Post and Telecommunication), Qi Qi (State Key
Laboratory of Networking and Switching Technology, Beijing University of Post and
Telecommunication), Zirui Zhuang (State Key Laboratory of Networking and Switching
Technology, Beijing University of Post and Telecommunication), Haifeng Sun (State
Key Laboratory of Networking and Switching Technology, Beijing University of Post
and Telecommunication), Lingqi Guo (State Key Laboratory of Networking and Switching
Technology, Beijing University of Post and Telecommunication), Yuebin Guo (State Key
Laboratory of Networking and Switching Technology, Beijing University of Post and
Telecommunication), Gong Zhang (Huawei Technologies), Jianxin Liao (State Key
Laboratory of Networking and Switching Technology, Beijing University of Posts and
Telecommunications) |
Occamy: A Preemptive Buffer Management for On-chip Shared-memory Switches |
Danfeng Shan (Xi'an Jiaotong University), Yunguang Li (Xi'an Jiaotong University),
Jinchao Ma (Xi'an Jiaotong University), Zhenxing Zhang (Huawei), Zeyu Liang (Xi'an
Jiaotong University), Xinyu Wen (Xi'an Jiaotong University), Hao Li (Xi'an Jiaotong
University), Wanchun Jiang (Central South University), Nan Li (Huawei), Fengyuan Ren
(Tsinghua University) |
Heimdall: Optimizing Storage I/O Admission with Extensive Machine Learning Pipeline
|
Daniar H. Kurniawan (University of Chicago and MangoBoost Inc.), Rani Ayu Putri
(Bandung Institute of Technology and University of Chicago), Peiran Qin (University
of Chicago), Kahfi S. Zulkifli (Bandung Institute of Technology), Ray A. O. Sinurat
(University of Chicago), Janki Bhimani (Florida International University), Sandeep
Madireddy (Argonne National Laboratory), Achmad Imam Kistijantoro (Bandung Institute
of Technology), Haryadi Gunawi (University of Chicago) |
Revealing the Unstable Foundations of eBPF-Based Kernel Extensions |
Shawn Zhong (University of Wisconsin-Madison), Jing Liu (Microsoft Research), Andrea Arpaci-Dusseau (UW-Madison), Remzi Arpaci-Dusseau (University of
Wisconsin–Madison) |
CRAVE: Analyzing Cross-Resource Interaction to Improve Energy Efficiency in
Systems-on-Chip |
Dipayan Mukherjee (University of Illinois, Urbana-Champaign), Sam Hachem (University
of Illinois, Urbana-Champaign), Jeremy Bao (University of Illinois,
Urbana-Champaign), Curtis Madsen (Sandia National Labs), Tian Ma (Sandia National
Labs), Saugata Ghose (University of Illinois Urbana-Champaign), Gul Agha (University
of Illinois at Urbana-Champaign) |