About me
Hello, I'm Jie Sun. I am a fourth-year Ph.D. student in the Department of Computer Science at Zhejiang University, supervised by Zeke Wang and Fei Wu. My areas of interest include machine learning systems, graph computing, and recommendation systems. I like building efficient and scalable machine learning systems (for GNN, DLRM, and LLM) that leverage heterogeneous hardware, such as NVMe SSDs and GPUs, to address large-scale challenges coming from the industry. Currently, I am collaborating with Alibaba on developing a large-scale recommendation system, which is expected to be released in the coming months.
Publications
Jie Sun, Li Su, Zuocheng Shi, Wenting Shen, Zeke Wang, Lei Wang, Jie Zhang, Wenyuan Yu, Yong Li, Jingren Zhou, Fei Wu
USENIX Annual Technical Conference (ATC), 2023
[Paper] [Code]
We build Legion with the co-design of GPU-topology-aware hierarchical graph partitioning with NVLink-enhanced multi-GPU unified cache to accelerate large-scale GNNs training. Legion minimizes CPU-GPU PCIe traffic, achieving throughput close to pure in-GPU systems even with billion-scale graphs.
Jie Sun, Mo Sun, Zheng Zhang, Zuocheng Shi, Jun Xie, Zihan Yang, Jie Zhang, Fei Wu, Zeke Wang
IEEE International Conference on Data Engineering (ICDE), 2025
[Paper] [Code]
We build Hyperion, a cost-efficient out-of-core GNN training system that can achieve in-memory-like throughput on terabyte-scale graphs with some cheap NVMe SSDs. We also propose a GPU-initiated asynchronous disk IO stack to saturate SSDs with only a few GPU cores. We believe the asynchronous disk IO stack can be further applied to other out-of-core applications like DLRM, LLM inference (KVCache in disk), and RAG systems.
Jie Sun*, Zuocheng Shi*, Li Su, Wenting Shen, Zeke Wang, Yong Li, Wenyuan Yu, Wei Lin, Fei Wu, Bingsheng He, Jingren Zhou. *: Contributed equally to this project.
Annual Symposium on Principles and Practice of Parallel Programming (PPoPP), 2025
[Paper] [Code]
We build Helios, a distributed dynamic graph sampling service for online GNN inference. Helios can achieve millisecond-level sampling latency on rapidly updated dynamic graphs and can linearly scale out. Helios is now part of Alibaba Graph-Learn, an industrial GNN framework. See dynamic sampling services for more details (https://graph-learn.readthedocs.io/en/latest/en/dgs/intro.html).
Meng Zhang*, Jie Sun*, Qinghao Hu, Peng Sun, Zeke Wang, Yonggang Wen, Tianwei Zhang. *: Contributed equally to this project.
International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2024
[Paper] Graph transformers achieve higher capabilities to capture global/long-range effects than GNNs. However, the quadratic computation cost of self-attention makes it hard to scale. We propose TorchGT with algorithm and system co-design to accelerate Graph Transformer training and scale up to over 1M sequence length.
Qi Liu, Mo Sun, Jie Sun, Liqiang Lu, Jieru Zhao, Zeke Wang
International Conference on Field-Programmable Technology (FPT), 2023
Jie Zhang, Hongjing Huang, Jie Sun, Juan G ́omez Luna, Onur Mutlu, Zeke Wang
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2023
Hongjing Huang, Yingtao Li, Jie Sun, Xueying Zhu, Jie Zhang, Liang Luo, Jialin Li, Zeke Wang
IEEE Transactions Parallel and Distributed System (TPDS), 2023
Education
[Sep. 2021 - Present] Zhejiang University, Ph.D. student in Computer Science (CS)
[Sep. 2017 – Jun. 2021] Zhejiang University, B.S. in Electronic Engineering (EE)
Internship
[June 2024 -- Present] Research Intern, NUS Xtra Group, supervised by Bingsheng He
[Nov 2020 - June 2024] Research Intern, Alibaba Group
Awards
[Jan. 2024] Alibaba Outstanding Research Intern (by Tongyi Lab)
[Jan. 2023] Eurosys Best Poster Award, for the early work of Helios
[June. 2023] Outstanding Graduate Student of Zhejiang University
[Jan. 2021] Alibaba-Zhejiang University Joint Institute of Frontier Technologies(AZFT)Annual Outstanding Research Intern