Publications

You can also find my articles on my Google Scholar profile.

Conference Papers

DCP: Addressing Input Dynamism in Long-Context Training via Dynamic Context Parallelism

Download Paper | Download Slides

Lancet: Accelerating Mixture-of-Experts Training by Overlapping Weight Gradient Computation and All-to-All Communication

Download Paper | Download Slides

DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines

Download Paper | Download Slides

dPRO: A Generic Performance Diagnosis and Optimization Toolkit for Expediting Distributed DNN Training