Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism*
Tim Tsz‑Kit Lau, Weijian Li, Chenwei Xu, Han Liu, Mladen Kolar
I am a Computer Science PhD student at Northwestern University, supervised by Prof. Han Liu in MAGICS lab. I received my B.S. degree from University of Edinburgh from the program, Artificial Intelligence and Computer Science. My PhD research direction is large time series models, large language models (LLMs) and their applications to scientific discoveries.
Tim Tsz‑Kit Lau, Weijian Li, Chenwei Xu, Han Liu, Mladen Kolar
Lingzhi Wang, Xiangmin Shen, Weijian Li, Zhenyuan Li, R. Sekar, Han Liu, Yan Chen
Weijian Li, Han Liu
Weijian Li, Haozheng Luo, Chenwei Xu, Han Liu
Weijian Li, Stephen S. Cheng, Lining Mao, Jigyasa Kumari, Alex Pyo, Mehak Kawatra, Jialong Li, Jiayi Wang, Ammar Gilani, Jingya Xun, Jui‑Hui Chung, Jerry Yao‑Chieh Hu, Han Liu
Tim Tsz‑Kit Lau, Weijian Li, Chenwei Xu, Han Liu, Mladen Kolar
Jerry Yao‑Chieh Hu, Pei‑Hsuan Chang, Haozheng Luo, Hong‑Yu Chen, Weijian Li, Wei‑Po Wang, Han Liu
Zhihan Zhou, Yanrong Ji, Weijian Li, Pratik Dutta, Ramana V. Davuluri, Han Liu
Zhi Zhang, Weijian Li, Han Liu
Dennis Wu, Jerry Yao‑Chieh Hu, Weijian Li, Bo‑Yu Chen, Han Liu
Chenwei Xu, Yu‑Chao Huang, Jerry Yao‑Chieh Hu, Weijian Li, Ammar Gilani, Hsi‑Sheng Goan, Han Liu
Alex Reneau, Jerry Yao‑Chieh Hu, Chenwei Xu, Weijian Li, Ammar Gilani, Han Liu
Tong Xie, Yuwei Wan, Weijian Li, Qingyuan Linghu, Shaozhou Wang, Yalun Cai, Han Liu
* denotes co-first author