History

wanghailu 14a40a1967 Merge branch 'master' of github.com:InfiniTensor/InfiniTensor into kunlun_dist_op		2024-04-03 01:01:40 +08:00
..
bang	kunlun dist inference fix	2024-04-02 15:30:46 +08:00
cuda	Merge branch 'master' of github.com:InfiniTensor/InfiniTensor into kunlun_dist_op	2024-04-03 01:01:40 +08:00
kunlun	kunlun distributed	2024-04-02 17:15:08 +08:00
README.md	针对bert和gpt2模型分布式推理的优化 (#221 )	2024-04-01 14:04:28 +08:00
__init__.py	kunlun dist inference fix	2024-04-02 15:30:46 +08:00
parallel.py	impl distributed launch with NCCL (#106 )	2023-09-05 09:47:35 +08:00
parallel_opt.py	Merge branch 'master' of github.com:InfiniTensor/InfiniTensor into kunlun_dist_op	2024-04-03 01:01:40 +08:00
placement.py	tensor parallel for transformer (#125 )	2023-09-14 14:19:45 +08:00
run_pytorch.py	针对bert和gpt2模型分布式推理的优化 (#221 )	2024-04-01 14:04:28 +08:00

分布式脚本

使用 --export_onnx 设置导出onnx的目录，默认为当前路径 ./，不使用这个flag则只进行计算和生成输入输出。

python run_pytorch.py --model gpt2  --batch_size 1  --length 1 --export_onnx ./

会在当前目录下生成输入输出文件test_inputs.npy 和 test_results.npy，目前只支持单一输入输出。

python cuda_launch.py --model "/XXX/XXX.onnx" --nproc_per_node 4