Updated README with new information
This commit is contained in:
parent
8e04794b2d
commit
df9b4fb90a
|
@ -519,7 +519,8 @@ use_cpu: false
|
|||
|
||||
```bash
|
||||
deepspeed --num_gpus 8 src/train_bash.py \
|
||||
--deepspeed ds_config.json \
|
||||
--deepspeed ds_config.json \
|
||||
--ddp_timeout 180000000 \ # If the training data is too large, it is recommended to add the ddp_timeout command line option to prevent NCCL errors.
|
||||
... # arguments (same as above)
|
||||
```
|
||||
|
||||
|
|
|
@ -519,7 +519,9 @@ use_cpu: false
|
|||
```bash
|
||||
deepspeed --num_gpus 8 src/train_bash.py \
|
||||
--deepspeed ds_config.json \
|
||||
--ddp_timeout 180000000 \ # 如训练数据过大,建议加上ddp_timeout命令行,防止nccl报错
|
||||
... # 参数同上
|
||||
|
||||
```
|
||||
|
||||
<details><summary>使用 DeepSpeed ZeRO-2 进行全参数训练的 ds_config.json 示例</summary>
|
||||
|
|
Loading…
Reference in New Issue